sql窗口函数学习笔记
窗口函数介绍:
一. 应用场景:
- 分组排序分析
- 分组求和,分组累加求和,求平均等
- 求数据同比环比时数据偏移
二.窗口函数介绍:
函数() over (partition by <分组列> order by <排序列> rows between 开始行 and 结束行)
- 窗口函数是通过窗口函数名+窗口数据集组成。
- 窗口函数是对where或者group by子句处理后的结果进行操作,所以窗口函数原则上只能写在select子句中。
- 窗口函数分为三大类:排序类窗口,聚合类窗口,取值函数(极值,偏移,切片等)
- 排序函数():rank() ,dense_rank(),row_number()
- 聚合函数():avg(),percent_rank(),cume_dist(),sum()
- 取值函数():first_value(),last_value(),nth_value(),lag(),lead(),ntile()
窗口函数应用:
一. 排序函数
- 基础数据:
- 使用分组函数按照班级分组,分数排序进行查询
select name AS "姓名" ,class as "班级",score as "分数"
,RANK() over(partition by class order by score desc) rank_id -- rank()可并列不可连续 1,1,3,4
,DENSE_RANK() over(partition by class order by score desc) dense_rank_id-- DENSE_RANK()可并列连续 1,1,2,3
,ROW_NUMBER() over(partition by class order by score DESC) row_number_id-- ROW_NUMBER()不可并列但连续 1,2,3,4
from mark
- rank()可并列不可连续 1,1,3,4
- DENSE_RANK()可并列连续 1,1,2,3
- ROW_NUMBER()不可并列但连续 1,2,3,4
结果:
二.聚合函数
- 基本聚合函数使用
select name AS "姓名" ,class as "班级",score as "分数"
,max(score) over(partition by class) as "最高得分"
,min(score) over(partition by class) as "最低得分"
,avg(score) over (partition by class ) as "平均分"
,sum(score) over(partition by class) as "总分"
,count(score) over (partition by class) as "score计数"
,count(1) over(partition by class) as "计数"
from mark
- max():最大值
- min():最小值
- sum():求和
- avg():求平均值
- count():计数
- 复杂聚合函数使用
select name AS "姓名" ,class as "班级",score as "分数"
,ROW_NUMBER() over(order by score desc) as "行数"
,CUME_DIST() over(order by score desc) as "cume_dist"
,concat(cast(100*CUME_DIST() over(order by score desc) as DECIMAL(10,1)),'%') as "cume_dist_%"
,RANK() over(order by score desc) rank_id -- rank()可并列不可连续 1,1,3,4
,PERCENT_RANK() over(order by score desc) as "percent_rank"
,concat(cast(100*PERCENT_RANK() over(order by score desc) as DECIMAL(10,1)),'%') as "percent_rank_%"
from mark
- cume_dist:大于等于当前行值得行数/分组内总行数,用于计算累计分布值 ,0<=x<=1
- percent_rank:窗口内当前行得RANK值-1/窗口内总行数-1(这里得rank值,就是rank()函数得返回值),用于计算当前行在整个分区内的排名百分位数,0<=x<=1