前言
这篇是进阶sql题目的记录,由于上一篇文章已经写将近一万字,有点长,就把剩下的再开一篇,免得总是重新发布
SQL126 平均活跃天数和月活人数
本题目要求统计,并且是多行,就需要使用group by查询
首先需要统计月份,这个需要format格式化出月份,统计每个月份里平均的活跃天数(各人活跃天数和/选取去重的人数),
月度活跃人数(在这个月且submit_time不为空的人数)
这里使用 count(distinct(date_format(submit_time,‘%Y-%m’))统计月份,但是发现数量对不上
经过反复尝试后发现这里不需要distinct去重,因为count自带去重,并且也去掉null了
select date_format(submit_time,'%Y%m') AS month,round(count(submit_time)/count(distinct(uid)),2) AS avg_active_days,count(distinct(uid)) AS mau
from exam_record
where year(submit_time)=2021 AND submit_time is not null
group by date_format(submit_time,'%Y%m')
之后发现有个用例通不过,检查后发现这里有一个用户在一天做了两种卷子
于是需要组合去重
select date_format(submit_time,'%Y%m') AS month,
round((count(distinct uid,date_format(submit_time,'%Y%m%d')))/count(distinct uid),2)
AS avg_active_days,
count(distinct uid) AS mau
from exam_record
where year(submit_time)=2021
group by date_format(submit_time,'%Y%m')
这里distinct不写括号也可以
round这里括号比较多,需要注意
SQL127 月总刷题数和日均刷题数
类似于上一道题,统计每个月的总题目数和日均刷题数量,group by肯定要用,但是第三行要求总的数量
这里求日均需要求这个月的天数,使用这个函数DAY(LAST_DAY(yourColumnName))
select date_format(submit_time,'%Y%m') AS submit_month
,count(date_format(submit_time,'%Y%m')) AS month_q_cnt
,round(count(date_format(submit_time,'%Y%m'))/DAY(LAST_DAY(submit_time)),3) AS avg_day_q_cnt
from practice_record
where year(submit_time) = 2021
group by date_format(submit_time,'%Y%m')
这里有一个错误
SQL_ERROR_INFO: "Expression #3 of SELECT list is not in GROUP BY clause and contains nonaggregated column ‘practice_record.submit_time’ which is not functionally dependent on columns in GROUP BY clause; this is incompatible with sql_mode=only_full_group_by
查阅得知这个/DAY(LAST_DAY(submit_time))因为day(last_day(submit_time)运算结果还是跟submit_time同样的一串数列,只有加上avg(),min()或max()运算才变成了一个数值作为分母使用
这样输出正确了
之后需要在最后一行输出总和
看到都是用union,union all做的,这两个分别是合并重复和不合并的,都是把两个查询结果上下合到一个表里
select date_format(submit_time,'%Y%m') AS submit_month
,count(date_format(submit_time,'%Y%m')) AS month_q_cnt
,round(count(date_format(submit_time,'%Y%m'))/avg(DAY(LAST_DAY(submit_time))),3) AS avg_day_q_cnt
from practice_record
where year(submit_time) = 2021
group by date_format(submit_time,'%Y%m')
union all
select '2021汇总' as submit_month,
count(submit_time) as month_q_cnt,
round(count(submit_time)/max(31),3) as avg_day_q_cnt
from practice_record
where year(submit_time) = 2021
order by submit_month
这里31加max是为了做分母,用30会报错,和放在下面是做不到的,需要另外计算
后记
后续在这里发布