目录
题目
准备数据
分析数据
总结
题目
报告在首次登录的第二天再次登录的玩家的 比率,四舍五入到小数点后两位。换句话说,你需要计算从首次登录日期开始至少连续两天登录的玩家的数量,然后除以玩家总数。
准备数据
## 创建库
create database db;
use db;
## 创建表
Create table If Not Exists Activity (player_id int, device_id int, event_date date, games_played int)
## 向表中插入数据
Truncate table Activity
insert into Activity (player_id, device_id, event_date, games_played) values ('1', '2', '2016-03-01', '5')
insert into Activity (player_id, device_id, event_date, games_played) values ('1', '2', '2016-03-02', '6')
insert into Activity (player_id, device_id, event_date, games_played) values ('2', '3', '2017-06-25', '1')
insert into Activity (player_id, device_id, event_date, games_played) values ('3', '1', '2016-03-02', '0')
insert into Activity (player_id, device_id, event_date, games_played) values ('3', '4', '2018-07-03', '5')
activity表
分析数据
只有 ID 为 1 的玩家在第一天登录后才重新登录,所以答案是 1/3 = 0.33
第一步:选出每个用户id的首次登陆日期
select player_id, min(event_date) as login
from activity
group by player_id
第二步:添加连续两次登录的条件
select a.event_date fraction
from
(select player_id, min(event_date) as login
from activity
group by player_id) p
left join activity a
on p.player_id=a.player_id and datediff(a.event_date, p.login)=1;
第三步:算出 首次登录的第二天再次登录的玩家的 比率,并四舍五入
select round(avg(a.event_date is not null), 2) fraction
from
(select player_id, min(event_date) as login
from activity
group by player_id) p
left join activity a
on p.player_id=a.player_id and datediff(a.event_date, p.login)=1;
总结
- 计算首次日期或最小日期可以使用min()函数
- 使用DATEDIFF()函数来计算两个日期之间的差值