Skip to content

SQL165 统计活跃间隔对用户分级结果

Static BadgeStatic Badge

题目描述

用户行为日志表tb_user_log

iduidartical_idin_timeout_timesign_cin
110990012021-08-31 10:00:002021-08-31 10:00:090
210990022021-11-04 11:00:552021-11-04 11:00:590
310890012021-09-01 10:00:012021-09-01 10:01:500
410890012021-11-03 10:00:012021-11-03 10:01:500
510490012021-11-02 10:00:282021-11-02 10:00:500
610490032021-09-03 11:00:452021-09-03 11:00:550
710590032021-11-03 11:00:532021-11-03 11:00:590
810290012021-10-30 10:00:002021-10-30 10:00:090
910390012021-10-21 10:00:002021-10-21 10:00:090
1010102021-10-01 10:00:002021-10-01 10:00:421

(uid-用户ID, artical_id-文章ID, in_time-进入时间, out_time-离开时间, sign_in-是否签到)

问题:统计活跃间隔对用户分级后,各活跃等级用户占比,结果保留两位小数,且按占比降序排序。

  • 用户等级标准简化为:忠实用户(近7天活跃过且非新晋用户)、新晋用户(近7天新增)、沉睡用户(近7天未活跃但更早前活跃过)、流失用户(近30天未活跃但更早前活跃过)。
  • 假设今天就是数据中所有日期的最大值。
  • 近7天表示包含当天T的近7天,即闭区间[T-6, T]。

输出示例

示例数据的输出结果如下:

user_graderatio
忠实用户0.43
新晋用户0.29
沉睡用户0.14
流失用户0.14

解释:

今天日期为2021.11.04,根据用户分级标准,用户行为日志表tb_user_log中忠实用户有:109、108、104;新晋用户有105、102;沉睡用户有103;流失用户有101;共7个用户,因此他们的比例分别为0.43、0.29、0.14、0.14。

SQL Schema

sql
DROP TABLE IF EXISTS tb_user_log;
CREATE TABLE tb_user_log
(
    id         INT PRIMARY KEY AUTO_INCREMENT COMMENT '自增ID',
    uid        INT NOT NULL COMMENT '用户ID',
    artical_id INT NOT NULL COMMENT '视频ID',
    in_time    datetime COMMENT '进入时间',
    out_time   datetime COMMENT '离开时间',
    sign_in    TINYINT DEFAULT 0 COMMENT '是否签到'
) CHARACTER SET utf8
  COLLATE utf8_bin;

INSERT INTO tb_user_log(uid, artical_id, in_time, out_time, sign_in)
VALUES (109, 9001, '2021-08-31 10:00:00', '2021-08-31 10:00:09', 0),
       (109, 9002, '2021-11-04 11:00:55', '2021-11-04 11:00:59', 0),
       (108, 9001, '2021-09-01 10:00:01', '2021-09-01 10:01:50', 0),
       (108, 9001, '2021-11-03 10:00:01', '2021-11-03 10:01:50', 0),
       (104, 9001, '2021-11-02 10:00:28', '2021-11-02 10:00:50', 0),
       (104, 9003, '2021-09-03 11:00:45', '2021-09-03 11:00:55', 0),
       (105, 9003, '2021-11-03 11:00:53', '2021-11-03 11:00:59', 0),
       (102, 9001, '2021-10-30 10:00:00', '2021-10-30 10:00:09', 0),
       (103, 9001, '2021-10-21 10:00:00', '2021-10-21 10:00:09', 0),
       (101, 0, '2021-10-01 10:00:00', '2021-10-01 10:00:42', 1);

答案

sql
SELECT user_grade, ROUND(COUNT(uid) / (SELECT COUNT(DISTINCT uid) FROM tb_user_log), 2) AS `ratio`
FROM (SELECT uid,
             CASE
                 WHEN 最晚活跃距今天的天数 >= 30 THEN '流失用户'
                 WHEN 最晚活跃距今天的天数 >= 7 THEN '沉睡用户'
                 WHEN 最早活跃距今天的天数 < 7 THEN '新晋用户'
                 ELSE '忠实用户' END AS `user_grade`
      FROM (SELECT uid,
                   DATEDIFF((SELECT MAX(in_time) FROM tb_user_log), MIN(in_time))  AS `最早活跃距今天的天数`,
                   DATEDIFF((SELECT MAX(in_time) FROM tb_user_log), MAX(out_time)) AS `最晚活跃距今天的天数`
            FROM tb_user_log
            GROUP BY uid) a) b
GROUP BY user_grade
ORDER BY ratio DESC;