• 分组子集对齐后再做差集


    【问题】

    I have two tables 1)users(id,registerdate) 2)user_answer(userid,answer,updated_date)

    I want the count of zero usage per day. How many users are registering but not answering per day. Results will be like this:

    Date registedCount notAnsweredCount
    
    15-09-02 20 10
    15-09-01 20 10
    15-08-31 12 4
    

    Data will be like for user table((1,‘15-09-01’),(2,‘15-09-01’),(3,‘15-09-01’)) for user answer table ((1,0,15-09-01)).. Here you can see three users are registered on the day of sep 01, 2015 but the only one user has answered one question. So, result will be (Date=>15-09-01, registedCount => 3, notAnsweredCount => 2)

    有人给出解答,楼主说比较像,但没进一步反馈

    SELECT date_range.aDay,
    COUNT(DISTINCT users.id) AS registedCount,
    SUM(IF(users.id IS NOT NULL AND user_answer.userid IS NULL, 1, 0)) AS notAnsweredCount
    FROM
    (
    SELECT DATE_ADD('2015-09-01', INTERVAL units.aCnt + tens.aCnt * 10 DAY) AS aDay
    FROM
    (
    SELECT 0 AS aCnt UNION SELECT 1 UNION SELECT 2 UNION SELECT 3 UNION SELECT 4 UNION SELECT 5 UNION SELECT 6 UNION SELECT 7 UNION SELECT 8 UNION SELECT 9
    ) units
    CROSS JOIN
    (
    SELECT 0 AS aCnt UNION SELECT 1 UNION SELECT 2 UNION SELECT 3 UNION SELECT 4 UNION SELECT 5 UNION SELECT 6 UNION SELECT 7 UNION SELECT 8 UNION SELECT 9
    ) tens
    ) date_range
    LEFT OUTER JOIN users
    ON date_range.aDay = users.registerdate
    LEFT OUTER JOIN user_answer
    ON users.id = user_answer.userid
    GROUP BY date_range.aDay
    

    【回答】

    有两个难点要解决:库表按照指定的日期序列分组,而不是库表中的字段;每日注册的 id 如何和每日解答的 userid 进行差集运算。用 SQL 实现会比较难理解,可以用 SPL 来帮助 SQL 实现,代码分步写出很直观:

    参数设置:

    SPL 脚本

    A
    1$select id,registerdatefrom userswhere registerdate>=? And registerdate<=?; argBegin,argEnd
    2$select userid,updated_datefrom user_answerwhere updated_date>=? And updated_date<=?; argBegin,argEnd
    3=periods(argBegin,argEnd)
    4=A1.align@a(A3,registerdate).(~.(id))
    5=A2.align@a(A3,updated_date).(~.(userid))
    6=A3.new(~:Date,A4(#).len():registedCount,(A4(#)\A5(#)).len():notAnsweredCount)

    A1、A2:通过 SQL 取表中数据

    A3:使用函数 periods,根据参数生成时间序列

    A4、A5:使用函数 align,将集合按指定序列对齐

    A6:生成新序表,“\”表示差集

  • 相关阅读:
    SpringBoot项目(百度AI整合)——如何在Springboot中使用语音文件识别 & ffmpeg的安装和使用
    微服务的故障处理
    第7个1024
    ES6 Symbol数据类型
    python:爬取网络小说,看这一篇就够了
    线性回归模型求解
    万字详细总结 Promise(期约)及其方法
    java数据结构与算法刷题-----LeetCode1109:航班预订统计
    javaWep内置对象的使用
    jenkins安装
  • 原文地址:https://blog.csdn.net/raqsoft/article/details/127744668