• HiveSQL用户行为漏斗分析


    漏斗分析是一个数据分析模型,它能够科学反映一个业务过程从起点到终点各阶段用户转化情况。由于其能将各阶段环节都展示出来,故哪个阶段存在问题,就能一目了然。

    该需求要求统计一个完整的购物流程各个阶段的人数,具体说明如下:

    统计周期

    指标

    说明

    最近1、7、30日

    首页浏览人数

    最近1、7、30日

    商品详情页浏览人数

    最近1、7、30日

    加购人数

    最近1、7、30日

    下单人数

    最近1、7、30日

    支付人数

    支付成功人数

     1.表结构

     2.数据来源

     

     

     

     

     

     

     实现1:

    1)分别查询各个指标的一天数据

    1. select dt,
    2. sum(`if`(page_id = 'home', 1, 0)) home_count,
    3. sum(`if`(page_id = 'good_detail', 1, 0)) good_detail_count
    4. from dws_traffic_page_visitor_page_view_1d
    5. where dt = '2020-06-14'
    6. and page_id in ('home', 'good_detail')
    7. group by dt
    8. select dt,
    9. count(distinct (user_id)) cart_count
    10. from dws_trade_user_cart_add_1d
    11. where dt = '2020-06-14'
    12. group by dt
    13. select dt,
    14. count(distinct (user_id)) order_count
    15. from dws_trade_user_sku_order_1d
    16. where dt = '2020-06-14'
    17. group by dt
    18. select dt,
    19. count(distinct (user_id)) payment_count
    20. from dws_trade_user_payment_1d
    21. where dt = '2020-06-14'
    22. group by dt

    2)查询7、30天的首页浏览、详情页浏览

    1. --7d
    2. select dt,
    3. sum(`if`(page_id = 'home' and view_count_7d > 0, 1, 0)) home_count,
    4. sum(`if`(page_id = 'good_detail' and view_count_7d > 0, 1, 0)) good_detail_count
    5. from dws_traffic_page_visitor_page_view_nd
    6. where dt = '2020-06-14'
    7. and page_id in ('home', 'good_detail')
    8. group by dt
    9. --30d
    10. select dt,
    11. sum(`if`(page_id = 'home' and view_count_30d > 0, 1, 0)) home_count,
    12. sum(`if`(page_id = 'good_detail' and view_count_30d > 0, 1, 0)) good_detail_count
    13. from dws_traffic_page_visitor_page_view_nd
    14. where dt = '2020-06-14'
    15. and page_id in ('home', 'good_detail')
    16. group by dt

    3)7天和30天的数据来源,行相同,列不同、将7天和30天的查询合并

    (1)使用case when 对列进行分支

    1. select recent_days,
    2. sum(`if`(page_id = 'home' and view_count > 0, 1, 0)) home_count,
    3. sum(`if`(page_id = 'good_detail' and view_count > 0, 1, 0)) good_detail_count
    4. from (select page_id,
    5. recent_days,
    6. case recent_days
    7. when 7 then view_count_7d
    8. when 30 then view_count_30d
    9. end view_count
    10. from dws_traffic_page_visitor_page_view_nd lateral view explode(array(7, 30)) tmp as recent_days
    11. where dt = '2020-06-14'
    12. and page_id in ('home', 'good_detail')) t1
    13. group by recent_days

     4)将1与7、30查询进行union

    1. select 1 recent_days,
    2. sum(`if`(page_id = 'home' and view_count_30d > 0, 1, 0)) home_count,
    3. sum(`if`(page_id = 'good_detail' and view_count_30d > 0, 1, 0)) good_detail_count
    4. from dws_traffic_page_visitor_page_view_nd
    5. where dt = '2020-06-14'
    6. and page_id in ('home', 'good_detail')
    7. union
    8. select recent_days,
    9. sum(`if`(page_id = 'home' and view_count > 0, 1, 0)) home_count,
    10. sum(`if`(page_id = 'good_detail' and view_count > 0, 1, 0)) good_detail_count
    11. from (select page_id,
    12. recent_days,
    13. case recent_days
    14. when 7 then view_count_7d
    15. when 30 then view_count_30d
    16. end view_count
    17. from dws_traffic_page_visitor_page_view_nd lateral view explode(array(7, 30)) tmp as recent_days
    18. where dt = '2020-06-14'
    19. and page_id in ('home', 'good_detail')) t1
    20. group by recent_days

     

     5)加购人数7,30日查询

    (1)分别查询7、30

    1. select
    2. sum(`if`(cart_add_count_7d>0,1,0)) cart_count
    3. from dws_trade_user_cart_add_nd
    4. where dt = '2020-06-14'
    5. select
    6. sum(`if`(cart_add_count_30d>0,1,0)) cart_count
    7. from dws_trade_user_cart_add_nd
    8. where dt = '2020-06-14'

    (2)7日、30日查询合并

    1. select
    2. sum(`if`(`if`(recent_days=7,cart_add_count_7d,cart_add_count_30d) > 0, 1, 0)) cart_count
    3. from dws_trade_user_cart_add_nd lateral view explode(array(7, 30)) tmp as recent_days
    4. where dt = '2020-06-14'
    5. group by recent_days

    (3)后续字段相同合并

    1. select
    2. sum(`if`(`if`(recent_days=7,cart_add_count_7d,cart_add_count_30d) > 0, 1, 0)) cart_count
    3. from dws_trade_user_cart_add_nd lateral view explode(array(7, 30)) tmp as recent_days
    4. where dt = '2020-06-14'
    5. group by recent_days
    1. select
    2. sum(`if`(`if`(recent_days=7,coupon_reduce_amount_7d,coupon_reduce_amount_30d)>0,1,0)) order_count
    3. from dws_trade_user_sku_order_nd lateral view explode(array(7, 30)) tmp as recent_days
    4. where dt = '2020-06-14'
    5. group by recent_days
    1. select
    2. sum(`if`(`if`(recent_days=7,payment_count_7d,payment_amount_30d)>0,1,0)) order_count
    3. from dws_trade_user_payment_nd lateral view explode(array(7, 30)) tmp as recent_days
    4. where dt = '2020-06-14'
    5. group by recent_days

    (4)将字段补齐 1日与7、30日进行union

    1. select 1 recent_days,
    2. sum(`if`(page_id = 'home' and view_count_30d > 0, 1, 0)) home_count,
    3. sum(`if`(page_id = 'good_detail' and view_count_30d > 0, 1, 0)) good_detail_count
    4. from dws_traffic_page_visitor_page_view_nd
    5. where dt = '2020-06-14'
    6. and page_id in ('home', 'good_detail')
    7. union
    8. select recent_days,
    9. sum(`if`(page_id = 'home' and view_count > 0, 1, 0)) home_count,
    10. sum(`if`(page_id = 'good_detail' and view_count > 0, 1, 0)) good_detail_count
    11. from (select page_id,
    12. recent_days,
    13. case recent_days
    14. when 7 then view_count_7d
    15. when 30 then view_count_30d
    16. end view_count
    17. from dws_traffic_page_visitor_page_view_nd lateral view explode(array(7, 30)) tmp as recent_days
    18. where dt = '2020-06-14'
    19. and page_id in ('home', 'good_detail')) t1
    20. group by recent_days
    21. select 1,
    22. count(distinct (user_id)) cart_count
    23. from dws_trade_user_cart_add_1d
    24. where dt = '2020-06-14'
    25. union all
    26. select
    27. recent_days,
    28. sum(`if`(`if`(recent_days=7,cart_add_count_7d,cart_add_count_30d) > 0, 1, 0)) cart_count
    29. from dws_trade_user_cart_add_nd lateral view explode(array(7, 30)) tmp as recent_days
    30. where dt = '2020-06-14'
    31. group by recent_days
    32. select 1,
    33. count(distinct (user_id)) order_count
    34. from dws_trade_user_sku_order_1d
    35. where dt = '2020-06-14'
    36. union all
    37. select recent_days,
    38. sum(`if`(`if`(recent_days=7,coupon_reduce_amount_7d,coupon_reduce_amount_30d)>0,1,0)) order_count
    39. from dws_trade_user_sku_order_nd lateral view explode(array(7, 30)) tmp as recent_days
    40. where dt = '2020-06-14'
    41. group by recent_days
    42. select 1,
    43. count(distinct (user_id)) payment_count
    44. from dws_trade_user_payment_1d
    45. where dt = '2020-06-14'
    46. union all
    47. select recent_days,
    48. sum(`if`(`if`(recent_days=7,payment_count_7d,payment_amount_30d)>0,1,0)) order_count
    49. from dws_trade_user_payment_nd lateral view explode(array(7, 30)) tmp as recent_days
    50. where dt = '2020-06-14'
    51. group by recent_days

    (5)进行join

    1. select
    2. '2020-06-14' dt,
    3. t1.recent_days,
    4. home_count,
    5. good_detail_count,
    6. cart_count,
    7. order_count,
    8. payment_count
    9. from (select t1.recent_days,
    10. home_count,
    11. good_detail_count,
    12. cart_count,
    13. order_count
    14. from (select t1.recent_days,
    15. home_count,
    16. good_detail_count,
    17. cart_count
    18. from (select 1 recent_days,
    19. sum(`if`(page_id = 'home' and view_count_30d > 0, 1, 0)) home_count,
    20. sum(`if`(page_id = 'good_detail' and view_count_30d > 0, 1, 0)) good_detail_count
    21. from dws_traffic_page_visitor_page_view_nd
    22. where dt = '2020-06-14'
    23. and page_id in ('home', 'good_detail')
    24. union
    25. select recent_days,
    26. sum(`if`(page_id = 'home' and view_count > 0, 1, 0)) home_count,
    27. sum(`if`(page_id = 'good_detail' and view_count > 0, 1, 0)) good_detail_count
    28. from (select page_id,
    29. recent_days,
    30. case recent_days
    31. when 7 then view_count_7d
    32. when 30 then view_count_30d
    33. end view_count
    34. from dws_traffic_page_visitor_page_view_nd lateral view explode(array(7, 30)) tmp as recent_days
    35. where dt = '2020-06-14'
    36. and page_id in ('home', 'good_detail')) t1
    37. group by recent_days) t1
    38. left join
    39. (
    40. select 1 recent_days,
    41. count(distinct (user_id)) cart_count
    42. from dws_trade_user_cart_add_1d
    43. where dt = '2020-06-14'
    44. union all
    45. select recent_days,
    46. sum(`if`(`if`(recent_days = 7, cart_add_count_7d, cart_add_count_30d) > 0, 1, 0)) cart_count
    47. from dws_trade_user_cart_add_nd lateral view explode(array(7, 30)) tmp as recent_days
    48. where dt = '2020-06-14'
    49. group by recent_days) t2
    50. on t1.recent_days = t2.recent_days) t1
    51. left join
    52. (select 1 recent_days,
    53. count(distinct (user_id)) order_count
    54. from dws_trade_user_sku_order_1d
    55. where dt = '2020-06-14'
    56. union all
    57. select recent_days,
    58. sum(`if`(`if`(recent_days = 7, coupon_reduce_amount_7d, coupon_reduce_amount_30d) > 0, 1,
    59. 0)) order_count
    60. from dws_trade_user_sku_order_nd lateral view explode(array(7, 30)) tmp as recent_days
    61. where dt = '2020-06-14'
    62. group by recent_days
    63. ) t2
    64. on t1.recent_days = t2.recent_days) t1
    65. left join
    66. (select 1 recent_days,
    67. count(distinct (user_id)) payment_count
    68. from dws_trade_user_payment_1d
    69. where dt = '2020-06-14'
    70. union all
    71. select recent_days,
    72. sum(`if`(`if`(recent_days = 7, payment_count_7d, payment_amount_30d) > 0, 1, 0)) order_count
    73. from dws_trade_user_payment_nd lateral view explode(array(7, 30)) tmp as recent_days
    74. where dt = '2020-06-14'
    75. group by recent_days
    76. ) t2
    77. on t1.recent_days = t2.recent_days

  • 相关阅读:
    java计算机毕业设计西安市城市绿地管理系统源程序+mysql+系统+lw文档+远程调试
    Python自动化小技巧03——自动统计文件里面的文字和词汇频数
    程序员基础能力系列(3)——chrome快捷键总结
    如何在Odoo中添加水印?
    【密码学基础】Oblivious Transfer(不经意传输)
    【唠唠嵌入式】__嵌入式开发需要从0开始造轮子吗?
    [原创]移动相机九点标定工具原理及实现(包涵部分源码)
    蓝桥杯练习题(3的倍数)
    【SpringBoot+Vue】资产管理系统(源码+远程部署+代码讲解+答辩教学)
    降低模拟量信号干扰的10个有效方法
  • 原文地址:https://blog.csdn.net/asd623444055/article/details/125359067