HIve数仓新零售项目DWS层的构建(Grouping sets)模型

news2025/2/28 15:37:54

HIve数仓新零售项目

注:大家觉得博客好的话,别忘了点赞收藏呀,本人每周都会更新关于人工智能和大数据相关的内容,内容多为原创,Python Java Scala SQL 代码,CV NLP 推荐系统等,Spark Flink Kafka Hbase Hive Flume等等~写的都是纯干货,各种顶会的论文解读,一起进步。
今天继续和大家分享一下HIve数仓新零售项目
#博学谷IT学习技术支持


文章目录

  • HIve数仓新零售项目
  • 前言
  • 一、Grouping sets 模型介绍
  • 二、DWS层功能与职责
  • 三、销售主题统计宽表
    • 1.构建目标表
    • 2.Presto Grouping sets语法实现
  • 总结


前言

在这里插入图片描述
在这里插入图片描述
这是一个线下真实HIve数仓的一个搭建项目,还是比较复杂的,主要和大家一起分享一下整个HIve数仓的思路。
整个项目分为:
1.ODS层
2.DWD层
3.DWB层
4.DWS层
5.DM层
6.RPT层
每一层都有每一层的知识点。我会和大家分享从数据源MySQL开始,如何搭建整个完整的项目。


一、Grouping sets 模型介绍

一种高效的替代多个UNION ALL语法的模型,个人比较喜欢,非常灵活,速度快。
以下是一个demo案例。
需求:
分别按照(month)、(day)、月和天(month,day)统计来访用户userid个数,并获取三者的结果集(一起插入到目标宽表中)。

create table test.t_user(
    month string, 
    day string, 
    userid string
) 
row format delimited fields terminated by ',';

--数据样例
2015-03,2015-03-10,user1
2015-03,2015-03-10,user5
2015-03,2015-03-12,user7
2015-04,2015-04-12,user3
2015-04,2015-04-13,user2
2015-04,2015-04-13,user4
2015-04,2015-04-16,user4
2015-03,2015-03-10,user2
2015-03,2015-03-10,user3
2015-04,2015-04-12,user5
2015-04,2015-04-13,user6
2015-04,2015-04-15,user3
2015-04,2015-04-15,user2
2015-04,2015-04-16,user1
  • UNION ALL 写法 ,比较丑而且速度慢,效率低
--3个分组统计而已,简单。统计完再使用union all合并结果集。
--注意union all合并结果集需要各个查询返回字段个数、类型一致,因此需要合理的使用null来填充返回结果。
select month,
       null,
       count(userid)
from test.t_user
group by month

union all

select null,
       day,
       count(userid)
from test.t_user
group by day

union all

select month,
       day,
       count(userid)
from test.t_user
group by month,day;

在这里插入图片描述

  • grouping sets模型写法
  • 根据不同的维度组合进行聚合,等价于将不同维度的GROUP BY结果集进行UNION ALL。
-- Hive的写法
select 
    month,day,count(userid) 
from test.t_user 
    group by month,day 
grouping sets (month,day,(month,day));

-- presto的写法
select
month,
day,
count(*) as cnt
from  test.t_user
group by
grouping sets (month,day,(month,day))
  • grouping介绍与使用
  • 功能:使用grouping操作来判断当前数据是按照哪个字段来分组的
  • 对于给定的分组,如果分组中包含相应的列,则将位设置为0,否则将其设置为1
select month,
       day,
       count(userid),
       grouping(month)      as m,
       grouping(day)        as d,
       grouping(month, day) as m_d
from test.t_user
group by
   grouping sets (month, day, (month, day));

二、DWS层功能与职责

DWS层: 基于主题统计分析, 此层一般是用于最细粒度的统计操作

  • 维度组合:

    日期
    日期+城市
    日期+城市+商圈
    日期+城市+商圈+店铺
    日期+品牌
    日期+大类
    日期+大类+中类
    日期+大类+中列+小类

  • 指标:
    销售收入、平台收入、配送成交额、小程序成交额、安卓APP成交额、苹果APP成交额、PC商城成交额、订单量、参 评单量、差评单量、配送单量、退款单量、小程序订单量、安卓APP订单量、苹果APP订单量、PC商城订单量。

三、销售主题统计宽表

最终要求通过group_type来判断指标来自哪个维度的聚合
在这里插入图片描述

1.构建目标表

drop database if exists yp_dws
create database if not exists yp_dws;

-- 销售主题日统计宽表
DROP TABLE IF EXISTS yp_dws.dws_sale_daycount;
CREATE TABLE yp_dws.dws_sale_daycount(
   city_id string COMMENT '城市id',
   city_name string COMMENT '城市name',
   trade_area_id string COMMENT '商圈id',
   trade_area_name string COMMENT '商圈名称',
   store_id string COMMENT '店铺的id',
   store_name string COMMENT '店铺名称',
   brand_id string COMMENT '品牌id',
   brand_name string COMMENT '品牌名称',
   max_class_id string COMMENT '商品大类id',
   max_class_name string COMMENT '大类名称',
   mid_class_id string COMMENT '中类id',
   mid_class_name string COMMENT '中类名称',
   min_class_id string COMMENT '小类id',
   min_class_name string COMMENT '小类名称',

   -- 经验字段: 用于标记每一条数据是按照哪个维度计算出来的
   group_type string COMMENT '分组类型:store,trade_area,city,brand,min_class,mid_class,max_class,all',

   --   =======日统计=======
   --   销售收入
   sale_amt DECIMAL(38,2) COMMENT '销售收入',
   --   平台收入
   plat_amt DECIMAL(38,2) COMMENT '平台收入',
   -- 配送成交额
   deliver_sale_amt DECIMAL(38,2) COMMENT '配送成交额',
   -- 小程序成交额
   mini_app_sale_amt DECIMAL(38,2) COMMENT '小程序成交额',
   -- 安卓APP成交额
   android_sale_amt DECIMAL(38,2) COMMENT '安卓APP成交额',
   --  苹果APP成交额
   ios_sale_amt DECIMAL(38,2) COMMENT '苹果APP成交额',
   -- PC商城成交额
   pcweb_sale_amt DECIMAL(38,2) COMMENT 'PC商城成交额',
   -- 成交单量
   order_cnt BIGINT COMMENT '成交单量',
   -- 参评单量
   eva_order_cnt BIGINT COMMENT '参评单量comment=>cmt',
   -- 差评单量
   bad_eva_order_cnt BIGINT COMMENT '差评单量negtive-comment=>ncmt',
   -- 配送成交单量
   deliver_order_cnt BIGINT COMMENT '配送单量',
   -- 退款单量
   refund_order_cnt BIGINT COMMENT '退款单量',
   -- 小程序成交单量
   miniapp_order_cnt BIGINT COMMENT '小程序成交单量',
   -- 安卓APP订单量
   android_order_cnt BIGINT COMMENT '安卓APP订单量',
   -- 苹果APP订单量
   ios_order_cnt BIGINT COMMENT '苹果APP订单量',
   -- PC商城成交单量
   pcweb_order_cnt BIGINT COMMENT 'PC商城成交单量'
)
COMMENT '销售主题日统计宽表'
PARTITIONED BY(dt STRING)
ROW format delimited fields terminated BY '\t'
stored AS orc tblproperties ('orc.compress' = 'SNAPPY');

2.Presto Grouping sets语法实现

insert into yp_dws.dws_sale_daycount
with t0 as (
   select
     -- 列裁剪
     -- 维度字段
     od.dt,
     city_id,
     city_name,
     trade_area_id,
     trade_area_name,
     store_name,
     brand_id,
     brand_name,
     max_class_name,
     max_class_id,
     mid_class_name,
     mid_class_id,
     min_class_name,
     min_class_id,

     -- 指标字段
    order_id,
    order_amount,
    total_price,
    plat_fee,
    delivery_fee,
    order_from,
    evaluation_id,
    geval_scores,
    delievery_id,
    refund_id,
    od.store_id,
    row_number() over (partition by order_id,goods_id ) as rk1, -- 过滤脏数据
    row_number() over (partition by order_id ) as rk2
   from yp_dwb.dwb_order_detail od
     left join  yp_dwb.dwb_shop_detail  sd on od.store_id = sd.id
     left join  yp_dwb.dwb_goods_detail gd on od.goods_id = gd.id
 )
select
    city_id,
    city_name,
    trade_area_id,
    trade_area_name,
    store_id,
    store_name,
    brand_id,
    brand_name,
    max_class_id,
    max_class_name,
    mid_class_id,
    mid_class_name,
    min_class_id,
    min_class_name,
    case when grouping(store_id) = 0      -- if
           then 'store'  -- 日期 + 城市 + 商圈 + 店铺
         when grouping(trade_area_id) = 0  -- else if
           then 'trade_area'  --日期 + 城市 + 商圈
         when grouping(city_id) = 0   -- else if
            then 'city '      --日期 + 城市
         when grouping(brand_id) = 0  -- else if
           then 'brand'  -- 日期 = 品牌
         when grouping(min_class_id) = 0  -- else if
           then 'min_class'  -- 日期 + 大类 + 中类 + 小类
         when grouping(mid_class_id) = 0   -- else if
           then 'mid_class'  -- 日期 + 大类 + 中类
         when grouping(max_class_id) = 0
           then 'max_clas'  -- 日期 + 大类
         else
           'all'  -- 日期
    end as group_type,
      -- 总销售额
    case when grouping(store_id) = 0
            then sum(if(store_id is not null,total_price,0))
         when grouping(trade_area_id) = 0
            then sum(if(trade_area_id is not null,total_price,0))
         when  grouping(city_id) = 0
            then sum(if(city_id is not null,total_price,0))
         when grouping(brand_id) = 0
             then sum(if(brand_id is not null,total_price,0))
         when grouping(min_class_id) = 0
            then sum(if(min_class_id is not null,total_price,0))
        when grouping(mid_class_id) = 0
             then sum(if(mid_class_id is not null,total_price,0))
        when grouping(mid_class_id) = 0
             then sum(if(mid_class_id is not null,total_price,0))   -- 聚合定制
        when grouping(max_class_id) = 0
             then sum(if(max_class_id is not null,total_price,0))
        else
            sum(if(dt is not null,total_price,0))  -- 日期
    end as sale_amt,
    -- 平台收入
       case when grouping(store_id) = 0
            then sum(if(store_id is not null,plat_fee,0))
         when grouping(trade_area_id) = 0
            then sum(if(trade_area_id is not null,plat_fee,0))
         when  grouping(city_id) = 0
            then sum(if(city_id is not null,plat_fee,0))
         when grouping(brand_id) = 0
             then sum(if(brand_id is not null,plat_fee,0))
         when grouping(min_class_id) = 0
            then sum(if(min_class_id is not null,plat_fee,0))
        when grouping(mid_class_id) = 0
             then sum(if(mid_class_id is not null,plat_fee,0))
        when grouping(mid_class_id) = 0
             then sum(if(mid_class_id is not null,plat_fee,0))   -- 聚合定制
        when grouping(max_class_id) = 0
             then sum(if(max_class_id is not null,plat_fee,0))
        else
            sum(if(dt is not null,plat_fee,0))  -- 日期
    end as plat_amt,
     -- 配送成交额
       case when grouping(store_id) = 0
            then sum(if(store_id is not null and delievery_id is not null,total_price,0))
         when grouping(trade_area_id) = 0
            then sum(if(trade_area_id is not null  and delievery_id is not null,total_price,0))
         when  grouping(city_id) = 0
            then sum(if(city_id is not null  and delievery_id is not null,total_price,0))
         when grouping(brand_id) = 0
             then sum(if(brand_id is not null  and delievery_id is not null,total_price,0))
         when grouping(min_class_id) = 0
            then sum(if(min_class_id is not null  and delievery_id is not null,total_price,0))
        when grouping(mid_class_id) = 0
             then sum(if(mid_class_id is not null  and delievery_id is not null,total_price,0))
        when grouping(mid_class_id) = 0
             then sum(if(mid_class_id is not null  and delievery_id is not null,total_price,0))   -- 聚合定制
        when grouping(max_class_id) = 0
             then sum(if(max_class_id is not null  and delievery_id is not null,total_price,0))
        else
            sum(if(dt is not null  and delievery_id is not null,total_price,0))  -- 日期
    end as deliver_sale_amt,
     -- 小程序成交额
       case when grouping(store_id) = 0
            then sum(if(store_id is not null and order_from = 'miniapp',total_price,0))
         when grouping(trade_area_id) = 0
            then sum(if(trade_area_id is not null and order_from = 'miniapp',total_price,0))
         when  grouping(city_id) = 0
            then sum(if(city_id is not null  and order_from = 'miniapp',total_price,0))
         when grouping(brand_id) = 0
             then sum(if(brand_id is not null  and order_from = 'miniapp',total_price,0))
         when grouping(min_class_id) = 0
            then sum(if(min_class_id is not null  and order_from = 'miniapp',total_price,0))
        when grouping(mid_class_id) = 0
             then sum(if(mid_class_id is not null  and order_from = 'miniapp',total_price,0))
        when grouping(mid_class_id) = 0
             then sum(if(mid_class_id is not null  and order_from = 'miniapp',total_price,0))   -- 聚合定制
        when grouping(max_class_id) = 0
             then sum(if(max_class_id is not null  and order_from = 'miniapp',total_price,0))
        else
            sum(if(dt is not null  and order_from = 'miniapp',total_price,0))  -- 日期
    end as mini_app_sale_amt,
     -- android成交额
       case when grouping(store_id) = 0
            then sum(if(store_id is not null and order_from = 'android',total_price,0))
         when grouping(trade_area_id) = 0
            then sum(if(trade_area_id is not null and order_from = 'android',total_price,0))
         when  grouping(city_id) = 0
            then sum(if(city_id is not null  and order_from = 'android',total_price,0))
         when grouping(brand_id) = 0
             then sum(if(brand_id is not null  and order_from = 'android',total_price,0))
         when grouping(min_class_id) = 0
            then sum(if(min_class_id is not null  and order_from = 'android',total_price,0))
        when grouping(mid_class_id) = 0
             then sum(if(mid_class_id is not null  and order_from = 'android',total_price,0))
        when grouping(mid_class_id) = 0
             then sum(if(mid_class_id is not null  and order_from = 'android',total_price,0))   -- 聚合定制
        when grouping(max_class_id) = 0
             then sum(if(max_class_id is not null  and order_from = 'android',total_price,0))
        else
            sum(if(dt is not null  and order_from = 'android',total_price,0))  -- 日期
    end as android_sale_amt,
      -- ios成交额
    case when grouping(store_id) = 0
            then sum(if(store_id is not null and order_from = 'ios',total_price,0))
         when grouping(trade_area_id) = 0
            then sum(if(trade_area_id is not null and order_from = 'ios',total_price,0))
         when  grouping(city_id) = 0
            then sum(if(city_id is not null  and order_from = 'ios',total_price,0))
         when grouping(brand_id) = 0
             then sum(if(brand_id is not null  and order_from = 'ios',total_price,0))
         when grouping(min_class_id) = 0
            then sum(if(min_class_id is not null  and order_from = 'ios',total_price,0))
        when grouping(mid_class_id) = 0
             then sum(if(mid_class_id is not null  and order_from = 'ios',total_price,0))
        when grouping(mid_class_id) = 0
             then sum(if(mid_class_id is not null  and order_from = 'ios',total_price,0))   -- 聚合定制
        when grouping(max_class_id) = 0
             then sum(if(max_class_id is not null  and order_from = 'ios',total_price,0))
        else
            sum(if(dt is not null  and order_from = 'ios',total_price,0))  -- 日期
    end as ios_sale_amt,
       -- pcweb成交额
    case when grouping(store_id) = 0
            then sum(if(store_id is not null and order_from = 'pcweb',total_price,0))
         when grouping(trade_area_id) = 0
            then sum(if(trade_area_id is not null and order_from = 'pcweb',total_price,0))
         when  grouping(city_id) = 0
            then sum(if(city_id is not null  and order_from = 'pcweb',total_price,0))
         when grouping(brand_id) = 0
             then sum(if(brand_id is not null  and order_from = 'pcweb',total_price,0))
         when grouping(min_class_id) = 0
            then sum(if(min_class_id is not null  and order_from = 'pcweb',total_price,0))
        when grouping(mid_class_id) = 0
             then sum(if(mid_class_id is not null  and order_from = 'pcweb',total_price,0))
        when grouping(mid_class_id) = 0
             then sum(if(mid_class_id is not null  and order_from = 'pcweb',total_price,0))   -- 聚合定制
        when grouping(max_class_id) = 0
             then sum(if(max_class_id is not null  and order_from = 'pcweb',total_price,0))
        else
            sum(if(dt is not null  and order_from = 'pcweb',total_price,0))  -- 日期
    end as pcweb_sale_amt,
    -- 成交单量
    case when grouping(store_id) = 0
            then count(if(store_id is not null and rk2 = 1,order_id,null))
         when grouping(trade_area_id) = 0
            then count(if(trade_area_id is not null and rk2 = 1,order_id,null))
         when  grouping(city_id) = 0
            then count(if(city_id is not null and rk2=1,order_id,null))
         when grouping(brand_id) = 0
             then count(if(brand_id is not null and rk2=1,order_id,null))
         when grouping(min_class_id) = 0
            then count(if(min_class_id is not null and rk2=1,order_id,null))
        when grouping(mid_class_id) = 0
             then count(if(mid_class_id is not null and rk2=1,order_id,null))
        when grouping(mid_class_id) = 0
             then count(if(mid_class_id is not null and rk2=1,order_id,null))   -- 聚合定制
        when grouping(max_class_id) = 0
             then count(if(max_class_id is not null and rk2=1,order_id,null))
        else
            count(if(dt is not null and rk2=1,order_id,null))  -- 日期
    end as order_cnt,
     -- 参评单量
    case when grouping(store_id) = 0
            then count(if(store_id is not null and rk2=1 and evaluation_id is not null and evaluation_id is not null,order_id,null))
         when grouping(trade_area_id) = 0
            then count(if(trade_area_id is not null and rk2=1 and evaluation_id is not null,order_id,null))
         when  grouping(city_id) = 0
            then count(if(city_id is not null and rk2=1 and evaluation_id is not null,order_id,null))
         when grouping(brand_id) = 0
             then count(if(brand_id is not null and rk2=1 and evaluation_id is not null,order_id,null))
         when grouping(min_class_id) = 0
            then count(if(min_class_id is not null and rk2=1 and evaluation_id is not null,order_id,null))
        when grouping(mid_class_id) = 0
             then count(if(mid_class_id is not null and rk2=1 and evaluation_id is not null,order_id,null))
        when grouping(mid_class_id) = 0
             then count(if(mid_class_id is not null and rk2=1 and evaluation_id is not null,order_id,null))   -- 聚合定制
        when grouping(max_class_id) = 0
             then count(if(max_class_id is not null and rk2=1 and evaluation_id is not null,order_id,null))
        else
            count(if(dt is not null and rk2=1 and evaluation_id is not null,order_id,null))  -- 日期
    end as eva_order_cnt,

     -- 差评单量
    case when grouping(store_id) = 0
            then count(if(store_id is not null and rk2=1 and evaluation_id is not null and geval_scores <= 6,order_id,null))
         when grouping(trade_area_id) = 0
            then count(if(trade_area_id is not null and rk2=1 and evaluation_id is not null and geval_scores <= 6,order_id,null))
         when  grouping(city_id) = 0
            then count(if(city_id is not null and rk2=1 and evaluation_id is not null and geval_scores <= 6,order_id,null))
         when grouping(brand_id) = 0
             then count(if(brand_id is not null and rk2=1 and evaluation_id is not null and geval_scores <= 6,order_id,null))
         when grouping(min_class_id) = 0
            then count(if(min_class_id is not null and rk2=1 and evaluation_id is not null and geval_scores <= 6,order_id,null))
        when grouping(mid_class_id) = 0
             then count(if(mid_class_id is not null and rk2=1 and evaluation_id is not null and geval_scores <= 6,order_id,null))
        when grouping(mid_class_id) = 0
             then count(if(mid_class_id is not null and rk2=1 and evaluation_id is not null and geval_scores <= 6,order_id,null))   -- 聚合定制
        when grouping(max_class_id) = 0
             then count(if(max_class_id is not null and rk2=1 and evaluation_id is not null and geval_scores <= 6,order_id,null))
        else
            count(if(dt is not null and rk2=1 and evaluation_id is not null and geval_scores <= 6,order_id,null))  -- 日期
    end as bad_eva_order_cnt,
    -- 配送单量
    case when grouping(store_id) = 0
            then count(if(store_id is not null and rk2=1 and delievery_id is not null,order_id,null))
         when grouping(trade_area_id) = 0
            then count(if(trade_area_id is not null and rk2=1 and delievery_id is not null,order_id,null))
         when  grouping(city_id) = 0
            then count(if(city_id is not null and rk2=1 and delievery_id is not null,order_id,null))
         when grouping(brand_id) = 0
             then count(if(brand_id is not null and rk2=1 and delievery_id is not null,order_id,null))
         when grouping(min_class_id) = 0
            then count(if(min_class_id is not null and rk2=1 and delievery_id is not null,order_id,null))
        when grouping(mid_class_id) = 0
             then count(if(mid_class_id is not null and rk2=1 and delievery_id is not null,order_id,null))
        when grouping(mid_class_id) = 0
             then count(if(mid_class_id is not null and rk2=1 and delievery_id is not null,order_id,null))   -- 聚合定制
        when grouping(max_class_id) = 0
             then count(if(max_class_id is not null and rk2=1 and delievery_id is not null,order_id,null))
        else
            count(if(dt is not null and rk2=1 and delievery_id is not null,order_id,null))  -- 日期
    end as deliver_order_cnt,
    -- 退款单量
    case when grouping(store_id) = 0
            then count(if(store_id is not null and rk2=1 and refund_id is not null,order_id,null))
         when grouping(trade_area_id) = 0
            then count(if(trade_area_id is not null and rk2=1 and refund_id is not null,order_id,null))
         when  grouping(city_id) = 0
            then count(if(city_id is not null and rk2=1 and refund_id is not null,order_id,null))
         when grouping(brand_id) = 0
             then count(if(brand_id is not null and rk2=1 and refund_id is not null,order_id,null))
         when grouping(min_class_id) = 0
            then count(if(min_class_id is not null and rk2=1 and refund_id is not null,order_id,null))
        when grouping(mid_class_id) = 0
             then count(if(mid_class_id is not null and rk2=1 and refund_id is not null,order_id,null))
        when grouping(mid_class_id) = 0
             then count(if(mid_class_id is not null and rk2=1 and refund_id is not null,order_id,null))   -- 聚合定制
        when grouping(max_class_id) = 0
             then count(if(max_class_id is not null and rk2=1 and refund_id is not null,order_id,null))
        else
            count(if(dt is not null and rk2=1 and refund_id is not null,order_id,null))  -- 日期
    end as refund_order_cnt,
    -- 小程序成交单量
    case when grouping(store_id) = 0
            then count(if(store_id is not null and rk2=1 and order_from = 'miniapp',order_id,null))
         when grouping(trade_area_id) = 0
            then count(if(trade_area_id is not null and rk2=1 and order_from = 'miniapp',order_id,null))
         when  grouping(city_id) = 0
            then count(if(city_id is not null and rk2=1  and order_from = 'miniapp',order_id,null))
         when grouping(brand_id) = 0
             then count(if(brand_id is not null and rk2=1  and order_from = 'miniapp',order_id,null))
         when grouping(min_class_id) = 0
            then count(if(min_class_id is not null and rk2=1  and order_from = 'miniapp',order_id,null))
        when grouping(mid_class_id) = 0
             then count(if(mid_class_id is not null and rk2=1  and order_from = 'miniapp',order_id,null))
        when grouping(mid_class_id) = 0
             then count(if(mid_class_id is not null and rk2=1  and order_from = 'miniapp',order_id,null))   -- 聚合定制
        when grouping(max_class_id) = 0
             then count(if(max_class_id is not null and rk2=1  and order_from = 'miniapp',order_id,null))
        else
            count(if(dt is not null  and rk2=1 and order_from = 'miniapp',order_id,null))  -- 日期
    end as miniapp_order_cnt,
       -- android成交单量
    case when grouping(store_id) = 0
            then count(if(store_id is not null and rk2=1 and order_from = 'android',order_id,null))
         when grouping(trade_area_id) = 0
            then count(if(trade_area_id is not  null and rk2=1 and order_from = 'android',order_id,null))
         when  grouping(city_id) = 0
            then count(if(city_id is not null and rk2=1 and order_from = 'android',order_id,null))
         when grouping(brand_id) = 0
             then count(if(brand_id is not null and rk2=1 and order_from = 'android',order_id,null))
         when grouping(min_class_id) = 0
            then count(if(min_class_id is not null and rk2=1  and order_from = 'android',order_id,null))
        when grouping(mid_class_id) = 0
             then count(if(mid_class_id is not null and rk2=1 and order_from = 'android',order_id,null))
        when grouping(mid_class_id) = 0
             then count(if(mid_class_id is not null and rk2=1 and order_from = 'android',order_id,null))   -- 聚合定制
        when grouping(max_class_id) = 0
             then count(if(max_class_id is not null and rk2=1  and order_from = 'android',order_id,null))
        else
            count(if(dt is not null and rk2=1 and order_from = 'android',order_id,null))  -- 日期
    end as android_order_cnt,
    -- ios成交单量
    case when grouping(store_id) = 0
            then count(if(store_id is not null and rk2=1  and order_from = 'ios',order_id,null))
         when grouping(trade_area_id) = 0
            then count(if(trade_area_id is not null and rk2=1  and order_from = 'ios',order_id,null))
         when  grouping(city_id) = 0
            then count(if(city_id is not null and rk2=1   and order_from = 'ios',order_id,null))
         when grouping(brand_id) = 0
             then count(if(brand_id is not null and rk2=1   and order_from = 'ios',order_id,null))
         when grouping(min_class_id) = 0
            then count(if(min_class_id is not null and rk2=1   and order_from = 'ios',order_id,null))
        when grouping(mid_class_id) = 0
             then count(if(mid_class_id is not null and rk2=1   and order_from = 'ios',order_id,null))
        when grouping(mid_class_id) = 0
             then count(if(mid_class_id is not null and rk2=1   and order_from = 'ios',order_id,null))   -- 聚合定制
        when grouping(max_class_id) = 0
             then count(if(max_class_id is not null and rk2=1   and order_from = 'ios',order_id,null))
        else
            count(if(dt is not null and rk2=1   and order_from = 'ios',order_id,null))  -- 日期
    end as ios_order_cnt,
      -- pcweb成交单量
    case when grouping(store_id) = 0
            then count(if(store_id is not null and rk2=1  and order_from = 'pcweb',order_id,null))
         when grouping(trade_area_id) = 0
            then count(if(trade_area_id is not null and rk2=1  and order_from = 'pcweb',order_id,null))
         when  grouping(city_id) = 0
            then count(if(city_id is not null and rk2=1   and order_from = 'pcweb',order_id,null))
         when grouping(brand_id) = 0
             then count(if(brand_id is not null and rk2=1   and order_from = 'pcweb',order_id,null))
         when grouping(min_class_id) = 0
            then count(if(min_class_id is not null and rk2=1   and order_from = 'pcweb',order_id,null))
        when grouping(mid_class_id) = 0
             then count(if(mid_class_id is not null and rk2=1   and order_from = 'pcweb',order_id,null))
        when grouping(mid_class_id) = 0
             then count(if(mid_class_id is not null and rk2=1   and order_from = 'pcweb',order_id,null))   -- 聚合定制
        when grouping(max_class_id) = 0
             then count(if(max_class_id is not null and rk2=1   and order_from = 'pcweb',order_id,null))
        else
            count(if(dt is not null  and order_from = 'pcweb',order_id,null))  -- 日期
    end as pcweb_order_cnt,
    dt
from t0
where rk1 = 1
group by
grouping sets (
  dt,
 (dt,city_id,city_name),
 (dt,city_id,city_name,trade_area_id,trade_area_name),
 (dt,city_id,city_name,trade_area_id,trade_area_name,store_id,store_name),
 (dt,brand_id,brand_name),
 (dt,max_class_id,max_class_name),
 (dt,max_class_id,max_class_name,mid_class_id,mid_class_name),
 (dt,max_class_id,max_class_name,mid_class_id,mid_class_name,min_class_id,min_class_name)
);

这里主要是运用了grouping和grouping sets的语法,如果不了解可以百度一下。


总结

这里介绍了HIve数仓新零售项目DWS层的构建(Grouping sets)模型,Grouping sets模型适合于多维度,多指标的稀疏宽表的构建,可以把不同的维度放在同一张宽表中,方便以后查询。同时在建立聚合字段的时候,可以根据每个维度进行定制聚合的操作。比较灵活。
如果对grouping和grouping sets的语法有疑问,可以留言讨论。

本文来自互联网用户投稿,该文观点仅代表作者本人,不代表本站立场。本站仅提供信息存储空间服务,不拥有所有权,不承担相关法律责任。如若转载,请注明出处:http://www.coloradmin.cn/o/21348.html

如若内容造成侵权/违法违规/事实不符,请联系多彩编程网进行投诉反馈,一经查实,立即删除!

相关文章

一文搞懂《前后端动态路由权限》

前言 本文主要针对后台管理系统的权限问题&#xff0c;即不同权限对应着不同的路由&#xff0c;同时侧边栏的路由也需要根据权限的不同异步生成。我们知道&#xff0c;权限那肯定是对应用户的&#xff0c;那么就会涉及到用户登录模块&#xff0c;所以这里也简单说一下实现登录的…

同花顺_代码解析_技术指标_S

本文通过对同花顺中现成代码进行解析&#xff0c;用以了解同花顺相关策略设计的思想 目录 SADL SAR SDLH SG_NDB SG_XDT SG_评分 SGSMX SG量比 SI SKDJ SRDM SRMI STIX SADL 腾落指数 1.ADL与指数顶背离时&#xff0c;指数向下反转机会大&#xff1b; 2.ADL与指…

合成孔径SAR雷达成像成(RDA和CSA)(Matlab代码实现)

&#x1f468;‍&#x1f393;个人主页&#xff1a;研学社的博客 &#x1f4a5;&#x1f4a5;&#x1f49e;&#x1f49e;欢迎来到本博客❤️❤️&#x1f4a5;&#x1f4a5; &#x1f3c6;博主优势&#xff1a;&#x1f31e;&#x1f31e;&#x1f31e;博客内容尽量做到思维缜…

(免费分享)基于springboot博客系统

源码获取&#xff1a;关注文末gongzhonghao&#xff0c;输入015领取下载链接 开发工具&#xff1a;IDEA,数据库mysql 技术&#xff1a;springbootmybatis-plusredis 系统分用户前台和管理后台 前台截图&#xff1a; 后台截图&#xff1a; package com.puboot.…

思泰克在创业板过会:拟募资4亿元,赛富投资、传音控股等为股东

11月18日&#xff0c;深圳证券交易所创业板披露的信息显示&#xff0c;厦门思泰克智能科技股份有限公司&#xff08;下称“思泰克”&#xff09;获得上市委会议通过。据贝多财经了解&#xff0c;思泰克的招股书于2022年5月5日获得创业板受理。 本次冲刺创业板上市&#xff0c;思…

西北工业大学算法理论考试复习

&#x1f600;大家好&#xff0c;我是白晨&#xff0c;一个不是很能熬夜&#x1f62b;&#xff0c;但是也想日更的人✈。如果喜欢这篇文章&#xff0c;点个赞&#x1f44d;&#xff0c;关注一下&#x1f440;白晨吧&#xff01;你的支持就是我最大的动力&#xff01;&#x1f4…

阿里云免费SSL证书过期替换

阿里云上有免费的SSL证书&#xff0c;但是好像一个账号全部免费的额度只有20张&#xff0c;一张可以用1年&#xff0c;意思是如果20年后你还需要SSL证书的话&#xff0c;那么你可能就得买了。 我的SSL证书过期了&#xff0c;网站能访问&#xff0c;但是浏览器总是说站点不安全&…

【蓝桥杯冲击国赛计划第7天】模拟和打表 {题目:算式问题、求值、既约分数、天干地支}

文章目录1. 模拟和打表1.1 定义2. 实例「算式问题」题目描述运行限制2.1 简单分析2.2 检查函数2.3 三重化二重3. 实例「求值」题目描述运行限制3.1 简单分析3.2 主函数4. 实例「既约分数」题目描述运行限制4.1 简单分析4.2 辗转相除法2.3 主函数5. 实例「天干地支」题目描述输入…

同花顺_代码解析_技术指标_T、U

本文通过对同花顺中现成代码进行解析&#xff0c;用以了解同花顺相关策略设计的思想 目录 TBR TRIX TRIXFS TWR UDL UOS TBR 新三价率 新三价率:100*上涨家数/(上涨家数下跌家数) MATBR1:TBR的M1日异同移动平均 MATBR2:TBR的M2日异同移动平均 1.指数仍处于下跌状态&a…

Java数据结构 | PriorityQueue详解

目录 一 、PriorityQueue 二、PriorityQueue常用方法介绍 三、 PriorityQueue源码剖析 四&#xff1a;应用&#xff1a;Top-K问题 一 、PriorityQueue 常用接口介绍 上文中我们介绍了优先级队列的模拟实现&#xff0c; Java集合框架中提供了PriorityQueue和PriorityBlocki…

2021 XV6 4:traps

目录 1.RISC-V assenbly 2.Backtrace 3.Alarm 1.RISC-V assenbly 第一个任务是阅读理解&#xff0c;一共有6个问题。 1.Which registers contain arguments to functions? For example, which register holds 13 in mains call to printf? 具体来说就是a0&#xff0c;a1几个…

Docker入门

目录 Docker的作用 Docker的核心概念 Docker安装 镜像命令 镜像下载 查看镜像 搜索镜像 删除镜像 容器命令 创建容器 列出容器 新建并启动容器(最常使用) 守护态运行 启动容器 终止容器 重启容器 进入容器 attach命令 exec命令&#xff08;最常使用&#xff09; 退出容器…

【JavaEE】一文掌握 Ajax

&#x1f431;‍&#x1f3cd;目录1. AJAX 简介2. 伪造Ajax演示3. jQuery.ajax3.1 简单测试&#xff0c;使用最原始的HttpServletResponse处理3.2 使用ajax动态构建前端表格3.3 登录提示效果小demo4. 练习小demo&#xff0c;实现百度搜索框的动态内容提示5. 总结&#xff1a;1.…

纸牌游戏洗牌发牌排序算法设计

纸牌游戏洗牌发牌排序算法设计 本文提供纸牌游戏设计制作的基础部分&#xff0c;即洗牌&#xff0c;发牌&#xff0c;牌张排序排列显示的算法。 以及游戏开始时间使用时间的显示。我是用简单的C语言编译器MySpringC在安卓手机上编写的。此是游戏的框架&#xff0c;供游戏设计者…

计算机网络4小时速成:网络层,虚电路和数据包服务,ipv4,ABC类地址,地址解析协议ARP,子网掩码,路由选择协议,路由器

计算机网络4小时速成&#xff1a;网络层&#xff0c;虚电路和数据包服务&#xff0c;ipv4,ABC类地址&#xff0c;地址解析协议ARP&#xff0c;子网掩码&#xff0c;路由选择协议&#xff0c;路由器 2022找工作是学历、能力和运气的超强结合体&#xff0c;遇到寒冬&#xff0c;…

关于瑞萨R7 的CANFD切换为经典CAN

首先,R7的CANFD是兼容CAN通讯的&#xff0c;在R7芯片他们公用相同的寄存器&#xff0c;至于发出来的帧是CANFD还是CAN取决于协议的不同。 CANFD是可变速率数据段为可变长度&#xff0c;扩展到64Byte&#xff0c;仲裁段和数据段的速率不相同。CANFD新增了FDF,BRS,ESI。FDF表示是…

牛客_小白月赛_61

传送门 A 如果不是特意防止溢出了&#xff0c;那么需要用long,否则会一直卡 很普通的写法,超了就 1, 最后补上一个 1就行 (所以, 这题我wa了8次, 卡了半个小时,就是因为没开 long ! ! !) package com.csh.A; /*** author :Changersh* date : 2022/11/18*/import java.io.*; i…

day02 springmvc

day02 springmvc 第一章 RESTFul风格交互方式 第一节 RESTFul概述 1. REST的概念 REST&#xff1a;Representational State Transfer&#xff0c;表现层资源状态转移。 定位&#xff1a;互联网软件架构风格倡导者&#xff1a;Roy Thomas Fielding文献&#xff1a;Roy Thom…

Android源码学习---init

init&#xff0c;是linux系统中用户空间的第一个进程&#xff0c;也是Android系统中用户空间的第一个进程。 位于/system/core/init目录下。 分析init int main(int argc, char **argv) { //设置子进程退出的信号处理函数 sigchld_handler act.sa_handler sigchld_handler;…

【博学谷学习记录】超强总结,用心分享丨人工智能 Python面向对象 学习总结之Python与Java的区别

目录前言简述面向对象类对象特性前言 经过学习&#xff0c;对Python面向对象部分有了一定的了解。 总结记录&#xff1a;面向对象上Python与Java的部分区别 简述 从类、对象、特性三个层面来简述其部分区别 面向对象 类 PythonJava定义class ClassName(object):passpubl…