Hive级联求和

    技术2026-01-13  11

    Hive级联求和(通用方法)

    1 创建表

    create table t_salary_detail(username string,month string,salary int) row format delimited fields terminated by ',';

    2 加载数据

    load data local inpath '/root/hivedata/t_salary_detail.dat' into table t_salary_detail;

    3 数据

    A,2015-01,5 A,2015-01,15 B,2015-01,5 A,2015-01,8 B,2015-01,25 A,2015-01,5 A,2015-02,4 A,2015-02,6 B,2015-02,10 B,2015-02,5 A,2015-03,7 A,2015-03,9 B,2015-03,11 B,2015-03,6

    4 表数据

    +--------------------------+-----------------------+------------------------+--+ | t_salary_detail.username | t_salary_detail.month | t_salary_detail.salary | +--------------------------+-----------------------+------------------------+--+ | A | 2015-01 | 5 | | A | 2015-01 | 15 | | B | 2015-01 | 5 | | A | 2015-01 | 8 | | B | 2015-01 | 25 | | A | 2015-01 | 5 | | A | 2015-02 | 4 | | A | 2015-02 | 6 | | B | 2015-02 | 10 | | B | 2015-02 | 5 | | A | 2015-03 | 7 | | A | 2015-03 | 9 | | B | 2015-03 | 11 | | B | 2015-03 | 6 | +--------------------------+-----------------------+------------------------+--+

    5 需求:统计每个用户每个月获得多少小费?

    select username,month,sum(salary) as salary from t_salary_detail group by username,month;

    6 需求结果

    +-----------+----------+---------+--+ | username | month | salary | 累加金额 +-----------+----------+---------+--+ | A | 2015-01 | 33 | 33 | A | 2015-02 | 10 | 43 | A | 2015-03 | 16 | 59 | B | 2015-01 | 30 | 30 | B | 2015-02 | 15 | 45 | B | 2015-03 | 17 | +-----------+----------+---------+--+

    方法一 使用sum窗口函数计算级联求和问题。

    select a.username, a.month, a.salary, sum(a.salary) over (partition by (a.username) order by month) from (select username, month, sum(salary) as salary from t_salary_detail group by username,month)a;

    方法二 将月总金额表 自己连接 自己连接

    select t3.ausername username, t3.amonth month, max(t3.asalary) salary, sum(bsalary) sumsalary from (select a.username ausername, a.month amonth, a.salary asalary, b.username busername, b.month bmonth, b.salary bsalary from (select username, month, sum(salary) as salary from t_salary_detail group by username,month)a inner join (select username, month, sum(salary) as salary from t_salary_detail group by username,month)b where a.username=b.username and b.month<=a.month)t3 group by t3.ausername,t3.amonth;

    方法二 详解

    将月总金额表 自己连接 自己连接

    select A.*,B.* FROM (select username,month,sum(salary) as salary from t_salary_detail group by username,month) A inner join (select username,month,sum(salary) as salary from t_salary_detail group by username,month) B ons A.username=B.username where B.month <= A.month +-------------+----------+-----------+-------------+----------+-----------+--+ | a.username | a.month | a.salary | b.username | b.month | b.salary | +-------------+----------+-----------+-------------+----------+-----------+--+ | A | 2015-01 | 33 | A | 2015-01 | 33 | | A | 2015-02 | 10 | A | 2015-01 | 33 | | A | 2015-02 | 10 | A | 2015-02 | 10 | | A | 2015-03 | 16 | A | 2015-01 | 33 | | A | 2015-03 | 16 | A | 2015-02 | 10 | | A | 2015-03 | 16 | A | 2015-03 | 16 | | B | 2015-01 | 30 | B | 2015-01 | 30 | | B | 2015-02 | 15 | B | 2015-01 | 30 | | B | 2015-02 | 15 | B | 2015-02 | 15 | | B | 2015-03 | 17 | B | 2015-01 | 30 | | B | 2015-03 | 17 | B | 2015-02 | 15 | | B | 2015-03 | 17 | B | 2015-03 | 17 | +-------------+----------+-----------+-------------+----------+-----------+--+

    从上一步的结果中 进行分组查询,分组的字段是a.username a.month 求月累计值: 将b.month <= a.month的所有b.salary求和即可

    select A.username,A.month,max(A.salary) as salary,sum(B.salary) as accumulate from (select username,month,sum(salary) as salary from t_salary_detail group by username,month) A inner join (select username,month,sum(salary) as salary from t_salary_detail group by username,month) B on A.username=B.username where B.month <= A.month group by A.username,A.month order by A.username,A.month; +-------------+----------+---------+-------------+--+ | a.username | a.month | salary | accumulate | +-------------+----------+---------+-------------+--+ | A | 2015-01 | 33 | 33 | | A | 2015-02 | 10 | 43 | | A | 2015-03 | 16 | 59 | | B | 2015-01 | 30 | 30 | | B | 2015-02 | 15 | 45 | | B | 2015-03 | 17 | 62 | +-------------+----------+---------+-------------+--+

    方法三 使用窗口函数

    绝对路径

    select step, numbs, first_value(numbs) over (order by step) numb2, numbs/ first_value(numbs) over (order by step)*100 absolutePath from dw_oute_numbs;

    相对路径

    select step, numbs, LAG(numbs,1) over(order by step) numb2, numbs/(lag(numbs,1) over (order by step))*100 relativePath from dw_oute_numbs;
    Processed: 0.016, SQL: 9