Hive学习之Lateral View

    技术2026-04-23  19

    Hive学习之Lateral View

    Lateral view与UDTF函数如explode()一起使用,UDTF对每个输入行产生0或者多个输出行。Lateral view首先在基表的每个输入行应用UDTF,然后连接结果输出行与输入行组成拥有指定表别名的虚拟表。Lateralview的语法如下

    LateralView: LATERAL VIEW udtf(expression) tableAlias AS columnAlias (',' columnAlias)* fromClause: FROM baseTable (lateralView)*

    常见的UDTF函数有 json_tuple,parse_url_tuple,split, explode

    基表pageAds

    列名称列类型pageidSTRINGadid_listArray

    拥有两列的样例表如下,用户想统计出现在所有页面上的某个广告出现的总次数。

    pageidadid_listfront_page[1, 2, 3]contact_page[3, 4, 5]

    ==Lateral view结合explode()==可以将adid_list转化为单独的行(explode函数将数组中的每个元素做为一行返回):

    SELECT pageid, adid FROM pageAds LATERAL VIEW explode(adid_list) adTable AS adid;

    输出的结果为:

    pageid (string)adid (int)“front_page”1“front_page”2“front_page”3“contact_page”3“contact_page”4“contact_page”5

    要统计特定广告的出现次数,可以使用count和group by:

    SELECT adid, count(1) FROM pageAds LATERAL VIEW explode(adid_list) adTable AS adid GROUP BY adid;

    输出结果为:

    int adidcount(1)1121324151

    FROM子句可以有多个LATERAL VIEW子句,后面的LATERALVIEW可以引用出现在LATERAL VIEW左侧的任何表中的列,例如:

    SELECT * FROM exampleTable LATERAL VIEW explode(col1) myTable1 AS myCol1 LATERAL VIEW explode(myCol1) myTable2 AS myCol2;

    LATERAL VIEW子句按照出现的顺序依次使用,例如存在下面的基表:

    Array col1Array col2[1, 2][a", “b”, “c”][3, 4][d", “e”, “f”]

    下面的查询:

    SELECT myCol1, col2 FROM baseTableLATERAL VIEW explode(col1) myTable1 AS myCol1;

    输出的结果为:

    int mycol1Array col21[a", “b”, “c”]2[a", “b”, “c”]3[d", “e”, “f”]4[d", “e”, “f”]

    再增加一个LATERAL VIEW后的查询语句为:

    SELECT myCol1, myCol2 FROM baseTable LATERAL VIEW explode(col1) myTable1 AS myCol1 LATERAL VIEW explode(col2) myTable2 AS myCol2;

    输出的结果为:

    int myCol1string myCol21“a”1“b”1“c”2“a”2“b”2“c”3“d”3“e”3“f”4“d”4“e”4“f”
    Processed: 0.009, SQL: 9