Lateral view与UDTF函数如explode()一起使用,UDTF对每个输入行产生0或者多个输出行。Lateral view首先在基表的每个输入行应用UDTF,然后连接结果输出行与输入行组成拥有指定表别名的虚拟表。Lateralview的语法如下
LateralView: LATERAL VIEW udtf(expression) tableAlias AS columnAlias (',' columnAlias)* fromClause: FROM baseTable (lateralView)*常见的UDTF函数有 json_tuple,parse_url_tuple,split, explode
基表pageAds
列名称列类型pageidSTRINGadid_listArray拥有两列的样例表如下,用户想统计出现在所有页面上的某个广告出现的总次数。
pageidadid_listfront_page[1, 2, 3]contact_page[3, 4, 5]==Lateral view结合explode()==可以将adid_list转化为单独的行(explode函数将数组中的每个元素做为一行返回):
SELECT pageid, adid FROM pageAds LATERAL VIEW explode(adid_list) adTable AS adid;输出的结果为:
pageid (string)adid (int)“front_page”1“front_page”2“front_page”3“contact_page”3“contact_page”4“contact_page”5要统计特定广告的出现次数,可以使用count和group by:
SELECT adid, count(1) FROM pageAds LATERAL VIEW explode(adid_list) adTable AS adid GROUP BY adid;输出结果为:
int adidcount(1)1121324151FROM子句可以有多个LATERAL VIEW子句,后面的LATERALVIEW可以引用出现在LATERAL VIEW左侧的任何表中的列,例如:
SELECT * FROM exampleTable LATERAL VIEW explode(col1) myTable1 AS myCol1 LATERAL VIEW explode(myCol1) myTable2 AS myCol2;LATERAL VIEW子句按照出现的顺序依次使用,例如存在下面的基表:
Array col1Array col2[1, 2][a", “b”, “c”][3, 4][d", “e”, “f”]下面的查询:
SELECT myCol1, col2 FROM baseTableLATERAL VIEW explode(col1) myTable1 AS myCol1;输出的结果为:
int mycol1Array col21[a", “b”, “c”]2[a", “b”, “c”]3[d", “e”, “f”]4[d", “e”, “f”]再增加一个LATERAL VIEW后的查询语句为:
SELECT myCol1, myCol2 FROM baseTable LATERAL VIEW explode(col1) myTable1 AS myCol1 LATERAL VIEW explode(col2) myTable2 AS myCol2;输出的结果为:
int myCol1string myCol21“a”1“b”1“c”2“a”2“b”2“c”3“d”3“e”3“f”4“d”4“e”4“f”