spark sql substr函数

技术2022-07-11 123

开发中，经常进行模糊查询或者进行截取字符串进行模糊匹配，常用的就是substr函数或者substring函数。

使用语法: substr(string A, int start),substring(string A, int start) 两者用法一样,两个参数

返回值: string 说明：返回字符串A从start位置到结尾的字符串

1 2 3 举例演示：

spark.sql(“select substring(‘abcdeffff’,3) from data1”).show()

spark.sql(“select substring(‘abcdeffff’,-1) from data1”).show()

±----------------------------------+ |substring(abcdeffff, 3, 2147483647)| ±----------------------------------+ | cdeffff| | cdeffff| ±----------------------------------+

±-----------------------------------+ |substring(abcdeffff, -1, 2147483647)| ±-----------------------------------+ | f| | f| ±-----------------------------------+

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 hive> select substr(‘abcde’,1,2) 和selectsubstr(‘abcde’,0,2)结果一样ab，默认都是从第一位开始取.

import spark.implicits._ val data1 = Seq( ("4", "qi", "bj", "2019/09/02 12:12:12"), ("5", "SUN", "tj", "2019/09/02 13:13:13") ).toDF("useid", "name", "live", "ptime")

data1.show() data1.createOrReplaceTempView(“data1”) ±----±—±---±------------------+ |useid|name|live| ptime| ±----±—±---±------------------+ | 4| qi| bj|2019/09/02 12:12:12| | 5| SUN| tj|2019/09/02 13:13:13| ±----±—±---±------------------+

spark.sql(“select *, substr(ptime, 2,5) from data1”).show()

spark.sql(“select *, substr(ptime, 3,5) from data1”).show() ±----±—±---±------------------±---------------------+ |useid|name|live| ptime|substring(ptime, 2, 5)| ±----±—±---±------------------±---------------------+ | 4| qi| bj|2019/09/02 12:12:12| 019/0| | 5| SUN| tj|2019/09/02 13:13:13| 019/0| ±----±—±---±------------------±---------------------+

±----±—±---±------------------±---------------------+ |useid|name|live| ptime|substring(ptime, 3, 5)| ±----±—±---±------------------±---------------------+ | 4| qi| bj|2019/09/02 12:12:12| 19/09| | 5| SUN| tj|2019/09/02 13:13:13| 19/09| ±----±—±---±------------------±---------------------+ 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 语法: substr(string A, int start, int len),substring(string A, intstart, int len)，用法一样，三个参数

返回值: string 说明：返回字符串A从start位置开始，长度为len的字符串

1 2 3 举例演示：

1 2 3 4 5 6 7 8 9 截取后5位;

截取起始超过自身长度：

截取长度超过自身长度:

spark.sql(“select substring(‘abcdeffff’,-5) from data1”).show() spark.sql(“select substring(‘abcdeffff’,10) from data1”).show() spark.sql(“select substring(‘abcdeffff’,12,5) from data1”).show() spark.sql(“select substring(‘abcdeffff’,3,25) from data1”).show()

±-----------------------------------+ |substring(abcdeffff, -5, 2147483647)| ±-----------------------------------+ | effff| | effff| ±-----------------------------------+

±-----------------------------------+ |substring(abcdeffff, 10, 2147483647)| ±-----------------------------------+ | | | | ±-----------------------------------+

±--------------------------+ |substring(abcdeffff, 12, 5)| ±--------------------------+ | | | | ±--------------------------+

±--------------------------+ |substring(abcdeffff, 3, 25)| ±--------------------------+ | cdeffff| | cdeffff| ±--------------------------+ ———————————————— 版权声明：本文为博主「楓尘林间」的原创文章，遵循CC 4.0 BY-SA版权协议，转载请附上原文出处链接及本声明。原文链接：https://blog.csdn.net/bowenlaw/article/details/102484978

Processed: 0.018, SQL: 9