开发中,经常进行模糊查询或者进行截取字符串进行模糊匹配,常用的就是substr函数或者substring函数。
使用语法: substr(string A, int start),substring(string A, int start) 两者用法一样,两个参数
返回值: string 说明:返回字符串A从start位置到结尾的字符串1 2 3 举例演示:
spark.sql(“select substring(‘abcdeffff’,3) from data1”).show()
spark.sql(“select substring(‘abcdeffff’,-1) from data1”).show()
±----------------------------------+ |substring(abcdeffff, 3, 2147483647)| ±----------------------------------+ | cdeffff| | cdeffff| ±----------------------------------+
±-----------------------------------+ |substring(abcdeffff, -1, 2147483647)| ±-----------------------------------+ | f| | f| ±-----------------------------------+
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 hive> select substr(‘abcde’,1,2) 和selectsubstr(‘abcde’,0,2)结果一样ab,默认都是从第一位开始取.
import spark.implicits._ val data1 = Seq( ("4", "qi", "bj", "2019/09/02 12:12:12"), ("5", "SUN", "tj", "2019/09/02 13:13:13") ).toDF("useid", "name", "live", "ptime")data1.show() data1.createOrReplaceTempView(“data1”) ±----±—±---±------------------+ |useid|name|live| ptime| ±----±—±---±------------------+ | 4| qi| bj|2019/09/02 12:12:12| | 5| SUN| tj|2019/09/02 13:13:13| ±----±—±---±------------------+
spark.sql(“select *, substr(ptime, 2,5) from data1”).show()
spark.sql(“select *, substr(ptime, 3,5) from data1”).show() ±----±—±---±------------------±---------------------+ |useid|name|live| ptime|substring(ptime, 2, 5)| ±----±—±---±------------------±---------------------+ | 4| qi| bj|2019/09/02 12:12:12| 019/0| | 5| SUN| tj|2019/09/02 13:13:13| 019/0| ±----±—±---±------------------±---------------------+
±----±—±---±------------------±---------------------+ |useid|name|live| ptime|substring(ptime, 3, 5)| ±----±—±---±------------------±---------------------+ | 4| qi| bj|2019/09/02 12:12:12| 19/09| | 5| SUN| tj|2019/09/02 13:13:13| 19/09| ±----±—±---±------------------±---------------------+ 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 语法: substr(string A, int start, int len),substring(string A, intstart, int len),用法一样,三个参数
返回值: string 说明:返回字符串A从start位置开始,长度为len的字符串1 2 3 举例演示:
spark.sql(“select substring(‘abcdeffff’,3,5) from data1”).show() ±-------------------------+ |substring(abcdeffff, 3, 5)| ±-------------------------+ | cdeff| | cdeff| ±-------------------------+
1 2 3 4 5 6 7 8 9 截取后5位;
截取起始超过自身长度:
截取长度超过自身长度:
spark.sql(“select substring(‘abcdeffff’,-5) from data1”).show() spark.sql(“select substring(‘abcdeffff’,10) from data1”).show() spark.sql(“select substring(‘abcdeffff’,12,5) from data1”).show() spark.sql(“select substring(‘abcdeffff’,3,25) from data1”).show()
±-----------------------------------+ |substring(abcdeffff, -5, 2147483647)| ±-----------------------------------+ | effff| | effff| ±-----------------------------------+
±-----------------------------------+ |substring(abcdeffff, 10, 2147483647)| ±-----------------------------------+ | | | | ±-----------------------------------+
±--------------------------+ |substring(abcdeffff, 12, 5)| ±--------------------------+ | | | | ±--------------------------+
±--------------------------+ |substring(abcdeffff, 3, 25)| ±--------------------------+ | cdeffff| | cdeffff| ±--------------------------+ ———————————————— 版权声明:本文为博主「楓尘林间」的原创文章,遵循CC 4.0 BY-SA版权协议,转载请附上原文出处链接及本声明。 原文链接:https://blog.csdn.net/bowenlaw/article/details/102484978