spark sql substr函数

    技术2022-07-11  89

    开发中,经常进行模糊查询或者进行截取字符串进行模糊匹配,常用的就是substr函数或者substring函数。

    使用语法: substr(string A, int start),substring(string A, int start) 两者用法一样,两个参数

    返回值: string 说明:返回字符串A从start位置到结尾的字符串

    1 2 3 举例演示:

    spark.sql(“select substring(‘abcdeffff’,3) from data1”).show()

    spark.sql(“select substring(‘abcdeffff’,-1) from data1”).show()

    ±----------------------------------+ |substring(abcdeffff, 3, 2147483647)| ±----------------------------------+ | cdeffff| | cdeffff| ±----------------------------------+

    ±-----------------------------------+ |substring(abcdeffff, -1, 2147483647)| ±-----------------------------------+ | f| | f| ±-----------------------------------+

    1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 hive> select substr(‘abcde’,1,2) 和selectsubstr(‘abcde’,0,2)结果一样ab,默认都是从第一位开始取.

    import spark.implicits._ val data1 = Seq( ("4", "qi", "bj", "2019/09/02 12:12:12"), ("5", "SUN", "tj", "2019/09/02 13:13:13") ).toDF("useid", "name", "live", "ptime")

    data1.show() data1.createOrReplaceTempView(“data1”) ±----±—±---±------------------+ |useid|name|live| ptime| ±----±—±---±------------------+ | 4| qi| bj|2019/09/02 12:12:12| | 5| SUN| tj|2019/09/02 13:13:13| ±----±—±---±------------------+

    spark.sql(“select *, substr(ptime, 2,5) from data1”).show()

    spark.sql(“select *, substr(ptime, 3,5) from data1”).show() ±----±—±---±------------------±---------------------+ |useid|name|live| ptime|substring(ptime, 2, 5)| ±----±—±---±------------------±---------------------+ | 4| qi| bj|2019/09/02 12:12:12| 019/0| | 5| SUN| tj|2019/09/02 13:13:13| 019/0| ±----±—±---±------------------±---------------------+

    ±----±—±---±------------------±---------------------+ |useid|name|live| ptime|substring(ptime, 3, 5)| ±----±—±---±------------------±---------------------+ | 4| qi| bj|2019/09/02 12:12:12| 19/09| | 5| SUN| tj|2019/09/02 13:13:13| 19/09| ±----±—±---±------------------±---------------------+ 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 语法: substr(string A, int start, int len),substring(string A, intstart, int len),用法一样,三个参数

    返回值: string 说明:返回字符串A从start位置开始,长度为len的字符串

    1 2 3 举例演示:

    spark.sql(“select substring(‘abcdeffff’,3,5) from data1”).show() ±-------------------------+ |substring(abcdeffff, 3, 5)| ±-------------------------+ | cdeff| | cdeff| ±-------------------------+

    1 2 3 4 5 6 7 8 9 截取后5位;

    截取起始超过自身长度:

    截取长度超过自身长度:

    spark.sql(“select substring(‘abcdeffff’,-5) from data1”).show() spark.sql(“select substring(‘abcdeffff’,10) from data1”).show() spark.sql(“select substring(‘abcdeffff’,12,5) from data1”).show() spark.sql(“select substring(‘abcdeffff’,3,25) from data1”).show()

    ±-----------------------------------+ |substring(abcdeffff, -5, 2147483647)| ±-----------------------------------+ | effff| | effff| ±-----------------------------------+

    ±-----------------------------------+ |substring(abcdeffff, 10, 2147483647)| ±-----------------------------------+ | | | | ±-----------------------------------+

    ±--------------------------+ |substring(abcdeffff, 12, 5)| ±--------------------------+ | | | | ±--------------------------+

    ±--------------------------+ |substring(abcdeffff, 3, 25)| ±--------------------------+ | cdeffff| | cdeffff| ±--------------------------+ ———————————————— 版权声明:本文为博主「楓尘林间」的原创文章,遵循CC 4.0 BY-SA版权协议,转载请附上原文出处链接及本声明。 原文链接:https://blog.csdn.net/bowenlaw/article/details/102484978

    Processed: 0.013, SQL: 9