Kudu 表数据迁移

    技术2022-07-11  96

    使用 Kudu Command Line Tools 将表数据复制到另一个表

    这两个表可在同一个集群中,也可在不同集群。但是这两个表必须具有相同的表模式,可以具有不同的分区模式。该工具可以使用与源表相同的表和分区模式创建新表。

    用法: kudu table copy <master_addresses> <table_name> <dest_master_addresses> [-nocreate_table] [-dst_table=<table>] [-num_threads=<threads>] [-predicates=<predicates>] [-tablets=<tablets>] [-write_type=<type>]

    参数:

    名称描述类型默认master_addresses以逗号分隔的源 Kudu 主机地址列表,其中每个地址的形式是"hostname:port",也可以使用集群名stringnonetable_name源表名称stringnonedest_master_addresses以逗号分隔的目标 Kudu 主机地址列表,其中每个地址的形式是"hostname:port",也可以使用集群名stringnonecreate_table (optional)如果目标表不存在,是否创建目标表。booltruedst_table (optional)将数据复制到的目标表的名称。如果为空字符串,则使用与源表相同的名称。stringnonenum_threads (optional)要运行的线程数。每个线程运行自己的KuduSession。int322predicates (optional)支持三种类型的谓词,包括"Comparison", “InList” and “IsNull”。stringnonetablets (optional)如果没有指定要检查的 Tablets (以逗号分隔的id列表),则检查所有的 Tablets。stringnonewrite_type (optional)如何将数据复制到目标表。“insert”、"upsert"或空字符串。如果字符串为空,则不会复制数据(create_table为true时很有用)。stringinsert

    使用示例:

    Configuration error: Could not connect to the cluster: no leader master found. Client configured with 1 master(s) (192.168.1.101:7051) but cluster indicates it expects 3 master(s) (cdh01:7051,cdh02:7051,cdh03:7051) 注意: master_addresses 需写全 master 地址。

    [root@node01 ~]# kudu table copy node01:7051,node02:7051,node03:7051 my_database.my_table cdh01:7051,cdh02:7051,cdh03:7051 -nocreate_table -dst_table=my_database.my_table -num_threads=8 -write_type=upsert I0701 13:56:29.095507 1551 table_scanner.cc:532] Scanned count: 1302487 I0701 13:56:34.099725 1551 table_scanner.cc:532] Scanned count: 2425488 I0701 13:56:39.105008 1551 table_scanner.cc:532] Scanned count: 3468528 I0701 13:56:44.109151 1551 table_scanner.cc:532] Scanned count: 4453114 I0701 13:56:49.113626 1551 table_scanner.cc:532] Scanned count: 5425016 I0701 13:56:54.117858 1551 table_scanner.cc:532] Scanned count: 6464445 ...... I0701 14:55:03.737257 1551 table_scanner.cc:532] Scanned count: 535158673 I0701 14:55:08.741680 1551 table_scanner.cc:532] Scanned count: 535302695 I0701 14:55:13.745841 1551 table_scanner.cc:532] Scanned count: 535538444 I0701 14:55:18.750126 1551 table_scanner.cc:532] Scanned count: 535752970 I0701 14:55:23.754225 1551 table_scanner.cc:532] Scanned count: 535949697 I0701 14:55:28.758299 1551 table_scanner.cc:532] Scanned count: 536157109 I0701 14:55:33.762697 1551 table_scanner.cc:532] Scanned count: 536302601 I0701 14:55:38.767024 1551 table_scanner.cc:532] Scanned count: 536385095 I0701 14:55:43.771351 1551 table_scanner.cc:532] Scanned count: 536455367 T 76ccd465c1a345a6808a5731da858044 scanned count 53658586 cost 1300.65 seconds Total count 536573786 cost 3563.88 seconds

    参考: https://kudu.apache.org/docs/command_line_tools_reference.html#remote_replica https://kudu.apache.org/docs/hive_metastore.html

    Processed: 0.008, SQL: 9