我们通过程序往hbase写数据非String的时候而是Long,如:
def monitoringSinglePut(hbasetableName: String, topic: String,execTime:Long,schedulingTime:Long,processTime:Long,num_records:Long, family: String,hbaseconf:Configuration): Unit = { //单个插入 val connection: Connection = ConnectionFactory.createConnection(hbaseconf) val put: Put = new Put(Bytes.toBytes(HashChoreWoker.getMD5(topic).substring(0,10))) //参数是行健 put.addColumn(Bytes.toBytes(family), Bytes.toBytes("dateDelay"), Bytes.toBytes(topic)) put.addColumn(Bytes.toBytes(family), Bytes.toBytes("execTimeDelay"), Bytes.toBytes(execTime)) put.addColumn(Bytes.toBytes(family), Bytes.toBytes("schedulingTimeDelay"), Bytes.toBytes(schedulingTime)) put.addColumn(Bytes.toBytes(family), Bytes.toBytes("schedulingTimeEndTime"), Bytes.toBytes(processTime)) put.addColumn(Bytes.toBytes(family), Bytes.toBytes("dataNumRecords"), Bytes.toBytes(num_records)) //获得表对象 val table: Table = connection.getTable(TableName.valueOf(hbasetableName)) table.put(put) table.close() connection.close() }在hbase我们是不能直接看的,会出现如下界面
FCB66C31F7 column=info:dateDelay, timestamp=1593666635018, value=2020-07-02 13:10:35 FCB66C31F7 column=info:execTimeDelay, timestamp=1593666635018, value=\x00\x00\x00\x00\x00\x00\x00\x01 FCB66C31F7 column=info:schedulingTimeDelay, timestamp=1593666635018, value=\x00\x00\x00\x00\x00\x00\x00\x00 FCB66C31F7 column=info:schedulingTimeEndTime, timestamp=1593666635018, value=\x00\x00\x01s\x0D\xEF\x04\xFC FD7231BE88 column=info:dataNumRecords, timestamp=1593666445022, value=\x00\x00\x00\x00\x00\x00\x00\x00 FD7231BE88 column=info:dateDelay, timestamp=1593666445022, value=2020-07-02 13:07:25 FD7231BE88 column=info:execTimeDelay, timestamp=1593666445022, value=\x00\x00\x00\x00\x00\x00\x00\x01 FD7231BE88 column=info:schedulingTimeDelay, timestamp=1593666445022, value=\x00\x00\x00\x00\x00\x00\x00\x00 FD7231BE88 column=info:schedulingTimeEndTime, timestamp=1593666445022, value=\x00\x00\x01s\x0D\xEC\x1E\xCD也有能看到的,那是我们插入的String类型
。
在hive进行映射的时候
直接用如下语句:
create external table if not exists streaming_monitoring ( rowkey string, dateDelay string, execTimeDelay bigint, schedulingTimeDelay bigint, schedulingTimeEndTime bigint, dataNumRecords bigint )STORED BY 'org.apache.hadoop.hive.hbase.HBaseStorageHandler' WITH SERDEPROPERTIES ("hbase.columns.mapping" = ":key, info:dateDelay, info:execTimeDelay, info:schedulingTimeDelay, info:schedulingTimeEndTim, info:dataNumRecords ") TBLPROPERTIES ("hbase.table.name" = "dc_sma:streaming_monitoring");出现:
00CF1B6FBA 2020-07-02 13:25:35 NULL NULL NULL NULL 0109883550 2020-07-02 13:19:55 NULL NULL NULL NULL 018034ADC6 2020-07-02 13:06:25 NULL NULL NULL NULL 01D18E46E2 2020-07-02 13:53:05 NULL NULL NULL NULL 0259B6E73A 2020-07-02 13:13:55 NULL NULL NULL NULL 02989FA99A 2020-07-02 13:49:10 NULL NULL NULL NULL 02E99E1235 2020-07-02 13:07:10 NULL NULL NULL NULL 034E5BB16B 2020-07-02 13:28:55 NULL NULL NULL NULL 0374AF89A6 2020-07-02 13:09:25 NULL NULL NULL NULL 03942E74AC 2020-07-02 13:46:05 NULL NULL NULL NULL如果使用这个语句:
create external table if not exists streaming_monitoring ( rowkey string, dateDelay string, execTimeDelay bigint, schedulingTimeDelay bigint, schedulingTimeEndTime bigint, dataNumRecords bigint )STORED BY 'org.apache.hadoop.hive.hbase.HBaseStorageHandler' WITH SERDEPROPERTIES ("hbase.columns.mapping" = ":key, info:dateDelay, info:execTimeDelay#b, info:schedulingTimeDelay#b, info:schedulingTimeEndTim#b, info:dataNumRecords#b ") TBLPROPERTIES ("hbase.table.name" = "dc_sma:streaming_monitoring");结果正常
00CF1B6FBA 2020-07-02 13:25:35 1 0 NULL 0 0109883550 2020-07-02 13:19:55 0 1 NULL 0 018034ADC6 2020-07-02 13:06:25 1 0 NULL 0 01D18E46E2 2020-07-02 13:53:05 0 0 NULL 0 0259B6E73A 2020-07-02 13:13:55 0 1 NULL 0 02989FA99A 2020-07-02 13:49:10 1 0 NULL 0 02E99E1235 2020-07-02 13:07:10 0 1 NULL 0 034E5BB16B 2020-07-02 13:28:55 0 0 NULL 0 0374AF89A6 2020-07-02 13:09:25 1 0 NULL 0 03942E74AC 2020-07-02 13:46:05 0 0 NULL 0总结,非String类型的字段我们在后边加上#b表示,就可以正常识别