步骤: 1. 数据库中各个字段以及内容 在这里我们可以看到爬取时间后面有时分秒,用户名也是完整显示 2. 添加JAVA代码进行用户名动态脱敏
代码位于Processor部分,红框中的代码是动态处理的部分,
import java.util.*; import java.util.Collections; public boolean processRow(StepMetaInterface smi, StepDataInterface sdi) throws KettleException { if (first){ first = false; } Object[] r = getRow(); if (r == null) { setOutputDone(); return false; } // It is always safest to call createOutputRow() to ensure that your output row's Object[] is large // enough to handle any new fields you are creating in this step. r = createOutputRow(r, data.outputRowMeta.size()); /* TODO: Your code here. (See Sample) // Get the value from an input field String foobar = get(Fields.In, "a_fieldname").getString(r); foobar += "bar"; // Set a value in a new output field get(Fields.Out, "output_fieldname").setValue(r, foobar); */ String id = get(Fields.In, "评论者ID").getString(r);//获取输入 int len = id.length(); if (len == 1) { id = "*"; } else if (len == 2) { id = id.charAt(0)+"*"; } else { String replace = String.join("", Collections.nCopies(len-2, "*")); id = id.charAt(0)+replace+id.charAt(len-1); } get(Fields.Out, "评论者ID_mask").setValue(r, id); // Send the row on to the next step. putRow(data.outputRowMeta, r);//输出 return true; }3. 使用字段选择实现时间模糊 4.添加excel输出,输出到xls文件 5.连接控件 6.启动并输出结果