在本博客系列的第3篇,我将仍然关注查询计划(Query Plan)和执行摘要(Execution Summary)。
现在,让我们看看Profile的Planner Timeline和Query Timeline部分:
Planner Timeline: 21.773ms
- Analysis finished: 1.432ms (1.432ms)
- Equivalence classes computed: 1.478ms (46.487us)
- Single node plan created: 21.005ms (19.526ms)
- Runtime filters computed: 21.020ms (14.599us)
- Distributed plan created: 21.096ms (76.103us)
- Lineage info computed: 21.156ms (59.924us)
- Planning finished: 21.773ms (617.260us)
Query Timeline: 14m8s
- Query submitted: 250.799us (250.799us)
- Planning finished: 24.359ms (24.108ms)
- Submit for admission: 25.375ms (1.016ms)
- Completed admission: 25.565ms (189.521us)
- Ready to start on 6 backends: 26.997ms (1.432ms)
- All 6 execution backends (7 fragment instances) started: 33.022ms (6.024ms)
- Rows available: 14m8s (14m8s)
- First row fetched: 14m8s (695.112ms)
- Unregister query: 14m8s (5.330ms)
- ComputeScanRangeAssignmentTimer: 110.741us
每行的信息都很容易理解,我们可以看到运行查询计划花费了21.773ms,从25.375ms开始向admission提交查询计划,从33.022ms开始在工作节点上执行查询计划,在14m8s时准备好最后一行数据并在14m8s时第一行数据被客户端获取(fetch)。这使你可以很清楚地了解每个阶段所花的时间,如果任何阶段都很慢,那将是非常明显的,然后我们可以开始进一步深入研究,以了解可能发生了什么。
从查询时间轴(Query Timeline)中,我们可以清楚地看到,从开始执行(一共6个执行后端(All 6 execution backends))到数据可用(可用行(Rows available))几乎花费了14分钟(14m8s)。由于这里只是count(*),所以没有最后一行。
比如下面的案例:(另一个profile)
Query Compilation: 16.268ms
- Metadata of all 1 tables cached: 1.786ms (1.786ms)
- Analysis finished: 6.162ms (4.376ms)
- Value transfer graph computed: 6.537ms (374.918us)
- Single node plan created: 7.955ms (1.417ms)
- Runtime filters computed: 8.274ms (318.815us)
- Distributed plan created: 8.430ms (156.307us)
- Lineage info computed: 9.664ms (1.234ms)
- Planning finished: 16.268ms (6.603ms)
Query Timeline: 35m46s
- Query submitted: 0.000ns (0.000ns)
- Planning finished: 22.001ms (22.001ms)
- Submit for admission: 23.001ms (1.000ms)
- Completed admission: 23.001ms (0.000ns)
- Ready to start on 2 backends: 24.001ms (1.000ms)
- All 2 execution backends (2 fragment instances) started: 36.001ms (12.000ms)
- Rows available: 5m51s (5m51s)
- First row fetched: 5m52s (950.045ms)
- Last row fetched: 35m46s (29m53s)
- Released admission control resources: 35m46s (1.000ms)
- Unregister query: 35m46s (30.001ms)
- ComputeScanRangeAssignmentTimer: 0.000ns
Impala查询运行了很长时间,客户想要找出原因。从查询时间轴(Query Timeline)中,我们可以清楚地看到,从开始执行(一共两个执行后端(All 2 execution backends))到数据可用(可用行(Rows available))几乎花费了6分钟(5m51s)。这6分钟的执行可能是正常的,就像有很多大数据集的连接(join)一样,查询运行几分钟是很常见的。
但是,我们可以注意到Impala花了30分钟将数据传递回客户端,因为第一行在第6分钟获取,而最后一行在第36分钟获取。因此,从这里,我们可以怀疑Impala协调器(coordinator)和客户端之间可能存在一些网络问题(当从客户端,如impala-shell或Hue,到Impala协调器主机获取数据时)。另一种可能性是客户端可能在获取结果时也在执行其他操作,如在屏幕上打印,因为返回的数据可能很大,该操作可能很耗时。
因此,这部分概要信息可以指引我们找到寻找瓶颈的正确方向。
几个关键的信息:
1)反向阅读,从下到上,因为这是Impala的操作顺序
2)比较“Avg Time”和“Max Time”列的数据
3)比较“Rows”和“Est. #Rows”列的数据
4)检查“Detail”列以查看每个操作的Join类型
Operator #Hosts Avg Time Max Time #Rows Est. #Rows Peak Mem Est. Peak Mem Detail
----------------------------------------------------------------------------------------------------------------------------------
03:AGGREGATE 1 316.053us 316.053us 1 1 40.00 KB 10.00 MB FINALIZE
02:EXCHANGE 1 14m8s 14m8s 6 1 0 0 UNPARTITIONED
01:AGGREGATE 6 2.746ms 5.205ms 6 1 67.00 KB 10.00 MB
00:SCAN KUDU 6 2m33s 14m8s 6.10M -1 299.00 KB 0 bms_ldgdb.ldg_690marketization_ledger t
很快,我就注意到扫描KUDU(SCAN KUDU)操作的“Avg Time”和“Max Time”的差别很大,平均时间为2分33秒,但是6个主机中的一个主机的最长时间为14分8秒。
因此,我的下一个想法是确定哪些主机的执行速度比其他主机慢,以及是否来自同一主机。为此,我使用字符串“id=0”来搜索Profile文件,“0”是Summary部分“00:SCAN KUDU”中每一行开头的操作编号。
我从文件开头搜索到了“id=0”部分:这里我只过滤出我想要的内容。
……
Instance 2945a77ff619defe:b658730000000005 (host=rrs-hdp-dn04:22000):(Total: 14m8s, non-child: 2.048ms, % non-child: 0.00%)
…
KUDU_SCAN_NODE (id=0):(Total: 14m8s, non-child: 14m8s, % non-child: 100.00%)
BytesRead(16s000ms): 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0
- BytesRead: 0
- KuduRemoteScanTokens: 0 (0)
- NumScannerThreadsStarted: 1 (1)
- PeakMemoryUsage: 299.00 KB (306176)
- RowsRead: 1.02M (1016239)
- RowsReturned: 1.02M (1016239)
- RowsReturnedRate: 1.20 K/sec
- ScanRangesComplete: 1 (1)
- ScannerThreadsInvoluntaryContextSwitches: 1 (1)
- ScannerThreadsTotalWallClockTime: 14m8s
- MaterializeTupleTime(*): 14m8s
- ScannerThreadsSysTime: 49.992ms
- ScannerThreadsUserTime: 313.952ms
- ScannerThreadsVoluntaryContextSwitches: 1.67K (1669)
- TotalKuduScanRoundTrips: 1.56K (1560)
- TotalReadThroughput: 0.00 /sec
……
Instance 2945a77ff619defe:b658730000000002 (host=rrs-hdp-dn13:22000):(Total: 39s554ms, non-child: 3.704ms, % non-child: 0.01%)
…
KUDU_SCAN_NODE (id=0):(Total: 39s477ms, non-child: 39s477ms, % non-child: 100.00%)
BytesRead(1s000ms): 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0
- BytesRead: 0
- KuduRemoteScanTokens: 0 (0)
- NumScannerThreadsStarted: 1 (1)
- PeakMemoryUsage: 122.00 KB (124928)
- RowsRead: 1.02M (1016532)
- RowsReturned: 1.02M (1016532)
- RowsReturnedRate: 25.75 K/sec
- ScanRangesComplete: 1 (1)
- ScannerThreadsInvoluntaryContextSwitches: 3 (3)
- ScannerThreadsTotalWallClockTime: 39s478ms
- MaterializeTupleTime(*): 39s474ms
- ScannerThreadsSysTime: 6.998ms
- ScannerThreadsUserTime: 137.979ms
- ScannerThreadsVoluntaryContextSwitches: 100 (100)
- TotalKuduScanRoundTrips: 89 (89)
- TotalReadThroughput: 0.00 /sec
……
Instance 2945a77ff619defe:b658730000000006 (host=rrs-hdp-dn05:22000):(Total: 10s311ms, non-child: 2.554ms, % non-child: 0.02%)
…
KUDU_SCAN_NODE (id=0):(Total: 10s235ms, non-child: 10s235ms, % non-child: 100.00%)
BytesRead(500.000ms): 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0
- BytesRead: 0
- KuduRemoteScanTokens: 0 (0)
- NumScannerThreadsStarted: 1 (1)
- PeakMemoryUsage: 122.00 KB (124928)
- RowsRead: 1.02M (1017341)
- RowsReturned: 1.02M (1017341)
- RowsReturnedRate: 99.39 K/sec
- ScanRangesComplete: 1 (1)
- ScannerThreadsInvoluntaryContextSwitches: 2 (2)
- ScannerThreadsTotalWallClockTime: 10s235ms
- MaterializeTupleTime(*): 10s231ms
- ScannerThreadsSysTime: 20.996ms
- ScannerThreadsUserTime: 107.983ms
- ScannerThreadsVoluntaryContextSwitches: 34 (34)
- TotalKuduScanRoundTrips: 31 (31)
- TotalReadThroughput: 0.00 /sec
……
Instance 2945a77ff619defe:b658730000000001 (host=rrs-hdp-dn03:22000):(Total: 9s700ms, non-child: 2.187ms, % non-child: 0.02%)
…
KUDU_SCAN_NODE (id=0):(Total: 9s632ms, non-child: 9s632ms, % non-child: 100.00%)
BytesRead(500.000ms): 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0
- BytesRead: 0
- KuduRemoteScanTokens: 0 (0)
- NumScannerThreadsStarted: 1 (1)
- PeakMemoryUsage: 122.00 KB (124928)
- RowsRead: 1.02M (1017071)
- RowsReturned: 1.02M (1017071)
- RowsReturnedRate: 105.58 K/sec
- ScanRangesComplete: 1 (1)
- ScannerThreadsInvoluntaryContextSwitches: 1 (1)
- ScannerThreadsTotalWallClockTime: 9s632ms
- MaterializeTupleTime(*): 9s628ms
- ScannerThreadsSysTime: 10.998ms
- ScannerThreadsUserTime: 110.983ms
- ScannerThreadsVoluntaryContextSwitches: 33 (33)
- TotalKuduScanRoundTrips: 31 (31)
- TotalReadThroughput: 0.00 /sec
……
Instance 2945a77ff619defe:b658730000000004 (host=rrs-hdp-dn11:22000):(Total: 8s165ms, non-child: 2.354ms, % non-child: 0.03%)
…
KUDU_SCAN_NODE (id=0):(Total: 8s090ms, non-child: 8s090ms, % non-child: 100.00%)
BytesRead(500.000ms): 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0
- BytesRead: 0
- KuduRemoteScanTokens: 0 (0)
- NumScannerThreadsStarted: 1 (1)
- PeakMemoryUsage: 122.00 KB (124928)
- RowsRead: 1.02M (1018748)
- RowsReturned: 1.02M (1018748)
- RowsReturnedRate: 125.92 K/sec
- ScanRangesComplete: 1 (1)
- ScannerThreadsInvoluntaryContextSwitches: 2 (2)
- ScannerThreadsTotalWallClockTime: 8s091ms
- MaterializeTupleTime(*): 8s085ms
- ScannerThreadsSysTime: 9.998ms
- ScannerThreadsUserTime: 114.982ms
- ScannerThreadsVoluntaryContextSwitches: 31 (31)
- TotalKuduScanRoundTrips: 28 (28)
- TotalReadThroughput: 0.00 /sec
……
Instance 2945a77ff619defe:b658730000000003 (host=rrs-hdp-dn08:22000):(Total: 6s024ms, non-child: 2.396ms, % non-child: 0.04%)
…
KUDU_SCAN_NODE (id=0):(Total: 5s952ms, non-child: 5s952ms, % non-child: 100.00%)
BytesRead(500.000ms): 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0
- BytesRead: 0
- KuduRemoteScanTokens: 0 (0)
- NumScannerThreadsStarted: 1 (1)
- PeakMemoryUsage: 122.00 KB (124928)
- RowsRead: 1.01M (1014934)
- RowsReturned: 1.01M (1014934)
- RowsReturnedRate: 170.50 K/sec
- ScanRangesComplete: 1 (1)
- ScannerThreadsInvoluntaryContextSwitches: 0 (0)
- ScannerThreadsTotalWallClockTime: 5s952ms
- MaterializeTupleTime(*): 5s948ms
- ScannerThreadsSysTime: 8.998ms
- ScannerThreadsUserTime: 88.986ms
- ScannerThreadsVoluntaryContextSwitches: 25 (25)
- TotalKuduScanRoundTrips: 22 (22)
- TotalReadThroughput: 0.00 /sec
这里你会看到有6个id=0的KUDU_SCAN_NODE,为什么是6个,因为这里的hosts为6
Operator #Hosts Avg Time Max Time #Rows Est. #Rows Peak Mem Est. Peak Mem Detail
----------------------------------------------------------------------------------------------------------------------------------
03:AGGREGATE 1 316.053us 316.053us 1 1 40.00 KB 10.00 MB FINALIZE
02:EXCHANGE 1 14m8s 14m8s 6 1 0 0 UNPARTITIONED
01:AGGREGATE 6 2.746ms 5.205ms 6 1 67.00 KB 10.00 MB
00:SCAN KUDU 6 2m33s 14m8s 6.10M -1 299.00 KB 0 bms_ldgdb.ldg_690marketization_ledger t
现在我可以清楚地看到哪个服务器节点是瓶颈,它是主机rrs-hdp-dn04,用时14分钟8秒,而其他所有主机用时约5秒至40秒。而这6个host平均时间:正好是2m33s
然后我得出结论,该服务器rrs-hdp-dn04出现了一些问题,需要修复。
下面是页面截图:
分区:6个
而刚才找到主机rrs-hdp-dn04,用时最长,所以到rrs-hdp-dn04上去看看:
到此分析完毕。由于集群有专门的运维团队,所以我只是想搞清楚为什么会那么慢,其实我是想集群调优,调整一些参数再测试一下的,但我的权限有限,不能随便进任何服务器,只好到此为止了,以后有机会再更新参数调优和sql优化。