记录一次死锁分析

    技术2022-07-11  88

    现象:死锁发生日志

    org.springframework.dao.DeadlockLoserDataAccessException: Error updating database. Cause: com.mysql.jdbc.exceptions.jdbc4.MySQLTransactionRollbackException: Deadlock found when trying to get lock; try restarting transaction The error may involve com.jianpeng.jss.dal.mysql.mapper.JobRegistryDao.removeDead-Inline The error occurred while setting parameters SQL: DELETE FROM job_registry WHERE update_time < DATE_ADD(NOW(),INTERVAL -? SECOND) ### Cause: com.mysql.jdbc.exceptions.jdbc4.MySQLTransactionRollbackException: Deadlock found when trying to get lock; try restarting transaction ; Deadlock found when trying to get lock; try restarting transaction; nested exception is com.mysql.jdbc.exceptions.jdbc4.MySQLTransactionRollbackException: Deadlock found when trying to get lock; try restarting transaction at org.springframework.jdbc.support.SQLErrorCodeSQLExceptionTranslator.doTranslate(SQLErrorCodeSQLExceptionTranslator.java:266) at org.springframework.jdbc.support.AbstractFallbackSQLExceptionTranslator.translate(AbstractFallbackSQLExceptionTranslator.java:72) at org.mybatis.spring.MyBatisExceptionTranslator.translateExceptionIfPossible(MyBatisExceptionTranslator.java:73) at org.mybatis.spring.SqlSessionTemplate$SqlSessionInterceptor.invoke(SqlSessionTemplate.java:446) at com.sun.proxy.$Proxy59.delete(Unknown Source) at org.mybatis.spring.SqlSessionTemplate.delete(SqlSessionTemplate.java:310) at org.apache.ibatis.binding.MapperMethod.execute(MapperMethod.java:68) at org.apache.ibatis.binding.MapperProxy.invoke(MapperProxy.java:59) at com.sun.proxy.$Proxy68.removeDead(Unknown Source) at com.jianpeng.jss.thread.JobRegistryMonitorHelper$1.run(JobRegistryMonitorHelper.java:41) at java.lang.Thread.run(Thread.java:748) Caused by: com.mysql.jdbc.exceptions.jdbc4.MySQLTransactionRollbackException: Deadlock found when trying to get lock; try restarting transaction at sun.reflect.NativeConstructorAccessorImpl.newInstance0(Native Method) at sun.reflect.NativeConstructorAccessorImpl.newInstance(NativeConstructorAccessorImpl.java:62) at sun.reflect.DelegatingConstructorAccessorImpl.newInstance(DelegatingConstructorAccessorImpl.java:45) at java.lang.reflect.Constructor.newInstance(Constructor.java:423) at com.mysql.jdbc.Util.handleNewInstance(Util.java:425)

    分析过程

    1 show engine innodb status 查看master thread的状态信息

    主要日志如下:

    ------------------------ LATEST DETECTED DEADLOCK ------------------------ 2020-07-01 07:24:39 0x7f1b60dfe700 *** (1) TRANSACTION: TRANSACTION 14811395, ACTIVE 0 sec starting index read mysql tables in use 1, locked 1 LOCK WAIT 3 lock struct(s), heap size 1136, 2 row lock(s) MySQL thread id 2700703, OS thread handle 139755565319936, query id 115677414 172.16.95.227 jp_prd updating DELETE FROM job_registry WHERE update_time < DATE_ADD(NOW(),INTERVAL -90 SECOND) *** (1) WAITING FOR THIS LOCK TO BE GRANTED: RECORD LOCKS space id 75 page no 3 n bits 80 index PRIMARY of table `jss`.`job_registry` trx id 14811395 lock_mode X locks rec but not gap waiting Record lock, heap no 5 PHYSICAL RECORD: n_fields 7; compact format; info bits 0 0: len 4; hex 80000008; asc ;; 1: len 6; hex 000000e20102; asc ;; 2: len 7; hex 57000001c904da; asc W ;; 3: len 8; hex 4558454355544f52; asc EXECUTOR;; 4: len 20; hex 6a6f622d6578656375746f722d6578616d706c65; asc job-executor-example;; 5: len 18; hex 3137322e31362e39352e3232373a39393938; asc 172.16.95.227:9998;; 6: len 4; hex 5efbc9b7; asc ^ ;; *** (2) TRANSACTION: TRANSACTION 14811394, ACTIVE 0 sec updating or deleting mysql tables in use 1, locked 1 4 lock struct(s), heap size 1136, 3 row lock(s), undo log entries 1 MySQL thread id 2700802, OS thread handle 139755566130944, query id 115677413 172.16.95.226 jp_prd updating UPDATE job_registry SET `update_time` = NOW() WHERE `registry_group` = 'EXECUTOR' AND `registry_key` = 'job-executor-example' AND `registry_value` = '172.16.95.227:9998' *** (2) HOLDS THE LOCK(S): RECORD LOCKS space id 75 page no 3 n bits 80 index PRIMARY of table `jss`.`job_registry` trx id 14811394 lock_mode X locks rec but not gap Record lock, heap no 5 PHYSICAL RECORD: n_fields 7; compact format; info bits 0 0: len 4; hex 80000008; asc ;; 1: len 6; hex 000000e20102; asc ;; 2: len 7; hex 57000001c904da; asc W ;; 3: len 8; hex 4558454355544f52; asc EXECUTOR;; 4: len 20; hex 6a6f622d6578656375746f722d6578616d706c65; asc job-executor-example;; 5: len 18; hex 3137322e31362e39352e3232373a39393938; asc 172.16.95.227:9998;; 6: len 4; hex 5efbc9b7; asc ^ ;; *** (2) WAITING FOR THIS LOCK TO BE GRANTED: RECORD LOCKS space id 75 page no 5 n bits 80 index i_u of table `jss`.`job_registry` trx id 14811394 lock_mode X locks rec but not gap waiting Record lock, heap no 6 PHYSICAL RECORD: n_fields 2; compact format; info bits 0

    2 分析: 可以看到,事务(1)开始索引读,锁定update_time索引(附表索引配置:job_registry表的索引),事务(2)持有三字段的联合索引锁,对应记录的主键索引,准备更新时间字段,等待update_time索引锁。同时事务1锁定了update_time索引,等待主键锁。由此产生死锁。 3 处理: 一:该例是由于分别拿了主键锁和索引锁,然后互相等待,可以考虑,先查,然后都根据主键去删除或者更新,这样的话,对于查询来说是共享读锁,真正更新的时候主键索引唯一;但是这样逐个执行效率太低,数据量也不可控制,实际生产不推荐使用。 二:使用乐观锁,删除或者更新,任一先对该记录进行操作,另一个就拿不到记录锁。 三:悲观锁,两个业务入口进行控制,保证不同时执行。

    Processed: 0.011, SQL: 9