【作者】
劉博:攜程技術(shù)保障中心數(shù)據(jù)庫高級經(jīng)理,主要關(guān)注Sql server和Mysql的運(yùn)維和故障處理。
【環(huán)境】
版本號:5.6.21
隔離級別:REPEATABLE READ
【問題描述】
接到監(jiān)控報(bào)警,有一個(gè)線上的應(yīng)用DeadLock報(bào)錯(cuò),每15分鐘會準(zhǔn)時(shí)出現(xiàn),報(bào)錯(cuò)統(tǒng)計(jì)如下圖:

登錄Mysql服務(wù)器查看日志:
mysql> show engine innodb status/G*** (1) TRANSACTION:TRANSACTION 102973, ACTIVE 11 sec starting index readmysql tables in use 3, locked 3LOCK WAIT 4 lock struct(s), heap size 1136, 3 row lock(s)MySQL thread id 6, OS thread handle 140024996574976, query id 83 localhost us updatingUPDATE TestTableSET column1 = 1,Column2 = sysdate(),Column3= '026'Column4 = 0AND column5 = 485AND column6 = 'SEK'*** (1) WAITING FOR THIS LOCK TO BE GRANTED:RECORD LOCKS space id 417 page no 1493 n bits 1000 index idx_column6 of table test.TestTable trx id 102973 lock_mode X waitingRecord lock, heap no 859 PHYSICAL RECORD: n_fields 2; compact format; info bits 00: len 3; hex 53454b; asc SEK;;1: len 8; hex 80000000007e1452; asc ~ R;;*** (2) TRANSACTION:TRANSACTION 102972, ACTIVE 26 sec starting index readmysql tables in use 3, locked 3219 lock struct(s), heap size 24784, 2906 row lock(s), undo log entries 7MySQL thread id 5, OS thread handle 140024996841216, query id 84 localhost us updatingUPDATE TestTableSET Column1 = 1,Column2 = sysdate(),Column3 = '026'Column4 = 0AND Column5 = 485AND Column6 = 'SEK'*** (2) HOLDS THE LOCK(S):RECORD LOCKS space id 417 page no 1493 n bits 1000 index idx_Column6 of table test.TestTable trx id 102972 lock_mode XRecord lock, heap no 1 PHYSICAL RECORD: n_fields 1; compact format; info bits 00: len 8; hex 73757072656d756d; asc supremum;;Record lock, heap no 859 PHYSICAL RECORD: n_fields 2; compact format; info bits 00: len 3; hex 53454b; asc SEK;;1: len 8; hex 80000000007e1452; asc ~ R;;
*** (2) WAITING FOR THIS LOCK TO BE GRANTED:RECORD LOCKS space id 601 page no 89642 n bits 1000 index idx_column6 of table test.TestTable trx id 32231892482 lock_mode X locks rec but not gap waitingRecord lock, heap no 38 PHYSICAL RECORD: n_fields 2; compact format; info bits 00: len 3; hex 53454b; asc SEK;;1: len 8; hex 80000000007eea14; asc ~ ;;
大致一看,更新同一索引的同一行,應(yīng)該是一個(gè)Block,報(bào)TimeOut的錯(cuò)才對,怎么會報(bào)DeadLock?
【初步分析】
先分析下(2) TRANSACTION,TRANSACTION 32231892482。
等待的鎖信息為:
0: len 3; hex 53454b; asc SEK;;1: len 8; hex 80000000007eea14; asc
持有的鎖信息為:
0: len 3; hex 53454b; asc SEK;;1: len 8; hex 80000000007eeac4; asc
再先分析下(1) TRANSACTION,TRANSACTION 32231892617。
等待的鎖信息為:
0: len 3; hex 53454b; asc SEK;;1: len 8; hex 80000000007eeac4; asc
于是可以畫出的死鎖表,兩個(gè)資源相互依賴,造成死鎖:
| TRANSACTION | Hold | Wait |
|---|---|---|
| 32231892617 | 53454b/80000000007eea14 | 53454b/80000000007eeac4 |
| 32231892482 | 53454b/80000000007eeac4 | 53454b/80000000007eea14 |
讓我們再看一下explain結(jié)果:
mysql>desc UPDATE TestTable SET Column1=1, Column2 = sysdate(),Column3 = '025' Column4 = 0 AND Column5 = 477 AND Column6 = 'SEK' /G;
*************************** 1. row ***************************
id: 1
select_type: UPDATE
table: TestTable
partitions: NULL
type: index_merge
possible_keys: column5_index,idx_column5_column6_Column1,idxColumn6
key: column5_index,idxColumn6
key_len: 8,9
ref: NULL
rows: 7
filtered: 100.00
Extra: Using intersect(column5_index,idxColumn6); Using where
可以看到 EXTRA 列:
Using intersect(column5_index,idxColumn6)
從5.1開始,引入了 index merge 優(yōu)化技術(shù),對同一個(gè)表可以使用多個(gè)索引分別進(jìn)行條件掃描。
相關(guān)文檔:http://dev.mysql.com/doc/refman/5.7/en/index-merge-optimization.html
The Index Merge method is used to retrieve rows with several range scans and to merge their results into one. The merge can produce unions, intersections, or unions-of-intersections of its underlying scans. This access method merges index scans from a single table; it does not merge scans across multiple tables.
【模擬與驗(yàn)證】
根據(jù)以上初步分析,猜測應(yīng)該就是intersect造成的,于是在測試環(huán)境模擬驗(yàn)證,開啟2個(gè)session模擬死鎖:
| 時(shí)間序列 | Session1 | Session2 |
|---|---|---|
| 1 | Begin; | |
| 2 | UPDATE TestTable SET Column2 = sysdate() Column4 = 0 AND Column5 = 47 AND Column6 = 'SEK 執(zhí)行成功,影響7行 | |
| 3 | Begin; | |
| 4 | UPDATE TestTable SET Column2 = sysdate(),Column4 = 0 AND Column5 = 485 AND Column6 = 'SEK'; 被Blocking | |
| 5 | UPDATE TestTable SET Column2 = sysdate(),Column4 = 0 AND Column5 = 485 AND Column6 = 'SEK'; 執(zhí)行成功 | ERROR 1213 (40001): Deadlock found when trying to get lock; try restarting transaction |
依據(jù)以上信息可以發(fā)現(xiàn)Session2雖然被Block了,但也獲取了一些Session1在時(shí)間序列5時(shí)所需資源的X鎖,可以再開啟一個(gè)查詢select count(Column5) from TestTable where Column5 = 485,設(shè)置SET TRANSACTION ISOLATION LEVEL SERIALIZABLE,去查詢Column5 = 485的行,觀察鎖等待的信息:
mysql> SELECT r.trx_id waiting_trx_id, r.trx_mysql_thread_id waiting_thread, r.trx_query waiting_query, b.trx_id
mysql> SELECT r.trx_id waiting_trx_id, r.trx_mysql_thread_id waiting_thread, r.trx_query waiting_query, b.trx_id blocking_trx_id, b.trx_mysql_thread_id blocking_thread, b.trx_query blocking_query FROM information_schema.innodb_lock_waits w INNER JOIN information_schema.innodb_trx b ON b.trx_id = w.blocking_trx_id INNER JOIN information_schema.innodb_trx r ON r.trx_id = w.requesting_trx_id /G;*************************** 1. row ***************************waiting_trx_id: 103006waiting_thread: 36waiting_query: UPDATE TestTable SET Column1 = 1, Column2 = sysdate(), Column3 = '026' Column4 = 0 AND Column5 = 485 AND Column6 = 'SEK'blocking_trx_id: 103003blocking_thread: 37blocking_query: NULL*************************** 2. row ***************************waiting_trx_id: 421500433538672waiting_thread: 39waiting_query: select count(Column5) from TestTable where Column5 = 485blocking_trx_id: 103006blocking_thread: 36blocking_query: UPDATE TestTable SET Column1 = 1, Column2 = sysdate(), Column3 = '026' Column4 = 0 AND Column5 = 485 AND Column6 = 'SEK'2 rows in set, 1 warning (0.00 sec)mysql> select * from information_schema.innodb_lock_waits /G;*************************** 1. row ***************************requesting_trx_id: 103006requested_lock_id: 103006:417:1493:859blocking_trx_id: 103003blocking_lock_id: 103003:417:1493:859*************************** 2. row ***************************requesting_trx_id: 421500433538672requested_lock_id: 421500433538672:417:749:2blocking_trx_id: 103006blocking_lock_id: 103006:417:749:22 rows in set, 1 warning (0.00 sec)
mysql> select * from INNODB_LOCKS /G;*************************** 1. row ***************************lock_id: 103006:417:1493:859lock_trx_id: 103006lock_mode: Xlock_type: RECORDlock_table: test.TestTablelock_index: idxColumn6lock_space: 417lock_page: 1493lock_rec: 859lock_data: 'SEK', 8262738*************************** 2. row ***************************lock_id: 103003:417:1493:859lock_trx_id: 103003lock_mode: Xlock_type: RECORDlock_table:test.TestTablelock_index: idxColumn6lock_space: 417lock_page: 1493lock_rec: 859lock_data: 'SEK', 8262738*************************** 3. row ***************************lock_id: 421500433538672:417:749:2lock_trx_id: 421500433538672lock_mode: Slock_type: RECORDlock_table: test.TestTablelock_index: column5_indexlock_space: 417lock_page: 749lock_rec: 2lock_data: 485, 8317620*************************** 4. row ***************************lock_id: 103006:417:749:2lock_trx_id: 103006lock_mode: Xlock_type: RECORDlock_table: test.TestTablelock_index: column5_indexlock_space: 417lock_page: 749lock_rec: 2lock_data: 485, 83176204 rows in set, 1 warning (0.00 sec)
可以看到Session2,trx_id 103006阻塞了trx_id 421500433538672,而trx_id 421500433538672 requested_lock也正好是lock_data: 485, 8317620。由此可見Session2雖然別block了,但是還是獲取到了Index column5_index相關(guān)的鎖。被Block是因?yàn)閕ntersect的原因,還需要idxColumn6的鎖,至此思路已經(jīng)清晰,對整個(gè)分配鎖的信息簡化一下,如下表格(請求到的鎖用青色表示,需獲取但未獲取到的鎖用紅色表示):
| 時(shí)間點(diǎn) | Session1 | Session2 |
|---|---|---|
| 1 | 477 SEK | |
| 2 | 485 SEK | |
| 3 | 485 SEK | 死鎖發(fā)生 |
可以看到485 SEK這兩個(gè)資源形成了一個(gè)環(huán)狀,最終發(fā)生死鎖。
【解決方法】
新聞熱點(diǎn)
疑難解答
圖片精選