如果系统出现死锁,一种方式是通过show engine innodb status查看,但是需要对MySQL锁有比较系统深入的了解,而且它只显示谁在等待锁,无法查看到谁持有锁;很多情况下其并不能显示所有锁状态信息;
今天讨论另一个方法,MySQL5.7 通过information_schema下的视图和sys库中的processlist表来查看是谁在持有锁;
首先我们复现一个死锁现象,查看一下事务隔离级别。
# session 1
[test]>show variables like "tx_isolation";
+---------------+-----------------+
| Variable_name | Value |
+---------------+-----------------+
| tx_isolation | REPEATABLE-READ |
+---------------+-----------------+
1 row in set (0.01 sec)
创建一个表,在表中插入一些数据
# session 1
[test]>show create table test\G
*************************** 1. row ***************************
Table: test
Create Table: CREATE TABLE `test` (
`c1` int(10) DEFAULT NULL,
`c2` char(10) DEFAULT NULL
) ENGINE=InnoDB DEFAULT CHARSET=utf8mb4
1 row in set (0.00 sec)
[test]>select * from test;
+------+-------+
| c1 | c2 |
+------+-------+
| 1 | a,b |
| 2 | c,d,e |
| 3 | f,g |
+------+-------+
3 rows in set (0.00 sec)
接着我们在开启另一个命令行窗口,会话2
# session 2
[test]>select * from test;
+------+-------+
| c1 | c2 |
+------+-------+
| 1 | a,b |
| 2 | c,d,e |
| 3 | f,g |
+------+-------+
3 rows in set (0.00 sec)
在会话1中开启事务,并执行删除语句,不提交事务。
# session 1
[test]>begin;
Query OK, 0 rows affected (0.00 sec)
[test]>delete from test where c1 = 2;
Query OK, 1 row affected (0.03 sec)
在会话2中开启事务,并执行删除语句,不提交事务。
# session 2
[test]>begin;
Query OK, 0 rows affected (0.00 sec)
[test]>delete from test where c1 = 3;
会话2中的语句被阻塞。这时我们再开启一个命令行窗口,会话3,进入information_schema库中,执行
select * from innodb_locks;
# session 3
[(none)]>use information_schema;
Reading table information for completion of table and column names
You can turn off this feature to get a quicker startup with -A
[information_schema]>select * from innodb_locks;
+----------------+-------------+-----------+-----------+---------------+-----------------+------------+-----------+----------+----------------+
| lock_id | lock_trx_id | lock_mode | lock_type | lock_table | lock_index | lock_space | lock_page | lock_rec | lock_data |
+----------------+-------------+-----------+-----------+---------------+-----------------+------------+-----------+----------+----------------+
| 1270714:24:3:2 | 1270714 | X | RECORD | `test`.`test` | GEN_CLUST_INDEX | 24 | 3 | 2 | 0x000000000200 |
| 1270709:24:3:2 | 1270709 | X | RECORD | `test`.`test` | GEN_CLUST_INDEX | 24 | 3 | 2 | 0x000000000200 |
+----------------+-------------+-----------+-----------+---------------+-----------------+------------+-----------+----------+----------------+
2 rows in set, 1 warning (0.00 sec)
可以看到2个锁争用,锁模式都是排他锁,可以看到对应2个锁的锁ID及其对应的表;
那么这2个锁,是谁阻塞谁呢?
我们可以通过innodb_lock_waits表来查看,我们执行 select * from innodb_lock_waits;
# session 3
[information_schema]>select * from innodb_lock_waits;
+-------------------+-------------------+-----------------+------------------+
| requesting_trx_id | requested_lock_id | blocking_trx_id | blocking_lock_id |
+-------------------+-------------------+-----------------+------------------+
| 1270714 | 1270714:24:3:2 | 1270709 | 1270709:24:3:2 |
+-------------------+-------------------+-----------------+------------------+
1 row in set, 1 warning (0.00 sec)
这个表可以看到事务ID为1270714的事务处于等待状态,所以1270709事务持有排他锁,未释放,1270714事务在等待对方释放,所以就造成了阻塞现象。
那么它们的事务ID我们知道了,如何知道对应的session id呢?
我们可用通过innodb_trx表来查看。
# session 3
[information_schema]>select * from innodb_trx\G
*************************** 1. row ***************************
trx_id: 1270714
trx_state: LOCK WAIT
trx_started: 2022-09-28 11:49:06
trx_requested_lock_id: 1270714:24:3:2
trx_wait_started: 2022-09-28 11:49:06
trx_weight: 2
trx_mysql_thread_id: 689
trx_query: delete from test where c1 = 3
trx_operation_state: starting index read
trx_tables_in_use: 1
trx_tables_locked: 1
trx_lock_structs: 2
trx_lock_memory_bytes: 1136
trx_rows_locked: 1
trx_rows_modified: 0
trx_concurrency_tickets: 0
trx_isolation_level: REPEATABLE READ
trx_unique_checks: 1
trx_foreign_key_checks: 1
trx_last_foreign_key_error: NULL
trx_adaptive_hash_latched: 0
trx_adaptive_hash_timeout: 0
trx_is_read_only: 0
trx_autocommit_non_locking: 0
*************************** 2. row ***************************
trx_id: 1270709
trx_state: RUNNING
trx_started: 2022-09-28 11:49:00
trx_requested_lock_id: NULL
trx_wait_started: NULL
trx_weight: 3
trx_mysql_thread_id: 687
trx_query: NULL
trx_operation_state: NULL
trx_tables_in_use: 0
trx_tables_locked: 1
trx_lock_structs: 2
trx_lock_memory_bytes: 1136
trx_rows_locked: 4
trx_rows_modified: 1
trx_concurrency_tickets: 0
trx_isolation_level: REPEATABLE READ
trx_unique_checks: 1
trx_foreign_key_checks: 1
trx_last_foreign_key_error: NULL
trx_adaptive_hash_latched: 0
trx_adaptive_hash_timeout: 0
trx_is_read_only: 0
trx_autocommit_non_locking: 0
2 rows in set (0.00 sec)
trx_mysql_thread_id 这个事务的会话id了,我们可以通过trx_mysql_thread_id对应的sys.processlist表中的conn_id,来查看当前事务正在做什么事。
# session 3
[information_schema]>select * from sys.processlist where conn_id = 687\G
*************************** 1. row ***************************
thd_id: 714
conn_id: 687
user:
db: test
command: Sleep
state: NULL
time: 412
current_statement: NULL
statement_latency: NULL
progress: NULL
lock_latency: 83.00 us
rows_examined: 3
rows_sent: 0
rows_affected: 1
tmp_tables: 0
tmp_disk_tables: 0
full_scan: NO
last_statement: delete from test where c1 = 2
last_statement_latency: 7.37 ms
current_memory: 0 bytes
last_wait: NULL
last_wait_latency: NULL
source: NULL
trx_latency: NULL
trx_state: NULL
trx_autocommit: NULL
pid: 3959
program_name: mysql
1 row in set (0.05 sec)
[information_schema]>select * from sys.processlist where conn_id = 689\G
*************************** 1. row ***************************
thd_id: 716
conn_id: 689
user:
db: test
command: Sleep
state: NULL
time: 632
current_statement: NULL
statement_latency: NULL
progress: NULL
lock_latency: 84.00 us
rows_examined: 0
rows_sent: 0
rows_affected: 0
tmp_tables: 0
tmp_disk_tables: 0
full_scan: NO
last_statement: delete from test where c1 = 3
last_statement_latency: 51.37 s
current_memory: 0 bytes
last_wait: NULL
last_wait_latency: NULL
source: NULL
trx_latency: NULL
trx_state: NULL
trx_autocommit: NULL
pid: 6807
program_name: mysql
1 row in set (0.05 sec)
这时我们就可以看到持有锁的事务在做什么操作了,例如上面的就是在执行删除记录操作。那么我们知道是哪个会话产生阻塞,就可以决定是等待完成,还是kill某一个会话了;
例如我们想kill掉持有锁的事务,我们直接kill trx_mysql_thread_id 就可以了
# session 3
[information_schema]>kill 687;
Query OK, 0 rows affected (0.00 sec)