6.2 锁问题排查

6.2.1 找出谁持有全局读锁

全局读锁通常是由flush table with read lock;这类语句添加的。在各种备份工具为了得到一致性备份,以及在具有主从复制架构的环境中做主备切换时常常使用这类语句。另外,还有一种情况,也是最难排查的一种情况,就是线上系统权限约束不规范,各种人员使用的数据库账号都具有RELOAD权限时,都可以对数据库加全局读锁。

在MySQL 5.7之前的版本中,要排查谁持有全局读锁,通常在数据库层面是很难直接查询到有用数据的(innodb_locks表也只能记录InnoDB层的锁信息,而全局读锁是Server层的锁,所以无法查询到)。从MySQL 5.7版本开始提供了performance_schema.metadata_locks表,用来记录一些Server层的锁信息(包括全局读锁和MDL锁等)。下面通过一个示例来演示如何使用performance_schema找出谁持有全局读锁。

首先,开启第一个会话,执行加全局读锁的语句。

# 执行加锁语句
mysql> flush table with read lock;
Query OK, 0 rows affected(0.00 sec)
# 查询以下加锁线程的process id,以便与后续排查过程对应
mysql> select connection_id();
+-----------------+
| connection_id()|
+-----------------+
| 4                |
+-----------------+
1 row in set(0.00 sec)

然后,开启第二个会话,执行可能对数据造成修改的任意语句,这里以update操作为例。

mysql> use sbtest
Database changed
mysql> select * from sbtest1 limit 1\G
*************************** 1. row ***************************
 id: 21
  k: 2483476
  c:09279210219-37745839908-56185699327-79477158641-86711242956-61449540392-42622804506-61031512845-36718422840-11028803849
pad: 96813293060-05308009118-09223341195-19224109585-45598161848
1 row in set(0.00 sec)
mysql> select connection_id();
+-----------------+
| connection_id()|
+-----------------+
| 5                |
+-----------------+
1 row in set(0.00 sec)
mysql> update sbtest1 set pad='xxx' where id=21; # 操作被阻塞

接下来,开启第三个会话,开始使用一些手段进行排查。

mysql> select connection_id();
+-----------------+
| connection_id()|
+-----------------+
| 16               |
+-----------------+
1 row in set(0.00 sec)
# 查询processlist信息,这里只能看到process id5的线程StateWaiting for global read lock,表示正在等待全局读锁
mysql> show processlist;
+----+-------+------+--------------+--------+-------------+-------+----------------+
| Id | User  | Host | db           | Command| Time         | State | Info           |
+----+-------+---------------------+--------+-------------+-------+----------------+
| 3  | qfsys | 192.168.2.168:41042  | NULL   | Binlog Dump | 11457  | Master has sent all binlog to slave; waiting for more updates   | NULL        |
| 4  | root | localhost | sbtest | Sleep | 234 | | NULL |
| 5  | root | localhost | sbtest | Query | 26 | Waiting for global read lock | update sbtest1 set pad='xxx' where id=21 |
| 16 | root | localhost | NULL | Query | 0 | starting | show processlist |
+----+-------+---------------------+--------+-------------+-------+----------------+
4 rows in set(0.00 sec)
# 继续查询information_schema.innodb_locksinnodb_lock_waitsinnodb_trx表,发现三个表均为空
mysql> select * from information_schema.innodb_locks;
Empty set, 1 warning(0.00 sec)
mysql> select * from information_schema.innodb_lock_waits;
Empty set, 1 warning(0.00 sec)
mysql> select * from information_schema.innodb_trx\G
Empty set(0.00 sec)
# 再使用show engine innodb status;查看(这里只需要看TRANSACTION段落即可),仍然无任何有用的锁信息
mysql> show engine innodb status;
......
=====================================
2018-06-25 13:01:43 0x7fe55ded8700 INNODB MONITOR OUTPUT
=====================================
......
------------
TRANSACTIONS
------------
Trx id counter 2527502
Purge done for trx's n:o < 2527500 undo n:o < 0 state: running but idle
History list length 3
LIST OF TRANSACTIONS FOR EACH SESSION:
---TRANSACTION 422099353083504, not started
0 lock struct(s), heap size 1136, 0 row lock(s)
---TRANSACTION 422099353082592, not started
0 lock struct(s), heap size 1136, 0 row lock(s)
---TRANSACTION 422099353081680, not started
0 lock struct(s), heap size 1136, 0 row lock(s)
--------
FILE I/O
......

通过上面的常规手段查询,无任何有用信息。这个时候,有GDB调试经验的老手估计就要开始使用gdb、strace、pstack等命令查看MySQL调用栈、线程等信息了,但这对于没有C语言基础的人来说,基本上是看天书。好在从MySQL 5.7版本开始提供了performance_schema.metadata_locks表,该表记录了各种Server层的锁信息(包括全局读锁和MDL锁等信息)。下面开启第四个会话查询该表试试。

# 通过performance_schema.metadata_locks表来排查谁持有全局读锁,全局读锁在该表中通常记录着同一个会话的OBJECT_TYPEglobalcommitLOCK_TYPE都为SHARED的两把显式锁
mysql> select * from performance_schema.metadata_locks where OWNER_THREAD_ID! =sys.ps_thread_id(connection_id())\G
*************************** 1. row ***************************
        OBJECT_TYPE: GLOBAL
      OBJECT_SCHEMA: NULL
        OBJECT_NAME: NULL
OBJECT_INSTANCE_BEGIN: 140621322913984
          LOCK_TYPE: SHARED  # 共享锁
      LOCK_DURATION: EXPLICIT  # 显式
        LOCK_STATUS: GRANTED  # 已授予
            SOURCE: lock.cc:1110
      OWNER_THREAD_ID: 94 # 持有锁的内部线程ID94
      OWNER_EVENT_ID: 16
*************************** 2. row ***************************
        OBJECT_TYPE: COMMIT
      OBJECT_SCHEMA: NULL
        OBJECT_NAME: NULL
OBJECT_INSTANCE_BEGIN: 140621322926064
          LOCK_TYPE: SHARED # 共享锁
      LOCK_DURATION: EXPLICIT  # 显式
        LOCK_STATUS: GRANTED # 已授予
            SOURCE: lock.cc:1194
      OWNER_THREAD_ID: 94 # 持有锁的内部线程ID94
          OWNER_EVENT_ID: 16
*************************** 3. row ***************************
        OBJECT_TYPE: GLOBAL
      OBJECT_SCHEMA: NULL
        OBJECT_NAME: NULL
OBJECT_INSTANCE_BEGIN: 140621391527216
          LOCK_TYPE: INTENTION_EXCLUSIVE  # 意向排他锁
      LOCK_DURATION: STATEMENT  # 语句
        LOCK_STATUS: PENDING  # 状态为PENDING,表示正在等待被授予
            SOURCE: sql_base.cc:3190
      OWNER_THREAD_ID: 95  # 被阻塞的内部线程ID95
      OWNER_EVENT_ID: 38
3 rows in set(0.00 sec)
# 查看process id45的线程各自对应的内部线程ID是多少
mysql> select sys.ps_thread_id(4);
+---------------------+
| sys.ps_thread_id(4)|
+---------------------+
| 94 |   # process id4的线程对应的内部线程ID正好为94,说明就是process id4的线程持有了全局读锁
+---------------------+
1 row in set(0.00 sec)
mysql> select sys.ps_thread_id(5);
+---------------------+
| sys.ps_thread_id(5)|
+---------------------+
| 95 |   # process id5的线程对应的内部线程ID正好是95,说明就是process id5的线程在等待全局读锁
+---------------------+
1 row in set(0.00 sec)

如果是生产环境,综合上述信息,通过在processlist信息的行记录中找到与process id为4对应的User、Host、db信息,大致判断属于什么业务用途,找相关人员询问清楚,该处理的就处理,再顺便讨论今后如何避免这个问题。

6.2.2 找出谁持有MDL锁

在执行语句时,我们可能经常会遇到被阻塞等待MDL锁的情况。例如:使用show processlist;语句查看线程信息时可能会发现State字段值为“Waiting for table metadata lock”。那么,当遇到这种情况时,应该如何去排查是谁持有了MDL锁没有释放呢?下面我们尝试进行MDL锁的等待场景模拟(MDL锁记录对应的instruments为wait/lock/metadata/sql/mdl,默认未启用;对应的consumers为performance_schema.metadata_locks,在setup_consumers中只受全局配置项global_instrumentation控制,默认已启用)。

首先,打开两个会话,分别执行如下语句。

# 会话1,显式开启一个事务,并执行一条update语句更新sbtest1表不提交
mysql> use sbtest
Database changed
mysql> begin;
Query OK, 0 rows affected(0.00 sec)
mysql> update sbtest1 set pad='yyy' where id=1;
Query OK, 1 row affected(0.00 sec)
Rows matched: 1  Changed: 1  Warnings: 0
# 会话2,对sbtest1表执行DDL语句添加一个普通索引
mysql> use sbtest
Database changed
mysql> alter table sbtest1 add index i_c(c);  # 被阻塞

然后,再开启一个会话,使用show processlist;语句查询线程信息,可以发现update语句正在等待MDL锁(Waiting for table metadata lock)。

mysql> show processlist;
+----+------+-----------+--------+---------+------+---------+---------------------+
| Id | User | Host       | db     | Command  | Time | State   | Info                 |
+----+------+-----------+--------+---------+------+---------+---------------------+
|92  |root |localhost  |sbtest |Query  |  121|Waiting for table metadata lock|alter table sbtest1 add index i_c(c)|
| 93 | root  | localhost | NULL    | Query | 0       | starting  | show processlist      |
| 94 | root | localhost | sbtest | Sleep  | 1078   |          | NULL                |
+----+------+-----------+--------+---------+------+----------+--------------------+
3 rows in set(0.00 sec)

在MySQL 5.7版本之前,我们不能从数据库层面很直观地去查询谁持有MDL锁信息(如果使用GDB之类的工具来查看,则需要具有一定的C语言基础)。现在,可以通过查询performance_schema.metadata_locks表得知MDL锁信息,发现有5行MDL锁记录,第一行为sbtest.sbtest1表的SHARED_WRITE锁,处于GRANTED状态,被136线程所持有(对应process id为94);后续4行为sbtest.sbtest1表的SHARED_UPGRADABLE、EXCLUSIVE锁,其中SHARED_UPGRADABLE处于GRANTED状态,EXCLUSIVE处于PENDING状态,被134线程所持有(对应process id为92),说明134线程在等待MDL锁。

mysql> select * from performance_schema.metadata_locks where OWNER_THREAD_ID! =sys.ps_thread_id(connection_id())\G
*************************** 1. row ***************************
        OBJECT_TYPE: TABLE
        OBJECT_SCHEMA: sbtest
        OBJECT_NAME: sbtest1
OBJECT_INSTANCE_BEGIN: 139886013386816
          LOCK_TYPE: SHARED_WRITE
        LOCK_DURATION: TRANSACTION
        LOCK_STATUS: GRANTED
            SOURCE: sql_parse.cc:5996
      OWNER_THREAD_ID: 136
      OWNER_EVENT_ID: 721
*************************** 2. row ***************************
        OBJECT_TYPE: GLOBAL
      OBJECT_SCHEMA: NULL
        OBJECT_NAME: NULL
OBJECT_INSTANCE_BEGIN: 139886348911600
          LOCK_TYPE: INTENTION_EXCLUSIVE
      LOCK_DURATION: STATEMENT
        LOCK_STATUS: GRANTED
            SOURCE: sql_base.cc:5497
      OWNER_THREAD_ID: 134
      OWNER_EVENT_ID: 4667
*************************** 3. row ***************************
        OBJECT_TYPE: SCHEMA
      OBJECT_SCHEMA: sbtest
        OBJECT_NAME: NULL
OBJECT_INSTANCE_BEGIN: 139886346748096
          LOCK_TYPE: INTENTION_EXCLUSIVE
      LOCK_DURATION: TRANSACTION
        LOCK_STATUS: GRANTED
            SOURCE: sql_base.cc:5482
      OWNER_THREAD_ID: 134
      OWNER_EVENT_ID: 4667
*************************** 4. row ***************************
        OBJECT_TYPE: TABLE
      OBJECT_SCHEMA: sbtest
        OBJECT_NAME: sbtest1
OBJECT_INSTANCE_BEGIN: 139886346749984
          LOCK_TYPE: SHARED_UPGRADABLE
      LOCK_DURATION: TRANSACTION
        LOCK_STATUS: GRANTED
            SOURCE: sql_parse.cc:5996
      OWNER_THREAD_ID: 134
      OWNER_EVENT_ID: 4669
*************************** 5. row ***************************
        OBJECT_TYPE: TABLE
      OBJECT_SCHEMA: sbtest
        OBJECT_NAME: sbtest1
OBJECT_INSTANCE_BEGIN: 139886348913168
          LOCK_TYPE: EXCLUSIVE
      LOCK_DURATION: TRANSACTION
        LOCK_STATUS: PENDING
            SOURCE: mdl.cc:3891
      OWNER_THREAD_ID: 134
      OWNER_EVENT_ID: 4748
5 rows in set(0.00 sec)

通过上述数据,我们知道是哪个线程持有了MDL锁。通过show processlist;语句的查询结果可以看到process id为94的线程已经长时间处于Sleep状态,但是在这里并不能看到这个线程执行了什么语句,可能需要查询information_schema.innodb_trx表,确认该线程是否存在一个没有提交的事务。如下所示,通过查询该表发现process id为94(trx_mysql_thread_id:94)的线程确实有一个未提交的事务,但并没有太多的有用信息,除了事务开始时间和process id(trx_started: 2018-01-14 01:19:25, trx_mysql_thread_id: 94)。

mysql> select * from information_schema.innodb_trx\G
*************************** 1. row ***************************
                  trx_id: 2452892
              trx_state: RUNNING
            trx_started: 2018-01-14 01:19:25
    trx_requested_lock_id: NULL
        trx_wait_started: NULL
              trx_weight: 3
      trx_mysql_thread_id: 94
......
1 row in set(0.00 sec)

此时,从我们掌握的所有数据信息来看,虽然知道是136线程的事务没有提交导致的134线程发生MDL锁等待,但是并不知道136线程正在做什么事情。我们当然可以杀掉136线程,让134线程继续往下执行,但是不知道136线程在执行什么语句,就无法找到相关的开发人员进行优化。所以,我们还可以借助performance_schema.events_statements_current表来查询某个线程正在执行或者说最后一次执行完成的语句事件信息(这里的信息并不一定可靠,因为该表只能记录每个线程当前正在执行和最近一次执行完成的语句事件信息,一旦这个线程执行新的语句,信息就会被覆盖),如下所示。

mysql> select * from performance_schema.events_statements_current where thread_id=136\G
*************************** 1. row ***************************
            THREAD_ID: 136
            EVENT_ID: 715
        END_EVENT_ID: 887
          EVENT_NAME: statement/sql/update
              SOURCE: socket_connection.cc:101
......
            SQL_TEXT: update sbtest1 set pad='yyy' where id=1
              DIGEST: 69f516aa8eaa67fd6e7bfd3352de5d58
          DIGEST_TEXT: UPDATE `sbtest1` SET `pad` = ? WHERE `id` = ?
      CURRENT_SCHEMA: sbtest
......
        MESSAGE_TEXT: Rows matched: 1  Changed: 1  Warnings: 0
......
1 row in set(0.00 sec)

在performance_schema.events_statements_current表的查询信息中,通过SQL_TEXT字段可以清晰地看到该线程正在执行的SQL语句是什么。如果是生产环境,那么现在你可以去找相关的开发人员进行交涉,下次碰到类似的语句必须及时提交,避免再次发生类似的问题。

6.2.3 找出谁持有表级锁

表级锁对应的instruments(wait/lock/table/sql/handler)默认已启用,对应的consumers为performance_schema.table_handles,在setup_consumers中只受全局配置项global_instrumentation控制,默认已启用。所以,在默认情况下,只需要设置系统配置参数performance_schema=ON即可。下面通过一个示例来演示如何找出谁持有表级锁。

首先,开启两个会话,第一个会话(会话1)对一个表(InnoDB引擎)执行显式加表级锁,第二个会话(会话2)对该表执行DML语句操作。

# 会话1,加表级锁
mysql> use sbtest
Database changed
mysql> select connection_id();
+-----------------+
| connection_id()|
+-----------------+
| 18               |
+-----------------+
1 row in set(0.00 sec)
mysql> lock table sbtest1 read;
Query OK, 0 rows affected(0.00 sec)
# 会话2,对该表执行update更新
mysql> use sbtest
Database changed
mysql> select connection_id();
+-----------------+
| connection_id()|
+-----------------+
| 19               |
+-----------------+
1 row in set(0.00 sec)
mysql> update sbtest1 set pad='xxx' where id=1; # 被阻塞

然后,开启第三个会话(会话3),使用show processlist;语句查询线程信息,可以发现update语句正在等待MDL锁(Waiting for table metadata lock)。

mysql> show processlist;
+----+---------+-----------+----------+---------+----------+----------+------------+
| Id | User    | Host      | db       | Command | Time     | State    | Info       |
+----+---------+-----------+----------+---------+----------+----------+------------+
|3   |qfsys   |192.168.2.168:41042   |NULL     |Binlog Dump|18565  |Master has sent all binlog to slave; waiting for more updates | NULL |
    | 18 | root    | localhost | sbtest   | Sleep   | 67        |           | NULL     |
    |19  |root    |localhost|sbtest   |Query    |51         |Waiting for table metadata lock | update sbtest1 set pad='xxx' where id=1 |
    | 20 | root    | localhost | NULL   | Query     | 0  | starting | show processlist |
    +----+---------+-----------+----------+----------+----------+---------+------------+
    4 rows in set(0.00 sec)

既然是等待MDL锁,那么在会话3中查询performance_schema.metadata_locks表,记录的顺序代表持有锁的时间顺序。

mysql> select * from performance_schema.metadata_locks where OWNER_THREAD_ID! =sys.ps_thread_id(connection_id())\G
 *************************** 1. row ***************************
          OBJECT_TYPE: TABLE
        OBJECT_SCHEMA: sbtest
          OBJECT_NAME: sbtest1
 OBJECT_INSTANCE_BEGIN: 140622530920576
          LOCK_TYPE: SHARED_READ_ONLY
        LOCK_DURATION: TRANSACTION
          LOCK_STATUS: GRANTED
              SOURCE: sql_parse.cc:5996
      OWNER_THREAD_ID:113  # 内部ID113的线程被授予了SHARED_READ_ONLY,持有该锁的线程不允许其他线程修改sbtest1表的数据
      OWNER_EVENT_ID: 11
 *************************** 2. row ***************************
          OBJECT_TYPE: GLOBAL
        OBJECT_SCHEMA: NULL
          OBJECT_NAME: NULL
 OBJECT_INSTANCE_BEGIN: 140620517607728
          LOCK_TYPE: INTENTION_EXCLUSIVE
        LOCK_DURATION: STATEMENT
          LOCK_STATUS: GRANTED
              SOURCE: sql_base.cc:3190
      OWNER_THREAD_ID: 114  # 内部ID114的线程被授予了INTENTION_EXCLUSIVE,但这只是一个意向锁
      OWNER_EVENT_ID: 12
 *************************** 3. row ***************************
          OBJECT_TYPE: TABLE
        OBJECT_SCHEMA: sbtest
          OBJECT_NAME: sbtest1
 OBJECT_INSTANCE_BEGIN: 140620517607824
          LOCK_TYPE: SHARED_WRITE
        LOCK_DURATION: TRANSACTION
          LOCK_STATUS: PENDING
              SOURCE: sql_parse.cc:5996
      OWNER_THREAD_ID: 114  # 内部ID114的线程正在等待被授予SHARED_WRITE
      OWNER_EVENT_ID: 12
 3 rows in set(0.00 sec)

排查陷入僵局,我们知道MDL锁非常常见,对表的绝大部分操作都会先加MDL锁(performance_schema.metadata_locks表中记录的锁信息也不顶用了)。通常看到这些信息时,我们可能会立刻想到需要查询information_schema下的三个关于InnoDB引擎的锁和事务信息表(INNODB_LOCK_WAITS、INNODB_LOCKS、INNODB_TRX),我们尝试查看这三个表(在会话3中执行),可是发现都没有记录。

mysql> select * from information_schema.INNODB_TRX;
Empty set(0.00 sec)
mysql> select * from information_schema.INNODB_LOCKS;
Empty set, 1 warning(0.00 sec)
mysql> select * from information_schema.INNODB_LOCK_WAITS;
Empty set, 1 warning(0.00 sec)

当然,可能有的人会说,就4个线程,第二个会话的“Command”为Sleep,应该是它,把它杀掉试试看。是的,在该案例中确实可以做这个尝试,但如果是在生产环境中有数十个、上百个正常的长连接处于Sleep状态,该怎么办呢?这时就不能挨个去尝试了,我们可以尝试查询一些表级别的锁信息(通过会话3查询performance_schema.table_handles表)。

mysql> select * from performance_schema.table_handles where OWNER_THREAD_ID! =0\G
*************************** 1. row ***************************
        OBJECT_TYPE: TABLE
      OBJECT_SCHEMA: sbtest
        OBJECT_NAME: sbtest1
OBJECT_INSTANCE_BEGIN: 140622530923216
      OWNER_THREAD_ID: 113
      OWNER_EVENT_ID: 11
      INTERNAL_LOCK: NULL
      EXTERNAL_LOCK:READ EXTERNAL# 发现内部ID113的线程持有了sbtest1表的READ EXTERNAL表级锁,这也是为什么内部ID114的线程无法获取到MDL写锁的原因
1 row in set(0.00 sec)

通过上面查询到的相关数据,我们知道113线程对sbtest1表显式加了表级读锁,而且长时间处于Sleep状态,但是并不知道该线程正在执行什么SQL语句,我们可以通过performance_schema.events_statements_current表查询。

mysql> select * from performance_schema.events_statements_current where thread_id=113\G
*************************** 1. row ***************************
            THREAD_ID: 113
            EVENT_ID: 10
          END_EVENT_ID: 10
            EVENT_NAME: statement/sql/lock_tables
              SOURCE: socket_connection.cc:101
          TIMER_START: 18503556405463000
            TIMER_END: 18503556716572000
            TIMER_WAIT: 311109000
            LOCK_TIME: 293000000
            SQL_TEXT: lock table sbtest1 read # 这里可以看到,内部ID113的线程对表sbtest1执行了加读锁语句
            DIGEST: 9f987e807ca36e706e33275283b5572b
          DIGEST_TEXT: LOCK TABLE `sbtest1` READ
        CURRENT_SCHEMA: sbtest
......
1 row in set(0.00 sec)

从performance_schema.events_statements_current表的查询信息中,通过SQL_TEXT字段可以清晰地看到该线程正在执行的SQL语句是什么。如果是生产环境,现在你可以去找相关的开发人员确认,如果没有什么特殊操作,就可以尝试杀掉这个线程(在会话3中执行,processlist_id为18),同时针对这个问题进行优化,避免再发生类似的情况。

# 如何知道内部ID113的线程对应的process id是多少呢?可以通过performance_schema.threads表查询
mysql> select processlist_id from performance_schema.threads where thread_id=113;
+----------------+
| processlist_id |
+----------------+
| 18             |
+----------------+
1 row in set(0.00 sec)
# 执行kill
mysql> kill 18;
Query OK, 0 rows affected(0.00 sec)
mysql> show processlist;
+----+---------+----------+----------+-------------+---------+----------+----------+
| Id | User    | Host      | db       | Command     | Time    | State    | Info     |
+----+---------+----------+----------+-------------+---------+----------+----------+
| 3   | qfsys   | 192.168.2.168:41042  | NULL | Binlog Dump | 18994 | Master has sent all binlog to slave; waiting for more updates | NULL |
| 19 | root   | localhost | sbtest   | Sleep | 480   |         | NULL      |
| 20 | root   | localhost | NULL    | Query    | 0  | starting | show processlist  |
+----+--------+------------+---------+-------------+----------+---------+----------+
3 rows in set(0.00 sec)
# 返回执行update语句的会话2,语句已经执行成功
mysql> update sbtest1 set pad='xxx' where id=1;
Query OK, 0 rows affected(7 min 50.23 sec)
Rows matched: 0 Changed: 0 Warnings: 0

6.2.4 找出谁持有行级锁

该案例中涉及的performance_schema.data_lock表是MySQL 8.0中新增的,在8.0之前的版本中不支持,这里仅作为针对MySQL 5.7的performance_schema的一个延伸学习

如果一个事务长时间未提交,我们虽然可以从information_schema.innodb_trx、performance_schema.events_transactions_current等表中查询到相应的事务信息,但却无从知道这个事务持有了哪些锁。虽然information_schema.innodb_locks表是用于记录事务的锁信息的,但需要在两个不同的事务发生锁等待时该表才会记录两个事务的锁信息。从MySQL 8.0开始,在performance_schema中提供了一个data_locks表用于记录任意事务的锁信息(同时废弃了information_schema.innodb_locks表),不需要有锁等待关系存在(注意,该表中只记录InnoDB存储引擎层的锁)。

首先,在MySQL 8.0中打开一个会话(会话1),显式开启一个事务。

mysql> use xiaoboluo
Database changed
mysql> select * from t_luoxiaobo limit 1;
+----+------+---------------------+
| id | test | datet_time          |
+----+------+---------------------+
|  2 | 1    | 2017-09-06 01:11:59 |
+----+------+---------------------+
1 row in set(0.00 sec)
mysql> begin;
Query OK, 0 rows affected(0.00 sec)
mysql> update t_luoxiaobo set datet_time=now()where id=2;
Query OK, 1 row affected(0.00 sec)
Rows matched: 1  Changed: 1  Warnings: 0

接下来,打开另一个会话(会话2),查询data_locks表。

mysql> select * from data_locks\G
*************************** 1. row ***************************
            ENGINE: INNODB
      ENGINE_LOCK_ID: 55562:62
ENGINE_TRANSACTION_ID: 55562
          THREAD_ID: 54                # 持有线程内部ID
          EVENT_ID: 85
      OBJECT_SCHEMA: xiaoboluo        # 库名
        OBJECT_NAME: t_luoxiaobo       # 表名
      PARTITION_NAME: NULL
    SUBPARTITION_NAME: NULL
        INDEX_NAME: NULL               # 索引名称
OBJECT_INSTANCE_BEGIN: 140439793477144
          LOCK_TYPE: TABLE             # 表级锁
          LOCK_MODE: IX                # IX
        LOCK_STATUS: GRANTED           # 被授予状态
          LOCK_DATA: NULL
*************************** 2. row ***************************
            ENGINE: INNODB
      ENGINE_LOCK_ID: 55562:2:4:2
ENGINE_TRANSACTION_ID: 55562
          THREAD_ID: 54                # 持有锁线程内部ID
          EVENT_ID: 85
      OBJECT_SCHEMA: xiaoboluo        # 库名
        OBJECT_NAME: t_luoxiaobo       # 表名
      PARTITION_NAME: NULL
    SUBPARTITION_NAME: NULL
        INDEX_NAME: PRIMARY            # 索引为主键
OBJECT_INSTANCE_BEGIN: 140439793474104
          LOCK_TYPE: RECORD            # 记录锁
          LOCK_MODE: X                 # 排他锁
        LOCK_STATUS: GRANTED           # 被授予状态
          LOCK_DATA: 2  # 被锁定的数据记录,这里的记录对应的是 INDEX_NAME: PRIMARY value 2 rows in set(0.00 sec)

从查询结果中可以看到,有两行锁记录,第一行是对t_luoxiaobo表的IX锁,状态为GRANTED;第二行为使用主键索引的X锁,记录锁,状态为GRANTED。

现在,我们模拟两条DML语句发生锁等待的场景。新打开一个会话(会话3),在会话1中的事务未提交的情况下,会话3对t_luoxiaobo表执行同样的操作。

mysql> use xiaoboluo
Database changed
mysql> begin;
Query OK, 0 rows affected(0.00 sec)
mysql> update t_luoxiaobo set datet_time=now()where id=2;  # 被阻塞

回到会话2中,查询data_locks表,可以发现有4行锁记录。

mysql> select * from performance_schema.data_locks\G
*************************** 1. row ***************************
......
          THREAD_ID: 55
......
          LOCK_TYPE: TABLE
          LOCK_MODE: IX
        LOCK_STATUS: GRANTED
          LOCK_DATA: NULL
*************************** 2. row ***************************
            ENGINE: INNODB
      ENGINE_LOCK_ID: 55563:2:4:2
ENGINE_TRANSACTION_ID: 55563
          THREAD_ID: 55  # 内部线程ID
          EVENT_ID: 8
      OBJECT_SCHEMA: xiaoboluo
        OBJECT_NAME: t_luoxiaobo
      PARTITION_NAME: NULL
    SUBPARTITION_NAME: NULL
        INDEX_NAME: PRIMARY  # 锁记录发生在哪个索引上
OBJECT_INSTANCE_BEGIN: 140439793480168
          LOCK_TYPE: RECORD  # 记录锁
          LOCK_MODE: X  # 排他锁
        LOCK_STATUS: WAITING  # 正在等待锁被授予
          LOCK_DATA: 2 # 锁定的索引value,这里与内部ID54的线程持有的主键值为2X锁完全一样,说明这里就是被内部ID54的线程阻塞了
*************************** 3. row ***************************
......
          THREAD_ID: 54
.......
          LOCK_TYPE: TABLE
          LOCK_MODE: IX
        LOCK_STATUS: GRANTED
          LOCK_DATA: NULL
*************************** 4. row ***************************
......
          THREAD_ID: 54
          EVENT_ID: 85
      OBJECT_SCHEMA: xiaoboluo
        OBJECT_NAME: t_luoxiaobo
      PARTITION_NAME: NULL
    SUBPARTITION_NAME: NULL
        INDEX_NAME: PRIMARY
OBJECT_INSTANCE_BEGIN: 140439793474104
          LOCK_TYPE: RECORD
          LOCK_MODE: X
        LOCK_STATUS: GRANTED
          LOCK_DATA: 2
4 rows in set(0.00 sec)

从上面的查询数据可以看到,performance_schema.data_locks表中新增了线程ID为55的两行锁记录,IX锁状态为GRANTED, X锁状态为WAITING,说明正在等待锁被授予。但这里并不能很直观地查看到锁等待关系,我们可以使用sys.innodb_lock_waits视图查看。

mysql> select * from sys.innodb_lock_waits\G
*************************** 1. row ***************************
              wait_started: 2018-01-14 21:51:59
                  wait_age: 00:00:11
            wait_age_secs: 11
              locked_table: `xiaoboluo`.`t_luoxiaobo`
      locked_table_schema: xiaoboluo
        locked_table_name: t_luoxiaobo
      locked_table_partition: NULL
  locked_table_subpartition: NULL
              locked_index: PRIMARY
              locked_type: RECORD
            waiting_trx_id: 55566
      waiting_trx_started: 2018-01-14 21:51:59
          waiting_trx_age: 00:00:11
waiting_trx_rows_locked: 1
  waiting_trx_rows_modified: 0
              waiting_pid: 8
            waiting_query: update t_luoxiaobo set datet_time=now()where id=2
          waiting_lock_id: 55566:2:4:2
        waiting_lock_mode: X
          blocking_trx_id: 55562
              blocking_pid: 7
            blocking_query: NULL
          blocking_lock_id: 55562:2:4:2
        blocking_lock_mode: X
        blocking_trx_started: 2018-01-14 21:34:44
          blocking_trx_age: 00:17:26
    blocking_trx_rows_locked: 1
  blocking_trx_rows_modified: 1
    sql_kill_blocking_query: KILL QUERY 7
sql_kill_blocking_connection: KILL 7
1 row in set(0.02 sec)

提示:在MySQL 5.7版本中,也可以使用sys.innodb_lock_waits视图查看,但是在MySQL 8.0中,该视图联结查询的表不同(把之前版本中使用的information_schema.innodb_locks表和information_schema.innodb_lock_waits表替换为了performance_ schema.data_locks表和performance_schema.data_lock_waits表)。另外,在MySQL 5.6及之前的版本中,在默认情况下并没有sys库,我们可以使用如下语句代替。

SELECT r.trx_wait_started AS wait_started,
      TIMEDIFF(NOW(), r.trx_wait_started)AS wait_age,
      TIMESTAMPDIFF(SECOND, r.trx_wait_started, NOW())AS wait_age_secs,
      rl.lock_table AS locked_table,
      rl.lock_index AS locked_index,
      rl.lock_type AS locked_type,
      r.trx_id AS waiting_trx_id,
      r.trx_started as waiting_trx_started,
      TIMEDIFF(NOW(), r.trx_started)AS waiting_trx_age,
      r.trx_rows_locked AS waiting_trx_rows_locked,
      r.trx_rows_modified AS waiting_trx_rows_modified,
      r.trx_mysql_thread_id AS waiting_pid,
      sys.format_statement(r.trx_query)AS waiting_query,
      rl.lock_id AS waiting_lock_id,
      rl.lock_mode AS waiting_lock_mode,
      b.trx_id AS blocking_trx_id,
      b.trx_mysql_thread_id AS blocking_pid,
      sys.format_statement(b.trx_query)AS blocking_query,
      bl.lock_id AS blocking_lock_id,
      bl.lock_mode AS blocking_lock_mode,
      b.trx_started AS blocking_trx_started,
      TIMEDIFF(NOW(), b.trx_started)AS blocking_trx_age,
      b.trx_rows_locked AS blocking_trx_rows_locked,
      b.trx_rows_modified AS blocking_trx_rows_modified,
      CONCAT('KILL QUERY ', b.trx_mysql_thread_id)AS sql_kill_blocking_query,
      CONCAT('KILL ', b.trx_mysql_thread_id)AS sql_kill_blocking_connection
  FROM information_schema.innodb_lock_waits w
      INNER JOIN information_schema.innodb_trx b ON b.trx_id = w.blocking_trx_id
      INNER JOIN information_schema.innodb_trx r ON r.trx_id = w.requesting_trx_id
      INNER JOIN information_schema.innodb_locks bl ON bl.lock_id = w.blocking_lock_id
      INNER JOIN information_schema.innodb_locks rl ON rl.lock_id = w.requested_lock_id
ORDER BY r.trx_wait_started;