mysql deadlock when concurrent insert

MySQL中innodb引擎的锁有多种,具体可查看官方文档,这次碰到的死锁与Next-Key Locks没有关系,与Record LocksGap Locks有关,下面引用innodb锁的部分文档说明。

Shared and Exclusive Locks

InnoDB implements standard row-level locking where there are two types of locks, shared (S) locks and exclusive (X) locks.

  • A shared (S) lock permits the transaction that holds the lock to read a row.
  • An exclusive (X) lock permits the transaction that holds the lock to update or delete a row.

Intention Locks

InnoDB supports multiple granularity locking which permits coexistence of row-level locks and locks on entire tables. To make locking at multiple granularity levels practical, additional types of locks called intention locks are used. Intention locks are table-level locks in InnoDB that indicate which type of lock (shared or exclusive) a transaction requires later for a row in that table. There are two types of intention locks used in InnoDB (assume that transaction T has requested a lock of the indicated type on table t):

  • Intention shared (IS): Transaction T intends to set S locks on individual rows in table t.
  • Intention exclusive (IX): Transaction T intends to set X locks on those rows.

For example, SELECT ... LOCK IN SHARE MODE sets an IS lock and SELECT ... FOR UPDATE sets an IX lock.

Record Locks

A record lock is a lock on an index record. For example, SELECT c1 FROM t WHERE c1 = 10 FOR UPDATE; prevents any other transaction from inserting, updating, or deleting rows where the value of t.c1 is 10.

Record locks always lock index records, even if a table is defined with no indexes. For such cases, InnoDB creates a hidden clustered index and uses this index for record locking. See Section 14.8.2.1, "Clustered and Secondary Indexes".

Transaction data for a record lock appears similar to the following in SHOW ENGINE INNODB STATUS and InnoDB monitor output:

RECORD LOCKS space id 58 page no 3 n bits 72 index PRIMARY of table test.t trx id 10078 lock_mode X locks rec but not gap
Record lock, heap no 2 PHYSICAL RECORD: n_fields 3; compact format; info bits 0
0: len 4; hex 8000000a; asc ;;
1: len 6; hex 00000000274f; asc 'O;;
2: len 7; hex b60000019d0110; asc ;;

Gap Locks

A gap lock is a lock on a gap between index records, or a lock on the gap before the first or after the last index record. For example, SELECT c1 FROM t WHERE c1 BETWEEN 10 and 20 FOR UPDATE; prevents other transactions from inserting a value of 15 into column t.c1, whether or not there was already any such value in the column, because the gaps between all existing values in the range are locked.

A gap might span a single index value, multiple index values, or even be empty.

Gap locks are part of the tradeoff between performance and concurrency, and are used in some transaction isolation levels and not others.

Gap locking is not needed for statements that lock rows using a unique index to search for a unique row. (This does not include the case that the search condition includes only some columns of a multiple-column unique index; in that case, gap locking does occur.) For example, if the id column has a unique index, the following statement uses only an index-record lock for the row having id value 100 and it does not matter whether other sessions insert rows in the preceding gap:

  • SELECT * FROM child WHERE id = 100;

If id is not indexed or has a nonunique index, the statement does lock the preceding gap.

It is also worth noting here that conflicting locks can be held on a gap by different transactions. For example, transaction A can hold a shared gap lock (gap S-lock) on a gap while transaction B holds an exclusive gap lock (gap X-lock) on the same gap. The reason conflicting gap locks are allowed is that if a record is purged from an index, the gap locks held on the record by different transactions must be merged.
Gap locks in InnoDB are “purely inhibitive”, which means they only stop other transactions from inserting to the gap. They do not prevent different transactions from taking gap locks on the same gap. Thus, a gap X-lock has the same effect as a gap S-lock.

Gap locking can be disabled explicitly. This occurs if you change the transaction isolation level to READ COMMITTED or enable the innodb_locks_unsafe_for_binlog system variable (which is now deprecated). Under these circumstances, gap locking is disabled for searches and index scans and is used only for foreign-key constraint checking and duplicate-key checking.

  • 数据库在READ-COMMITTED隔离级别,对于外键约束和唯一键约束仍然会使用gap locking

There are also other effects of using the READ COMMITTED isolation level or enabling innodb_locks_unsafe_for_binlog. Record locks for nonmatching rows are released after MySQL has evaluated the WHERE condition. For UPDATE statements, InnoDB does a "semi-consistent" read, such that it returns the latest committed version to MySQL so that MySQL can determine whether the row matches the WHERE condition of the UPDATE.

Next-Key Locks

By default, InnoDB operates in REPEATABLE READ transaction isolation level. In this case, InnoDB uses next-key locks for searches and index scans, which prevents phantom rows (see Section 14.5.4, "Phantom Rows").

InnoDB Locks

IS/IX/S/X锁兼容矩阵:

X IX S IS
X 冲突 冲突 冲突 冲突
IX 冲突 兼容 冲突 兼容
S 冲突 冲突 兼容 兼容
IS 冲突 兼容 兼容 兼容

当InnoDB在判断行锁是否冲突的时候, 除了最基本的IS/IX/S/X锁的冲突判断外, InnoDB还将锁细分为如下几种子类型:

  1. Record Locks (RK)
    • 记录锁, 仅仅锁住索引记录的一行。
  2. Gap Locks (GK)
    • 区间锁, 仅仅锁住一个区间(开区间)。
    • READ-COMMITTED隔离级别下,对复合唯一索引也会有gap lock。
  3. Insert Intention Locks (IK)
    • Gap Locks中存在一种插入意向锁(Insert Intention Lock),在INSERT操作时产生。
  4. Next-Key Locks (NK)
    • Record Locks + Gap Locks, 半开半闭区间。
    • Next-Key Locks是为防止幻读的发生,而只有REPEATABLE-READ以及以上隔离级别才能防止幻读,所以在READ-COMMITTED隔离级别下面没有Next-Key Locks

RK/GK/IK/NK锁兼容矩阵:

Request Lock与Granted Lock之间的兼容矩阵:

RK GK IK NK
RK 冲突 兼容 兼容 冲突
GK 兼容 兼容 兼容 兼容
IK 兼容 冲突 兼容 冲突
NK 冲突 兼容 兼容 冲突

问题重现

应用程序集群有3台机器,由于并发的问题,3台机器同时往数据里插入记录,并且在插入数据过程有其他的数据校验,因为数据校验失败,事务需要回滚,而导致死锁产生。

死锁问题产生的SHOW ENGINE INNODB STATUS提示内容与网上的文章:并发insert操作导致的dead lock基本是一样的,事务隔离级别为READ-COMMITTED,有2个字段联合的唯一索引,状态提及内容中涉及Gap Lock:

------------------------LATEST DETECTED DEADLOCK------------------------2017-09-15 09:35:16 7fdb8b6fb700*** (1) TRANSACTION:TRANSACTION 606898669, ACTIVE 0.036 sec insertingmysql tables in use 1, locked 1LOCK WAIT 4 lock struct(s), heap size 1184, 2 row lock(s), undo log entries 1LOCK BLOCKING MySQL thread id: 5706233 block 5707263MySQL thread id 5707263, OS thread handle 0x7fdb867be700, query id 4809191957 10.28.30.213 dbadmin updateinsert into `t_deadlock` (user_id,type,value,active,created) values(3016018,'ADMIN','500',1,'2017-09-15 09:35:16')*** (1) WAITING FOR THIS LOCK TO BE GRANTED:RECORD LOCKS space id 543110 page no 15540 n bits 584 index `ix_user_id_type` of table `test`.`t_deadlock` trx id 606898669 lock_mode X locks gap before rec insert intention waitingRecord lock, heap no 508 PHYSICAL RECORD: n_fields 3; compact format; info bits 0 0: len 4; hex 802e0552; asc . R;; 1: len 13; hex 77785f73756273637269626564; asc subscribed;; 2: len 4; hex 8027c73d; asc ' =;;*** (2) TRANSACTION:TRANSACTION 606898668, ACTIVE 0.057 sec insertingmysql tables in use 1, locked 14 lock struct(s), heap size 1184, 2 row lock(s), undo log entries 1MySQL thread id 5706233, OS thread handle 0x7fdb8b6fb700, query id 4809191891 10.172.25.211 dbadmin updateinsert into `t_deadlock` (user_id,type,value,active,created) values(3016018,'ADMIN','2000',1,'2017-09-15 09:35:15.973')*** (2) HOLDS THE LOCK(S):RECORD LOCKS space id 543110 page no 15540 n bits 584 index `ix_user_id_type` of table `test`.`t_deadlock` trx id 606898668 lock mode S locks gap before recRecord lock, heap no 508 PHYSICAL RECORD: n_fields 3; compact format; info bits 0 0: len 4; hex 802e0552; asc . R;; 1: len 13; hex 77785f73756273637269626564; asc subscribed;; 2: len 4; hex 8027c73d; asc ' =;;*** (2) WAITING FOR THIS LOCK TO BE GRANTED:RECORD LOCKS space id 543110 page no 15540 n bits 584 index `ix_user_id_type` of table `test`.`t_deadlock` trx id 606898668 lock_mode X locks gap before rec insert intention waitingRecord lock, heap no 508 PHYSICAL RECORD: n_fields 3; compact format; info bits 0 0: len 4; hex 802e0552; asc . R;; 1: len 13; hex 77785f73756273637269626564; asc subscribed;; 2: len 4; hex 8027c73d; asc ' =;;*** WE ROLL BACK TRANSACTION (2)------------TRANSACTIONS------------

上述信息中关键的3个锁记录如下:

RECORD LOCKS space id 543110 page no 15540 n bits 584 index ix_user_id_type of table test.t_deadlock trx id 606898669 lock_mode X locks gap before rec insert intention waiting
RECORD LOCKS space id 543110 page no 15540 n bits 584 index ix_user_id_type of table test.t_deadlock trx id 606898668 lock mode S locks gap before rec
RECORD LOCKS space id 543110 page no 15540 n bits 584 index ix_user_id_type of table test.t_deadlock trx id 606898668 lock_mode X locks gap before rec insert intention waiting

MySQL 环境

mysql> SELECT version();+-----------+| version() |+-----------+| 5.6.24    |+-----------+1 row in set (0.00 sec)mysql> SHOW VARIABLES LIKE 'innodb_locks_unsafe_for_binlog';+--------------------------------+-------+| Variable_name                  | Value |+--------------------------------+-------+| innodb_locks_unsafe_for_binlog | OFF   |+--------------------------------+-------+1 row in set (0.01 sec)mysql> SELECT @@global.tx_isolation, @@session.tx_isolation, @@tx_isolation;+-----------------------+------------------------+----------------+| @@global.tx_isolation | @@session.tx_isolation | @@tx_isolation |+-----------------------+------------------------+----------------+| READ-COMMITTED        | READ-COMMITTED         | READ-COMMITTED |+-----------------------+------------------------+----------------+1 row in set (0.00 sec)

设置隔离级别

mysql> SET GLOBAL tx_isolation = 'READ-COMMITTED';Query OK, 0 rows affected (0.01 sec)mysql> SET SESSION tx_isolation = 'READ-COMMITTED';Query OK, 0 rows affected (0.00 sec)

在每个mysql session开启后都执行以下命令,mysql默认的隔离级别一般是REPEATABLE-READ

SET SESSION tx_isolation = 'READ-COMMITTED';

创建测试表

USE test;CREATE TABLE deadlock (  id bigint(20) NOT NULL AUTO_INCREMENT,  a smallint(5) unsigned NOT NULL DEFAULT '0',  b int(11) NOT NULL DEFAULT '0',  c int(11) NOT NULL DEFAULT '0',  d datetime NOT NULL DEFAULT '0000-00-00 00:00:00',  PRIMARY KEY (id),  UNIQUE KEY uniq_b_c_a (b,c,a)) ENGINE=InnoDB DEFAULT CHARSET=utf8;

复现死锁过程

T1(140104) T2(140105) T3(140106)
BEGIN; BEGIN; BEGIN;
INSERT INTO deadlock(a,b,c) VALUES(1,2,4);
INSERT INTO deadlock(a,b,c) VALUES(1,2,4);
INSERT INTO deadlock(a,b,c) VALUES(1,2,4);
ROLLBACK;
Query OK, 1 row affected (13.10 sec)
ERROR 1213 (40001): Deadlock found when trying to get lock; try restarting transaction  

在事务T1没有ROLLBACK时,可以查看数据库引擎innodb锁的情况如下:

mysql> SELECT * FROM information_schema.INNODB_LOCKS;+----------------+-------------+-----------+-----------+-------------------+------------+------------+-----------+----------+-----------+| lock_id        | lock_trx_id | lock_mode | lock_type | lock_table        | lock_index | lock_space | lock_page | lock_rec | lock_data |+----------------+-------------+-----------+-----------+-------------------+------------+------------+-----------+----------+-----------+| 140106:270:4:3 | 140106      | S         | RECORD    | `test`.`deadlock` | unq_b_c_a  |        270 |         4 |        3 | 2, 4, 1   || 140104:270:4:3 | 140104      | X         | RECORD    | `test`.`deadlock` | unq_b_c_a  |        270 |         4 |        3 | 2, 4, 1   || 140105:270:4:3 | 140105      | S         | RECORD    | `test`.`deadlock` | unq_b_c_a  |        270 |         4 |        3 | 2, 4, 1   |+----------------+-------------+-----------+-----------+-------------------+------------+------------+-----------+----------+-----------+3 rows in set (0.01 sec)

如果T1未ROLLBACK,而是COMMIT的话,T2和T3会报唯一键冲突,提示内容如下:

ERROR 1062 (23000): Duplicate entry '2-3-1' for key 'unq_b_c_a'

SHOW ENGINE INNODB STATUS

mysql> SHOW ENGINE INNODB STATUS\G------------------------LATEST DETECTED DEADLOCK------------------------2017-09-16 09:50:09 12ff74000*** (1) TRANSACTION:TRANSACTION 140105, ACTIVE 32 sec insertingmysql tables in use 1, locked 1LOCK WAIT 4 lock struct(s), heap size 1184, 2 row lock(s), undo log entries 1MySQL thread id 154, OS thread handle 0x12ffb8000, query id 2318 localhost root updateinsert into deadlock(a,b,c) values(1,2,4)*** (1) WAITING FOR THIS LOCK TO BE GRANTED:RECORD LOCKS space id 270 page no 4 n bits 72 index `unq_b_c_a` of table `test`.`deadlock` trx id 140105 lock_mode X insert intention waitingRecord lock, heap no 1 PHYSICAL RECORD: n_fields 1; compact format; info bits 0 0: len 8; hex 73757072656d756d; asc supremum;;*** (2) TRANSACTION:TRANSACTION 140106, ACTIVE 17 sec insertingmysql tables in use 1, locked 14 lock struct(s), heap size 1184, 2 row lock(s), undo log entries 1MySQL thread id 153, OS thread handle 0x12ff74000, query id 2320 localhost root updateinsert into deadlock(a,b,c) values(1,2,4)*** (2) HOLDS THE LOCK(S):RECORD LOCKS space id 270 page no 4 n bits 72 index `unq_b_c_a` of table `test`.`deadlock` trx id 140106 lock mode SRecord lock, heap no 1 PHYSICAL RECORD: n_fields 1; compact format; info bits 0 0: len 8; hex 73757072656d756d; asc supremum;;*** (2) WAITING FOR THIS LOCK TO BE GRANTED:RECORD LOCKS space id 270 page no 4 n bits 72 index `unq_b_c_a` of table `test`.`deadlock` trx id 140106 lock_mode X insert intention waitingRecord lock, heap no 1 PHYSICAL RECORD: n_fields 1; compact format; info bits 0 0: len 8; hex 73757072656d756d; asc supremum;;*** WE ROLL BACK TRANSACTION (2)----------------------------END OF INNODB MONITOR OUTPUT============================1 row in set (0.01 sec)

问题原因

  1. SHOW ENGINE INNODB STATUS\G看到的DEADLOCK相关信息,只会返回最后的2个事务的信息,而其实有可能有更多的事务才最终导致的死锁。
  2. 当有3个(或以上)事务对相同的表进行insert操作,如果insert对应的字段上有uniq key约束并且第一个事务ROLLBACK了,那其中一个将返回死锁错误信息。
  3. 死锁的原因
    • T1 获得 X 锁并 INSERT 成功
    • T2 试图 INSERT, 检查重复键需要获得 S 锁, 但试图获得 S 锁失败, 加入等待队列, 等待 T1
    • T3 试图 INSERT, 检查重复键需要获得 S 锁, 但试图获得 S 锁失败, 加入等待队列, 等待 T1
    • T1 ROLLBACK, T1 释放锁, 此后 T2, T3 获得 S 锁成功, 检查 duplicate-key, 之后 INSERT 试图获得 X 锁, 但 T2, T3 都已经获得 S 锁, 导致 T2, T3 死锁
  4. 避免此DEADLOCK
    • 我们都知道死锁的问题通常都是业务处理的逻辑造成的,既然是uniq key,同时多台不同服务器上的相同程序对其 INSERT 一模一样的value,这本身逻辑就不太完美。
  5. 故解决此问题:
    • 保证业务程序别再同一时间点并发的插入相同的值到相同的uniq key的表中
    • 上述实验可知,是由于第一个事务ROLLBACK了才产生的DEADLOCK,查明ROLLBACK的原因
    • 尽量减少完成事务的时间

References

  1. 14.5.1 InnoDB Locking
  2. 通过InnoDB监控状态分析锁占用
  3. mysql insert锁机制
  4. 并发insert操作导致的dead lock
  5. MySQL gap locks/next-key locks浅析