Using SHOW PROCESSLIST and mysqladmin debug Output in Conjunction with SHOW INNODB STATUS

When InnoDB appears hung, I know the natural reaction is to check SHOW ENGINE INNODB STATUS.

In fact, it’s the first thing I check when InnoDB tables are involved.

However, I just want to iterate how valuable SHOW FULL PROCESSLIST and/or mysqladmin debug outputs can be even when it seems mysqld is hung on on InnoDB table.

Two recent cases I’ve encountered illustrate why.

Case #1:

MySQL appeared hung on the following simple, single-row INSERT:

---TRANSACTION 0 2035648699, ACTIVE 76629 sec, process no 9047, 
OS thread id 3069426592, thread declared inside InnoDB 500
mysql tables in use 1, locked 1
...
INSERT INTO test (id, parent, text) VALUES (180370, 70122, 'test table')

At least that’s what it seemed per the INNODB STATUS, but unfortunately, there wasn’t any further information to go on.

The next time it occurred, SHOW FULL PROCESSLIST was captured at the time.

Turns out, there was a *very* long SELECT running, but not from the same table, and no foreign keys (FKs) either. Turned out it was some crazy, auto-generated query that self-joined itself 548 times. So there were no locks, per se. This query itself held up everything, and thus also the INSERT.

Case #2:

This was a table that was also hanging on a certain, simple UPDATE. The UPDATE was based on te PK, so only one row was to be updated.

Yet, it hung, and it hung, longer than wait_timeout, interactive_timeout, and innodb_lock_wait_timeout. And there were no other transactions running in the INNODB STATUS.

Turned out, another client had issued a LOCK TABLE command on the table. Since LOCK TABLE is handled outside of the InnoDB storage engine, the lock doesn’t appear in SHOW INNODB STATUS output.

Using mysqladmin debug output, coupled with SHOW PROCESSLIST helped catch this offender.

At any rate, hope this helps, and happy troubleshooting. 🙂

 
 

Advanced InnoDB Deadlock Troubleshooting – What SHOW INNODB STATUS Doesn’t Tell You, and What Diagnostics You Should be Looking At

One common cause for deadlocks when using InnoDB tables is from the existence of foreign key constraints and the shared locks (S-lock) they acquire on referenced rows.

The reason I want to discuss them though is because they are often a bit tricky to diagnose, especially if you are only looking at the SHOW ENGINE INNODB STATUS output (which might be a bit counter-intuitive since one would expect it to contain this info).

Let me show a deadlock error to illustrate (below is from SHOW ENGINE INNODB STATUS\g):

------------------------
LATEST DETECTED DEADLOCK
------------------------
111109 20:10:03
*** (1) TRANSACTION:
TRANSACTION 65839, ACTIVE 19 sec, OS thread id 4264 starting index read
mysql tables in use 1, locked 1
LOCK WAIT 6 lock struct(s), heap size 1024, 3 row lock(s), undo log entries 1
MySQL thread id 3, query id 74 localhost 127.0.0.1 root Updating
UPDATE parent SET age=age+1 WHERE id=1
*** (1) WAITING FOR THIS LOCK TO BE GRANTED:
RECORD LOCKS space id 6833 page no 3 n bits 72 index `PRIMARY` of table 
`test`.`parent` trx id 65839 lock_mode X locks rec but not gap waiting
Record lock, heap no 2 PHYSICAL RECORD: n_fields 4; compact format; info bits 0

*** (2) TRANSACTION:
TRANSACTION 65838, ACTIVE 26 sec, OS thread id 768 starting index read, 
thread declared inside InnoDB 500
mysql tables in use 1, locked 1
7 lock struct(s), heap size 1024, 4 row lock(s), undo log entries 2
MySQL thread id 4, query id 75 localhost 127.0.0.1 root Updating
UPDATE parent SET age=age+1 WHERE id=2
*** (2) HOLDS THE LOCK(S):
RECORD LOCKS space id 6833 page no 3 n bits 72 index `PRIMARY` of table 
`test`.`parent` trx id 65838 lock_mode X locks rec but not gap
Record lock, heap no 2 PHYSICAL RECORD: n_fields 4; compact format; info bits 0

*** (2) WAITING FOR THIS LOCK TO BE GRANTED:
RECORD LOCKS space id 6833 page no 3 n bits 72 index `PRIMARY` of table 
`test`.`parent` trx id 65838 lock_mode X locks rec but not gap waiting
Record lock, heap no 3 PHYSICAL RECORD: n_fields 4; compact format; info bits 0

*** WE ROLL BACK TRANSACTION (1)

Now, we do see a lot about what caused the deadlock above, but we are only seeing *half* of the picture.

Allow me to explain ..

First of all, note transaction #1 has been running for 19 seconds, while transaction #2 for 26 seconds. So, the output is referring to the newer transaction as #1 and the older as #2 (also somewhat counter-intuitive, but good to be aware of).

Now, what we can see clearly is this:

Transaction #1 (“UPDATE parent .. WHERE id=1”) is waiting on a lock from Transaction #2 (“UPDATE parent .. WHERE id=2”).

Thus TX #2 holds a lock (RECORD LOCKS space id 6833 page no 3 n bits 72 index, heap no 2), but is waiting on (RECORD LOCKS space id 6833 page no 3 n bits 72 index, heap no 3 – held by TX #1).

Clearly the 2 updates should not cause a conflict in and of themselves. Thus we know something must have happened earlier in the transaction(s).

For reference, here is how to reproduce it:

CREATE TABLE `parent` (
`id` INT NOT NULL AUTO_INCREMENT, 
`age` INT NOT NULL,
PRIMARY KEY (`id`)
) ENGINE=InnoDB;

CREATE TABLE `child` (
`id` INT NOT NULL AUTO_INCREMENT, 
`age` INT NOT NULL,
`parent_id` INT NOT NULL,
PRIMARY KEY (`id`),
KEY `parent_id` (`parent_id`),
CONSTRAINT `fk_parent_id` FOREIGN KEY (`parent_id`) REFERENCES `parent` (`id`)
) ENGINE=InnoDB;

INSERT INTO parent (id, age) VALUES (1, 50);
INSERT INTO parent (id, age) VALUES (2, 60);
INSERT INTO child (id, age, parent_id) VALUES (1, 20, 1);
INSERT INTO child (id, age, parent_id) VALUES (2, 20, 1);

Then, open 2 connections (T1 and T2 – note order is opposite compared to what is shown in SHOW INNODB STATUS):

T1:

BEGIN;
UPDATE child SET age=age+1, parent_id=2 WHERE id=1;
UPDATE parent SET age=age+1 WHERE id=1;

T2:

BEGIN;
UPDATE child SET age=age+1, parent_id=2 WHERE id=2;
UPDATE parent SET age=age+1 WHERE id=1;

<-- T2 Hangs

T1:

UPDATE parent SET age=age+1 WHERE id=2;

<-- Deadlock (T1 completes, T2 is rolled back)

But why does this deadlock? Well, it is due to the foreign key. In fact, this example would not deadlock at all if no foreign key was defined on `parent`.`id`.

So then what exactly is preventing T2 from completing?

Here is what is happening behind the scenes, so-to-speak:

T1#1:

Obtains the following 2 locks:

X lock on `child`  where id=1 <-- due to the actual "UPDATE child" statement itself
S lock on `parent` where id=2 <-- due to the FK on parent.id

(Note this S lock means other S locks can be obtained on this row, but not X locks -- which is the crux of this issue).

T1#2:

Obtains the following lock:

X lock on `parent` where id=1 <-- due to the actual "UPDATE parent" statement itself

T2#1:

Obtains the following 2 locks:

X lock on `child`  where id=2 <-- due to the actual "UPDATE child" statement itself
S lock on `parent` where id=2 <-- due to the FK (again, this is okay since it is also a S-lock)

T2#2:

Tries to obtain the following, but hangs due to the existing X-lock from T1#2:

X lock on parent where id=1

T1#3:

Tries to obtain the following lock:

X lock on parent where id=2

However, since there are 2 S-locks on this row already (one from T1 and T2), and T1 now wants an X-lock on the same row, then there is a conflict.

Now this would normally just wait for the S-locks to be released, but since T2 is already "hung" waiting on the other lock to be released from T1, we now have the deadlock.

T1 wins the dispute, T2 rolls back thus releasing its locks, and T1 completes.

So all in all, quite a bit is going on there, but you only see about half of this information from the LATEST DETECTED DEADLOCK section of SHOW ENGINE INNODB STATUS output. And had I not posted the SHOW CREATE TABLE status (and prior tx statements), it'd be unclear as to what happened exactly.

Well, so how do you find out exactly what happened when locking problems happen to you?

SHOW INNODB STATUS only tells you so much. Furthermore, once the deadlock occurs, the winner moves on, and the loser is rolled back. *Meaning*, there is no longer any information about these "transactions" in the output as they are in the "past" now.

Therefore, in general, if you're having any locking issues (deadlocks, lock wait timeouts, hangs due to semaphore waits, and so forth), do yourself a favor and capture all of the following outputs at the time, if possible, to give you the best likelihood in tracking down the issue:

  1. SHOW ENGINE INNODB STATUS
    • This is generally very good, but it can get truncated, and simply may not contain every bit of info you need.
  2. Enable InnoDB Lock Monitor (enable the InnODb Lock Monitor by simply creating any innodb table named innodb_lock_monitor)
    • This logs a lot of extra lock information in the SHOW ENGINE INNODB STATUS output, but it can get truncated too.
  3. Run "mysqladmin debug"
    • Logs all lock info to the error log. Great because it logs all locks (i.e., none truncated) and it logs LOCK TABLE locks, which do not appear in SHOW INNODB STATUS even if on an InnoDB table, because LOCK TABLE is external to the InnoDB storage engine. Not so great because a bit cryptic to read, and I wouldn't solely reply on it, as it's often most helpful in conjuntion with other details).
  4. SHOW FULL PROCESSLIST
    • This will show all connected threads. Specifically, when it comes to 'hidden' locks, it would show a user that has been connected, but idle (but who could have issued a LOCK TABLE command).
  5. Error log
    • Of course, always check out the error log for messages and/or anything out of the ordinary. (Not to mention extra data will be logged to it from "mysqladmin debug" and innodb_lock_monitor.)
  6. SELECT * FROM INFORMATION_SCHEMA.INNODB_LOCKS, INNODB_LOCK_WAITS, and INNODB_TRX tables
  7. SHOW CREATE TABLE outputs for each table involved