Even when the output of EXPLAIN doesn’t show “using temporary”, a temporary file may still be used in certain cases.
That’s not to say the query needs the temporary file to actually resolve the query (like what you’d see from the need for a derived table). But rather, the temporary file I’m speaking of is due to binary logging.
In particular, you can see this easily if using InnoDB, (most commonly) row-based binary logging, and you issue a large transaction, say a large
UPDATE (large meaning something larger than the size of binlog_cache_size). In this case, you’ll notice a temporary file being created in the MySQL tmpdir (if you were using
lsof or something similar to monitor the tmpdir).
If you have row-based binary logging enabled, this file would have a name like
ML*. If you are not using row-based binary logging, it will have a file name like
MY* (unlike the
#sql* tmp files most of us are accustomed to).
Say you run a large
UPDATE which updates millions of rows (note the
UPDATE is a single transaction in itself). If you have row-based binary log format enabled, then each row is written to the binary log. However, before being written to the binary log, the changes are first written to the binlog_cache (i.e., it will not write the tranaction to the binary log until the transaction commits).
With a small value of
binlog_cache_size (which the default is only 32K), it is easy to overrun this cache, hence you end up with a temporary file being created/used to store this information until the transaction completes.
So, to avoid this, you have a couple of options:
- o Increase binlog_cache_size (max is 4G)
- o Change overall binary log format to MIXED mode
- o Dynamically change session to MIXED/STATEMENT mode before running the query
(In either of the latter 2 options, the query must be deterministic.)
binlog_cache_size, there are a couple of things you can look at.
In the general case, you can monitor this cache from the
SHOW GLOBAL STATUS:
mysql> show global status like 'binlog%'; +-----------------------+----------+ | Variable_name | Value | +-----------------------+----------+ | Binlog_cache_disk_use | 652065 | | Binlog_cache_use | 38271899 | +-----------------------+----------+
In this case, you could compute the efficiency:
((1 - (652065/38271899))*100) = 98.3%.
This is not a terrible ratio, but there is definitely room for improvement. So again, when tuning this in the general case, increase in smaller intervals, and monitor the changes. Then iteratively re-apply this process until you’re comfortable with the value/ratio.
Now, in the more specific case, suppose you were monitoring the tmpdir with lsof, then you might see something like this:
chris@chris-linux:~/mysql-5.5.8/bin$ lsof | grep tmpdir mysqld 17499 chris 5u REG 8,5 0 246876 /mysql-5.5.8/tmpdir/ibr26n5u (deleted) mysqld 17499 chris 6u REG 8,5 0 246877 /mysql-5.5.8/tmpdir/ibqsbazO (deleted) mysqld 17499 chris 7u REG 8,5 0 246943 /mysql-5.5.8/tmpdir/ibYCyW27 (deleted) mysqld 17499 chris 8u REG 8,5 0 246947 /mysql-5.5.8/tmpdir/ibnrd9xr (deleted) mysqld 17499 chris 12u REG 8,5 0 246948 /mysql-5.5.8/tmpdir/ibT0q36K (deleted) mysqld 17499 chris 36u REG 8,5 320057735 246960 /mysql-5.5.8/tmpdir/MLcPabAK (deleted)
The tmp files named
ib* are related to InnoDB, but that is another discussion.
The file in question is
MLcPabAK, which grew to over 300M while the
UPDATE was executing.
Thus, if I was only trying to eliminate the need for the tmp file in this instance, I know my
binlog_cache_size would need to exceed 300M.
Morals of the story:
If running with InnoDB and row-based replication, you should check the value of Binlog_cache_disk_use (and compare to binlog_cache_use) and increase
binlog_cache_size if necessary. You may find it a good idea to tune this anyway.
Further, if running large, deterministic UPDATES, you might want to consider changing the
binlog_format for the duration of said queries, to simply avoid all of the extra I/O.
Upcoming change to MySQL 5.5.9:
Begining with MySQL 5.5.9,
binlog_cache_size sets the size for the transaction cache only, and the size of the statement cache is governed by the binlog_stmt_cache_size system variable.
Click here to give this article a “Thumbs Up” on Planet MySQL !
Tags: #sql*, binary log, binlog_cache_disk_use, binlog_cache_size, binlog_cache_use, binlog_stmt_cache_size, chris calender, deterministic, innodb, large transaction, lsof, max_binlog_cache_size, max_binlog_stmt_cache_size, MIXED, ML*, MY*, MySQL, reduce I/O, row-based, skysql, STATEMENT, temp file, temporary file, tmp file, tmpdir