Archive for April, 2009

How to Recover Data using the InnoDB Recovery Tool

Wednesday, April 15th, 2009

As you may or may not know, there is a tool called the InnoDB Recovery Tool which can allow you to recover data from InnoDB tables that you cannot otherwise get the data from.

“This set of tools could be used to check InnoDB tablespaces and to recover data from damaged tablespaces or from dropped/truncated InnoDB tables.”

http://code.google.com/p/innodb-tools/

This is a very handy tool, however, the documentation on how to use it is a bit limited when it comes to actually recovering the data, so I thought I’d post a step-by-step tutorial on how to use this tool.

1. Download the InnoDB Recovery Tool (latest version is 0.3)

2. Unpack the download to the location of your choice

3. Create your table_defs.h file using the create_defs.pl script. Note the below creates a table_defs.h file based on only one table, t1, from the database named test:

cd innodb-recovery-tool-0.3/
./create_defs.pl --user=root --password=mysql --db=test --table=t1 > table_defs.h

4. Copy the newly created table_defs.h fle to the innodb-recovery-tool-0.3/include/ directory.

5. Now is time to build/compile the InnoDB Recovery Tool

cd innodb-recovery-tool-0.3/mysql-source/
./configure
cd ..
make

At this point, you’re almost ready to begin to recover the data. However, let me point out a couple items at this stage. The InnoDB Recovery Tool documentation says you can use the page_parser program to split up the tablespace. Since this page_parser program is now created (after compilation and make), you can use this to break apart the tablespace. However, in my case, I did this, but the page_parser didn’t work as well as I expected. This could be due to the corruption in the tablespace files (ibdata1 and ibdata2). So, I simply tried to recover based off the entire ibdata files. I found that I recovered much more data by just running the recovery against the ibdata files, rather than against the split-up pages. If you opt for this method, then you can skip steps 6, 7, and 8.

6. Should you want to use the page_parser, here is how you run it:

cd innodb-recovery-tool-0.3/
./page_parser -f /home/chris/Desktop/test/ibdata1 /home/chris/Desktop/test/ibdata2 -5

Note that the -f indicates the file(s) to use, and the -5 indicates the ibdata files are from MySQL version 5.0.

7. Should you use the page_parser, you must also load the ibdata file(s) and capture the InnoDB tablespace monitor output. This part is described on the InnoDB Tools how-to.

8. After running the above, you’ll want to capture all of the primary key index positions for each table you want to recover. For instance, you might see something like “0 135″ for the index position of a primary key. This will correspond to the folder named “0-135″ that is created by page_parser.

9. Now you are ready to recover the data for the first table.

(Note that you could create a table_defs.h file based on all of the tables you want to recover. And then you can recover all of the data at once, however, the problem with this is the data is all mis-matched into one big file. So you might have a row for one table followed by a row from another table. If you’re good with sed/awk, this might not be a problem for you, as you can then split it apart. However, it might be easier to create a single table_defs.h file for each table, and then recover the data table-by-table.)

If you want to recover the data based on the page_parser output, then you would use the following command:

./constraints_parser -f /home/chris/Desktop/innodb-recovery-0.3/pages-1239037839/0-135/50-00000050.page -5 -V

Note that the -V is for verbose mode. It is best to use this initially to make sure the data being recovered looks to be correct. Once you’ve verified it looks correct, then simply run the above command without the -V and pipe the output to a text file.

Should you not want to use the page_parser, and just run constraints_parser directly against the ibdata file(s), then issue the following command instead:

./constraints_parser -f /home/chris/Desktop/test/ibdata1 /home/chris/Desktop/test/ibdata2 -5 > output.txt

As for the recovered data itself, note that this data is displayed in a tab-delimited text format that the InnoDB Recovery tool dumps it in (default, not configurable yet).

For instance, here is a sample of data recovered for the t1 table:

t1 128992703 84118144 301989888 224000 33558272 268435456 ""
t1 0 0 34796032 0 530 838926338 ""
t1 1886545261 268455808 256 497 880803840 2949392 ""
t1 1398034253 1953654117 1952672116 2037609569 1952801647 1970173042 ""
t1 402667648 755047491 1431524431 1296388657 825372977 825308725 "5"
t1 536884352 755050563 1431524431 1296388658 842150450 842162531 "t"
t1 671103872 755053635 1431524431 1296388663 926365495 926365495 "77"
t1 524288 0 755056707 1431524431 1296388705 1668573558 ""
t1 524288 0 755059779 1431524431 1296388705 1668573558 ""
t1 524288 0 755062851 1431524431 1296388705 1668573558 ""
t1 525312 0 755065923 1431524431 1296388705 1668573558 ""
t1 524288 0 755068995 1431524431 1296388705 1668573558 ""
t1 524288 0 755072067 1431524431 1296388705 1668573558 ""
t1 524288 0 755075139 1431524431 1296388705 1668573558 ""
t1 525312 0 755078211 1431524431 1296388705 1668573558 ""
t1 524288 0 755081283 1431524431 1296388705 1668573558 ""
t1 524288 0 755084355 1431524431 1296388705 1668573558 ""
t1 524288 0 755047491 1431524431 1296388705 1668573558 ""
t1 524288 0 755047491 1431524431 1296388705 1668573558 ""
t1 0 0 0 0 0 0 ""
t1 0 0 0 0 0 0 ""
t1 0 0 0 0 0 0 ""
t1 0 0 0 0 0 0 ""
t1 0 0 0 0 0 0 ""

You can see each line is pre-pended with the table name (followed by a tab).

You can also see at the end (of the above output) there are a number of empty rows. These are just garbage rows, and can be deleted before you import or afterwards. You’ll see similar such rows in most of the recovered tables’ data as well. However, don’t just delete from the end of the file, as actual data rows are scattered all throughout the files.

I’d also suggest creating some temporary tables using the same CREATE TABLE commands but without any keys or indexes. This will allow you to import the data easier, and then you can clean it up with simple SQL commands. And after that, then you could simply add back your primary keys, indexes, and referential keys.

Should you follow my approach and do this per-table, then you just need to create your new table_defs.h file, re-compile and make, then re-run the constraints_parser just as you did above. Since it is built with the new table_defs.h file, it will now extract the data for this table, so no other changes need to be made.

10. Format the dump file(s) so that it can be imported into the appropriate table(s).

11. Import the data, and clean up the garbage rows.

12. Re-create any needed indexes and/or referential keys.

SHOW PROXY PROCESSLIST : Another Proxy Tutorial Revisited

Wednesday, April 15th, 2009

In a previous post, Another Proxy Tutorial, I outlined how you could find and display the originating IP address of a particular query from within MySQL Proxy.

This was a nice illustration of how to accomplish the above, however, in practice, there were a couple of limitations. For one, the queries were “tagged” with the originating IP address, so the query cache would only work for the same query coming from the same IP address. So if you’re using the query cache, then this is not an optimal solution. Secondly, you could only see the IP address for a current query, and not for an idle connection. Thus when issuing a SHOW PROCESSLIST command, one would only see the IP address only for actual queries that are being executed, and not for any of the other idle connections.

This was bothersome to me, so I worked up a better and more efficient solution that addresses both of these concerns.

In this new solution, I implement a new command called SHOW PROXY PROCESSLIST. This leaves SHOW PROCESSLIST output unchanged, and instead, if you want to see the originating IP address of the every connection listed in the SHOW PROCESSLIST output, then simply issue a SHOW PROXY PROCESSLIST command.

Here is an example of the output when using 3 different connections from differing IP addresses:

mysql> SHOW PROXY PROCESSLIST; SHOW PROCESSLIST;
+------+---------------------+-------------------+
| Id   | IP Address          | Time              |
+------+---------------------+-------------------+
| 2    | 127.0.0.1:49528     | 04/15/09 13:22:14 |
| 3    | 192.168.1.100:1322  | 04/15/09 13:22:23 |
| 4    | 192.168.1.103:59374 | 04/15/09 13:22:44 |
+------+---------------------+-------------------+
3 rows in set (0.00 sec)

+----+------+-----------------+------+---------+------+-------+------------------+
| Id | User | Host            | db   | Command | Time | State | Info             |
+----+------+-----------------+------+---------+------+-------+------------------+
|  1 | root | localhost:49526 | NULL | Sleep   |   13 |       | NULL             |
|  2 | root | localhost:49529 | NULL | Query   |    0 | NULL  | SHOW PROCESSLIST |
|  3 | root | localhost:49533 | NULL | Sleep   |   50 |       | NULL             |
|  4 | root | localhost:49547 | NULL | Sleep   |   18 |       | NULL             |
+----+------+-----------------+------+---------+------+-------+------------------+
4 rows in set (0.00 sec)

Here, you can see I simply called both commands in a single call.

The first output is the SHOW PROXY PROCESSLIST output. In it you can see the “Id” which corresponds to the “Id” in the SHOW PROCESSLIST output. You can also see the corresponding originating IP address along with the client source port, as well as the time stamp of when the connection was created.

To implement this, I simply capture and store the client address information in an internal table. I’m able to capture this from within the read_auth_result() function.

function read_auth_result( auth )
	local state = auth.packet:byte()
	if state == proxy.MYSQLD_PACKET_OK then
		initialize_process_table()
		table.insert( proxy.global.process[proxy.connection.server["thread_id"]],
			{ ip = proxy.connection.client["address"], ts = os.time() } )
	end
end

So when within the function read_auth_result(), if state == proxy.MYSQLD_PACKET_OK, then we know the connection was auth’ed, and hence I log this information at this point.

Note that I’m storing the value of proxy.connection.client["address"] which is the original IP address along with the client source port (e.g., 192.168.1.100:1322). The nice thing about this is you can track down the port information too on the original machine.

(Also note that in MySQL Proxy 0.7.0, there are even more available ports (variables), so you could add more to capture, say the client destination port, or the server source port. For more on this, pelase see Jan Kneschke’s post about this.)

On the remote machine (192.168.1.100), if I run the following command, I can track this connection via netstat:

netstat -an | grep 1322
tcp 0 0 192.168.1.100:1322 192.168.1.101:4050 ESTABLISHED

There is also another internal function to proxy that I use, which is disconnect_client(). This is called when a client disconnects, so if this function is entered for a particular connection, I simply delete that connection entry from the internal table.

function disconnect_client()
	local connection_id = proxy.connection.server.thread_id
	if connection_id then
		-- client has disconnected, set this to nil
		proxy.global.process[connection_id] = nil
	end
end

The rest of the code is basically to display this information in a nice format.

Here is the full code:

function read_query( packet )
	if string.byte(packet) == proxy.COM_QUERY then
		if string.sub(packet, 2):lower() == 'select help' then
			return show_process_help()
		end
		if string.sub(packet, 2):lower() == 'show proxy processlist' then
			return show_process_table()
		end
		--debug output
		--dump_process_table()
		proxy.queries:append(1, packet )
		return proxy.PROXY_SEND_QUERY
	end
end

function make_dataset (header, dataset)
	proxy.response.type = proxy.MYSQLD_PACKET_OK

	proxy.response.resultset = {
		fields = {},
		rows = {}
	}
	for i,v in pairs (header) do
		table.insert(proxy.response.resultset.fields, {type = proxy.MYSQL_TYPE_STRING, name = v})
	end
	for i,v in pairs (dataset) do
		table.insert(proxy.response.resultset.rows, v )
	end
	return proxy.PROXY_SEND_RESULT
end

function initialize_process_table()
	if proxy.global.process == nil then
		proxy.global.process= {}
	end
	if proxy.global.process [proxy.connection.server["thread_id"] ]== nil then
		proxy.global.process[proxy.connection.server["thread_id"] ]= {}
	end
end

function show_process_table()
	initialize_process_table()
	local dataset = {}
	local header = { 'Id', 'IP Address', 'Time' }
	local rows = {}
	for t_i, t_v in pairs (proxy.global.process) do
		for s_i, s_v in pairs ( t_v ) do
			table.insert(rows, { t_i, s_v.ip, os.date('%c',s_v.ts) })
		end
	end
	return make_dataset(header,rows)
end

function show_process_help()
	local dataset = {}
	local header = { 'command',  'description' }
	local rows = {
		{'SELECT HELP',                 'This command.'},
		{'SHOW PROXY PROCESSLIST',      'Show all connections and their true IP Address.'},
	}
	return make_dataset(header,rows)
end

function dump_process_table()
	initialize_process_table()
	print('current contents of process table')
	for t_i, t_v in pairs (proxy.global.process) do
		print ('session id: ', t_i)
		for s_i, s_v in pairs ( t_v ) do
			print ( 't', s_i, s_v.ip, s_v.ts )
		end
	end
	print ('---END PROCESS TABLE---')
end

function read_auth_result( auth )
	local state = auth.packet:byte()
	if state == proxy.MYSQLD_PACKET_OK then
		initialize_process_table()
		table.insert( proxy.global.process[proxy.connection.server["thread_id"]],
			{ ip = proxy.connection.client["address"], ts = os.time() } )
	end
end

function disconnect_client()
	local connection_id = proxy.connection.server.thread_id
	if connection_id then
		-- client has disconnected, set this to nil
		proxy.global.process[connection_id] = nil
	end
end

If you saved this to a file named show-ips.lua, then you would start MySQL Proxy as follows:

mysql-proxy --proxy-lua-script=share/mysql-proxy/show-ips.lua

And then you would connect to the Proxy as follows:

mysql -u root -pmysql -P4040

After that, you’ll be able to issue the commands implemented here such as SHOW PROXY PROCESSLIST and SELECT HELP:

mysql> SELECT HELP;
+------------------------+-------------------------------------------------+
| command                | description                                     |
+------------------------+-------------------------------------------------+
| SELECT HELP            | This command.                                   |
| SHOW PROXY PROCESSLIST | Show all connections and their true IP Address. |
+------------------------+-------------------------------------------------+
2 rows in set (0.00 sec)

mysql> SHOW PROXY PROCESSLIST;
+------+---------------------+-------------------+
| Id   | IP Address          | Time              |
+------+---------------------+-------------------+
| 5    | 127.0.0.1:51128     | 04/15/09 14:09:42 |
| 6    | 192.168.1.100:2341  | 04/15/09 14:10:22 |
| 8    | 192.168.1.103:52722 | 04/15/09 14:18:49 |
+------+---------------------+-------------------+
3 rows in set (0.00 sec)

Lastly, note that you can easily integrate these functions to the quan.lua script that the Query Analyzer Proxy uses (for MySQL Enterprise Monitor).


Period Panties by Period Panteez Menstrual Underwear Menstruation PMS Panty