Author: Hong Bin
Does every DBA have experience of deleting libraries? What if you delete the library without backup? What happens when the service cannot be started after the backup recovery? What if table definition corrupted data cannot be read?
I once met a startup Internet company. Because of the irregular backup and recovery operation of maintenance personnel, the system tablespace files were initialized, tens of thousands of tables could not be read, and it took hours to rescue them.
When you find that the data cannot be read, it may not be that the data is lost, but that the DBMS cannot find the information describing the data.
background
Let's first look at some key InnoDB data dictionary tables that hold some table definition information that we need to use when we restore the table structure.
SYS_TABLES Describes InnoDB Table Information
CREATE TABLE `SYS_TABLES` ( `NAME` varchar(255) NOT NULL DEFAULT '', Table name `ID` bigint(20) unsigned NOT NULL DEFAULT '0', surface id `N_COLS` int(10) DEFAULT NULL, `TYPE` int(10) unsigned DEFAULT NULL, `MIX_ID` bigint(20) unsigned DEFAULT NULL, `MIX_LEN` int(10) unsigned DEFAULT NULL, `CLUSTER_NAME` varchar(255) DEFAULT NULL, `SPACE` int(10) unsigned DEFAULT NULL, Table space id PRIMARY KEY (`NAME`) ) ENGINE=InnoDB DEFAULT CHARSET=latin1;
SYS_INDEXES Describes InnoDB Index Information
CREATE TABLE `SYS_INDEXES` ( `TABLE_ID` bigint(20) unsigned NOT NULL DEFAULT '0', and sys_tables Of id Corresponding `ID` bigint(20) unsigned NOT NULL DEFAULT '0', Indexes id `NAME` varchar(120) DEFAULT NULL, Index name `N_FIELDS` int(10) unsigned DEFAULT NULL, Index contains the number of fields `TYPE` int(10) unsigned DEFAULT NULL, `SPACE` int(10) unsigned DEFAULT NULL, Table space for storing indexes id `PAGE_NO` int(10) unsigned DEFAULT NULL, Indexed root page id PRIMARY KEY (`TABLE_ID`,`ID`) ) ENGINE=InnoDB DEFAULT CHARSET=latin1;
SYS_COLUMNS Describes Field Information of InnoDB Table
CREATE TABLE `SYS_COLUMNS` ( `TABLE_ID` bigint(20) unsigned NOT NULL, and sys_tables Of id Corresponding `POS` int(10) unsigned NOT NULL, Relative position of field `NAME` varchar(255) DEFAULT NULL, Field name `MTYPE` int(10) unsigned DEFAULT NULL, Field Coding `PRTYPE` int(10) unsigned DEFAULT NULL, Field Check Type `LEN` int(10) unsigned DEFAULT NULL, Field byte length `PREC` int(10) unsigned DEFAULT NULL, Field Accuracy PRIMARY KEY (`TABLE_ID`,`POS`) ) ENGINE=InnoDB DEFAULT CHARSET=latin1;
SYS_FIELDS describes field columns for all indexes
CREATE TABLE `SYS_FIELDS` ( `INDEX_ID` bigint(20) unsigned NOT NULL, `POS` int(10) unsigned NOT NULL, `COL_NAME` varchar(255) DEFAULT NULL, PRIMARY KEY (`INDEX_ID`,`POS`) ) ENGINE=InnoDB DEFAULT CHARSET=latin1;
The. / storage/innobase/include/dict0boot.h file defines the index id of each dictionary table and stores the data of the dictionary table in the page corresponding to the id.
Here we need to use undrop-for-innodb tool to recover data, which can read table space information to get pages and extract data from pages.
# wget https://github.com/chhabhaiya/undrop-for-innodb/archive/master.zip # yum install -y gcc flex bison # make # make sys_parser
. / sys_parser reads table structure information
sys_parser [-h <host>] [-u <user>] [-p <passowrd>] [-d <db>] databases/table
stream_parser reads InnoDB page from ibdata1 or ibd or partition tables
# ./stream_parser You must specify file with -f option Usage: ./stream_parser -f <innodb_datafile> [-T N:M] [-s size] [-t size] [-V|-g] Where: -h - Print this help -V or -g - Print debug information -s size - Amount of memory used for disk cache (allowed examples 1G 10M). Default 100M -T - retrieves only pages with index id = NM (N - high word, M - low word of id) -t size - Size of InnoDB tablespace to scan. Use it only if the parser can't determine it by himself.
c_parser reads records from innodb page and saves them to files
# ./c_parser Error: Usage: ./c_parser -4|-5|-6 [-dDV] -f <InnoDB page or dir> -t table.sql [-T N:M] [-b <external pages directory>] Where -f <InnoDB page(s)> -- InnoDB page or directory with pages(all pages should have same index_id) -t <table.sql> -- CREATE statement of a table -o <file> -- Save dump in this file. Otherwise print to stdout -l <file> -- Save SQL statements in this file. Otherwise print to stderr -h -- Print this help -d -- Process only those pages which potentially could have deleted records (default = NO) -D -- Recover deleted rows only (default = NO) -U -- Recover UNdeleted rows only (default = YES) -V -- Verbose mode (lots of debug information) -4 -- innodb_datafile is in REDUNDANT format -5 -- innodb_datafile is in COMPACT format -6 -- innodb_datafile is in MySQL 5.6 format -T -- retrieves only pages with index id = NM (N - high word, M - low word of id) -b <dir> -- Directory where external pages can be found. Usually it is pages-XXX/FIL_PAGE_TYPE_BLOB/ -i <file> -- Read external pages at their offsets from <file>. -p prefix -- Use prefix for a directory name in LOAD DATA INFILE command
Next, we demonstrate several data recovery scenarios for the scenario.
Scenario 1: drop table
Whether innodb_file_per_table is enabled or not, the recovery method is different. When the table is deleted by mistake, MySQL service should be stopped as soon as possible and not started. If innodb_file_per_table=ON, it is better to reload the file system in a read-only manner to prevent other processes from writing to the data of the block device before data coverage.
If you evaluate whether records are overwritten, you can use some records in the table as keywords to see if they can be filtered from ibdata1.
grep WOODYHOFFMAN ibdata1
Binary file ibdata1 matches
You can also use BVI (for smaller files) or hexdump-C (for larger files)
Take table sakila.actor as an example
CREATE TABLE `actor` ( `actor_id` smallint(5) unsigned NOT NULL AUTO_INCREMENT, `first_name` varchar(45) NOT NULL, `last_name` varchar(45) NOT NULL, `last_update` timestamp NOT NULL DEFAULT CURRENT_TIMESTAMP ON UPDATE CURRENT_TIMESTAMP, PRIMARY KEY (`actor_id`), KEY `idx_actor_last_name` (`last_name`) ) ENGINE=InnoDB AUTO_INCREMENT=201 DEFAULT CHARSET=utf8
First restore table structure information
1. Parsing System Table Space to Get page Information
./stream_parser -f /var/lib/mysql/ibdata1
2. Create a new schema to import the DDL of the system dictionary table
cat dictionary/SYS_* | mysql recovered
3. Create a Recovery Directory
mkdir -p dumps/default
4. Analyzing the dictionary table information contained in the system table space.
./c_parser -4f pages-ibdata1/FIL_PAGE_INDEX/0000000000000001.page -t dictionary/SYS_TABLES.sql > dumps/default/SYS_TABLES 2> dumps/default/SYS_TABLES.sql ./c_parser -4f pages-ibdata1/FIL_PAGE_INDEX/0000000000000002.page -t dictionary/SYS_COLUMNS.sql > dumps/default/SYS_COLUMNS 2> dumps/default/SYS_COLUMNS.sql ./c_parser -4f pages-ibdata1/FIL_PAGE_INDEX/0000000000000003.page -t dictionary/SYS_INDEXES.sql > dumps/default/SYS_INDEXES 2> dumps/default/SYS_INDEXES.sql ./c_parser -4f pages-ibdata1/FIL_PAGE_INDEX/0000000000000004.page -t dictionary/SYS_FIELDS.sql > dumps/default/SYS_FIELDS 2> dumps/default/SYS_FIELDS.sql
5. Import Restored Data Dictionary
cat dumps/default/*.sql | mysql recovered
6. Read the restored table structure information
./sys_parser -pmsandbox -d recovered sakila/actor
Since the 5.x version innodb engine does not fully record the structure information of the table, it will lose AUTO_INCREMENT attributes, secondary index and foreign key constraints, DECIMAL accuracy and other information.
If the frm file of mysql version 5.5 is deleted from the system, the frm file with the same name of the original table can be touch ed under the original directory, and the structure information and data of the table can also be read. If there are only frm files, you can use mysqlfrm --diagnostic /path/to/xxx.frm to get table structure information. Connecting mysql will display character set information.
- innodb_file_per_table=OFF
Because it is a shared table space mode, data pages are stored in ibdata1 and data can be extracted from ibdata1 files.
1. Get the table id of the table, sys_table holds the table id of the table, and the index id of the sys_table table table is 1, so get the table id from 0000000000001. page.
./c_parser -4Df pages-ibdata1/FIL_PAGE_INDEX/0000000000000001.page -t dictionary/SYS_TABLES.sql | grep sakila/actor 000000000B28 2A000001430D4D SYS_TABLES "sakila/actor" 158 4 1 0 0 "" 0 000000000B28 2A000001430D4D SYS_TABLES "sakila/actor" 158 4 1 0 0 "" 0
2. Use table id to get the primary key id of the table, sys_indexes have table index information, innodb index organizes the table, find the primary key id to find the data, sys_indexes index id is 3, so get the primary key id from 0000000000003. page.
./c_parser -4Df pages-ibdata1/FIL_PAGE_INDEX/0000000000000003.page -t dictionary/SYS_INDEXES.sql | grep 158 000000000B28 2A000001430BCA SYS_INDEXES 158 376 "PRIMARY" 1 3 0 4294967295 000000000B28 2A000001430C3C SYS_INDEXES 158 377 "idx\_actor\_last\_name" 1 0 0 4294967295 000000000B28 2A000001430BCA SYS_INDEXES 158 376 "PRIMARY" 1 3 0 4294967295 000000000B28 2A000001430C3C SYS_INDEXES 158 377 "idx\_actor\_last\_name" 1 0 0 4294967295
3. Knowing the primary key id, you can extract table data from corresponding page s and generate sql files.
./c_parser -4f pages-ibdata1/FIL_PAGE_INDEX/0000000000000376.page -t sakila/actor.sql > dumps/default/actor 2> dumps/default/actor_load.sql
4. Finally import the recovered data
cat dumps/default/*.sql | mysql sakila
- innodb_file_per_table=ON
This recovery process is basically the same as the above, but because of the independent table space mode, the data pages are stored in their respective ibd files, the ibd files are deleted, and the data pages can not be extracted through ibdata1, so the pages-ibdata1 directory can not find the data pages, and stream_parser needs to read the data page information from the block device. After scanning, the data is extracted in pages-sda1 directory.
./stream_parser -f /dev/sda1 -t 1000000k
Scenario 2: Corrupted InnoDB table
Damage occurred to InnoDB tables, and MySQL could not be started even if innodb_force_recovery=6
Similar errors may occur in the log
InnoDB: Database page corruption on disk or a failed InnoDB: file read of page 4.
At this time, the recovery strategy needs to extract the data page from the independent table space, delete the table space, and recreate the table to import the data.
1. Get the primary key index id of the fault table first
2. Obtain data records through index id page
select t.name, t.table_id, i.index_id, i.page_no from INNODB_SYS_TABLES t join INNODB_SYS_INDEXES i on t.table_id=i.table_id and t.name='test/sbtest1';
3. Because data pages may be partially corrupted, it is necessary to filter out "bad" data and retain good data.
For example, the first two rows of records are actually "bad" data and need to be filtered out.
root@test:~/recovery/undrop-for-innodb# ./c_parser -6f pages-actor.ibd/FIL_PAGE_INDEX/0000000000000015.page -t sakila/actor.sql > dumps/default/actor 2> dumps/default/actor_load.sql root@test:~/recovery/undrop-for-innodb# cat dumps/default/actor -- Page id: 3, Format: COMPACT, Records list: Invalid, Expected records: (0 200) 72656D756D07 08000010002900 actor 30064 "\0\0\0\0" "" "1972-09-20 23:07:44" 1050454E454C 4F50454755494E actor 19713 "ESSCâ–’" "" "2100-08-09 07:52:36" 00000000051E 9F0000014D011A actor 2 "NICK" "WAHLBERG" "2006-02-15 04:34:33" 00000000051E 9F0000014D0124 actor 3 "ED" "CHASE" "2006-02-15 04:34:33" 00000000051E 9F0000014D012E actor 4 "JENNIFER" "DAVIS" "2006-02-15 04:34:33" 00000000051E 9F0000014D0138 actor 5 "JOHNNY" "LOLLOBRIGIDA" "2006-02-15 04:34:33" 00000000051E 9F000001414141 actor 6 "AAAAA" "AAAAAAAAA" "2004-09-10 01:53:05" 00000000051E 9F0000014D016A actor 10 "CHRISTIAN" "GABLE" "2006-02-15 04:34:33" ...
You can add filter conditions to sql files, such as scope filtering through actor_id, and reading data pages with new sql files.
CREATE TABLE `actor` ( `actor_id` smallint(5) unsigned NOT NULL AUTO_INCREMENT /*!FILTER int_min_val: 1 int_max_val: 300 */, `first_name` varchar(45) NOT NULL, `last_name` varchar(45) NOT NULL, `last_update` timestamp NOT NULL DEFAULT CURRENT_TIMESTAMP ON UPDATE CURRENT_TIMESTAMP, PRIMARY KEY (`actor_id`), KEY `idx_actor_last_name` (`last_name`) ) ENGINE=InnoDB AUTO_INCREMENT=201 DEFAULT CHARSET=utf8;
4. Delete the fault table file, start MySQL with innodb_force_recovery=6, and delete metadata after startup
5. Create new tables to import restored data
Question: How do you know how many records have been lost?
When reading a data page, the expected number of records will be displayed at the beginning, and the actual number of recovered records will be displayed at the end. The difference is the number of lost records.
-- Page id: 3, Format: COMPACT, Records list: Invalid, Expected records: (0 200) -- Page id: 3, Found records: 197, Lost records: YES, Leaf page: YES
Scenario 3: How to recover data from disk or file system Corruption
In this case, protect the damaged block device from writing as soon as possible, and use the dd tool to read the mirror data for recovery.
Local approach
dd if=/dev/sdb of=/path/to/faulty_disk.img conv=noerror
Remote mode
remote server> nc -l 1234 > faulty_disk.img local server> dd if=/dev/sdb of=/dev/stdout conv=noerror | nc a.b.c.d 1234
After the disk image is saved, the subsequent recovery operation refers to Scenario 2.
summary
1. Never use copy data files as backup mode when the service is running. It seems that the data is backed up, but the actual data is inconsistent.
2. Use the physical backup tool xtrabackup/meb or logical backup correctly.
3. Restore and validate backup data periodically.
Hope you will never use these methods, do a good backup, validate diligently!
Reference resources
https://twindb.com/how-to-rec...
https://twindb.com/recover-co...
https://twindb.com/take-image...
https://twindb.com/data-loss-...
https://twindb.com/repair-cor...
https://twindb.com/resolving-...
https://dev.mysql.com/doc/ref...