Welcome to the MySQL technical articles shared by the great SQL community. If you have any questions or want to learn, you can leave a message in the comment area below. After seeing it, you will answer it
- The original content of GreatSQL community cannot be used without authorization. Please contact Xiaobian and indicate the source for reprint.
- 1. Find problems
- 2. Recurrence problem
- 3. Check import file
- 4. Cause of problem
- 5. Dealing with problems
- 6. Summary
1. Find problems
In a data migration task, Xiao Ling exported the data in the source database as a CSV file, and then imported the data into MySQL through load data. She was surprised to find that the id field was lost, just like this:
mysql> select * from t2; +----+-------+---------------------+ | id | col1 | col2 | +----+-------+---------------------+ | || TfdESTA |TESTA |4 | TEfdfdSTA | 5 | TEST5 | TESfddfdsfdsfdsfTA |TEST6 | TESffdfdfddTA +----+-------+---------------------+ 6 rows in set (0.00 sec)
The target database version and table structure are as follows:
mysql> select @@version; +-----------+ | @@version | +-----------+ | 8.0.25 | +-----------+ 1 row in set (0.00 sec) mysql> show create table t2; +-------+----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+ | Table | Create Table | +-------+----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+ | t2 | CREATE TABLE `t2` ( `id` int NOT NULL AUTO_INCREMENT, `col1` varchar(69) DEFAULT NULL, `col2` varchar(79) DEFAULT NULL, PRIMARY KEY (`id`) ) ENGINE=InnoDB AUTO_INCREMENT=8 DEFAULT CHARSET=utf8mb4 COLLATE=utf8mb4_0900_ai_ci | +-------+----------------------------------------------------------------------------------
After Xiaoling calmed down, she verified that the primary key id was not really lost through the following statement. It seems that there is only some display error:
mysql> select * from t2 where id=1; +----+-------+------+ | id | col1 | col2 | +----+-------+------+ |1 | TEST1 | TA +----+-------+------+ 1 row in set (0.00 sec)
So Xiaoling decided to check the whole operation process and find out the cause of the problem.
2. Recurrence problem
First create a table t1
mysql> CREATE TABLE `t1` ( -> `id` int NOT NULL AUTO_INCREMENT, -> `col1` varchar(60) DEFAULT NULL, -> `col2` varchar(70) DEFAULT NULL, -> PRIMARY KEY (`id`) -> ); Query OK, 0 rows affected (0.01 sec)
In the windows environment, prepare a new test file T1 through Notepad Txt, view t1.txt in linux environment Txt reads as follows:
great@great-PC:~/Downloads/windows$ cat t1.txt test1,test1 test2,test2 test3,test3 test4,test4 test5,test5 test6,test6
load and view the data. Found no confusion.
mysql> load data infile "/home/great/Downloads/windows/t1.txt" \ into table t1 FIELDS TERMINATED BY ',' (col1,col2); Query OK, 6 rows affected (0.01 sec) Records: 6 Deleted: 0 Skipped: 0 Warnings: 0 mysql> select * from t1; +----+-------+--------+ | id | col1 | col2 | +----+-------+--------+ | 1 | test1 | test1 | 2 | test2 | test2 | 3 | test3 | test3 | 4 | test4 | test4 | 5 | test5 | test5 | 6 | test6 | test6 +----+-------+--------+ 6 rows in set (0.00 sec)
But it's a little strange that the edge line on the right is missing. Insert a data manually, and then query to see the data.
mysql> insert into t1(col1,col2) values('test7','test7'); Query OK, 1 row affected (0.00 sec) mysql> select * from t1; +----+-------+--------+ | id | col1 | col2 | +----+-------+--------+ | 1 | test1 | test1 | 2 | test2 | test2 | 3 | test3 | test3 | 4 | test4 | test4 | 5 | test5 | test5 | 6 | test6 | test6 | 7 | test7 | test7 | +----+-------+--------+ 7 rows in set (0.00 sec) mysql> select * from t1 where id=7; +----+-------+-------+ | id | col1 | col2 | +----+-------+-------+ | 7 | test7 | test7 | +----+-------+-------+ 1 row in set (0.00 sec)
Parse the binlog log
... ### INSERT INTO `test`.`t1` ### SET ### @1=6 /* INT meta=0 nullable=0 is_null=0 */ ### @2='test6' /* VARSTRING(240) meta=240 nullable=1 is_null=0 */ ### @3='test6\r' /* VARSTRING(280) meta=280 nullable=1 is_null=0 */ # at 1223 #210923 11:19:09 server id 12345 end_log_pos 1250 Xid = 271 COMMIT/*!*/; ... ### INSERT INTO `test`.`t1` ### SET ### @1=7 /* INT meta=0 nullable=0 is_null=0 */ ### @2='test7' /* VARSTRING(240) meta=240 nullable=1 is_null=0 */ ### @3='test7' /* VARSTRING(280) meta=280 nullable=1 is_null=0 */ # at 1500 #210923 11:19:15 server id 12345 end_log_pos 1527 Xid = 272 COMMIT/*!*/;
It can be clearly seen that when inserting txt data, test6 is not test6, but test\r. The test7 we manually inserted is indeed test7.
Then check the hexadecimal value of the actually inserted data.
mysql> select id,hex(col2) from t1; +----+--------------+ | id | hex(col2) | +----+--------------+ | 1 | 74657374310D | | 2 | 74657374320D | | 3 | 74657374330D | | 4 | 74657374340D | | 5 | 74657374350D | | 6 | 74657374360D | | 8 | 7465737437 | +----+--------------+ 7 rows in set (0.00 sec)
The data load ed in is 0D more than the normal data manually inserted.
mysql> select hex('\r') from t1; +-----------+ | hex('\r') | +-----------+ | 0D | | 0D | | 0D | | 0D | | 0D | | 0D | | 0D | +-----------+ 7 rows in set (0.00 sec) mysql> select unhex('0D') from t1; +-------------+ | unhex('0D') | +-------------+ | | | | | | | +-------------+ 7 rows in set (0.00 sec)
It is basically obvious that it is caused by the symbols in the txt file. The reason why the result of the query at the beginning is not standardized is that the length of col2 is 5. When you import data with different lengths, you can clearly see the differences.
mysql> load data infile "/home/great/Downloads/windows/oracle_objects.txt" \ into table t1 FIELDS TERMINATED BY ',' (col1,col2); Query OK, 2088 rows affected (0.02 sec) Records: 2088 Deleted: 0 Skipped: 0 Warnings: 0 mysql> select * from t1 limit 30 ; +----+-----------------------------+------------+ | id | col1 | col2 | +----+-----------------------------+------------+ | test1 | test1 | test2 | test2 | test3 | test3 | test4 | test4 | test5 | test5 | test6 | test6 | 8 | test7 | test7 | | A | TABLE | A1 | TABLE | AAA | TABLE | ABC123 | TABLE | ABCDEF | TABLE | ACTIVE_ALARMS | TABLE |15 | ADDAUTH | PROCEDURE |16 | ADDROLEAUTH | PROCEDURE | AGENT_AVAIL_PRIV | TABLE | AGE_STAT_FORBIDTIMELOGIN | TABLE | AGE_STAT_NOTFORBIDTIMELOGIN | TABLE | AGE_STAT_QUERYLARGE | TABLE | APP_SCHEDULE_INFO | TABLE |22 | AP_CLEARMUTEXROLL | PROCEDURE |23 | AP_CLEARROLEBYSTSNUM | PROCEDURE |24 | AP_CLEARROLEBYTEL | PROCEDURE |25 | AP_CLEARSUPERROLL | PROCEDURE |26 | AP_CREATE_RDDLV_FILE | PROCEDURE |27 | AP_DEALACTLOG_OPERIP | PROCEDURE |28 | AP_LOC_CRMBI_MSGSEND_LOG | PROCEDURE |29 | AP_MENDCLICK | PROCEDURE |30 | AP_PUB_UNLOCK_VERIFYCODE | PROCEDURE |31 | AP_SETDBUSERANDPASS | PROCEDURE +----+-----------------------------+------------+ 30 rows in set (0.00 sec)
3. Check import file
Confirm that it is the problem of the imported file. Let's check the hexadecimal status of this file.
The vscode plug-in hexdump is used here, and the results are as follows
Offset: 00 01 02 03 04 05 06 07 08 09 0A 0B 0C 0D 0E 0F 00000000: 74 65 73 74 31 2C 74 65 73 74 31 0D 0A 74 65 73 test1,test1..tes 00000010: 74 32 2C 74 65 73 74 32 0D 0A 74 65 73 74 33 2C t2,test2..test3, 00000020: 74 65 73 74 33 0D 0A 74 65 73 74 34 2C 74 65 73 test3..test4,tes 00000030: 74 34 0D 0A 74 65 73 74 35 2C 74 65 73 74 35 0D t4..test5,test5. 00000040: 0A 74 65 73 74 36 2C 74 65 73 74 36 0D 0A .test6,test6..
The observation shows that there are 0A in addition to 0D
mysql> select hex('\n') ; +-----------+ | hex('\n') | +-----------+ | 0A | +-----------+ 1 row in set (0.00 sec)
It can be seen that in the imported txt file, the end of each line is \ R \ NAS a newline.
In different systems, there are different representations of line breaks.
The following is from Wikipedia
Expression of newline characters by application software and operating system:
ASCII based or compatible character sets use LF (Line feed, U+000A) or CR (Carriage Return, u > + 000d) or CR+LF respectively; the list of Line feed character codes of each system is listed below
LF: on Unix or Unix compatible systems (GNU/Linux, AIX, Xenix, Mac OS X,...), BeOS, Amiga, RISC OS
CR+LF: DOS (MS-DOS, PC-DOS, etc.), Microsoft Windows operating system (Microsoft Windows), most non Unix systems
CR: Apple II family, Mac OS to version 9
4. Cause of problem
It can be seen from the above that in txt text, the end of each line uses \ r\n as line feed, while linux system uses \ n as line feed, so \ r is inserted into the table as a character.
\r is processed as A carriage return character in mysql, so the article header will appear and the query results are not standardized.
5. Dealing with problems
Using load_ When importing data from data, you can use lines terminated by '\r\n' to tell mysql that the whole \ r\n is used as a line break.
Let's do it again.
mysql> show master status; +---------------+----------+--------------+------------------+---------------------------------------------+ | File | Position | Binlog_Do_DB | Binlog_Ignore_DB | Executed_Gtid_Set | +---------------+----------+--------------+------------------+---------------------------------------------+ | binlog.000013 | 192 | | | b5457ec7-f50a-11eb-ac22-2cf05daaf63e:1-6182 | +---------------+----------+--------------+------------------+---------------------------------------------+ 1 row in set (0.00 sec) mysql> load data infile "/home/great/Downloads/windows/t1.txt" \ into table t1 fields terminated by ','lines terminated by '\r\n'(col1,col2); Query OK, 6 rows affected (0.00 sec) Records: 6 Deleted: 0 Skipped: 0 Warnings: 0 mysql> select * from t1; +----+-------+-------+ | id | col1 | col2 | +----+-------+-------+ | 1 | test1 | test1 | | 2 | test2 | test2 | | 3 | test3 | test3 | | 4 | test4 | test4 | | 5 | test5 | test5 | | 6 | test6 | test6 | +----+-------+-------+ 6 rows in set (0.00 sec)
Parse the binary log to see the actual insertion
... ### INSERT INTO `test`.`t1` ### SET ### @1=6 /* INT meta=0 nullable=0 is_null=0 */ ### @2='test6' /* VARSTRING(240) meta=240 nullable=1 is_null=0 */ ### @3='test6' /* VARSTRING(280) meta=280 nullable=1 is_null=0 */ # at 532 #210923 14:39:18 server id 12345 end_log_pos 559 Xid = 385 COMMIT/*!*/; ...
The value inserted is correct.
6. Summary
The whole problem is caused by the inconsistency of line breaks between linux system and windows system. When re importing, you need to consider whether there are processing differences between files in different systems to avoid unpredictable consequences.
[reference document]
1.String Literals(https://dev.mysql.com/doc/ref...)
2.LOAD DATA Statement(https://dev.mysql.com/doc/ref...)
Enjoy GreatSQL :)
Article recommendation:
Technology sharing | MGR best practice
https://mp.weixin.qq.com/s/66...
Technology sharing | the way to repair MGR Bug in Wanli database
https://mp.weixin.qq.com/s/Ia...
Macos system compiler percona and the difference of some functions on Macos system
https://mp.weixin.qq.com/s/jA...
Technology sharing | using systemd to manage single machine and multiple instances of MySQL
https://mp.weixin.qq.com/s/iJ...
Product | GreatSQL to create a better MGR ecology
https://mp.weixin.qq.com/s/By...
Product | GreatSQL MGR optimization reference
https://mp.weixin.qq.com/s/5m...
About GreatSQL
GreatSQL is a MySQL branch maintained by Wanli database. It focuses on improving the reliability and performance of MGR and supports the parallel query feature of InnoDB. It is a branch version of MySQL suitable for financial applications.
Gitee:
https://gitee.com/GreatSQL/Gr...
GitHub:
https://github.com/GreatSQL/G...
Wechat & QQ group:
You can search and add GreatSQL community assistant wechat friends, send verification information "add group" to join GreatSQL/MGR communication wechat group
QQ group: 533341697
Wechat assistant: wanlidbc
This article is composed of blog one article multi posting platform OpenWrite release!