Mysql million data import

Posted by showman on Fri, 18 Feb 2022 22:25:03 +0100

Mysql million data import

1. Method

  • Import directly from the source sql file (not recommended)
  • Load data infile command for import (recommended)
  • Replace the engine, change the InnoDB engine of the table to MyISAM, and then import the data (not recommended)
  • Import a single SQL statement of imported data into multiple SQL statements (recommended)
  • Orderly insertion of stored procedure + transaction (you can try it yourself if you are interested)

1.1 direct import (not recommended)

There is no need to talk about this way. Simple violence, direct source

use example_database;
source your_disk:/yourpath/example.sql

Summary: you can try this import method to import millions of data. I imported more than 20000 pieces of data in half an hour. If it is millions, you can estimate the time. Therefore, this method is not recommended

1.2 Load data infile import (recommended)

1.2.1 basic introduction:

The load data infile statement reads a table from a text file at a high speed. Before using this command, the mysqld process (service) must already be running.

be careful:

  • The file must be in the database directory or readable by everyone
  • You must have file permission on the server host.

1.2.2 grammatical structure:

load data  [low_priority] [local] infile 'file_name txt' [replace | ignore]
into table tbl_name
[fields
[terminated by't']
[OPTIONALLY] enclosed by '']
[escaped by'\' ]]
[lines terminated by'n']
[ignore number lines]
[(col_name,   )]

explain:

  • If low is specified_ Priority keyword, MySQL will wait until no one else reads the table before inserting the data
  • If you specify the local keyword, it indicates that the file is read from the client host. If local is not specified, the file must be located on the server
  • The replace and ignore keywords control the processing of duplicate existing unique key records. If you specify replace, the new row will replace the existing row with the same unique key value. If you specify ignore, skip the input of duplicate lines of existing lines with only one key
  • fields keyword specifies the segmentation format of file segments. If this keyword is used, MySQL parser wants to see at least one of the following options:
    • terminated by separator: it means what character is used as the separator
    • enclosed by field enclosing character
    • escaped by escape character
    • terminated by describes the separator of the field. By default, it is the tab character (\ t)
    • enclosed by describes the enclosing character of the field.
    • Escape character described by. The default is backslash (backslash: \)
  • The default line separator for each record is' lines' \ n

1.2.3 examples:

(1) File data

"I love you!","20","He looks ordinary and often plays hooligans! ha-ha"
"Li Kui","21","He looks ordinary and often plays hooligans! ha-ha"
"Wang Ermi","20","He looks ordinary and often plays hooligans! ha-ha"
"Old three","24","Very strong"
"Old four","34","XXXXX"
"Old five","52","***%*¥*¥*¥*¥"
"kitten","45","Omit in the middle..."
"puppy","12","Will call"
"youngest sister","21","PP Very"
"Little villain","52","have two faces"
"God, God","96","Very handsome"
"MM coming","10",". . . "
"Praise the party","20","Good socialism"
"Good people","20","It's really good"
"fairly tall","10","Study well"
"Oblique three","60","Eyes slanted"
"Son of China","100","It's too powerful"
"rice","63","I love it"
"Apple","15","yummy"

(2) Sample table structure

CREATE TABLE `t0` (
  `id` bigint(20) unsigned NOT NULL auto_increment,
  `name` char(20) NOT NULL,
  `age` tinyint(3) unsigned NOT NULL,
  `description` text NOT NULL,
  PRIMARY KEY  (`id`),
  UNIQUE KEY `idx_name` (`name`)
) ENGINE=INNODB DEFAULT CHARSET=utf8

(3) Import command

# Ignore ignore duplicate data
# terminated data is in comma style
# The enclosed by field is enclosed by ""
# lines terminated data are branches
load data infile '/tmp/t0.txt' 
	ignore into table t0 character set gbk 
	fields 
	terminated by ',' enclosed by '"' 
	lines 
	terminated by '\n' (`name`,`age`,`description`);

(4) Import situation

mysql> select * from t0;
+----+----------+-----+----------------------------+
| id | name     | age | description                |
+----+----------+-----+----------------------------+
|  1 | I love you!   |  20 | He looks ordinary and often plays hooligans! ha-ha | 
|  2 | Li Kui     |  21 | He looks ordinary and often plays hooligans! ha-ha | 
|  3 | Wang Ermi   |  20 | He looks ordinary and often plays hooligans! ha-ha | 
|  4 | Old three     |  24 | Very strong                       | 
|  5 | Old four     |  34 | XXXXX                      | 
|  6 | Old five     |  52 | ***%*¥*¥*¥*¥           | 
|  7 | kitten     |  45 | Omit in the middle...             | 
|  8 | puppy     |  12 | Will call                     | 
|  9 | youngest sister     |  21 | PP Very                     | 
| 10 | Little villain   |  52 | have two faces                   | 
| 11 | God, God |  96 | Very handsome                   | 
| 12 | MM coming   |  10 | . . .                      | 
| 13 | Praise the party   |  20 | Good socialism                 | 
| 14 | Good people   |  20 | It's really good                   | 
| 15 | fairly tall     |  10 | Study well                   | 
| 16 | Oblique three     |  60 | Squint your eyes                   | 
| 17 | Son of China | 100 | It's too powerful               | 
| 18 | rice     |  63 | I love it                     | 
| 19 | Apple     |  15 | yummy                       | 
+----+----------+-----+----------------------------+
19 rows in set (0.00 sec)

Summary: you can try to import millions of data in this way, just
It only takes tens of seconds, especially fast. recommend

1.3 engine replacement

This is mainly to change the engine used from Inoodb to MyISAM when creating the table, and then import it

Create table statement:

CREATE TABLE `t0` (
  `id` bigint(20) unsigned NOT NULL auto_increment,
  `name` char(20) NOT NULL,
  `age` tinyint(3) unsigned NOT NULL,
  `description` text NOT NULL,
  PRIMARY KEY  (`id`),
  UNIQUE KEY `idx_name` (`name`)
) ENGINE= MyISAM DEFAULT CHARSET=utf8

Then, when importing data in the above two ways:

  • innodb + direct import: 30000 pieces of data were inserted, which took half an hour.
    MyISAM + direct import: 30000 pieces of data were inserted and took ten minutes.
  • innodb + Load data infile: millions of pieces of data were inserted, which took more than 40 seconds.
    MyISAM + Load data infile: millions of pieces of data were inserted, which took more than 30 seconds.

Conclusion: it is obviously faster to import data by changing the engine, but the difference between MyISAM and innodb is still great. This method is not recommended for subsequent operations.

1.2.4 merge a single SQL statement into multiple SQL statements and then import

This method is mainly to merge millions of sql inserted into data into 1000 sql, and each sql inserts 1000 data

For example:

Multiple sql:

insert ignore into actor values (1 , 'PENELOPE' , 'GUINESS'  ,'2006-02-15 12:34:33'),
insert ignore into actor values (2 , 'NICK' , 'WAHLBERG'  ,'2006-02-15 12:34:33');

One sql:

insert ignore into actor values
    (1 , 'PENELOPE' , 'GUINESS'  ,'2006-02-15 12:34:33'),
    (2 , 'NICK' , 'WAHLBERG'  ,'2006-02-15 12:34:33');

There are two ways to merge:

  • Direct manual merge (if you think you have a brain problem)
  • Use the program, use python and Java to directly operate the files and merge them (I won't explain it specifically)

Summary: if you import in this way, insert 1000 rows of data according to one sql and insert 1000 times. This method takes only about 2 minutes to import 1 million data. It is also recommended

Topics: Database MySQL Big Data