cause
Because of the characteristics of business, sometimes the data need complex query logic to get. Because the database itself does not support complex queries, so synchronizing data to capability middleware has become a common solution, and synchronizing to Elastic search is one of them.
Scheme Selection
The data source we use is MySql, which chooses to synchronize to ES. The idea is to reduce intrusiveness by subscribing to Binlog. There are also some solutions, such as: go-mysql-elasticsearch,canal,gravity
However, these schemes usually support only one-to-one indexing, that is, one table and one index, while the one-to-one and one-to-many relationships in data tables can not be reflected in the body of the index. This relationship is often needed in business. So we should consider gravity as a plug-in.
gravity is Mobay's open source data synchronization middleware. Currently, data sources support MySql and Mono, TiDB and PostgreSQL in development. Synchronization target support: MySQL/TiDB and Kafka, Elastic search is still in bate phase, and supports 6.
EsModel Synchronization Plug-in
Project address: gravity Welcome star:).
Synchronization strategy
- Supports the synchronization of main table and one-to-one and one-to-one multi-word table, which can be synchronized into an index structure.
- One-to-one relationships support synchronization in the form of tiles or sub-objects.
- Support ES Versions: 6, 7
For example, there are four tables: student, student_class, student_detail, student_parent. Student is the student table (main table), student_class class class (one-to-one sub-table), student_detail student details (one-to-one sub-table), student_parent student parents (one-to-many sub-table). Stud_class is synchronized in the form of sub-objects, and student_detail is synchronized in the form of tiles.
The Sql script is as follows:
DROP TABLE IF EXISTS `student`; CREATE TABLE `student` ( `id` bigint(20) NOT NULL, `name` varchar(64) NOT NULL DEFAULT '', `birthday` date NOT NULL DEFAULT '1970-01-01', `high` int(11) NOT NULL DEFAULT '0', `sex` tinyint(4) NOT NULL DEFAULT '1', `create_time` datetime NOT NULL DEFAULT CURRENT_TIMESTAMP, PRIMARY KEY (`id`) ) ENGINE=InnoDB DEFAULT CHARSET=utf8; -- ---------------------------- -- Records of student -- ---------------------------- BEGIN; INSERT INTO `student` VALUES (1, 'Zhang San', '2010-02-05', 156, 1, '2019-08-29 19:55:36'); INSERT INTO `student` VALUES (2, 'Li Si', '2010-03-05', 176, 2, '2019-08-29 19:55:36'); INSERT INTO `student` VALUES (3, 'Wang Ping', '2010-03-05', 176, 2, '2019-08-29 20:09:03'); COMMIT; -- ---------------------------- -- Table structure for student_class -- ---------------------------- DROP TABLE IF EXISTS `student_class`; CREATE TABLE `student_class` ( `id` bigint(20) NOT NULL, `student_id` bigint(20) NOT NULL DEFAULT '0', `name` varchar(255) NOT NULL DEFAULT '', `student_count` int(11) NOT NULL DEFAULT '0', PRIMARY KEY (`id`) ) ENGINE=InnoDB DEFAULT CHARSET=utf8; -- ---------------------------- -- Records of student_class -- ---------------------------- BEGIN; INSERT INTO `student_class` VALUES (1, 1, 'One(1)class', 32); INSERT INTO `student_class` VALUES (2, 2, 'Two(2)class', 12); COMMIT; -- ---------------------------- -- Table structure for student_detail -- ---------------------------- DROP TABLE IF EXISTS `student_detail`; CREATE TABLE `student_detail` ( `id` bigint(20) NOT NULL, `student_id` bigint(20) NOT NULL DEFAULT '0', `introduce` varchar(255) NOT NULL DEFAULT '', `mobile` varchar(64) NOT NULL DEFAULT '', PRIMARY KEY (`id`) ) ENGINE=InnoDB DEFAULT CHARSET=utf8; -- ---------------------------- -- Records of student_detail -- ---------------------------- BEGIN; INSERT INTO `student_detail` VALUES (1, 1, 'Introduction by Zhang San', '18888888888'); INSERT INTO `student_detail` VALUES (2, 2, 'Introduction of Li Si', '13333333333'); COMMIT; -- ---------------------------- -- Table structure for student_parent -- ---------------------------- DROP TABLE IF EXISTS `student_parent`; CREATE TABLE `student_parent` ( `id` bigint(20) NOT NULL, `student_id` bigint(20) NOT NULL DEFAULT '0', `name` varchar(255) NOT NULL DEFAULT '', `birthday` date NOT NULL DEFAULT '1970-01-01', `sex` tinyint(11) NOT NULL DEFAULT '1', `create_time` datetime NOT NULL DEFAULT CURRENT_TIMESTAMP, PRIMARY KEY (`id`) ) ENGINE=InnoDB DEFAULT CHARSET=utf8; -- ---------------------------- -- Records of student_parent -- ---------------------------- BEGIN; INSERT INTO `student_parent` VALUES (1, 1, 'Zhang San's father', '1980-02-02', 1, '2019-08-29 20:00:58'); INSERT INTO `student_parent` VALUES (2, 1, 'Zhang San's Mother', '1982-07-07', 2, '2019-08-29 20:00:58'); INSERT INTO `student_parent` VALUES (3, 2, 'Four Fathers of Li', '1979-03-03', 1, '2019-08-29 20:00:58'); INSERT INTO `student_parent` VALUES (4, 2, 'Mother Li Si', '1981-06-06', 2, '2019-08-29 20:00:58'); COMMIT;
The synchronization configuration is as follows:
# name must be filled in name = "mysql2esmodelDemo" # Internal library name for loci, heartbeat, etc., default to _gravity internal-db-name = "_gravity" # # Definition of the Input plug-in, where mysql is used # [input] type = "mysql" mode = "replication" [input.config.source] host = "192.168.1.148" username = "root" password = "mysqldev" port = 3306 max-idle = 10 max-open = 10 [output] type = "esmodel" [output.config] # Ignore 400 (bad request) return # Elastic search returns 400 errors when the index name is not standardized and parsing errors occur # The default is false, which means that an exception will be thrown when a failure occurs and must be handled manually. These requests are ignored when set to true ignore-bad-request = true # # Target-side Elastic search configuration # - Required selection # [output.config.server] # Connected Elastic search address, required urls = ["http://192.168.1.152:9200"] # Whether to sniff nodes, default to false sniff = false # Timeout, default 1000ms timeout = 500 # # Target End Authentication Configuration # - Optional # [output.config.server.auth] username = "" password = "" [[output.config.routes]] match-schema = "test" # Master table match-table = "student" #Index Name index-name="student_index" #Type name, es7 item is invalid type-name="student" #Number of fragments shards-num=1 #Number of copies replicas-num=0 #Number of failed retries retry-count=3 #Included columns, default all include-column = [] #Excluded columns, not by default exclude-column = [] # List escalation strategy [output.config.routes.convert-column] name = "studentName" [[output.config.routes.one-one]] match-schema = "test" match-table = "student_detail" #foreign key column fk-column = "student_id" #Included columns, default all include-column = [] #Excluded columns, not by default exclude-column = [] # Schema, 1: Subobjects, 2 Index Flattening mode = 2 # Attribute object name, mode 1 valid, default to source table name hump structure property-name = "studentDetail" # Attribute prefix, mode 2 valid, default to source table name hump structure property-pre = "sd_" [output.config.routes.one-one.convert-column] introduce = "introduceInfo" [[output.config.routes.one-one]] match-schema = "test" match-table = "student_class" #foreign key column fk-column = "student_id" #Included columns, default all include-column = [] #Excluded columns, not by default exclude-column = [] # Schema, 1: Subobjects, 2 Index Flattening mode = 1 # Attribute object name, mode 1 valid, default to source table name hump structure property-name = "studentClass" # Attribute prefix, mode 2 valid, default to source table name hump structure property-pre = "sc_" [output.config.routes.one-one.convert-column] name = "className" [[output.config.routes.one-more]] match-schema = "test" match-table = "student_parent" #foreign key column fk-column = "student_id" #Included columns, default all include-column = [] #Excluded columns, not by default exclude-column = [] # Property object name, default to source table name hump structure property-name = "studentParent" [output.config.routes.one-more.convert-column] name = "parentName"
Compile to the project root directory make and execute the synchronization command
./bin/gravity -config ./docs/2.0/example-mysql2esmodel.toml
The index structure after synchronization is:
{ "state": "open", "settings": { "index": { "creation_date": "1567160065596", "number_of_shards": "1", "number_of_replicas": "0", "uuid": "noe_V-RdTr6QaFDy4fPRjA", "version": { "created": "7030199" }, "provided_name": "student_index" } }, "mappings": { "_doc": { "properties": { "birthday": { "type": "date" }, "studentParent": { "type": "nested", "properties": { "birthday": { "type": "date" }, "parentName": { "type": "text", "fields": { "keyword": { "ignore_above": 256, "type": "keyword" } } }, "create_time": { "type": "date" }, "sex": { "type": "long" }, "student_id": { "type": "long" }, "id": { "type": "long" } } }, "high": { "type": "long" }, "create_time": { "type": "date" }, "sex": { "type": "long" }, "studentName": { "type": "text", "fields": { "keyword": { "ignore_above": 256, "type": "keyword" } } }, "sd_student_id": { "type": "long" }, "introduceInfo": { "type": "text", "fields": { "keyword": { "ignore_above": 256, "type": "keyword" } } }, "sd_id": { "type": "long" }, "id": { "type": "long" }, "sd_mobile": { "type": "text", "fields": { "keyword": { "ignore_above": 256, "type": "keyword" } } }, "studentClass": { "properties": { "sc_id": { "type": "long" }, "className": { "type": "text", "fields": { "keyword": { "ignore_above": 256, "type": "keyword" } } }, "sc_student_count": { "type": "long" }, "sc_student_id": { "type": "long" } } } } } } }
The data samples are as follows:
{ "_index": "student_index", "_type": "_doc", "_id": "2", "_version": 5, "_score": 1, "_source": { "studentClass": { "className": "Two(2)class", "sc_id": 2, "sc_student_count": 12, "sc_student_id": 2 }, "sd_student_id": 2, "introduceInfo": "Introduction of Li Si", "sd_id": 2, "sd_mobile": "13333333333", "studentParent": [ { "birthday": "1981-06-06T00:00:00+08:00", "parentName": "Mother Li Si", "create_time": "2019-08-29T20:00:58+08:00", "sex": 2, "student_id": 2, "id": 4 }, { "birthday": "1979-03-03T00:00:00+08:00", "parentName": "Four Fathers of Li", "create_time": "2019-08-29T20:00:58+08:00", "sex": 1, "student_id": 2, "id": 3 } ], "birthday": "2010-03-05T00:00:00+08:00", "high": 176, "create_time": "2019-08-29T19:55:36+08:00", "sex": 2, "studentName": "Li Si", "id": 2 } }