1. Description
The description of each component is shown in the table below:
assembly | edition | The server | Installation tutorial | effect |
---|---|---|---|---|
canal | 1.1.5 | canal1, canal2, canal3 | canal HA installation + mysql multi node automatic switching | |
zookeeper | 3.6.3 | canal1, canal2, canal3 | zookeeper fully distributed installation | |
mysql | 8.0.25 | canal1, canal2 | MySQL master master architecture construction | Source database |
mysql | 8.0.25 | canal3 | Install MySQL version 8.0.25 in CentOS 7 | ClientAdapter remote configuration database |
postgresql | 13 | canal3 | Centos7 installing and configuring PostgreSQL13 | Target database |
postgresql creates database db1 and table db1.public.tb1_ one
2. canal.deploy configuration
- canal.properties configuration
canal.serverMode = tcp
- instance configuration
example_ rdb_ The db1 / instance.properties setting only synchronizes the data of the db1 database, as follows
################################################# ## mysql serverId , v1.0.26+ will autoGen # canal2 is 1262 and canal3 is 1263 canal.instance.mysql.slaveId=1261 # enable gtid use true/false canal.instance.gtidon=false # position info canal.instance.master.address=canal2:3306 canal.instance.master.journal.name=mysql-bin.000026 canal.instance.master.position=4 canal.instance.master.timestamp= canal.instance.master.gtid= # rds oss binlog canal.instance.rds.accesskey= canal.instance.rds.secretkey= canal.instance.rds.instanceId= # table meta tsdb info canal.instance.tsdb.enable=true #canal.instance.tsdb.url=jdbc:mysql://127.0.0.1:3306/canal_tsdb #canal.instance.tsdb.dbUsername=canal #canal.instance.tsdb.dbPassword=canal canal.instance.standby.address = canal1:3306 #canal.instance.standby.journal.name = #canal.instance.standby.position = #canal.instance.standby.timestamp = #canal.instance.standby.gtid= # username/password canal.instance.dbUsername=canal canal.instance.dbPassword=Canal_123 canal.instance.connectionCharset = UTF-8 # enable druid Decrypt database password canal.instance.enableDruid=false #canal.instance.pwdPublicKey=MFwwDQYJKoZIhvcNAQEBBQADSwAwSAJBALK4BUxdDltRRE5/zXpVEVPUgunvscYFtEip3pmLlhrWpacX7y7GCMo2/JM6LeHmiiNdH1FWgGCpUfircSwlWKUCAwEAAQ== # table regex canal.instance.filter.regex=db1\\.tb\\d_\\d # table black regex canal.instance.filter.black.regex=mysql\\.slave_.* # table field filter(format: schema1.tableName1:field1/field2,schema2.tableName2:field1/field2) #canal.instance.filter.field=test1.t_product:id/subject/keywords,test2.t_company:id/name/contact/ch # table field black filter(format: schema1.tableName1:field1/field2,schema2.tableName2:field1/field2) #canal.instance.filter.black.field=test1.t_product:subject/product_image,test2.t_company:id/name/contact/ch # mq config canal.mq.topic=example # dynamic topic route by schema or table regex #canal.mq.dynamicTopic=mytest1.user,mytest2\\..*,.*\\..* canal.mq.partition=0 # hash partition config #canal.mq.partitionsNum=3 #canal.mq.partitionHash=test.table:id^name,.*\\..* #canal.mq.dynamicTopicPartitionNum=test.*:4,mycanal:6 ################################################# # Heartbeat check needs to be turned on canal.instance.detecting.enable = true # Heartbeat check sql canal.instance.detecting.sql = select 1 # Heartbeat check frequency canal.instance.detecting.interval.time = 3 # Threshold value of the number of failed heartbeat checks. If the threshold value is exceeded, mysql link switching will be triggered, such as switching to the standby machine to continue consuming binlog canal.instance.detecting.retry.threshold = 3 # Whether to enable master/standby switching after the heartbeat check exceeds the threshold of failure times canal.instance.detecting.heartbeatHaEnable = true
3. Get ClientAdapter
The canadian.adapter-1.1.5.tar.gz downloaded from github has postgresql compatibility problems. Use source code compilation instead. Please Refer here , then upload and unzip
[root@canal1 ~]# [root@canal1 ~]# mkdir canal.adapter-1.1.5 [root@canal1 ~]# [root@canal1 ~]# tar -zxvf canal.adapter-1.1.5.tar.gz -C canal.adapter-1.1.5 [root@canal1 ~]#
4. ClientAdapter configuration
4.1 conf/application.yml configuration
server: port: 8081 logging: level: com.alibaba.otter.canal.client.adapter.rdb: DEBUG spring: jackson: date-format: yyyy-MM-dd HH:mm:ss time-zone: GMT+8 default-property-inclusion: non_null canal.conf: mode: tcp #tcp kafka rocketMQ rabbitMQ flatMessage: true zookeeperHosts: canal1:2181,canal2:2181,canal3:2181 batchSize: 500 syncBatchSize: 1000 retries: -1 # -1 means infinite retry timeout: accessKey: secretKey: consumerProperties: # canal tcp consumer # canal.tcp.server.host: 127.0.0.1:11111 canal.tcp.zookeeper.hosts: canal1:2181,canal2:2181,canal3:2181 canal.tcp.batch.size: 500 canal.tcp.username: canal.tcp.password: # kafka consumer kafka.bootstrap.servers: 127.0.0.1:9092 kafka.enable.auto.commit: false kafka.auto.commit.interval.ms: 1000 kafka.auto.offset.reset: latest kafka.request.timeout.ms: 40000 kafka.session.timeout.ms: 30000 kafka.isolation.level: read_committed kafka.max.poll.records: 1000 # rocketMQ consumer rocketmq.namespace: rocketmq.namesrv.addr: 127.0.0.1:9876 rocketmq.batch.size: 1000 rocketmq.enable.message.trace: false rocketmq.customized.trace.topic: rocketmq.access.channel: rocketmq.subscribe.filter: # rabbitMQ consumer rabbitmq.host: rabbitmq.virtual.host: rabbitmq.username: rabbitmq.password: rabbitmq.resource.ownerId: srcDataSources: defaultDS: url: jdbc:mysql://canal1:3306/default_ test?useUnicode=true # default_ No data in test database username: root password: Root_123 canalAdapters: - instance: example_rdb_db1 # canal instance Name or mq topic name groups: # An instance can be consumed by multiple group s in parallel - groupId: g1 outerAdapters: - name: logger # - name: rdb # key: mysql1 # properties: # jdbc.driverClassName: com.mysql.jdbc.Driver # jdbc.url: jdbc:mysql://127.0.0.1:3306/mytest2?useUnicode=true # jdbc.username: root # jdbc.password: 121212 # - name: rdb # key: oracle1 # properties: # jdbc.driverClassName: oracle.jdbc.OracleDriver # jdbc.url: jdbc:oracle:thin:@localhost:49161:XE # jdbc.username: mytest # jdbc.password: m121212 - name: rdb key: postgres1 # Unique value, corresponding to outerAdapterKey of table mapping configuration properties: jdbc.driverClassName: org.postgresql.Driver jdbc.url: jdbc:postgresql://canal3:5432/db1 jdbc.username: postgres jdbc.password: postgres123 threads: 2 commitSize: 3000 # - name: hbase # properties: # hbase.zookeeper.quorum: 127.0.0.1 # hbase.zookeeper.property.clientPort: 2181 # zookeeper.znode.parent: /hbase # - name: es # hosts: 127.0.0.1:9300 # 127.0.0.1:9200 for rest mode # properties: # mode: transport # or rest # # security.auth: test:123456 # only used for rest mode # cluster.name: elasticsearch # - name: kudu # key: kudu # properties: # kudu.master.address: 127.0.0.1 # ',' split multi address
4.2 conf/rdb configuration
The adapter will automatically load all. yml terminated table mapping configuration files under conf/rdb
- db1.yml
#dataSourceKey: defaultDS #destination: example #groupId: g1 #outerAdapterKey: mysql1 #concurrent: true #dbMapping: # database: mytest # table: user # targetTable: mytest2.user # targetPk: # id: id # mapAll: true # targetColumns: # id: # name: # role_id: # c_time: # test1: # etlCondition: "where c_time>={}" # commitBatch: 3000 # Size of batch submission ## Mirror schema synchronize config dataSourceKey: defaultDS destination: example_rdb_db1 groupId: g1 outerAdapterKey: postgres1 # Corresponding to the key of outAdapters in application.yml concurrent: true # For parallel synchronized tables, the primary key must not be changed and cannot be a foreign key of other synchronized tables dbMapping: mirrorDb: true database: db1 # Source and target database names
- tb1_1.yml
Manually run the curl ETL command to synchronize the full data of the tables in this file
dataSourceKey: defaultDS destination: example_rdb_db1 groupId: g1 outerAdapterKey: postgres1 concurrent: true dbMapping: database: db1 table: tb1_1 targetTable: tb1_1 targetPk: idA: idA mapAll: true targetColumns: id: name: role_id: c_time: test1: etlCondition: "where c_time>={}" commitBatch: 3000 # Size of batch submission ## Mirror schema synchronize config #dataSourceKey: defaultDS #destination: example_rdb_db1 #groupId: g1 #outerAdapterKey: postgres1 # Corresponding to the key of outAdapters in application.yml #concurrent: true # For parallel synchronized tables, the primary key must not be changed and cannot be a foreign key of other synchronized tables #dbMapping: # mirrorDb: true # database: db1 # Source and target database names
5. Start and stop
[root@canal1 ~]# [root@canal1 ~]# /root/canal.adapter-1.1.5/bin/restart.sh [root@canal1 ~]# [root@canal1 ~]# /root/canal.adapter-1.1.5/bin/stop.sh [root@canal1 ~]#
6. REST interface command
- Query destinations:
[root@canal1 ~]# [root@canal1 ~]# curl http://canal1:8081/destinations [{"destination":"example_rdb_db1","status":"on"}] [root@canal1 ~]#
- Query the switch status of a destination
[root@canal1 ~]# [root@canal1 ~]# curl http://canal1:8081/syncSwitch/example_rdb_db1 {"stauts":"on"} [root@canal1 ~]#
- Manual operation of full ETL
[root@canal1 ~]# [root@canal1 ~]# curl http://canal1:8081/etl/rdb/postgres1/tb1_1.yml -X POST -d "" {"succeeded":true,"resultMessage":"Import RDB Data: 3"} [root@canal1 ~]#
- Client incremental data synchronization switch
[root@canal1 ~]# [root@canal1 ~]# curl http://canal1:8081/syncSwitch/example_rdb_db1/off -X PUT {"code":20000,"message":"example: example_rdb_db1 Closing synchronization succeeded"} [root@canal1 ~]#
- View the total amount of data in the target table
[root@canal1 ~]# [root@canal1 ~]# curl http://canal1:8081/count/rdb/postgres1/tb1_1.yml {"count":3,"targetTable":"tb1_1"} [root@canal1 ~]#
7. Data synchronization mode
- Incremental synchronization
- Set tb1_ 1. Move YML to conf/rdb directory
- Manually run curl ETL full synchronization command
- Set tb1_ 1. Move YML out of the conf/rdb directory
8. Legacy BUG
-
Run bin / stop.sh of adapter and example of zookeeper_ rdb_ Db1 / 1001 / cursor is deleted. After restarting, there is no example_rdb_db1/1001/cursor node, and then insert a piece of data into the mysql source database, example_ rdb_ The db1 / 1001 / cursor node has data again
-
At the beginning, a field in mysql is score A. later, the table deletion is replaced with ageA, and the table deletion is replaced with score a; The field corresponding to the postgresql table is also score. During incremental synchronization, the following error is reported:
c.a.otter.canal.adapter.launcher.loader.AdapterProcessor - java.lang.RuntimeException: java.util.concurrent.ExecutionException: java.lang.RuntimeException: java.lang.RuntimeException: Target column: ageA not matched java.lang.RuntimeException: java.lang.RuntimeException: java.util.concurrent.ExecutionException: java.lang.RuntimeException: java.lang.RuntimeException: Target column: ageA not matched
-
Manually run the curl ETL full synchronization command, tb1 of mysql_ 1 and tb1_ The data volume of Table 2 is 5 and 1 respectively, while the synchronized data volume is 3 and 3
-
conf/application.yml configuration, saving to mysql is not completed
-
Therefore, we give up the canal ClientAdapter method and implement it manually. Therefore, this article is for reference only