Log mining method of Henkel database

Posted by masgas on Tue, 02 Nov 2021 02:58:22 +0100

catalogue
environment
Document purpose
detailed information

environment
System platform: Linux x86-64 Red Hat Enterprise Linux 7
Version: 4.3.4.7
Document purpose
This paper mainly introduces how to mine the database wal log through walminer, including the installation and use of wal.

detailed information
1, Introduction to walminer

1. Brief introduction

Walminer is an open source software of Highgo. Walminer is a tool for parsing executed SQL statements from PostgreSQL WAL(write ahead logs) logs and generating corresponding undo SQL statements.

2. Version control

Support range: all versions of HGDB are supported
Versions before PG9.5.0 have not been tested
Walminer is used for PG9.5.0~PG10.X_ 10_ X
Versions after PG11 use WalMiner_11_X
Version 10.0.1
1.WalMiner supports resolving any wal log level above the minimum level.
2. There is no need to set the table to IDENTITY FULL mode.
3. Add analysis of wal records modified in the system table.
4. bug modification of mismatch between relfilenode and other database parsing
Version 11.0.1
1.WalMiner supports resolving any wal log level above the minimum level.
2. There is no need to set the table to IDENTITY FULL mode.
3. Add analysis of wal records modified in the system table.
4. bug modification of mismatch between relfilenode and other database parsing
5.11 supporting modification

3. Download address:
WalMiner

4. Use premise

The database log level configuration needs to be greater than minimum. If you want to do the most complete log mining, it is recommended to set it to logical.

2, Installation of walminer

1. Determine database version

highgo=# select version();
                                                 version                                                 
----------------------------------------------------------------------------------------------------------
 PostgreSQL 10.5 on x86_64-pc-linux-gnu, compiled by gcc (GCC) 4.8.5 20150623 (NeoKylin 4.8.5-16), 64-bit
(1 row)
pg10.5,use walminer_10_x edition       

2. Database operation parameter setting

①Create archive directory
[root@mode2 pg_wal]# mkdir /hgdbbak/archive/ -p
②Modify database parameters                

The following three parameters must be set. According to the library log level configuration, they need to be greater than minimal                

wal_level Divided into minimal, replica, or logical,If you want to do the most complete log mining, it is recommended to set it to logical.                 

highgo=# alter system set wal_level = 'replica';
highgo=# alter system set archive_mode = on;
highgo=# alter system set archive_directory = '/hgdbbak/archive/';                

③Restart the database to take effect                

[root@mode2 pg_wal]# pg_ctl restart     

3. Install walminer

①To be downloaded walniner Folder code uploaded to the database contrib Directory, execute make;make install
[root@mode2 walminer]# pwd
/opt/HighGo4.3.4.7-see/share/postgresql/contrib/walminer
[root@mode2 walminer]# ls
datadictionary.c logminer.c Makefile pg_logminer.c walminer--1.0.sql xlogminer_contents.c xlogreader_logmine
datadictionary.h logminer.h organizsql.c pg_logminer.h walminer.control xlogminer_contents.h
[root@mode2 walminer]# make
Makefile:16: ../../src/Makefile.global: No such file or directory
Makefile:17: /contrib/contrib-global.mk: No such file or directory
make: *** No rule to make target `/contrib/contrib-global.mk'. Stop.
               

②Modify the file information according to the error information during compilation
[root@mode2 walminer]# vi Makefile                
top_builddir = /opt/HighGo4.3.4.7-see/lib/postgresql/pgxs    ----Modify this directory
include $(top_builddir)/src/Makefile.global
include $(top_srcdir)/contrib/contrib-global.mk
               
[root@mode2 walminer]# find / -name Makefile.global
/opt/HighGo4.3.4.7-see/lib/postgresql/pgxs/src/Makefile.global                
[root@mode2 walminer]# make
Makefile:17: /opt/HighGo4.3.4.7-see/lib/postgresql/pgxs//contrib/contrib-global.mk: No such file or directory
make: *** No rule to make target `/opt/HighGo4.3.4.7-see/lib/postgresql/pgxs//contrib/contrib-global.mk'.  Stop.                
③Configure compilation parameters
[root@mode2 walminer]# make USE_PGXS=1
gcc -Wall -Wmissing-prototypes -Wpointer-arith -Wdeclaration-after-statement -Wendif-labels -Wmissing-format-attributeity -fno-strict-aliasing -fwrapv -fexcess-precision=standard -O2 -fPIC -I. -I./ -I/opt/HighGo4.3.4.7-see/include/postg/opt/HighGo4.3.4.7-see/include/postgresql/internal  -D_GNU_SOURCE -I/usr/include/libxml2   -c -o pg_logminer.o pg_logm
pg_logminer.c: In function 'logminer_elog':
pg_logminer.c:151:2: warning: function might be possible candidate for 'gnu_printf' format attribute [-Wsuggest-attrib
  vfprintf(tempFileOpen, _(fmt), args);
  ^
gcc -Wall -Wmissing-prototypes -Wpointer-arith -Wdeclaration-after-statement -Wendif-labels -Wmissing-format-attributeity -fno-strict-aliasing -fwrapv -fexcess-precision=standard -O2 -fPIC -I. -I./ -I/opt/HighGo4.3.4.7-see/include/postg/opt/HighGo4.3.4.7-see/include/postgresql/internal  -D_GNU_SOURCE -I/usr/include/libxml2   -c -o logminer.o logminer.c
gcc -Wall -Wmissing-prototypes -Wpointer-arith -Wdeclaration-after-statement -Wendif-labels -Wmissing-format-attributeity -fno-strict-aliasing -fwrapv -fexcess-precision=standard -O2 -fPIC -I. -I./ -I/opt/HighGo4.3.4.7-see/include/postg/opt/HighGo4.3.4.7-see/include/postgresql/internal  -D_GNU_SOURCE -I/usr/include/libxml2   -c -o organizsql.o organizs
gcc -Wall -Wmissing-prototypes -Wpointer-arith -Wdeclaration-after-statement -Wendif-labels -Wmissing-format-attributeity -fno-strict-aliasing -fwrapv -fexcess-precision=standard -O2 -fPIC -I. -I./ -I/opt/HighGo4.3.4.7-see/include/postg/opt/HighGo4.3.4.7-see/include/postgresql/internal  -D_GNU_SOURCE -I/usr/include/libxml2   -c -o xlogreader_logminer.ominer.c
gcc -Wall -Wmissing-prototypes -Wpointer-arith -Wdeclaration-after-statement -Wendif-labels -Wmissing-format-attributeity -fno-strict-aliasing -fwrapv -fexcess-precision=standard -O2 -fPIC -I. -I./ -I/opt/HighGo4.3.4.7-see/include/postg/opt/HighGo4.3.4.7-see/include/postgresql/internal  -D_GNU_SOURCE -I/usr/include/libxml2   -c -o datadictionary.o data
gcc -Wall -Wmissing-prototypes -Wpointer-arith -Wdeclaration-after-statement -Wendif-labels -Wmissing-format-attributeity -fno-strict-aliasing -fwrapv -fexcess-precision=standard -O2 -fPIC -I. -I./ -I/opt/HighGo4.3.4.7-see/include/postg/opt/HighGo4.3.4.7-see/include/postgresql/internal  -D_GNU_SOURCE -I/usr/include/libxml2   -c -o xlogminer_contents.o nts.c
gcc -Wall -Wmissing-prototypes -Wpointer-arith -Wdeclaration-after-statement -Wendif-labels -Wmissing-format-attributeity -fno-strict-aliasing -fwrapv -fexcess-precision=standard -O2 -fPIC -shared -o walminer.so pg_logminer.o logminer.ologreader_logminer.o datadictionary.o xlogminer_contents.o -L/opt/HighGo4.3.4.7-see/lib    -Wl,--as-needed -Wl,-rpath,.4.7-see/lib',--enable-new-dtags 
④install walminer
                   

[root@mode2 walminer]# make USE_PGXS=1 install
/usr/bin/mkdir -p '/opt/HighGo4.3.4.7-see/lib/postgresql'
/usr/bin/mkdir -p '/opt/HighGo4.3.4.7-see/share/postgresql/extension'
/usr/bin/mkdir -p '/opt/HighGo4.3.4.7-see/share/postgresql/extension'
/usr/bin/install -c -m 755  walminer.so '/opt/HighGo4.3.4.7-see/lib/postgresql/walminer.so'
/usr/bin/install -c -m 644 .//walminer.control '/opt/HighGo4.3.4.7-see/share/postgresql/extension/'
/usr/bin/install -c -m 644 .//walminer--1.0.sql  '/opt/HighGo4.3.4.7-see/share/postgresql/extension/'                    

⑤Create in database walminer extend                    

[root@mode2 walminer]# psql -U sysdba -d highgo
NOTICE: 
-------------------------------------------
Login User: sysdba
Login time: 2020-05-18 10:28:15.806217+08
Login Address: [local]
Last Login Status: SUCCESS
Login Failures: 0
Valied Until: infinity
-------------------------------------------
                   

psql (4.3.4.7)
Type "help" for help.                    

highgo=# create extension walminer;
CREATE EXTENSION

3, Use restrictions

1. Only DML statements can be parsed, and DDL statements are not processed.

Future changes: the parsing of DDL statements has been put into todolist, and various DDL statements may be supported step by step.

2. After such DDL statements as delete table, truncate table, change table space, change table field type and vacuum full are executed, the DML statements related to this table before the DDL statement will not be resolved.
Countermeasures: it is recommended to save a data dictionary before changing the table structure to ensure that the historical wal log can be parsed.
Future changes: now we have considered adding the function of saving data dictionary in walminer.

3. The parsing result depends on the data dictionary. (for example: create the table t1 with the owner of user1, but change the owner to user2 in the middle. In the parsing result, all T1 related operation owners will be marked as user2).
Countermeasures: it is recommended to save a data dictionary before changing the table structure to ensure that the historical wal log can be parsed.
Future changes: now we have considered adding the function of saving data dictionary in walminer.

4. The ctid attribute of the undo field in the parsing result is the value at the time of change. If the ctid is changed due to operations such as vacuum, this value will be inaccurate. For data that may have duplicate rows, we need to determine the number of tuple s corresponding to undo through this value, which does not mean that the undo statement can be executed directly.

5. After the DDL statement of the table field drop is executed, the values related to the field before the DDL statement will be resolved to the form of encode ('AD976BC56F ', hex). In addition, the user-defined type will also be resolved to this form.

6. Only wal files consistent with the timeline of the data dictionary can be parsed.

7.WalMiner is a personal product and has not been fully tested for the time being.

8. It is not recommended to use walminer to parse the wal log generated by a large number of copy statements (inserting a large number of data rows in the same transaction), which will lead to low efficiency and high memory consumption in the parsing process.

9. Parameter interpretation

--Parsing syntax:                    

select walminer_start('START_TIMSTAMP','STOP_TIMESTAMP','START_XID','STOP_XID')                    

--If all logs are analyzed:                
select   walminer_start('null','null',0,0);                
--Output system table modification results to $PGDATA/walminer/temp Below:                
select   walminer_start('null','null',0,0,true);          

START_TIMESTAMP: Specifies the earliest record entry in the output result, that is, the analysis data is output from this time; If the parameter value is empty, the output starts with the earliest data in the analysis log list; If the specified time of the parameter value is not included in the analyzed xlog list, that is, it is found through analysis that all are earlier than the specified time of the parameter, a null value is returned.

STOP_TIMESTAMP: Specifies the latest record entry in the data result, that is, if the output result is greater than this time, the analysis will be stopped and there is no need to continue to output; If the parameter value is empty, start_ All logs from timestamp are analyzed and output.
START_XID: function and start_ The same as timestamp, specify the starting XID value;
STOP_XID: function and stop_ Same as timestamp, specify the XID value of the end
Note: only one of the two groups of parameters can be valid input, otherwise an error will be reported.

4, Scenario 1 direct parsing of WAL log generation library

1. New table and test data

highgo=# create table t2(i int,j int, k varchar);               
CREATE TABLE
highgo=#
highgo=# insert into t2 values(1,1,'qqqqqq');
INSERT 0 1
highgo=# insert into t2 values(2,2,'wwwwww');
INSERT 0 1
highgo=# insert into t2 values(3,3,'eeeee');
INSERT 0 1
highgo=#  update t2 set k = '1111qqqqq' where i = 1;
UPDATE 1
highgo=# delete from t2 where j = 2;
DELETE 1
highgo=# insert into t2 values(4,4,'44444rrrrrr');                    

INSERT 0 1                    

2. View the wal log currently in use and switch the wal log

highgo=# select pg_walfile_name(pg_current_wal_lsn());
     pg_walfile_name     
--------------------------
 000000010000000000000002
(1 row)                     

highgo=# select pg_switch_wal();                    

 pg_switch_wal                     

---------------
                   

 0/2034158
(1 row)                

3. Load the wal log to the parser, and add PG this time_ A single wal log in the wal directory. Note that it is a relative path. You can add a single wal log or load PG_ All logs in the wal directory or wal log archive directory

highgo=# select walminer_wal_add('pg_wal/000000010000000000000002');
NOTICE:  Get data dictionary from current database.
  walminer_wal_add 
--------------------
 1 file add success
(1 row)                    


                   

Example 1: Loading wal Single in archive directory wal Log, using absolute path                    

highgo=# select walminer_wal_add('/hgdbbak/archive/000000010000000000000003');
NOTICE:  Get data dictionary from current database.
  walminer_wal_add 
--------------------
 1 file add success
(1 row)                    


                   

Example 2: add entire wal Archive logs in the directory, using absolute paths                    

highgo=# select walminer_wal_add('/hgdbbak/archive');                    

  walminer_wal_add 
--------------------
 2 file add success
(1 row)
                   

4. View the log loaded in the parser
highgo=# select walminer_wal_list();
walminer_wal_list

(/opt/HigoGO4.3.4.7-see/data/pg_wal/000000010000000000000003)
(1 row)
5. Perform parsing

highgo=# select walminer_start('NULL','NULL',0,0);
NOTICE:  Change Wal Segment To:pg_wal/000000010000000000000002
NOTICE:  Change Wal Segment To:(null)
   walminer_start   
---------------------
 walminer sucessful!
(1 row)            

6. View parsing results

highgo=# \x
Expanded display is on.
highgo=# select record_database,record_user,op_text,op_undo from walminer_contents;
-[ RECORD 1 ]---+-----------------------------------------------------------------------------------------------------
record_database | highgo
record_user     | sysdba
op_text         | INSERT INTO "public"."t2"("i", "j", "k") VALUES(1, 1, 'qqqqqq');
op_undo         | DELETE FROM "public"."t2" WHERE "i"=1 AND "j"=1 AND "k"='qqqqqq' AND ctid = '(0,1)';
-[ RECORD 2 ]---+-----------------------------------------------------------------------------------------------------
record_database | highgo
record_user     | sysdba
op_text         | INSERT INTO "public"."t2"("i", "j", "k") VALUES(2, 2, 'wwwwww');
op_undo         | DELETE FROM "public"."t2" WHERE "i"=2 AND "j"=2 AND "k"='wwwwww' AND ctid = '(0,2)';
-[ RECORD 3 ]---+-----------------------------------------------------------------------------------------------------
record_database | highgo
record_user     | sysdba
op_text         | INSERT INTO "public"."t2"("i", "j", "k") VALUES(3, 3, 'eeeee');
op_undo         | DELETE FROM "public"."t2" WHERE "i"=3 AND "j"=3 AND "k"='eeeee' AND ctid = '(0,3)';
-[ RECORD 4 ]---+-----------------------------------------------------------------------------------------------------
record_database | highgo
record_user     | sysdba
op_text         | UPDATE "public"."t2" SET "k" = '1111qqqqq' WHERE "i"=1 AND "j"=1 AND "k"='qqqqqq';
op_undo         | UPDATE "public"."t2" SET "k" = 'qqqqqq' WHERE "i"=1 AND "j"=1 AND "k"='1111qqqqq' AND ctid = '(0,4)'
-[ RECORD 5 ]---+-----------------------------------------------------------------------------------------------------
record_database | highgo
record_user     | sysdba
op_text         | DELETE FROM "public"."t2" WHERE "i"=2 AND "j"=2 AND "k"='wwwwww';
op_undo         | INSERT INTO "public"."t2"("i", "j", "k") VALUES(2, 2, 'wwwwww');
-[ RECORD 6 ]---+-----------------------------------------------------------------------------------------------------
record_database | highgo
record_user     | sysdba
op_text         | INSERT INTO "public"."t2"("i", "j", "k") VALUES(4, 4, '44444rrrrrr');
op_undo         | DELETE FROM "public"."t2" WHERE "i"=4 AND "j"=4 AND "k"='44444rrrrrr' AND ctid = '(0,5)';                
highgo=# \q                

Note: you can see the insert statement and corresponding undo statement just executed by parsing the content

7. Stop parsing and free memory

test=# select walminer_stop();
   walminer_stop  
-------------------
 walminer cleaned!
(1 row)      

5, Perform wal log parsing from non wal generated databases

1. Create extension in production Library

test=# create extension xlogminer;      

2. Create data dictionary

test=# select walminer_build_dictionary('/hgdbbak/test');
 walminer_build_dictionary 
---------------------------
 Dictionary build success!
(1 row)       

3. Create an extension in the test library

highgo=# create extension walminer ;
CREATE EXTENSION  

4. Load the dictionary library, move the dictionary created in the production library to / opt, and load the dictionary library

highgo=# select walminer_load_dictionary('/opt/test')
highgo-# ;
 walminer_load_dictionary 
--------------------------
 Dictionary load success!
(1 row) 

5. Load wal log

highgo=# select walminer_wal_add('/hgdbbak/archive');
  walminer_wal_add 
--------------------
 3 file add success
(1 row)                
highgo=# select walminer_wal_list();
              walminer_wal_list             
---------------------------------------------
 (/hgdbbak/archive/000000010000000000000003)
 (/hgdbbak/archive/000000010000000000000004)
 (/hgdbbak/archive/000000010000000000000005)
(3 rows)            

6. Start parsing wal log

highgo=# select walminer_start('null','null',900,1000);
NOTICE: Change Wal Segment To:/hgdbbak/archive/000000010000000000000003
NOTICE: wal record after time 2020-05-18 11:02:36+08 or 0/300d2e8 will be analyse completely
NOTICE: Change Wal Segment To:/hgdbbak/archive/000000010000000000000004
NOTICE: Change Wal Segment To:/hgdbbak/archive/000000010000000000000005
NOTICE: Change Wal Segment To:(null)
   walminer_start   
---------------------
 walminer sucessful!
(1 row)          

7. View resolved content

highgo=# \x
Expanded display is on.
highgo=# select * from walminer_contents ;
-[ RECORD 1 ]-----+-------------------------------------------------------------------------------------------------
----------------------------------------------
xid | 983
virtualxid | 1
timestamptz | 2020-05-18 13:42:56.797565+08
record_database | test
record_user | sysdba
record_tablespace | pg_default
record_schema | public
op_type | INSERT
op_text | INSERT INTO "public"."gt_test1"("id", "passwd", "num") VALUES(2, 'be236f80d7aa78bf06ca288edbdc03
3e', 3001);
op_undo | DELETE FROM "public"."gt_test1" WHERE "id"=2 AND "passwd"='be236f80d7aa78bf06ca288edbdc033e' AND
 "num"=3001 AND ctid = '(0,1)';
-[ RECORD 2 ]-----+-------------------------------   

Topics: Database PostgreSQL