Detailed description of sersync and frsync:

Posted by vurentjie on Mon, 31 Jan 2022 15:06:49 +0100

1.inotify+rsync

If you want to synchronize data regularly, you can add rsync to the scheduled task at the client, but the synchronization time granularity of the scheduled task can not meet the requirements of real-time synchronization. Inotify file system monitoring mechanism is provided after Linux kernel 2.6.13. Real time synchronization can be realized through rsync+inotify combination.

There are several inotify implementation tools: inotify itself, sersync and lsyncd. Sersync is a tool developed by Zhou Yang of Jinshan, which overcomes the defect of inotify and provides several plug-ins as optional tools. Here we first introduce the usage of inotify and its defects. Through its defects, we introduce sersync and its usage.

1.1 installing inotify tools

Inotify is provided by the inotify tools package. Before installing inotify tools, please ensure that the kernel version is higher than 2.6.13, and there are the following three items in the / proc/sys/fs/inotify directory, which means that the system supports inotify monitoring. The meaning of these three items will be briefly explained below.

[root@node1 tmp]# ll /proc/sys/fs/inotify/
total 0
-rw-r--r-- 1 root root 0 Feb 11 19:57 max_queued_events
-rw-r--r-- 1 root root 0 Feb 11 19:57 max_user_instances
-rw-r--r-- 1 root root 0 Feb 11 19:57 max_user_watches

The inotify tools tool is available on the epel source, or download the source package format for compilation.

Inotify tools source package address: https://cloud.github.com/downloads/rvoicilas/inotify-tools/inotify-tools-3.14.tar.gz

The following is the compilation and installation process:

tar xf inotify-tools-3.14.tar.gz
./configure --prefix=/usr/local/inotify-tools-3.14
make && make install
ln -s /usr/local/inotify-tools-3.14 /usr/local/inotify

The inotify tools tool provides only two commands.

[root@xuexi ~]# rpm -ql inotify-tools | grep bin/
/usr/bin/inotifywait
/usr/bin/inotifywatch

inotifywait command is used to wait for the file to change, so it can realize the function of monitoring. This command is the core command of inotify. inotifywatch is used to collect statistical data of the file system, such as how many inotify events have occurred, how many times a file has been accessed, etc. it is generally not used.

Here are the kernel parameters related to inotify.

(1)./proc/sys/fs/inotify/max_queued_events: call inotify_ The maximum number of queueable events allocated to inotify instance during init. When the value is exceeded, the events will be discarded, but the queue overflow Q will be triggered_ Overflow event.

(2)./proc/sys/fs/inotify/max_user_instances: the maximum number of inotify instances that each real user can create.

(3)./proc/sys/fs/inotify/max_user_watches: the upper limit of the watches associated with each inotify instance, that is, the maximum number of directories and files that can be monitored by each inotify instance. If the number of monitored files is huge, you need to increase this value appropriately according to the situation.

For example:

[root@xuexi ~]# echo 30000000 > /proc/sys/fs/inotify/max_user_watches

1.2 inotifywait command and event analysis

Options for inotifywait command:

-m: It means to monitor all the time, otherwise it should be monitored once and quit monitoring
-r: Recursive monitoring, monitoring any files in the directory, including subdirectories. Recursive monitoring may exceed max_user_watches The value needs to be adjusted appropriately
@<file>: If the directory is monitored recursively, this option is used to exclude files that are not monitored in the recursive directory. file Whether the path is relative or absolute depends on whether the monitoring directory is relative or absolute
-q: --quiet Silent monitoring, so that some irrelevant information will not be output
-e: Specify the events to monitor. General monitoring delete,create,attrib,modify,close_write
--exclude <pattern> : Specify files that are not monitored through pattern matching, case sensitive
--excludei <pattern>: Specify files that are not monitored through pattern matching, case insensitive
--timefmt: After the event is triggered, the output time format can be specified or not. It is generally set to[--timefmt '%Y/%m/%d %H:%M:%S']
--format: User defined output formats, such as[--format '%w%f %e%T']
  %w: The monitoring path that generates the event is not necessarily the specific file where the event occurs. For example, recursively monitor a directory. If a file in the directory generates an event, it will output the directory rather than the specific file in it
  %f: If a directory is monitored, the specific file name of the event is output. In all other cases, an empty string is output
  %e: Generated event name
  %T: with"--timefmt"Output the current time in the defined time format. It is required to define it at the same time"--timefmt"

inotifywait -e monitorable events:

access: File accessed
modify: File written
attrib: Metadata has been modified. Including permissions, timestamps, extended attributes, and so on
close_write: The event that an open file is closed after it is opened for writing
close_nowrite: read only In mode, the file is closed, that is, the file can only be opened for reading, and the file is closed after reading
close: yes close_write and close_nowrite No matter how the file is opened, as long as it is closed, it belongs to this event
open: File opened
moved_to: Files or directories are moved into the monitoring directory, or they can be moved inside the monitoring directory
moved_from: Move the files or directories under the monitoring directory to other places or within the monitoring directory
move: yes moved_to and moved_from Combination of
moved_self: The monitored file or directory has been moved. After the move, the file or directory will no longer be monitored
create: A file or directory is created in the monitored directory
delete: Deleted a file or directory in the monitored directory
delete_self: The monitored file or directory is deleted. After deletion, the file or directory will no longer be monitored
umount: The file system mounted on the monitored directory is umount,umount No longer monitor this directory after
isdir : Monitoring directory related operations

Here are some examples:

[root@xuexi ~]# mkdir /longshuai

[root@xuexi ~]# inotifywait -m /longshuai   # The directory is monitored in the foreground mode. Since no monitored events are specified, all events are monitored
Setting up watches.
Watches established.

Open other sessions and perform some operations on the monitored directory to see what events will be triggered by each operation.

[root@xuexi ~]# cd  /longshuai    # Entering the directory does not trigger any events

(1). Create a file into the directory and trigger create, open attrib and close_write and close events.

[root@xuexi longshuai]# touch a.log

/longshuai/ CREATE a.log
/longshuai/ OPEN a.log
/longshuai/ ATTRIB a.log
/longshuai/ CLOSE_WRITE,CLOSE a.log

If you create a directory, there are far fewer events triggered.

[root@xuexi longshuai]# mkdir b

/longshuai/ CREATE,ISDIR b

ISDIR indicates that the object that generated the event is a directory.

(2). Modify the file attributes and trigger the attrib event.

[root@xuexi longshuai]# chown 666 a.log

/longshuai/ ATTRIB a.log

(3).cat view the file and trigger open, access and close_nowrite and close events.

[root@xuexi longshuai]# cat a.log

/longshuai/ OPEN a.log
/longshuai/ ACCESS a.log
/longshuai/ CLOSE_NOWRITE,CLOSE a.log

(4). Append, write or clear data to the file and trigger open, modify and close_write and close events.

[root@xuexi longshuai]# echo "haha" >> a.log

/longshuai/ OPEN a.log
/longshuai/ MODIFY a.log
/longshuai/ CLOSE_WRITE,CLOSE a.log

(5).vim opens a file and modifies it. Temporary files are involved, so there are many events.

[root@xuexi longshuai]# vim a.log

/longshuai/ OPEN,ISDIR
/longshuai/ CLOSE_NOWRITE,CLOSE,ISDIR
/longshuai/ OPEN,ISDIR
/longshuai/ CLOSE_NOWRITE,CLOSE,ISDIR
/longshuai/ OPEN a.log
/longshuai/ CREATE .a.log.swp
/longshuai/ OPEN .a.log.swp
/longshuai/ CREATE .a.log.swx
/longshuai/ OPEN .a.log.swx
/longshuai/ CLOSE_WRITE,CLOSE .a.log.swx
/longshuai/ DELETE .a.log.swx
/longshuai/ CLOSE_WRITE,CLOSE .a.log.swp
/longshuai/ DELETE .a.log.swp
/longshuai/ CREATE .a.log.swp
/longshuai/ OPEN .a.log.swp
/longshuai/ MODIFY .a.log.swp
/longshuai/ ATTRIB .a.log.swp
/longshuai/ CLOSE_NOWRITE,CLOSE a.log
/longshuai/ OPEN a.log
/longshuai/ CLOSE_NOWRITE,CLOSE a.log
/longshuai/ MODIFY .a.log.swp
/longshuai/ CREATE 4913
/longshuai/ OPEN 4913
/longshuai/ ATTRIB 4913
/longshuai/ CLOSE_WRITE,CLOSE 4913
/longshuai/ DELETE 4913
/longshuai/ MOVED_FROM a.log
/longshuai/ MOVED_TO a.log~
/longshuai/ CREATE a.log
/longshuai/ OPEN a.log
/longshuai/ MODIFY a.log
/longshuai/ CLOSE_WRITE,CLOSE a.log
/longshuai/ ATTRIB a.log
/longshuai/ ATTRIB a.log
/longshuai/ MODIFY .a.log.swp
/longshuai/ DELETE a.log~
/longshuai/ CLOSE_WRITE,CLOSE .a.log.swp
/longshuai/ DELETE .a.log.swp

Where "ISDIR" is identified as a directory event. In addition, it should be noted that during the vim process, several corresponding temporary files (. swp,. swx and backup files with ~) also generate events. The events related to these temporary files should not be monitored in the actual application process.

(6). Copy a file into the directory and trigger create, open, modify and close_write and close events. In fact, it is basically similar to creating a new file.

[root@xuexi longshuai]# cp /bin/find .

/longshuai/ CREATE find
/longshuai/ OPEN find
/longshuai/ MODIFY find
/longshuai/ MODIFY find
/longshuai/ CLOSE_WRITE,CLOSE find

(7). Move a file into and out of the directory.

[root@xuexi longshuai]# mv /tmp/after.log /longshuai

/longshuai/ MOVED_TO after.log
[root@xuexi longshuai]# mv /longshuai/after.log /tmp

/longshuai/ MOVED_FROM after.log

(8). Delete a file and trigger the delete event.

[root@xuexi longshuai]# rm -f a.log

/longshuai/ DELETE a.log

From the above test results, we can find that many actions involve the close event, and most of them are accompanied by the close event_ Of the write event. Therefore, in most cases, when defining monitoring events, it is not really necessary to monitor open, modify and close events. In particular, close only needs to monitor its branch event close_write and close_ Just nowrite. In general, inotify is to monitor the addition, deletion and modification of files, and its access will not be monitored. Therefore, it is generally only necessary to monitor close_ Just write.

In many cases, the operation after the trigger event is defined is judged according to the file. For example, if the file a is monitored for change (no matter what change), the operation a is executed immediately. In addition, an operation on the file often triggers multiple events. For example, cat viewing the file triggers open, access and close_nowrite and close events, which is likely to repeat operation a because multiple events are triggered. For example, in the following example, when the a.log keyword appears in the / var/log/messages file, the echo action is executed.

while inotifywait -mrq -e modify /var/log/messages; do
  if tail -n1 /var/log/messages | grep a.log; then
    echo "haha"
  fi
done

Based on the above considerations, it is recommended to close the monitoring object_ write,moved_to,moved_from, delete and isdir (mainly create and isdir, but the whole of these two events cannot be defined, so only isdir is monitored) events define the corresponding operations, because they do not duplicate each other. If necessary, you can define them separately and add other events to be monitored. For example:

[root@xuexi tmp]# cat a.sh
#!/bin/bash
#
inotifywait -mrq -e delete,close_write,moved_to,moved_from,isdir /longshuai |\
while read line;do
   if echo $line | grep -i delete &>/dev/null; then
       echo "At `date +"%F %T"`: $line" >>/etc/delete.log
   else
       rsync -az $line --password-file=/etc/rsync_back.passwd rsync://rsync_backup@172.16.10.6::longshuai
   fi
done

1.3 where should inotify be installed

inotify is a monitoring tool that monitors changes in directories or files and then triggers a series of operations.

If there is A site publisher A and three web servers B/C/D, the purpose is to automatically trigger synchronization and push them to the web server when there are changes in the files in the directory where the site is stored on server A, so that the web server can get the latest files as soon as possible. What needs to be clear is that the directory on A is monitored and pushed to the B/C/D server, so install inotify tool on site publisher A. In addition, rsync is generally configured as daemon running mode on the BCD of the web server to make it listen on port 873 (not necessary, even sersync). In other words, for rsync, the monitoring end is the client of rsync, and the others are the server of rsync.

Of course, this is only the most likely use case, not necessarily. Moreover, inotify is an independent tool. It has nothing to do with rsync. It just provides a better way of real-time synchronization for rsync.

1.4 inotify+rsync sample script (imperfect)

The following is an example of inotify+rsync script for monitoring / www directory, which is also a usage version circulating on the Internet. But note that the script is very bad and needs to be modified in actual use. This is just an example (it doesn't matter if you don't consider resource consumption).

[root@xuexi www]# cat ~/inotify.sh
#!/bin/bash
 
watch_dir=/www
push_to=172.16.10.5
inotifywait -mrq -e delete,close_write,moved_to,moved_from,isdir --timefmt '%Y-%m-%d %H:%M:%S' --format '%w%f:%e:%T' $watch_dir \
--exclude=".*.swp" |\
while read line;do
  # logging some files which has been deleted and moved out
    if echo $line | grep -i -E "delete|moved_from" &>/dev/null;then
        echo "$line" >> /etc/inotify_away.log
    fi
  # from here, start rsync's function
    rsync -az --delete --exclude="*.swp" --exclude="*.swx" $watch_dir $push_to:/tmp
    if [ $? -eq 0 ];then
        echo "sent $watch_dir success"
    else
        echo "sent $watch_dir failed"
    fi
done

Then grant the execution permission to the above script and execute it. Note that this script is used for the foreground test run. If you want to run in the background, delete the if sentence in the last paragraph.

This script records which files are deleted or removed from the monitoring directory. After the event is monitored, the triggered rsync operation is $watch for the entire monitoring directory_ Dir is synchronized, and the temporary files generated by vim are not synchronized.

The foreground runs the monitoring script.

[root@xuexi www]# ~/inotify.sh

Then test the copying and deleting of files to the monitoring directory / www.

For example, when deleting a file, the deletion event will be recorded to / etc / inotify first_ away. Log file, and then perform rsync synchronization to delete the remote corresponding file.

/www/yum.repos.d/base.repo:DELETE:2017-07-21 14:47:46
sent /www success
/www/yum.repos.d:DELETE,ISDIR:2017-07-21 14:47:46
sent /www success

For another example, copying the directory / etc/pki to / www will produce multiple rsync results.

sent /www success
sent /www success
sent /www success
sent /www success
sent /www success
sent /www success
......

Obviously, rsync is triggered many times due to copying multiple files, but in fact, rsync only needs to synchronize the / www directory to the remote end once. The redundant rsync operation is a waste of resources. It doesn't matter if you copy a few files, but if you copy thousands of files, rsync will be called for a long time. For example:

[root@xuexi www]# cp -a /usr/share/man /www

After copying more than 15000 files to the www directory, the script will cycle more than 15000 times and call rsync more than 15000 times. Although the performance consumption of rsync is very low after a directory synchronization (even if the performance waste is small, it is still a waste), it consumes a lot of time resources and network bandwidth.

1.5 shortcomings of inotify

Although inotify has been integrated into the kernel and often used to assist rsync to realize real-time synchronization at the application level, inotify is not perfect to cooperate with rsync because its design is too detailed. Therefore, it is necessary to improve inotify+rsync script or use sersync tool as much as possible. In addition, inotify has a bug.

1.5.1 inotify bug

When copying complex hierarchical directories (multi-level directories contain files) to the monitoring directory, or copying a large number of files to them, inotify often randomly omits some files. Because these missing files are not monitored, all subsequent monitoring operations will not be performed, for example, they will not be synchronized by rsync.

In fact, the problem described above is not the defect of inotify, but the defect of inotify wait tool in inotify tools package. The man document of inotifywait also gives a description of this bug.

BUGS
    There are race conditions in the recursive directory watching code which can cause events to be missed if they occur in a directory immediately after that directory is created.  This is probably not fixable.

In other words, the upper tools (such as sersync, lsyncd, etc.) that directly initiate inotify related system calls may not have this bug.

To illustrate the impact of this bug, here are some examples to prove it.

The following is a monitor for delete and close_ The example of write event monitors the / www directory, under which there is no pki directory initially.

[root@xuexi ~]# inotifywait -mrq -e delete,close_write --format '%w%f:%e' /www

There are several subdirectories under the / PKI / etc subdirectory, and there are multiple subdirectories under the subdirectory to be monitored. After summary, there are 30 ordinary files in / etc/pki directory.

[root@xuexi www]# find /etc/pki/ -type f | wc -l
30

Open another terminal and copy the pki directory to / www.

[root@xuexi www]# cp -a /etc/pki /www

At the same time, some monitoring events will be generated on the monitoring terminal.

/www/pki/rpm-gpg/RPM-GPG-KEY-CentOS-7:CLOSE_WRITE,CLOSE
/www/pki/rpm-gpg/RPM-GPG-KEY-CentOS-Debug-7:CLOSE_WRITE,CLOSE
/www/pki/rpm-gpg/RPM-GPG-KEY-CentOS-Testing-7:CLOSE_WRITE,CLOSE
/www/pki/tls/certs/Makefile:CLOSE_WRITE,CLOSE
/www/pki/tls/certs/make-dummy-cert:CLOSE_WRITE,CLOSE
/www/pki/tls/certs/renew-dummy-cert:CLOSE_WRITE,CLOSE
/www/pki/tls/misc/c_info:CLOSE_WRITE,CLOSE
/www/pki/tls/misc/c_issuer:CLOSE_WRITE,CLOSE
/www/pki/tls/misc/c_name:CLOSE_WRITE,CLOSE
/www/pki/tls/openssl.cnf:CLOSE_WRITE,CLOSE
/www/pki/ca-trust/README:CLOSE_WRITE,CLOSE
/www/pki/ca-trust/ca-legacy.conf:CLOSE_WRITE,CLOSE
/www/pki/ca-trust/extracted/java/README:CLOSE_WRITE,CLOSE
/www/pki/ca-trust/extracted/java/cacerts:CLOSE_WRITE,CLOSE
/www/pki/ca-trust/extracted/openssl/ca-bundle.trust.crt:CLOSE_WRITE,CLOSE
/www/pki/ca-trust/extracted/pem/tls-ca-bundle.pem:CLOSE_WRITE,CLOSE
/www/pki/ca-trust/extracted/pem/email-ca-bundle.pem:CLOSE_WRITE,CLOSE
/www/pki/ca-trust/extracted/pem/objsign-ca-bundle.pem:CLOSE_WRITE,CLOSE
/www/pki/ca-trust/source/README:CLOSE_WRITE,CLOSE
/www/pki/nssdb/cert8.db:CLOSE_WRITE,CLOSE
/www/pki/nssdb/cert9.db:CLOSE_WRITE,CLOSE
/www/pki/nssdb/key3.db:CLOSE_WRITE,CLOSE
/www/pki/nssdb/key4.db:CLOSE_WRITE,CLOSE
/www/pki/nssdb/pkcs11.txt:CLOSE_WRITE,CLOSE
/www/pki/nssdb/secmod.db:CLOSE_WRITE,CLOSE

Count the event results monitored above. There are 25 lines in total, that is, the copying action of 25 files is monitored, but in fact, the total number of copied files (directories and linked files are not included in the calculation) is 30. In other words, inotify missed five files.

After testing, the missing quantity and documents are not fixed, but random (so fortunately, there may be no omission), and only close_ The write event will be omitted, and there is no problem with delete. To confirm this bug, give two more examples.

Copy the / usr/share/man directory to the monitoring directory / www. there are 15441 ordinary files in this directory, and there are three levels of subdirectories at the deepest level, which is not complex.

[root@xuexi www]# find /usr/share/man/ -type f | wc -l
15441

To facilitate the calculation of the number of monitored events, redirect the event results to the file man Log.

[root@xuexi ~]# inotifywait -mrq -e delete,close_write,moved_to,moved_from,isdir --format '%w%f:%e' /www > /tmp/man.log

Start copying.

[root@xuexi www]# cp -a /usr/share/man /www

After copying, count / TMP / man The number of lines in the log file, that is, the number of events monitored.

[root@xuexi www]# cat /tmp/man.log | wc -l
15388

Obviously, 15388 files were monitored, 53 fewer than the actual copied files, which also proves the bug of inotify.

But both of the above examples monitor close_write event. In order to ensure the strictness of the proof, monitor all events. In order to facilitate subsequent statistics, redirect the monitoring results to PKI Log file, and delete the / www/pki directory before monitoring.

[root@xuexi ~]# rm -rf /www/pki

[root@xuexi ~]# inotifywait -mrq --format '%w%f:%e' /www > /tmp/pki.log

Copy the / etc/pki directory to the monitoring directory / www.

[root@xuexi ~]# cp -a /etc/pki /www

Since all events are monitored, directory related events "ISDIR" and soft link related events are also redirected to / TMP / PKI Log, in order to count the number of monitored files, PKI Remove the "ISDIR" line related to the directory in the log, and then de duplicate the same file so that multiple events of a file are counted only once. The following is the statistics command.

[root@xuexi www]# sed /ISDIR/d /tmp/pki.log | cut -d":" -f1 | sort -u | wc -l
32

The result is two more than 30 ordinary files, which is not the case because PKI The soft link files in the log are also counted, but there are a total of 35 files in the / www/pki directory.

[root@xuexi www]# find /www/pki -type f -o -type l | wc -l
35

In other words, even if all events are monitored, there are still omissions, so I think this is a bug in inotify.

However, it should be noted that this bug occurs only when copying multi-level directories including multiple files. This bug does not occur when copying a single file or a simple directory without subdirectories. For inotify+rsync, since rsync is often used to synchronize the entire directory rather than a single file after the event is triggered, this bug is not serious for rsync.

1.5.2 defects of inotify + Rsync

Due to the bug of inotify, when using inotify+rsync, you should always let rsync synchronize the directory instead of synchronizing the single files that generate events, otherwise files may be missed. On the other hand, the performance of synchronizing a single file is very poor. The following description of the relevant defects will default rsync to the directory.

When using inotify+rsync, two aspects should be considered: (1) Inotify monitoring often generates multiple events for a file, and one-time operation of multiple files in the same directory will also generate multiple events, which makes inotify almost always trigger rsync to synchronize the directory multiple times. Because rsync synchronizes the directory, it is completely unnecessary to trigger rsync multiple times, which will waste resources and network bandwidth; If the subdirectory is monitored hierarchically and independently, the real-time synchronization cannot be guaranteed (2) vim will be generated in the process of editing the file swp and Inotify will also monitor temporary files such as swx, and these temporary files will involve multiple events, so they may also be copied by rsync. Unless it is set to exclude temporary files, in any case, these temporary files should not be synchronized. In extreme cases, synchronizing vim temporary files to the server may be fatal.

Because of these two defects, inotify+rsync realized by script is almost difficult to achieve perfection. Even if it is necessary to achieve a good degree of perfection, it is not easy (don't be naive to think that the online inotify+rsync examples or the examples given by the teachers in the training video are perfect, and those things can only be regarded as correct usage examples of swallowing dates). In short, in order to ensure that inotify+rsync can ensure both synchronization performance and asynchronous temporary files, it is necessary to carefully design inotify+rsync monitoring events, loops and rsync commands.

In the process of designing inotify+rsync script, the following objectives should be considered or achieved as far as possible:

(1). Each file should generate as few monitoring events as possible, but events should not be omitted.

(2). Let rsync synchronize directories instead of individual files that generate events.

(3). One time operation to synchronize multiple files in the directory will generate multiple events, resulting in multiple triggering of rsync. If this batch of operations can trigger rsync only once, the resource consumption will be greatly reduced.

(4). When Rsync synchronizes directories, consider whether to exclude some files and add the "-- delete" option.

(5). For performance, consider designing inotify+rsync scripts separately for subdirectories and different events.

Analyze with the example script given above.

[root@xuexi www]# cat ~/inotify.sh
#!/bin/bash
 
watch_dir=/www
push_to=172.16.10.5
inotifywait -mrq -e delete,close_write,moved_to,moved_from,isdir --timefmt '%Y-%m-%d %H:%M:%S' --format '%w%f:%e:%T' $watch_dir \
--exclude=".*.swp" |\
while read line;do
  # logging some files which has been deleted and moved out
    if echo $line | grep -i -E "delete|moved_from" &>/dev/null;then
        echo "$line" >> /etc/inotify_away.log
    fi
  # from here, start rsync's function
    rsync -az --delete --exclude="*.swp" --exclude="*.swx" $watch_dir $push_to:/tmp
    if [ $? -eq 0 ];then
        echo "sent $watch_dir success"
    else
        echo "sent $watch_dir failed"
    fi
done 

The script has set as few monitoring events as possible, so that it can trigger rsync repeatedly as possible. However, it should be clear that although the design goal is to minimize triggering events, monitoring events should be defined on the premise of meeting the requirements. If you don't know how to select a monitoring event, go back to the previous article inotify command and event analysis . In addition, you can consider defining different scripts for files, directories and subdirectories to monitor different events respectively.

The main disadvantage of this script is that rsync is triggered repeatedly. In this script, rsync synchronizes the directory rather than a single file. Therefore, if multiple files in the directory are operated at one time, multiple events will be generated and multiple rsync commands will be triggered. In the previous article, an example of copying / usr/share/man is given. It calls rsync more than 15000 times. In fact, it only needs to be synchronized once. The remaining tens of thousands of synchronizations are completely redundant.

Therefore, the improvement direction of the above script is to call rsync as few as possible, but to ensure the real-time and synchronous integrity of rsync. This can be easily achieved using the sersync tool. Perhaps the author of sersync developed the tool to solve this problem. The usage of sersync will be introduced later.

1.6 optimal implementation of inotify + Rsync

The shortcomings and improvement objectives of inotify+rsync have been mentioned above. The following is an example of improving inotify+rsync by modifying the shell script.

[root@xuexi tmp]# cat ~/inotify.sh
#!/bin/bash
 
###########################################################
#  description: inotify+rsync best practice               #
#  author     : Horse Golden Dragon                                   #
#  blog       : http://www.cnblogs.com/f-ck-need-u/       #
###########################################################
 
watch_dir=/www
push_to=172.16.10.5
 
# First to do is initial sync
rsync -az --delete --exclude="*.swp" --exclude="*.swx" $watch_dir $push_to:/tmp
 
inotifywait -mrq -e delete,close_write,moved_to,moved_from,isdir --timefmt '%Y-%m-%d %H:%M:%S' --format '%w%f:%e:%T' $watch_dir \
--exclude=".*.swp" >>/etc/inotifywait.log &
 
while true;do
     if [ -s "/etc/inotifywait.log" ];then
        grep -i -E "delete|moved_from" /etc/inotifywait.log >> /etc/inotify_away.log
        rsync -az --delete --exclude="*.swp" --exclude="*.swx" $watch_dir $push_to:/tmp
        if [ $? -ne 0 ];then
           echo "$watch_dir sync to $push_to failed at `date +"%F %T"`,please check it by manual" |\
           mail -s "inotify+Rsync error has occurred" root@localhost
        fi
        cat /dev/null > /etc/inotifywait.log
        rsync -az --delete --exclude="*.swp" --exclude="*.swx" $watch_dir $push_to:/tmp
    else
        sleep 1
    fi
done

In order to trigger rsync only once for the operation of multiple files in the directory at one time, it is impossible to read the standard input through the loop mode of while read line.

My implementation method is to record the events obtained by inotifywait to the file / etc / inotifywait Log, and then judge the file in the dead loop. If the file is not empty, call rsync for synchronization once, and clear inotifywait immediately after synchronization Log file to prevent repeated calls to rsync. However, we need to consider a situation that inotifywait may continue to report to inotifywait Log and clearing the file may cause the file monitored by inotifywait to be missed by rsync during rsync synchronization. Therefore, rsync should be called again for synchronization after clearing the file, which also realizes the error handling function of failed retransmission in disguise. If no event is monitored, inotifywait Log will be an empty file. At this time, the loop will sleep for 1 second, so the script is not 100% real-time, but the error of 1 second is very worthwhile for cpu consumption.

The script only calls rsync twice for each batch of events. Although rsync cannot be triggered only once like sersync, the gap is completely negligible.

In addition, the background symbol "&" in the inotifywait command in the script must not be less, otherwise the script will always be in the inotifywait command stage and will not enter the next cycle stage. However, it should be noted that the background process in the script (sub shell) will not stop at the end of the script, but is attached to the init/systemd process with pid=1. In this case, you can directly use the "kill script"_ File to stop the script, so that the background in the script will be interrupted. If you want to implement such a function directly in the script, see: How to make shell scripts kill themselves.

In fact, the above script is far from perfect. A more perfect way is to provide a judgment function. If the monitored directory is very large (i.e. there are a large number of files), you should transfer a changed single file instead of synchronizing the whole directory. If the monitored directory is not large, you can consider synchronizing the whole directory. sersync uses the method of monitoring the directory, but takes out the changed single file and synchronizes it, and provides a scheduled task to decide how often to synchronize the whole directory, that is, the whole directory is synchronized. All this can be realized through shell script, including its multithreading. If you are interested, you can write it yourself.

2.sersync

sersync is similar to inotify and is also used for monitoring, but it overcomes several shortcomings of inotify.

As mentioned earlier, inotify's biggest disadvantage is that it will generate repeated events, or the operation of multiple files in the same directory will generate multiple events (for example, when there are 5 files in the monitoring directory, 6 monitoring events will be generated when deleting the directory), resulting in repeated calls of rsync commands. Inotify will monitor the events of temporary files when vim files, but these events should not be monitored compared with rsync.

The best implementation script of inotify+rsync has been given above, which overcomes the above two problems. sersync can also overcome this problem, and its implementation is simpler. In addition, it has the advantage of multithreading.

sersync project address: https://code.google.com/archive/p/sersync/ , there are detailed Chinese descriptions of download, installation, use and so on in this website.

sersync download address: https://code.google.com/archive/p/sersync/downloads.

sersync benefits:

1.sersync is written in c + +, and filters the temporary files and repeated file operations generated by the file system of linux system. Therefore, when combined with rsync synchronization, it saves running time consumption and network resources. So faster.

2. The sersync configuration is very simple. There are statically compiled binary files in the bin directory, which can be used directly with the xml configuration file in the bin directory.

3.sersync uses multithreading for synchronization. Especially when synchronizing large files, it can ensure that multiple servers keep synchronized in real time.

4.sersync has an error handling mechanism. It resynchronizes the wrong file through the failure queue. If it still fails, it resynchronizes the failed file according to the set duration.

5.sersync has its own crontab function. Just open it in the xml configuration file, and you can synchronize it as a whole at intervals as required. There is no need to configure the crontab function.

6.sersync can be redeveloped.

In short, sersync can filter repeated events, reduce the burden, have its own crontab function, call rsync by multithreading, and retransmit failures.

Recommendations:

(1) When the amount of synchronized directory data is small, rsync+inotify is recommended

(2) rsync+sersync is recommended when the amount of synchronized directory data is large (hundreds of G or even more than 1T) and there are many files

Actually, in front The best implementation of inotify+rsync In addition to the multithreading function, the improved script in has been close to the core function of sersync. Even if a large amount of data is synchronized, the performance can be close to sersync.

The sersync toolkit does not require any installation and can be used after decompression.

[root@xuexi ~]# wget https://storage.googleapis.com/google-code-archive-downloads/v2/code.google.com/sersync/sersync2.5.4_64bit_binary_stable_final.tar.gz
[root@xuexi ~]# tar xf sersync2.5.4_64bit_binary_stable_final.tar.gz
[root@xuexi ~]# cp -a GNU-Linux-x86 /usr/local/sersync
[root@xuexi ~]# echo "PATH=$PATH:/usr/local/sersync" > /etc/profile.d/sersync.sh
[root@xuexi ~]# source /etc/profile.d/sersync.sh

The sersync directory / usr/local/sersync has only two files: a binary program file and a configuration file in xml format.

[root@xuexi ~]# ls /usr/local/sersync/
confxml.xml  sersync2

Where conf XML XML is a configuration file, and its content is easy to understand. The following is a description of the contents of the sample file.

<?xml version="1.0" encoding="ISO-8859-1"?>
<head version="2.5">
    <host hostip="localhost" port="8008"></host>
    <debug start="false"/>           # Whether to turn on the debugging mode. All the places where false and true appear below indicate the switches that are off and on respectively
    <fileSystem xfs="false"/>        # Are xfs file systems being monitored
    <filter start="false">           # Whether to enable the filtering function of monitoring. The filtered files will not be monitored
        <exclude expression="(.*)\.svn"></exclude>
        <exclude expression="(.*)\.gz"></exclude>
        <exclude expression="^info/*"></exclude>
        <exclude expression="^static/*"></exclude>
    </filter>
    <inotify>                         # The default monitored event is delete/close_write/moved_from/moved_to/create folder
        <delete start="true"/>
        <createFolder start="true"/>
        <createFile start="false"/>
        <closeWrite start="true"/>
        <moveFrom start="true"/>
        <moveTo start="true"/>
        <attrib start="false"/>
        <modify start="false"/>
    </inotify>
 
    <sersync>                       # Configuration section of rsync command
        <localpath watch="/www">    # The synchronized directory or file is the same as inotify+rsync. It is recommended to synchronize the directory
            <remote ip="172.16.10.5" name="/tmp/www"/>  # The target address and the module name of rsync daemon, so the remote end should first run rsync in daemon mode
            <!--remote ip="IPADDR" name="module"-->     # Unless ssh start is enabled below, name is the target directory when running in remote shell mode
        </localpath>
        <rsync>                      # Specify rsync options
            <commonParams params="-az"/>
            <auth start="false" users="root" passwordfile="/etc/rsync.pas"/>
            <userDefinedPort start="false" port="874"/><!-- port=874 -->
            <timeout start="false" time="100"/><!-- timeout=100 -->
            <ssh start="false"/>      # Whether to run the rsync command using remote shell mode instead of rsync daemon
        </rsync>
        <failLog path="/tmp/rsync_fail_log.sh" timeToExecute="60"/><!--default every 60mins execute once-->  # Error retransmission
        <crontab start="false" schedule="600"><!--600mins-->    # Enable crontab function
            <crontabfilter start="false">       # Filter function of crontab timing transmission
                <exclude expression="*.php"></exclude>
                <exclude expression="info/*"></exclude>
            </crontabfilter>
        </crontab>
        <plugin start="false" name="command"/>
    </sersync>
 
    <plugin name="command">
        <param prefix="/bin/sh" suffix="" ignoreError="true"/>  <!--prefix /opt/tongbu/mmm.sh suffix-->
        <filter start="false">
            <include expression="(.*)\.php"/>
            <include expression="(.*)\.sh"/>
        </filter>
    </plugin>
 
    <plugin name="socket">
        <localpath watch="/opt/tongbu">
            <deshost ip="192.168.138.20" port="8009"/>
        </localpath>
    </plugin>
    <plugin name="refreshCDN">
        <localpath watch="/data0/htdocs/cms.xoyo.com/site/">
            <cdninfo domainname="ccms.chinacache.com" port="80" username="xxxx" passwd="xxxx"/>
            <sendurl base="http://pic.xoyo.com/cms"/>
            <regexurl regex="false" match="cms.xoyo.com/site([/a-zA-Z0-9]*).xoyo.com/images"/>
        </localpath>
    </plugin>
</head>

The above configuration file adopts the rsync connection mode of remote shell mode, so there is no need to start rsync daemon on the target host. After setting the configuration file, just execute the sersync2 command. The usage of this command is as follows:

[root@xuexi sersync]# sersync2 -h
set the system param
execute: echo 50000000 > /proc/sys/fs/inotify/max_user_watches
execute: echo 327679 > /proc/sys/fs/inotify/max_queued_events
parse the command param
_____________________________________________________________
parameter-d:Enable daemon mode to enable sersync2 Running in the background
 parameter-r:Before monitoring, connect the monitoring directory with the remote host rsync Push the command again,
      :That is, first make the remote directory consistent with the local directory, and then realize incremental synchronization through monitoring
 parameter-n:Specifies the number of daemon threads to open. The default is 10
 parameter-o:Specifies the configuration file, which is used by default confxml.xml file
 parameter-m:Enable other modules separately, using -m refreshCDN Enable refresh CDN modular
 parameter-m:Enable other modules separately, using -m socket open socket modular
 parameter-m:Enable other modules separately, using -m http open http modular
 No-m Parameter, the synchronization program is executed by default
_____________________________________________________________

Thus, the sersync2 command always sets inotify related system kernel parameters first.

Therefore, just execute the following simple command.

[root@xuexi ~]# sersync2 -r -d
set the system param
execute: echo 50000000 > /proc/sys/fs/inotify/max_user_watches
execute: echo 327679 > /proc/sys/fs/inotify/max_queued_events
parse the command param
option: -r      rsync all the local files to the remote servers before the sersync work
option: -d      run as a daemon
daemon thread num: 10
parse xml config file
host ip : localhost     host port: 8008
daemon start,sersync run behind the console
config xml parse success
please set /etc/rsyncd.conf max connections=0 Manually
sersync working thread 12  = 1(primary thread) + 1(fail retry thread) + 10(daemon sub threads)
Max threads numbers is: 22 = 12(Thread pool nums) + 10(Sub threads)
please according your cpu ,use -n param to adjust the cpu rate
------------------------------------------
rsync the directory recursivly to the remote servers once
working please wait...
execute command: cd /www && rsync -az -R --delete ./  -e ssh 172.16.10.5:/tmp/www >/dev/null 2>&1
run the sersync:
watch path is: /www

The operation parameters of rsync command are marked in bold and red above. Because rsync will be cd to the monitoring directory before execution, and rsync is synchronized through the "- R" option with the relative path with / WWW as the root, the monitoring directory itself will not be copied to the remote end, so it is in conf XML Set the target directory as / tmp/www in XML, so that the files under the local / www will be synchronized to the / tmp/www directory of the target host, otherwise they will be synchronized to the / tmp directory of the target host.

For multiple instances of sersync, that is, when monitoring multiple directories, you only need to configure different configuration files respectively, and then use sersync2 to specify the corresponding configuration file to run.

For example:

[root@xuexi ~]# sersync2 -r -d -o /etc/sersync.d/nginx.xml

 

Topics: Linux shell