Using shell to realize the function similar to windows recycle bin under linux

Posted by shilpa_agarwal on Sun, 27 Feb 2022 05:00:05 +0100

When using linux, the rm -rf command is often executed, but this command is risky. For example, execute a shell script, which contains the following statements:

# script.sh
rm -rf $HOME/$SOME_PATH

At this time, if the environment variable HOME is caused by a setting_ If the path is empty, all contents in the HOME directory will be cleared directly. When I was developing the development kit, I also had a mistake in deleting a project, which caused the deletion of the code of local and remote branches. I found it hard to find commit id to recover. When colleagues and Tucao make complaints about this, my colleagues said that I had made a function of collecting and receiving stations. I also tried to make a simple recycle bin on the spur of the moment.

To implement a function similar to windows recycle bin, the following issues need to be considered:

  • linux is usually used by multiple users. This script should not affect other users, that is, it can only be used under the current user. Ideally, this script cannot be promoted to root;
  • When the same file is deleted many times, you should be able to select one of them to recover from the recycle bin;
  • It should be able to view the meta information of the deleted file, such as size, deleted date, original path, etc;
  • Be able to correctly handle files / directories / soft links;
  • Identify regular matching parameters, such as rm -rf test *, and match test1, test2, etc

This function is not difficult to achieve.
For the first point, the user script can be limited to the HOME directory;
For the second point, a file can be uniquely identified by the absolute path + the hash value generated by the deleted date / md5, and the script for recovering the file can be provided;
For the third point, a file can be provided to record the necessary information of the deleted file;
Fourth, don't worry. Everything under linux is a file; Fifth, the actual measurement shows that the shell script will expand the parameters of regular matching into actual values.

This function is mainly divided into two parts: remove SH and recover SH, which jointly operates a recycle directory and a meta file. The variables shared by the two are placed in init Conf.

Move files to the recycle bin

#!/bin/bash
# remove.sh

source ./init.conf
# init
if [ ! -d ${TRASH} ]
then
	mkdir ${TRASH}
	echo "recycle ${TRASH} created"
fi

if [ ! -f ${TRASH_META} ]
then
	touch ${TRASH_META}
	echo "file ${TRASH_META} created"
fi

# do remove
for f in $*
do
	if [ ! -e ${f} ]
	then
		echo -e "\e[31m WARN \e[0m: ${f} not exists"
	else
		real_path=$(realpath ${f})
		if [ ${real_path} = ${TRASH} -o ${real_path} = ${TRASH_META} ]
		then
			echo -e "\e[31m WARN \e[0m: ${f} not exists"
		else
			cur_time=$(date +%G-%m-%dT%T)
			unique_file=${real_path}+${cur_time}
			encode_file=$(echo -n ${unique_file} | md5sum | cut -d ' ' -f1)
			# write to meta file
			echo "[FILE NAME]:${real_path}; [DELETE TIME]:${cur_time}; [MD5]:${encode_file}" >> ${TRASH_META}
			# mv the deleted file to recycle
			mv $f ${TRASH}/${encode_file}
		fi
	fi
done

remove.sh accepts the same parameter form as the rm command.

In shell script

\e[31m xxx \e[0m
Is a small trick, which is used to output different colors on the command line for differentiation. The above 31m represents red.

variable T R A S H and {dash} and Trace and {trace_meta} are defined in init In conf, in order to prevent the recycle bin directory from being deleted by mistake, we set it as a hidden directory. The superior directory of the recycle bin should be set as a directory with large hard disk and permission. For simplicity, it is set as $HOME:

# hidden dir/file, to avoid being deleted unconsciously
TRASH=${HOME}/.recycle 
TRASH_META=${HOME}/.meta

The whole script is simple and clear. The only thing to consider is "self deletion", that is, the script is prohibited from deleting the recycle directory.
Here is recover sh

Recovering files from the recycle bin

#!/bin/bash
# recover.sh
source ./init.conf
for md5 in $*
do
	recover_file=$(cat ${TRASH_META} | grep ${md5} | cut -d ';' -f1 | cut -d ':' -f2)
	if [ -z ${recover_file} ]
	then
		echo -e "\e[31m WARN \e[0m: can not locate recover file, perhaps the md5(${md5}) you input is invalid"
	else
		if [ ! -e ${TRASH}/${md5} ]
		then
			echo -e "\e[31m WARN \e[0m: ${TRASH}/${md5} not exists!"
		else
			mv ${TRASH}/${md5} ${recover_file}
			if [ $? != 0 ]
			then
				dir_path=$(echo -n ${recover_file} | rev | cut -d '/' -f 2- | rev)
				mkdir -p ${dir_path}
				if [ $? != 0 ]
				then
					echo "failed to create directory ${dir_path}"
				else
					# parent dies have been created, try mv again
					mv ${TRASH}/${md5} ${recover_file}
				fi
			fi
			# the file has been recovered, remove the specific line from meta file
			sed -ie "/${md5}/d" ${TRASH_META}
		fi
	fi
done

recover.sh accepts several parameters, each of which is an md5 value. If the md5 value does not exist or the original file corresponding to the md5 value does not exist, an error will be reported. Since md5 occurs at most once, we can directly locate the absolute path of the original file through the combined command of grep+sed.
Why is the parameter passed in md5?
In windows, if we want to restore a deleted file, we need to open the recycle bin, select a file, and then click restore; The corresponding recycle bin here is open meta file, view the file to be recovered, select its corresponding md5 value, and then pass it as a parameter to recover Restore in SH.

During the recovery process, we added an additional detection: suppose we want to recover the file ${HOME}/test/1, but the test directory has been deleted. At this time, an error will be reported when executing the mv command. The correct way is to recursively create the directory through the mkdir -p command and then execute the mv command. It is assumed that the reason for the mv command error is that the directory does not exist. In fact, it is not enough robust. Other errors (such as insufficient hard disk space) are not handled.

Other operations

Initialize script execution environment

We hope that the user can execute this command anywhere, so we can consider adding it to the user's environment variable. Assume remove sh, recover. sh, init. Conf are all in the remove directory, so create an export SH to initialize the environment variable:

#!/bin/bash
# export.sh
remove_root=$(pwd)
find_remove=(echo ${PATH} | grep ${remove_root})
if [ ${find_remove} ]
then
	echo "export PATH=${remove_root}:${PATH}" >> ${HOME}/.bashrc
	source ${HOME}/.bashrc
fi

Periodically empty the recycle bin

As time goes on, the space occupied by the recycle bin is bound to become larger and larger. From the demand point of view, we initially executed the rm command to permanently delete a file. The recycle bin is only to recover a few wrongly deleted files. A huge and bloated recycle bin is not what we want to see. Therefore, consider adding a scheduled task to clean the recycle bin regularly. For example, automatically clean up files that have been in the recycle bin for more than 7 days.

#!/bin/bash
# clean_recycle.sh
source ./init.conf
seven_days_before=$(date +%G-%m-%dT-%T --date='7 day')
sdb_ts=$(date -d ${seven_days_before} +%s)
if [ ! -f ${TRASH_META} ]
then
	echo "\e[31m WARN \e[0m: meta file [${TRASH_META}] not exists!"
	exit
else
	while read LINE
	do
		delete_time=$(echo -n ${LINE} | cut -d ';' -f 2 | cut -d ':' -f1)
		md5=$(echo -n ${LINE} | cut -d ';' -f 3 | cut -d ':' -f1)
		if [ -z ${delete_time} ]
		then
			echo "invalid time"
		else
			delete_ts=$(date -d ${delete_time} +%s)
		fi
		if [ ${delete_ts} < ${sdb_ts}]
		then
			if [ ! -e ${TRASH}/${md5} ]
			then
				echo "md5[${md5}] file not exists!"
			else
				# do real remove
				rm -rf ${TRASH}/${md5}
				sed -ie "/${md5}/d" ${TRASH_META}
			fi
		fi
	done < ${TRASH_META}
fi

clean_recycle.sh is written dead in the script to empty the files seven days ago. If you want to adjust, you have to change the script. In fact, it doesn't seem very flexible. The better way is to pass it by parameters or write it in init In conf, it's easy here, so I don't think so much.

The rm command in the script is risky: if the ${dash} variable is cleared by other programs, the rm command will delete the root directory together. Therefore, before rm, you must check whether the deleted md5 file exists.

This script can exist in the form of scheduled tasks or background tasks. Considering that background processes are often kill ed for various reasons in the actual development environment, it can be run in the form of scheduled tasks every morning:

chmod +x clean_recycle.sh
crontab -l

59 23 * * * ${HOME}/remove/clean_recycle.sh

Finally, remove SH may also be deleted by the rm command. One way to compare the trick is to use the chatr command to make it read-only:

chattr +i remove.sh recover.sh clean_recycle.sh

After changing to read-only mode, even sudo rm cannot delete these files.

So far, a simple linux recycle bin function has been realized.

Topics: Linux Windows bash