Design pattern: combined pattern, how to design and implement the file system directory tree structure that supports recursive traversal

Posted by Jibberish on Wed, 05 Jan 2022 20:47:52 +0100

introduce

The composition pattern is completely different from the "composition relationship" (organizing two classes through composition) in object-oriented design. The "combination mode" here is mainly used to process tree structure data. The "data" here can be simply understood as a set of objects

Because of the particularity of its application scenario, the data must be represented as a tree structure, which also leads to the fact that this model is not so commonly used in practical development. However, once the data meets the tree structure, the application of this pattern can play a great role and make the code very concise

Principle and implementation of combination mode

In GoF's design pattern, the definition of composite pattern is to organize a group of data into a tree structure to represent a "part whole" hierarchical structure.

Composition lets the client (in many design pattern books, "client" refers to the user of the code.) The processing logic of single object and combined object can be unified.

For example, suppose we have a requirement to design a class to represent the directory in the file system, which can easily realize the following functions:

  • Dynamically add or delete subdirectories or files under a directory
  • Count the number of files in the specified directory
  • Count the total size of files in the specified directory

The code is as follows. In the following code implementation, we uniformly represent the file and directory with the FileSystemNode class, and distinguish them through the isFile attribute.

public class FileSystemNode {
	private String path;
	private boolean isFile;
	private List<FileSystemNode> subNodes = new ArrayList<>();
	
	public FileSystemNode(String path, boolean isFile) {
		this.path = path;
		this.isFile = isFile;
	}

	public int countNumOfFiles() {
		if (isFile) {
			return 1;
		}
		int numOfFiles = 0;
		for (FileSystemNode fileOrDir : subNodes) {
			numOfFiles += fileOrDir.countNumOfFiles();
		}
		return numOfFiles;
	}
	
	public long countSizeOfFiles() {
		if (isFile) {
		File file = new File(path);
		if (!file.exists()) return 0;
			return file.length();
		}
		long sizeofFiles = 0;
		for (FileSystemNode fileOrDir : subNodes) {
			sizeofFiles += fileOrDir.countSizeOfFiles();
		}
		return sizeofFiles;
	}
	
	public String getPath() {
		return path;
	}
	
	public void addSubNode(FileSystemNode fileOrDir) {
		subNodes.add(fileOrDir);
	}
	
	public void removeSubNode(FileSystemNode fileOrDir) {
		int size = subNodes.size();
		int i = 0;
		for (; i < size; ++i) {
			if (subNodes.get(i).getPath().equalsIgnoreCase(fileOrDir.getPath())) {
				break;
			}
		}
		
		if (i < size) {
			subNodes.remove(i);
		}
	}
}

From the perspective of function realization, the above code has no problem and has realized the functions we want. However, if we are developing a large-scale system, from the perspective of scalability (files or directories may correspond to different operations), business modeling (files and directories are two concepts in business), and code readability (files and directories are treated differently, which is more symbolic of people's understanding of business), we'd better design files and businesses differently, It is defined as File and Directory. The reconfiguration is as follows:

public abstract class FileSystemNode {
	protected String path;
	
	public FileSystemNode(String path) {
		this.path = path;
	}
	
	public abstract int countNumOfFiles();
	
	public abstract long countSizeOfFiles();
	
	public String getPath() {
		return path;
	}
}

public class File extends FileSystemNode {
	public File(String path) {
		super(path);
	}
	
	@Override
	public int countNumOfFiles() {
		return 1;
	}
	
	@Override
	public long countSizeOfFiles() {
		java.io.File file = new java.io.File(path);
		if (!file.exists()) return 0;
		return file.length();
	}
}

public class Directory extends FileSystemNode {
	private List<FileSystemNode> subNodes = new ArrayList<>();
	
	public Directory(String path) {
		super(path);
	}
	
	@Override
	public int countNumOfFiles() {
		int numOfFiles = 0;
		for (FileSystemNode fileOrDir : subNodes) {
			numOfFiles += fileOrDir.countNumOfFiles();
		}
		return numOfFiles;
	}
	
	@Override
	public long countSizeOfFiles() {
		long sizeofFiles = 0;
		for (FileSystemNode fileOrDir : subNodes) {
			sizeofFiles += fileOrDir.countSizeOfFiles();
		}
		return sizeofFiles;
	}
	
	public void addSubNode(FileSystemNode fileOrDir) {
		subNodes.add(fileOrDir);
	}
	
	public void removeSubNode(FileSystemNode fileOrDir) {
		int size = subNodes.size();
		int i = 0;
		for (; i < size; ++i) {
			if (subNodes.get(i).getPath().equalsIgnoreCase(fileOrDir.getPath())) {
				break;
			}
		}
		if (i < size) {
			subNodes.remove(i);
		}
	}
}

File and directory classes are designed. Let's see how to use them to represent the directory tree structure in a file system. Specific code examples are as follows:

public class Demo {
	public static void main(String[] args) {
		/**
		* /
		* /wz/
		* /wz/a.txt
		* /wz/b.txt
		* /wz/movies/
		* /wz/movies/c.avi
		* /xzg/
		* /xzg/docs/
		* /xzg/docs/d.txt
		*/
		Directory fileSystemTree = new Directory("/");
		Directory node_wz = new Directory("/wz/");
		Directory node_xzg = new Directory("/xzg/");
		fileSystemTree.addSubNode(node_wz);
		fileSystemTree.addSubNode(node_xzg);
		File node_wz_a = new File("/wz/a.txt");
		File node_wz_b = new File("/wz/b.txt");
		Directory node_wz_movies = new Directory("/wz/movies/");
		node_wz.addSubNode(node_wz_a);
		node_wz.addSubNode(node_wz_b);
		node_wz.addSubNode(node_wz_movies);
		File node_wz_movies_c = new File("/wz/movies/c.avi");
		node_wz_movies.addSubNode(node_wz_movies_c);
		Directory node_xzg_docs = new Directory("/xzg/docs/");
		node_xzg.addSubNode(node_xzg_docs);
		File node_xzg_docs_d = new File("/xzg/docs/d.txt");
		node_xzg_docs.addSubNode(node_xzg_docs_d);
		System.out.println("/ files num:" + fileSystemTree.countNumOfFiles());
		System.out.println("/wz/ files num:" + node_wz.countNumOfFiles());
	}
}

Referring to this example, let's revisit the definition of composite pattern: organize a group of objects (files and directories) into a tree structure to represent a "part whole" hierarchy (nested structure of directories and subdirectories). Combined mode allows the client to unify the processing logic (recursive traversal) of single object (file) and combined object (directory).

In fact, the design idea of this combination pattern just mentioned is not so much a design pattern as an abstraction of data structure and algorithm of business scenario. Among them, the data can be expressed as a tree, and the business requirements can be realized by recursive traversal algorithm on the tree.

Problem: the implementation of countNumOfFiles() and countSizeOfFiles() functions is not efficient, because each time they are called, the subtree must be traversed again. Is there any way to improve the execution efficiency of these two functions

Answer: the essence is "recursive code should be vigilant against repeated calculation"! You can store each (path,size) in a hash table, and directly return the corresponding size through the path. When deleting or adding, you can maintain this size.

Examples of application scenarios of composite mode

Just now we talked about the example of file system. For the composite mode, here is another example. If you understand these two examples, you basically master the combination mode. In the actual project, if you encounter a similar business scenario that can be expressed as a tree structure, you just need to design it "according to the gourd and ladle"

Suppose we are developing an OA system (office automation system). The company's organizational structure includes two data types: Department and employee. Among them, the Department can include sub departments and employees. The table structure in the database is as follows:

We hope to reconstruct the personnel structure diagram of the whole company (Department, sub department and employee affiliation) in memory, and provide an interface to calculate part of the salary cost (salary and salary of all employees belonging to this department)

Departments include sub departments and employees. This is a nested structure that can be represented as a tree. The need to calculate the salary expenses of each part can be realized by traversing the tree. Therefore, from this perspective, this application scenario can be designed and implemented using composite patterns.

The code is as follows. Among them, HumanResource is the parent class abstracted from Department class and Employee class to unify the salary processing logic. The code in the Demo is responsible for reading data from the database and building an organization chart in memory.

public abstract class HumanResource {
	protected long id;
	protected double salary;
	
	public HumanResource(long id) {
		this.id = id;
	}
	
	public long getId() {
		return id;
	}
	
	public abstract double calculateSalary();
}

public class Employee extends HumanResource {
	public Employee(long id, double salary) {
		super(id);
		this.salary = salary;
	}
	
	@Override
	public double calculateSalary() {
		return salary;
	}
}
public class Department extends HumanResource {
	private List<HumanResource> subNodes = new ArrayList<>();
	
	public Department(long id) {
		super(id);
	}
	
	@Override
	public double calculateSalary() {
		double totalSalary = 0;
		for (HumanResource hr : subNodes) {
			totalSalary += hr.calculateSalary();
		}
		this.salary = totalSalary;
		return totalSalary;
	}
	public void addSubNode(HumanResource hr) {
		subNodes.add(hr);
	}
}

// Code for building organizational structure
public class Demo {
	private static final long ORGANIZATION_ROOT_ID = 1001;
	private DepartmentRepo departmentRepo; // Dependency injection
	private EmployeeRepo employeeRepo; // Dependency injection
	
	public void buildOrganization() {
		Department rootDepartment = new Department(ORGANIZATION_ROOT_ID);
		buildOrganization(rootDepartment);
	}
	
	private void buildOrganization(Department department) {
		List<Long> subDepartmentIds = departmentRepo.getSubDepartmentIds(department
		for (Long subDepartmentId : subDepartmentIds) {
			Department subDepartment = new Department(subDepartmentId);
			department.addSubNode(subDepartment);
			buildOrganization(subDepartment);
		}
		List<Long> employeeIds = employeeRepo.getDepartmentEmployeeIds(department.g
		for (Long employeeId : employeeIds) {
			double salary = employeeRepo.getEmployeeSalary(employeeId);
			department.addSubNode(new Employee(employeeId, salary));
		}
	}
}

Let's compare the definition of composite mode with this example: "organize a group of objects (employees and departments) into a tree structure to represent a 'part whole' hierarchy (nested structure of departments and sub departments). Composite mode allows the client to unify the processing logic (recursive traversal) of single objects (employees) and composite objects (departments)."

summary

The design idea of composite pattern is not so much a design pattern as an abstraction of data structure and algorithm of business scenario. Among them, the data can be expressed as a tree, and the business requirements can be realized through the recursive traversal algorithm on the tree.

The combination mode organizes a group of objects into a tree structure, and looks at the nodes of tree species for unified processing logic. It uses the characteristics of tree structure to recursively process each subtree and simplify the code implementation in turn. The premise of using composite mode is that your business scenario must be able to be represented in a tree structure. Therefore, the application scenario of composite mode is relatively limited and is not a very common design mode