7-30 clear thinking of C + + problem solution of directory tree

Posted by tarado on Thu, 16 Dec 2021 23:44:07 +0100

Title Link

In the ZIP archive, the relative paths and names of all compressed files and directories are retained. When you open a ZIP archive using GUI software such as WinZIP, you can reconstruct the tree structure of the directory from this information. Please write a program to rebuild the tree structure of the directory.

Input format:

The input first gives a positive integer N(≤104 ),express ZIP Number of files and directories in the archive. subsequently N Line, each line has the relative path and name of the file or directory in the following format (no more than 260 characters per line):

Characters in the path and name include only English letters (case sensitive);
Symbols“\"Appears only as a path separator;
Catalog with symbols“\"end;
There are no duplicate input items;
The entire input size does not exceed 2 MB. 

Output format:

Assume that all paths are relative to the root directory. Starting from the root directory, when outputting, each directory first outputs its own name, then outputs all subdirectories in dictionary order, and then outputs all files in dictionary order. Note that during output, spaces should be indented according to the relative relationship of directories. Each level of directory or file should be indented by 2 more spaces than the previous level.

Input example:

7
b
c
ab\cd
a\bc
ab\d
a\d\a
a\d\z\

Output example:

 root
  a
    d
      z
      a
    bc
  ab
    cd
    d
  c
  b

Analysis: first of all, when you see this question, you can easily think of a similar question: 7-27 genealogy processing.

A brief review: 7-27 genealogy processing gives a directory structure and requires to judge the relative relationship between nodes. This question gives the file path and requires the output directory structure, which is very similar to the inverse problem of 7-27?

program = data structure + algorithm

Design of data structure

First, the child node to the parent node is one-to-one, while the parent node to the child node is one to many.

In 7-27, you only need to query the relative relationship between nodes, then use the hash map to save the parent node information of each node and iterate the query. It is bottom-up and one-to-one, so you can only use map to record fathers.

In this problem, you need to output the entire directory from the root node down, from top to bottom, one to many, so you need to use the structure to store all children (①).

The title contains two structures of "directory" and "file", and has the same name, so it needs to be marked (②) and its own file / directory name (③).

To sum up, the node data structure of the tree is:

struct node {
	string name;
	int isCata;				// Directory file tag
	vector<node*> child;	// Child pointer
};

Algorithm design

Firstly, the solution framework of the whole problem is to build a tree first, and then output a tree

Directory tree
	|
	- 1.Build a tree
	|
	- 2.output

1. Achievements

In the tree building process, first scan the string in sequence, extract the words in the string, and the end flag bit '\' or the end of the string ().

  1. Ends with '\' and extracts the directory name
  2. End with end() and extract the file name

Set a variable curRoot to record the pointer of the current parent directory when processing strings. Each time a name is extracted, it will be inserted into the current parent directory, and then update the current parent directory.

 a\d\a
↑ Before traversing, curRoot = root

 a\d\a
  ↑ Extract directory a.  non-existent a,newly build a,curRoot = a
  
 a\d\a
    ↑ Extract directory d.  non-existent d,establish d,curRoot = d

 a\d\a
      ↑ Extract file a.  non-existent a,newly build a,curRoot = d

To sum up, the process of building a tree is summarized as follows:

1.Tree building: scan strings
	|
	- 1.1 str[i] == '\' :  Extract the directory name and switch the current parent directory
		
		- 1.1.1 The directory does not exist. Switch to the current parent directory directly
		
		- 1.1.2 Directory does not exist, new, switch
		
	- 1.2 i == str.end() :  Extract the file name and insert it into the current parent directory
	
	- 1.3 Others: cumulative characters

The code is as follows:

	// Establish root node 
	node* root = new node;
	root->name = "root";
	root->isCata = 1;


	
	string tmp,str;
	node* curRoot;
	for(int j = 0; j < n; ++j) {
	
		 // For each new path, set the root to root 
		curRoot = root;

		getline(cin,str);
		
		
		for(int i = 0; i <= str.size(); ++i) {
			if(str[i] == '\\') {	// Case 1 Yes Directory: switch the current directory,

				// Search in the current parent directory to see if it exists
				int flag = 0;
				for(int k = 0; k < curRoot->child.size(); ++k) {
				// 1.1 there is this directory
					if(curRoot->child[k]->name == tmp && curRoot->child[k]->isCata == 1) {	
						// Then switch to the current directory
						curRoot =  curRoot->child[k];
						flag = 1;
						break;
					}
				}

				// 1.2 if there is no such directory, create one
				if(!flag) { 

					// Create node
					node* newnode = new node;
					newnode->name = tmp;
					newnode->isCata = 1;

					// Join parent directory
					curRoot->child.push_back(newnode) ;

					// Switch current directory
					curRoot = newnode;
				}

				// Word reset
				tmp.clear();
			
			// Case 2 It's a file
			}else if(i == str.size()) {		
				if(!tmp.empty()) {	// To the end, but the word is not empty, the description is a file
					// Add file to parent node

					node* newnode = new node;
					newnode->name = tmp;
					newnode->isCata = 0;

					curRoot->child.push_back(newnode) ;
				}

				tmp.clear();
			} else {						// Case 3 Accumulate word letters
				tmp += str[i];
			}
		}
	}

2. Output

The output is relatively simple. dfs can traverse the output. Just pay attention to two points:

  1. The directory is first, the file is second, and the dictionary is in ascending order
  2. Record depth, used to output leading spaces

The code is as follows:

bool cmp(node* a, node* b) {
	if(a->isCata != b->isCata)	return a->isCata > b->isCata;
	else return a->name < b->name;
}

void dfs(node* root,int level) {
	if(root == NULL)	return ;

	// Output yourself first
	for(int i  = 0; i < level; ++i)	printf("  ");
	printf("%s\n",root->name.c_str()) ;

	// Sort all children: the directory is in the front, the file is in the back, and the dictionary is in ascending order
	sort(root->child.begin(),root->child.end(),cmp);

	// Downward recursion
	for(int i = 0; i < root->child.size(); ++i)
		dfs(root->child[i],level+1);

}

3. Consolidation

Summarizing the above contents, we can get:

Directory tree
	|
	-1.Tree building: scan strings
		|
		- 1.1 str[i] == '\' :  Extract the directory name and switch the current parent directory
					
			- 1.1.1 The directory does not exist. Switch to the current parent directory directly
		
			- 1.1.2 Directory does not exist, new, switch
		
		- 1.2 i == str.end() :  Extract the file name and insert it into the current parent directory
		
		- 1.3 Others: cumulative characters
	
	- 2.output
		|
		- 2.1 Custom sorting
		
		- 2.2 leading blanks

The AC code is as follows:

#include<cstdio>
#include<algorithm>
#include<iostream>
#include<string>
#include<vector>
#include<set>
#include<map>
#define MAXN 100010
using namespace std;


struct node {
	string name;
	int isCata;				// Directory file tag
	vector<node*> child;	// Child pointer
};

bool cmp(node* a, node* b) {
	if(a->isCata != b->isCata)	return a->isCata > b->isCata;
	else return a->name < b->name;
}

void dfs(node* root,int level) {
	if(root == NULL)	return ;

	// Output yourself first
	for(int i  = 0; i < level; ++i)	printf("  ");
	printf("%s\n",root->name.c_str()) ;

	// Sort all children: the directory is in the front, the file is in the back, and the dictionary is in ascending order
	sort(root->child.begin(),root->child.end(),cmp);

	// Downward recursion
	for(int i = 0; i < root->child.size(); ++i)
		dfs(root->child[i],level+1);

}

int n;
int main() {

	scanf("%d",&n);
	getchar();

	node* root = new node;
	root->name = "root";
	root->isCata = 1;


	// Read in tree building process
	string tmp,str;
	node* curRoot;
	for(int j = 0; j < n; ++j) {
		// For each loop, set the root to root
		curRoot = root;

		getline(cin,str);
		// Split out words
		for(int i = 0; i <= str.size(); ++i) {
			if(i == str.size()) {
				if(!tmp.empty()) {	// To the end, but the word is not empty, the description is a file
					// Add file to parent node

					node* newnode = new node;
					newnode->name = tmp;
					newnode->isCata = 0;

					curRoot->child.push_back(newnode) ;
				}
//				cout <<"Get File: "<<tmp<<endl;
				tmp.clear();
			} else if(str[i] == '\\') {	// Yes Directory: switch the current directory,

//				cout <<"Get Catagory: "<<tmp<<endl;
				// Search in the current parent directory to see if it exists
				int flag = 0;
				for(int k = 0; k < curRoot->child.size(); ++k) {
					if(curRoot->child[k]->name == tmp && curRoot->child[k]->isCata == 1) {	// 1. There is this directory
						// Then switch to the current directory
						curRoot =  curRoot->child[k];
//						cout <<"   Already create catagory : "<<tmp<<endl;
						flag = 1;
						break;
					}
				}

				if(!flag) { // 2. If there is no such directory, create one
//					cout <<"  Do not create catagory : "<<tmp<<endl;
					// Create node
					node* newnode = new node;
					newnode->name = tmp;
					newnode->isCata = 1;

					// Join parent directory
					curRoot->child.push_back(newnode) ;

					// Switch current directory
					curRoot = newnode;
				}

				// Word reset
				tmp.clear();

			} else {						// Accumulate word letters
				tmp += str[i];
			}
		}
	}


	// Output process
	dfs(root,0);


	return 0;
}

summary

It doesn't seem difficult from this point of view. But when I first did it for a long time, I was at a loss when I met unconventional questions. In fact, I scared myself, especially the exams in recent years. Hope to stabilize your mind this year, Ollie!

Topics: C++ Algorithm data structure