Data structure and algorithm (Java implementation) [hash table]

Posted by mart on Tue, 01 Feb 2022 08:37:30 +0100

1. Introduction

2. Hash table

(- 1) related terms

  • 1. Hash method (hash method)
    Select a function, calculate the storage location of elements by keyword according to the function, and store them according to this.
    When searching, the same function calculates the address of the given value k, and compares k with the key value of the element in the address unit to determine whether the search is successful.

  • 2. Hash function: the conversion function used in the hash method.

  • 3. Hash table: a table constructed according to the above idea

  • 4. Conflict: different key values are mapped to the same hash address, key1 ≠ key2, but H(key1)=H(key2)

(0) store by hash function H(k)=k

(1) Thinking about stored procedures

  • Select a function and calculate the storage location of elements by keyword according to the function. Loc(i)=H(key)
  • conflict
    Key1 is mapped to the same address of key2 (key1 hashes, but key2 hashes are different)
    In the hash lookup method, conflict is inevitable and can only be reduced as much as possible

(2) Search procedure in case of storage conflict

3. Using hash tables solves two problems

  • (1) Constructed hash function
    The selected function shall be as simple as possible to improve the conversion speed;
    The addresses calculated by the selected function for the key values shall be evenly distributed in the hash table to reduce space waste.
  • (2) Develop a good solution to the conflict
    When searching, if the keyword cannot be found in the address calculated by the hash function, other relevant units shall be queried regularly according to the conflict resolution rules.

4. Construction methods of common hash functions

(1) Direct address method

(2) Division and residue method

5. Common conflict resolution methods

(1) Open address method

[1] Basic idea:

If there is a conflict, find the next empty hash address. As long as the Hash list is large enough, the empty hash address can always be found and the data elements are stored.
For example: Division remainder method Hi = (Hash(key)+di) mod m, where di is an incremental sequence.

[2] Common methods for determining di

[2.1] linear detection method

Hi=(Hash(key)+di)mod m (1=<i<m)
Where: m < = prime number of hash table length
Di is the increment sequence 1,2,3,..., m-1, and di=i
Tip: in case of conflict, find the next address according to the incremental sequence.

[2.2.2] cases

(2) Chain address method (zipper method)

[1] Basic idea:

Records with the same hash address are linked into a single linked list
Set M single linked lists for M hash addresses, and then use an array to store the header pointers of m single linked lists to form a dynamic structure.

[2] Chestnuts

6. Comparison and summary

Search efficiency analysis of hash table

Conclusion: divide and leave remainder method + chain address method -- > hash expression

Conclusion verification:

It is known that the hash function of a group of keywords (19, 14, 23, 1, 68, 20, 84, 27, 55, 11, 10, 79) is: H(key)=key MOD 13, and the hash table length is m=16. Let the search probability of each record be equal

  • (1) Use open address method + linear detection hash to deal with conflicts, that is, Hi=(H(key)+di) MOD m
  • (2) Using chain address method to deal with conflicts

7. Java code hand hash table (beggar version)

(1) An online question from google:

In one company, when a new employee comes to report, it is required to add the employee's information (id, gender, age, name, address...). When entering the employee's id, it is required to find all the information of the employee

(2) Demand

  • 1) If you don't use the database, the faster the better = > hash table (hash)
  • 2) When adding, ensure that the id is inserted from low to high [after class thinking: if the id is not inserted from low to high, but each linked list is still required to be inserted from low to high, how to solve it?]
  • 3) Use the linked list to realize the hash table. The linked list does not have a header [that is, the first node of the linked list stores employee information]

(3) Schematic diagram

(4) Code implementation

  • Employees (Gu Yuan - small age)
//Represents an employee
class Emp {
	public int id;
	public String name;
	public Emp next; //next defaults to null
	public Emp(int id, String name) {
		super();
		this.id = id;
		this.name = name;
	}
}
  • Employee list
    add to
    lookup
    Show all employees
//Create an EmpLinkedList to represent a linked list
class EmpLinkedList {
	//The header pointer executes the first Emp, so the head of our linked list directly points to the first Emp
	private Emp head; //Default null
	
	//Add employee to linked list
	//explain
	//1. Suppose that when an employee is added, the id is self increasing, that is, the id allocation is always from small to large
	//   Therefore, we can add the employee directly to the end of this linked list
	public void add(Emp emp) {
		//If you are adding the first employee
		if(head == null) {
			head = emp;
			return;
		}
		//If not the first employee, an auxiliary pointer is used to help locate the last employee
		Emp curEmp = head;
		while(true) {
			if(curEmp.next == null) {//Description to the end of the linked list
				break;
			}
			curEmp = curEmp.next; //Move back
		}
		//When exiting, add emp directly to the linked list
		curEmp.next = emp;
	}
	
	//Employee information traversing the linked list
	public void list(int no) {
		if(head == null) { //Description: the linked list is empty
			System.out.println("The first "+(no+1)+" The linked list is empty");
			return;
		}
		System.out.print("The first "+(no+1)+" The information of the linked list is");
		Emp curEmp = head; //Auxiliary pointer
		while(true) {
			System.out.printf(" => id=%d name=%s\t", curEmp.id, curEmp.name);
			if(curEmp.next == null) {//It indicates that curEmp is the last node
				break;
			}
			curEmp = curEmp.next; //Move back, traverse
		}
		System.out.println();
	}
	
	//Find employees by id
	//If found, return Emp; if not found, return null
	public Emp findEmpById(int id) {
		//Determine whether the linked list is empty
		if(head == null) {
			System.out.println("The linked list is empty");
			return null;
		}
		//Auxiliary pointer
		Emp curEmp = head;
		while(true) {
			if(curEmp.id == id) {//find
				break;//At this time, curEmp points to the employee to be found
			}
			//sign out
			if(curEmp.next == null) {//Description: the employee cannot be found by traversing the current linked list
				curEmp = null;
				break;
			}
			curEmp = curEmp.next;//in the future
		}
		
		return curEmp;
	}
	
}

  • Hash array with employee linked list as element
    Add employee (including MOD hash function)
    Calculate which linked list belongs to and find the id
    Show all employees
//Create HashTab to manage multiple linked lists
class HashTab {
	private EmpLinkedList[] empLinkedListArray;
	private int size; //Indicates how many linked lists there are
	
	//constructor 
	public HashTab(int size) {
		this.size = size;
		//Initialize empLinkedListArray
		empLinkedListArray = new EmpLinkedList[size];
		//? Leave a hole. Don't initialize each linked list separately at this time
		for(int i = 0; i < size; i++) {
			empLinkedListArray[i] = new EmpLinkedList();
		}
	}
	
	//Add employee
	public void add(Emp emp) {
		//According to the employee's id, get the linked list to which the employee should be added
		int empLinkedListNO = hashFun(emp.id);
		//Add emp to the corresponding linked list
		empLinkedListArray[empLinkedListNO].add(emp);
		
	}
	//Traverse all linked lists and hashtab
	public void list() {
		for(int i = 0; i < size; i++) {
			empLinkedListArray[i].list(i);
		}
	}
	
	//Find the employee based on the entered id
	public void findEmpById(int id) {
		//Use the hash function to determine which linked list to look up
		int empLinkedListNO = hashFun(id);
		Emp emp = empLinkedListArray[empLinkedListNO].findEmpById(id);
		if(emp != null) {//find
			System.out.printf("In the first%d Employee found in linked list id = %d\n", (empLinkedListNO + 1), id);
		}else{
			System.out.println("The employee was not found in the hash table~");
		}
	}
	
	//Write hash function, using a simple modular method
	public int hashFun(int id) {
		return id % size;
	}
	
	
}

(5) Effect



(6) Complete code

public class HashTabDemo {

	public static void main(String[] args) {
		
		//Create hash table
		HashTab hashTab = new HashTab(7);
		
		//Write a simple menu
		String key = "";
		Scanner scanner = new Scanner(System.in);
		while(true) {
			System.out.println("add:  Add employee");
			System.out.println("list: Show employees");
			System.out.println("find: Find employees");
			System.out.println("exit: Exit the system");
			
			key = scanner.next();
			switch (key) {
			case "add":
				System.out.println("input id");
				int id = scanner.nextInt();
				System.out.println("Enter name");
				String name = scanner.next();
				//Create employee
				Emp emp = new Emp(id, name);
				hashTab.add(emp);
				break;
			case "list":
				hashTab.list();
				break;
			case "find":
				System.out.println("Please enter the to find id");
				id = scanner.nextInt();
				hashTab.findEmpById(id);
				break;
			case "exit":
				scanner.close();
				System.exit(0);
			default:
				break;
			}
		}
		
	}

}

//Create HashTab to manage multiple linked lists
class HashTab {
	private EmpLinkedList[] empLinkedListArray;
	private int size; //Indicates how many linked lists there are
	
	//constructor 
	public HashTab(int size) {
		this.size = size;
		//Initialize empLinkedListArray
		empLinkedListArray = new EmpLinkedList[size];
		//? Leave a hole. Don't initialize each linked list separately at this time
		for(int i = 0; i < size; i++) {
			empLinkedListArray[i] = new EmpLinkedList();
		}
	}
	
	//Add employee
	public void add(Emp emp) {
		//According to the employee's id, get the linked list to which the employee should be added
		int empLinkedListNO = hashFun(emp.id);
		//Add emp to the corresponding linked list
		empLinkedListArray[empLinkedListNO].add(emp);
		
	}
	//Traverse all linked lists and hashtab
	public void list() {
		for(int i = 0; i < size; i++) {
			empLinkedListArray[i].list(i);
		}
	}
	
	//Find the employee based on the entered id
	public void findEmpById(int id) {
		//Use the hash function to determine which linked list to look up
		int empLinkedListNO = hashFun(id);
		Emp emp = empLinkedListArray[empLinkedListNO].findEmpById(id);
		if(emp != null) {//find
			System.out.printf("In the first%d Employee found in linked list id = %d\n", (empLinkedListNO + 1), id);
		}else{
			System.out.println("The employee was not found in the hash table~");
		}
	}
	
	//Write hash function, using a simple modular method
	public int hashFun(int id) {
		return id % size;
	}
	
	
}

//Represents an employee
class Emp {
	public int id;
	public String name;
	public Emp next; //next defaults to null
	public Emp(int id, String name) {
		super();
		this.id = id;
		this.name = name;
	}
}

//Create an EmpLinkedList to represent a linked list
class EmpLinkedList {
	//The header pointer executes the first Emp, so the head of our linked list directly points to the first Emp
	private Emp head; //Default null
	
	//Add employee to linked list
	//explain
	//1. Suppose that when an employee is added, the id is self increasing, that is, the id allocation is always from small to large
	//   Therefore, we can add the employee directly to the end of this linked list
	public void add(Emp emp) {
		//If you are adding the first employee
		if(head == null) {
			head = emp;
			return;
		}
		//If not the first employee, an auxiliary pointer is used to help locate the last employee
		Emp curEmp = head;
		while(true) {
			if(curEmp.next == null) {//Description to the end of the linked list
				break;
			}
			curEmp = curEmp.next; //Move back
		}
		//When exiting, add emp directly to the linked list
		curEmp.next = emp;
	}
	
	//Employee information traversing the linked list
	public void list(int no) {
		if(head == null) { //Description: the linked list is empty
			System.out.println("The first "+(no+1)+" The linked list is empty");
			return;
		}
		System.out.print("The first "+(no+1)+" The information of the linked list is");
		Emp curEmp = head; //Auxiliary pointer
		while(true) {
			System.out.printf(" => id=%d name=%s\t", curEmp.id, curEmp.name);
			if(curEmp.next == null) {//It indicates that curEmp is the last node
				break;
			}
			curEmp = curEmp.next; //Move back, traverse
		}
		System.out.println();
	}
	
	//Find employees by id
	//If found, return Emp; if not found, return null
	public Emp findEmpById(int id) {
		//Determine whether the linked list is empty
		if(head == null) {
			System.out.println("The linked list is empty");
			return null;
		}
		//Auxiliary pointer
		Emp curEmp = head;
		while(true) {
			if(curEmp.id == id) {//find
				break;//At this time, curEmp points to the employee to be found
			}
			//sign out
			if(curEmp.next == null) {//Description: the employee cannot be found by traversing the current linked list
				curEmp = null;
				break;
			}
			curEmp = curEmp.next;//in the future
		}
		
		return curEmp;
	}
	
}


Topics: Algorithm data structure