Big talk data structure - Search

Posted by visualed on Sat, 25 Dec 2021 15:38:25 +0100

1. Ordered table lookup

-Binary search

  • The premise of binary search is that the records in the linear table must be in order of key codes (usually from small to large), and the linear table must be stored in order.
/*Binary search*/
int Binary_Search(int *a, int n, int key)
{
	int low, high, mid;
	low = 1;   /*The lowest subscript of the record is the first place of the record*/
	high = n;   /*Define the highest subscript as the last bit of the record*/
	while(low <= high)
	{
		mid = (low + high)/2;   /*Halve*/
		if(key < a[mid])
			high = mid - 1;
		else if(key > a[mid])
			low = mid + 1;
		else return mid;
	}
	return 0;
}

-Interpolation lookup

  • Interpolation lookup is a search method based on the comparison between the keyword to be searched and the keyword of the largest and smallest record in the lookup table. Its core lies in the calculation formula of interpolation (key - a [low] / (a [high] - a [low])

-Fibonacci search

/*fibonacci search */
int Fibonacci_Search(int *a, int n, int key)
{
	int low, high, mid, i, k;
	low = 1;
	high = n;
	k = 0;
	while(n > F[k] - 1;   /*Calculate the position of n in the Fibonacci sequence*/
		k++;
	for(i = n; i < F[k] - 1; i++)   /*Complete the dissatisfied value*/
		a[i] = a[n];

	while(low <= high)
	{
		mid = low + F[k - 1] - 1;   /*Calculates the subscript of the current split*/
		if(key < a[mid])   /*If the search record is smaller than the current split record*/
		{
			high = mid - 1;
			k = k - 1;
		}
		else if(key > a[mid])
		{
			low = mid + 1;
			k = k - 2;
		}
		else
		{
			if(mid <= n)
				return mid;   /*If they are equal, the mid is the location found*/
			else
				return n;   /*If mid > n indicates that it is a complement value, n is returned*/
		}
	}
	return 0;
}

2. Linear index lookup

  • The so-called linear index is to organize the set of index items into a linear structure, also known as the index table.

-Dense index

  • Dense index means that in a linear index, each record in the dataset corresponds to an index item. For the index table of dense index, the index items must be arranged in order according to the key.

-Block index

  • Block ordering is to divide the records of the data set into several blocks, and these blocks meet the disorder within the blocks and the order between blocks (for example, the keywords of all records in the second block are greater than those of all records in the first block). For block ordered data sets, each block corresponds to an index item. This index method is called block index.

-Inverted index

  • In the index table, the attribute value is not determined by the record, but the position of the record is determined by the attribute value, so it becomes an inverted index.

3. Binary sort tree

  • Binary sort tree, also known as binary search tree. It is either an empty tree or a binary tree with the following properties:

    1) If its left subtree is not empty, the values of all nodes on the left subtree are less than the values of its root structure;
    2) If its right subtree is not empty, the values of all nodes on the right subtree are greater than those of its root node;
    3) Its left and right subtrees are also binary sort trees.

-Binary sort tree lookup operation

/*Definition of binary linked list node structure of binary tree*/
typedef struct BiTNode   /*Node structure*/
{
	int data;   /*Node data*/
	struct BiTNode *lchild, *rchild;   /*Left and right child pointers*/
}BiTNode, *BiTree;
  • Implementation of binary sort tree:
/*Recursively find out whether there is a key in the binary sort tree T. the pointer f points to the parents of T, and its initial call value is NULL*/
/*If the search is successful, the pointer p points to the data element node and returns TRUE. Otherwise, the pointer p points to the last node accessed on the search path and returns FALSE*/
Status SearchBST(BiTree T, int key, BiTree f, BiTree *p)
{
	if(!T)   /*The search was unsuccessful*/
	{
		*p = f;
		return FALSE;
	}
	else if(key == T->data)  /*Search succeeded*/
	{
		*p = T;
		return TRUE;
	}
	else if(key < T->data)
		return SearchBST(T->lchild, key, T, p);   /*Continue searching in the left subtree*/
	else
		return SearchBST(T->rchild, key, T, p);   /*Continue searching in the right subtree*/
}

-Binary sort tree insert operation

/*When there is no data element with keyword equal to key in binary sort tree T, insert key and return TRUE; otherwise, return FALSE*/
Status InsertBST(BiTree *T, int key)
{
	BiTree p, s;
	if(!SearchBST(*T, key, NULL, &p))   /*The search was unsuccessful*/
	{
		s = (BiTree)malloc(sizeof(BiTNode));
		s->data = key;
		s->lchild = s->rchild = NULL;
		if(!p)
			*T = s;   /*Insert s as the new root node*/
		else if(key < p->data)
			p->lchild = s;   /*Insert s as left child*/
		else
			p->rchild = s;   /*Insert s as right child*/
		return TRUE;
	}
	else
		return FALSE;   /*There are already nodes with the same keyword in the tree, so it is no longer inserted*/
}

-Delete binary sort tree

  • There are three ways to delete a node:
    1) Leaf node;
    2) Nodes with only left or right subtrees;
    3) Both left and right subtrees have nodes.
/*If there is a data element with keyword equal to key in binary sort tree T, delete the data element node and return TRUE; otherwise, return FALSE*/
Status DeleteBST(BiTree *T, int key)
{
	if(!T)   /*There is no data element with keyword equal to key*/
		return FALSE;
	else
	{
		if(key == (*T)->data)   /*Find the data element whose keyword is equal to key*/
			return Delete(T);
		else if(key < (*T)->data)
			return DeleteBST(&(*T)->lchild, key);
		else
			return DeleteBST(&(*T)->rchild, key);
	}
}

/*Delete node p from the binary sort tree and rejoin its left or right subtree*/
Status Delete(BiTree *p)
{
	BiTree q,s;
	if((*p)->rchild == NULL)   /*If the right subtree is empty, you only need to pick up its left subtree*/
	{
		q = *p; *p = (*p)->lchild; free(q);
	}
	else if((*p)->lchild == NULL)   /*Just reconnect its right subtree*/
	{
		q = *p; *p = (*p)->rchild; free(q);
	}
	else   /*The left and right subtrees are not empty*/
	{
		q = *p; s = (*p)->lchild;
		while(s->rchild)   /*Turn left and then right to the end (find the precursor of the node to be deleted)*/
		{
			q = s; s = s->rchild;
		}
		(*p)->data = s->data;   /*s Direct precursor to deleted node*/
		if(q != *p)
			q->rchild = s->lchild;   /*Right subtree of reconnected q*/
		else
			q->lchild = s->lchild;   /*Left subtree of reconnected q*/
		free(s);
	}
	return TRUE;
}

4. Balanced binary tree (AVL tree)

  • Balanced binary tree is a sort of binary tree, in which the height difference between the left subtree and the right subtree of each node is at most equal to 1. We call the left subtree depth minus the right subtree depth of the node on the binary tree the Balance Factor BF (Balance Factor)
/*Definition of binary linked list node structure of binary tree*/
typedef struct BiTNode   /*Node structure*/
{
	int data;   /*Node data*/
	int bf;   /*Equilibrium factor of node*/
	struct BiTNode *lchild, *rchild;   /*Left and right child pointers*/
}BiTNode, *BiTree;

/*The binary sort tree with p as the root is processed by right rotation. After processing, p points to the new tree root node, that is, the root node of the left subtree before rotation processing*/
void R_Rotate(BiTree *p)
{
	BiTree L;
	L = (*p)->lchild;   /*L Point to the root node of the left subtree of p*/
	(*P)->lchild = L->rchild;   /*L The right subtree of is connected to the left subtree of p*/
	L->rchild = (*P);
	*p = L;   /*p Point to the new root node*/
}

/*The binary sort tree with p as the root is processed by left rotation. After processing, p points to the new tree root node, that is, the root node 0 of the right subtree before rotation processing*/
void L_Rotate(BiTree *p)
{
	BiTree R;
	R = (*p)->rchild;   /*R Point to the root node of the right subtree of p*/
	(*p)->rchild = R->lchild;   /*R The left subtree of is connected to the right subtree of p*/
	R->lchild = (*p);
	*p = R;   /*p Point to the new root node*/
}
  • Let's look at the code for balancing rotation:
#define LH +1 / * left high*/
#define EH 0 / * contour*/
#define RH -1 / * right high*/
/*The binary tree with the node referred to by pointer T as the root is processed by left balanced rotation*/
/*At the end of this algorithm, pointer T points to the new root node*/
void LeftBalance(BiTree *T)
{
	BiTree L, Lr;
	L = (*T)->lchild;   /*L Point to the root node of the left subtree of T*/
	switch(L->bf)
	{/*Check the balance degree of the left subtree of T and balance it accordingly*/
		case LH:   /*The new node is inserted into the left subtree of the left child of T, and single right rotation is required*/
			(*T)->bf = L->bf = EH;
			R_Rotate(T);
			break;
		case RH:   /*The new node is inserted into the right subtree of the left child of T, and double selection is required*/
			Lr = L->rchild;   /*Lr Point to T's left child's right subtree root*/
			switch(Lr->bf)   /*Modified balance factor of T and its left child*/
			{
				case LH: (*T)->bf = RH;
					L->bf = EH;
					break;
				case EH: (*T)->bf = L->bf = EH;
					break;
				case RH: (*T)->bf = EH;
					L->bf = LH;
					break;
			}
			Lr->bf = EH;
			L_Rotate(&(*T)->lchild);   /*The left subtree of T is left-handed balanced*/
			R_Rotate(T);   /*D-balance T*/
	}          
}
  • Main function:
/*If there is no node with the same keyword as e in the balanced binary sort tree T, a node is inserted*/
/*For a new node whose data element is e, it returns 1, otherwise it returns 0. If the binary sort tree loses balance due to insertion, it will be processed as balance rotation*/
/*The boolean variable teller reflects whether T is long or high*/
Status InsertAVL(BiTree *T, int e, Status *taller)
{
	if(!T)
	{/*Insert a new node, the tree is "long and high", and set teller to TRUE*/
		*T = (BiTree)malloc(sizeof(BiTNode));
		(*T)->data = e;
		(*T)->lchild = (*T)->rchild = NULL;
		(*T)->bf = EH;
		*taller = TRUE;
	}
	else
	{
		if(e == (*T)->data)
		{/*If a node with the same keyword as e already exists in the tree, it will not be inserted*/
			*taller = FALSE;
			return FALSE;
		}
		if(e < (*T)->data)
		{/*The search should continue in the left subtree of T*/
			if(!InsertAVL(&(*T)->lchild, e, taller);   /*Not inserted*/
				return FALSE;
			if(taller)   /*It has been inserted into the left subtree of T and the left subtree is "long and high"*/
			{
				switch((*T)->bf)   /*Check the balance of T*/
				{
					case LH:   /*The original left subtree is higher than the right subtree, so it needs to be left balanced*/
						LeftBalance(T);
						*taller = FALSE;
						break;
					case EH:   /*Originally, the left and right subtrees were equal in height, but now the left subtree is higher and the tree is higher*/
						(*T)->bf = LH;
						*taller = TRUE;
						break;
					case RH:   /*Originally, the right subtree was taller than the left subtree, but now the left and right subtrees are equal in height*/
						(*T)->bf = EH;
						*taller = FALSE;
						break;
				}
			}
		}
		else
		{/*The search should continue in the right subtree of T*/
			if(!InsertAVL(&(*T)->rchild, e, taller)   /*Not inserted*/
				return FALSE;
			if(*taller)   /*Inserted into the right subtree of T and the right subtree is "long and high"*/
			{
				switch((*T)->bf)   /*Check the balance of T*/
				{
					case LH:   /*Originally, the left subtree was taller than the right subtree, but now the left and right subtrees are equal in height*/
						(*T)->bf = EH;
						*taller = FALSE;
						break;
					case EH:   /*Originally, the left and right subtrees were equal in height, but now the right subtree is higher and the tree is higher*/
						(*T)->bf = RH;
						*taller = TRUE;
						break;
					case RH:   /*Originally, the right subtree is higher than the left subtree, so it needs to be balanced*/
						RightBalance(T);
						*taller = FALSE;
						break;
				}
			}	
		}
	}
}

5. Multiple lookup tree (B-tree)

  • The number of children in each node of a multiple way search tree can be more than two, and multiple elements can be stored at each node.

-2-3 tree

  • A 2-3 tree is a multi-path lookup tree in which each node has two children (we call it node 2) or three children (we call it node 3).

-2-3-4 tree

  • 2-3-4 tree is actually an extension of the concept of 2-3 tree, including the use of 4 nodes. A 4 node includes three elements of small, medium and large and four children (or no children).

-B tree

  • B-tree is a balanced multi-path lookup tree. 2-3 tree and 2-3-4 tree are special cases of B-tree. The maximum number of children in a node is called the order of a B-tree.

-B + tree

6. Hash table lookup

  • Hash technology is to establish a definite correspondence between the storage location of a record and its keywords f, Make each keyword key correspond to a storage location f (key). We call this correspondence F as hash function, also known as hash function. In other words, hash technology is used to store award records in a continuous storage space, which is called Hash list or Hash table (Hash table). The record storage location corresponding to the keyword is called hash address.

Topics: Algorithm data structure