String and string function summary

Posted by Sorrow on Fri, 15 Oct 2021 22:40:41 +0200

Learning objectives:

Master the usage of string function in C language and its simulation implementation

The processing of strings and characters in C is very cumbersome, but there is no string type in C language! Strings are usually placed in constant string bars and character arrays, where string constants are applicable to string functions that do not modify them.

Learning content:

String function with unlimited lengthstrlen,strcpy,strcat,strcmp
String function with limited lengthstrncpy,strncat,strncmp
Memory operation functionmemcpy, memmove, memset, memcmp (introduced in the next article)
String lookup functionstrstr,strtok
Error message reportstrerror

1: strlen

① : Standard Format: size_t strlen ( const char * str );

② : Meaning: the string pointed to by str must take '\ 0' as the end flag, and strlen returns the number of characters in front of the string end flag.

③ : precautions

      (a) : the string pointed to by str must end with '\ 0'

      (b) : the strlen function returns an unsigned int!!!!

④ : classic case:

#include <stdio.h>
#include <string.h>
int main()
{
	printf("%d\n", strlen("abcdef"));
	return 0;
}

⑤ : error prone:

#include <stdio.h>
#include <string.h>
int main()
{
	char arr1[] = "abcdef";
	char arr2[] = "abc";
	if ((strlen(arr2) - strlen(arr1)) > 0)
		printf("hehe\n");
	else
		printf("gaga\n");
	return 0;
}

Q: what does the above code print?

The result is hehe, because the expression operation result of unsigned integer is still unsigned shaping. As long as the binary bit of unsigned integer is non-zero, it must be greater than 0.

⑥ : Simulation Implementation of strlen

Method 1: (counting method)

#include <stdio.h>
#include <assert.h>
size_t my_strlen(const char* s)
{
	assert(s);
	size_t count = 0;
	while (*s != '\0')
	{
		count++;
		s++;
	}
	return count;
}
int main()
{
	char arr[] = "abcdef";
	int ret = my_strlen(arr);
	printf("%d\n", ret);
	return 0;
}

Method 2: recursion

#include <stdio.h>
#include <assert.h>
size_t my_strlen(const char* s)
{
	assert(s);
	if (*s != '\0')
		return 1 + my_strlen(s + 1);
	else
		return 0;
}
int main()
{
	char arr[] = "abcdef";
	int ret = my_strlen(arr);
	printf("%d\n", ret);
	return 0;
}

Method 3: subtraction using pointer

#include <assert.h
size_t my_strlen(const char* s)
{
	assert(s);
	char* start = s;
	while (*s != '\0')
		s++;
	return s - start;
}
int main()
{
	char arr[] = "abcdef";
	int ret = my_strlen(arr);
	printf("%d\n", ret);
	return 0;
}

2: strcpy

① : Standard Format: char * strcpy (char * destination, const char * source);

② : Meaning: copy from source to '\ 0' (inclusive), put from destination, and finally return to destination

③ : precautions:

      (a) : the source string must end with '\ 0' (\ 0 will be copied together)

      (b) : the target space should be large enough

  ④ : Simulation Implementation:

#include <stdio.h>
char* my_strcpy(char* dest, const char* src)
{
	char* ret = dest;
	while (*src != '\0')
	{
		*dest = *src;
		dest++; 
		src++;
	}
	*dest = *src;//Copy \ 0
	return ret;
}
int main()
{
	char arr1[] = "********************";
	char arr2[] = "abcdef";
	char* ret = my_strcpy(arr1,arr2);
	printf("%s\n", arr1);
	return 0;
}

    

3: strcat

① : Standard Format: char * strcat (char * destination, const char * source);

② : Meaning: attach a copy of the source string to the target string. The '\ 0' at the end of the string pointed to by destination will be overwritten by the first character of the appended source string. After appending, add an extra '\ 0' at the end of the "large string", or append the source string to the target string together with its string end flag. Return destination.

③ : precautions:

    (a) : source string must end with '\ 0'

    (b) : the target space should be large enough to hold the appended "large string"

④ : Simulation Implementation:

#include <stdio.h>
#include <assert.h>
char* my_strcat(char* destination, const char* source)
{
	assert(destination && source);
	char* ret = destination;
	while (*destination!= '\0')
	{
		destination++;
	}
	while (*source != '\0')
	{
		*destination = *source;
		destination++;
		source++;
	}
	return ret;
}
int main()
{
	char arr1[] = "abcdef";
	char arr2[20] = "***";
	char* ret = my_strcat(arr2, arr1);
	printf("%s\n", ret);
	return 0;
}

4: strcmp

① : Standard Format: int StrCmp (const char * STR1, const char * STR2);

② : Meaning: start from the characters pointed to by str1 and str2 respectively, and compare whether the two characters are the same one by one. The comparison will exit under the following circumstances:

    (a) : one of the characters appears' \ 0 '

    (b) : the two characters are different

③ : Return:

<0

A mismatched character was encountered and the ASCII value of the character pointed to by str1 is less than str2 at this time

ASCII value of the character pointed to

=0Compare the characters one by one until '\ 0' is reached at the same time
>0

A mismatched character was encountered and the ASCII value of the character pointed to by str1 is greater than str2 at this time

ASCII value of the character pointed to

④ : Simulation Implementation:

#include <stdio.h>
int my_strcmp(const char* str1, const char* str2)
{
	while ((*str1 != '\0') && (*str2 != '\0'))
	{
		if (*str1 == *str2)
		{
			str1++;
			str2++;
		}
		else
		{
			if ((*str1 - *str2) > 0)
			{
				return 1;
				break;
			}
			else
			{
				return -1;
				break;
			}
			
		}
	}
	if ((*str1 - *str2) > 0)
	{
		return 1;
	}
	else if((*str1-*str2)==0)
	{
		return 0;
	}
	else
	{
		return -1;
	}
	
}
int main()
{
	char arr1[] = "abcdeg";
	char arr2[] = "abcdef";
	int ret = my_strcmp(arr1, arr2);
	printf("%d\n", ret);
	if (ret == 0)
	{
		printf("The two strings are the same\n");
	}
	else
		printf("The two strings are different\n");
	return 0;
}

The above operations are brainless to '\ 0'. What library functions should be used if you want to operate and formulate a number of characters to achieve similar purposes as above?

1: strncpy

① : Standard Format: char * strncpy (char * destination, const char * source, size_t Num);

② : copy num bytes of characters from source to the target space where destination is the starting address. The unit of num here is byte!

③ Note: the string pointed to by source has a length, so num is different from it. There are two cases:

    (a) : num > source refers to the number of characters in the string (including \ 0). At this time, copy to the target space. In addition to the string (including \ 0) pointed to by source, supplement \ 0 to the total number of num.

    (b) : num < source refers to the number of characters in the string (including \ 0). At this time, only num bytes of characters are copied. Because char takes up one byte, it can be understood that num bytes are copied to the target space. Because num < the number of strings pointed to by source, the \ 0 in the source is not copied. At this time, the destination must be long enough so that its own \ 0 is not overwritten, Then there is. If equal, it degenerates to strcpy.

④ : use case:

#include <string.h>
#include <stdio.h>
int main()
{
	char arr1[] = "abcdef";
	char arr2[20] = "*******";
	strncpy(arr2, arr1, 4);
	printf("%s\n", arr2);//Should print: abcd***
	return 0;
}

⑤ Simulation Implementation: it's just more bytes than strcpy. The idea is similar. No more details.

2: strncat

① : Standard Format: char * strncat (char * destination, const char * source, size_t Num);

② : append num bytes of characters in the string pointed to by source to the end of the string pointed to by destination. And it is appended from the string end flag of destination pointing to the string, so the string end flag of destination pointing to the string is overwritten.

③ : similar to strncpy, there is a problem about the relationship between num and source pointing to the number of characters in the string:

  (a) : num > when source points to the number of characters in the string, only copy the string pointed to by source (including \ 0), and do not copy it later

  (b) : num < source when pointing to the number of characters in the string, copy num characters first, and then add one more \ 0

④ : use case:

#include <stdio.h>
#include <string.h>
int main()
{
	char arr1[] = "abcdef";
	char arr2[100] = "***********************";
	char* ret = strncat(arr2, arr1, 4);
	printf("%s\n", ret);
	return 0;
}

3: strncmp

① : Standard Format: int strncmp (const char * STR1, const char * STR2, size_t Num);

② : Meaning: start from STR1 and STR2 to check whether the contents of num bytes are consistent one by one. The returned result is the same as that of strcmp

③ : use case:

#include <stdio.h>
#include <string.h>
int main()
{
	char arr1[] = "abcdef";
	char arr2[] = "abcedf";
	int ret = strncmp(arr1, arr2, 4);
	printf("%d\n", ret);
	return 0;
}

String lookup function: STR

① : Standard Format: const char * strstr (const char * STR1, const char * STR2);

② : Meaning: find the string pointed to by str2 in the string pointed to by str1, and return the first character address found for the first time. NULL returned when not found

③ : use case:

  ④ : Simulation Implementation (see the next blog for the code). At present, master the use first.

strtok: (I don't know how to choose the name, please see the meaning)

  ① Format: char * strtok (char * STR, const char * delimiters);

② : Meaning: there are 0 or more separators you want to use in the string pointed to by str. how to cut out the string between the separators you use? Then use the strtok function. Specifically, start from STR, find the first non separator character as the beginning, and then find the first separator (or \ 0). At this time, modify the separator to \ 0 (at this time, the string pointed to by STR is modified, so in order to keep the string pointed to by STR output as it is, we'd better make a temporary copy of the string pointed to by STR to perform strtok operation.) , and then do two things: first, return the first character address of the found string, that is, the return value of the function. Second, implicitly store the address of the next character of the character modified to \ 0 by the function. The purpose of implicit storage: because the use of strtok is characterized by: when STR finds the first string, the first parameter of strtok is STR, and then find the next word The first parameter of strtok is NULL!!!, and the delimiters usually exist in a character array or a string constant.

③ : when does strtok return NULL?

(a) : the above procedure is to implicitly store the address of \ 0. If strtok is performed again, NULL will be returned

(b) : the \ 0 and the first character of the face are stored implicitly (you can remember it this way). Here is the case where \ 0 is found from the separator, and strtok will return NULL again

④ : use case:

  The above belongs to the case where the address of \ 0 is used as the pre stored address and the separator is a string constant.

The above belongs to the case where the address following \ 0 is used as the pre storage address and the separator is a string array.

The above two cases are not copied temporarily, so you'd better remember to make a temporary copy. The above summary is personal understanding and can indeed predict the result of the function. For a better understanding, you can see it in the comment area.

Study time:

2021.10.10

Learning output:


1. Technical notes 2 times
2. CSDN technology blog 1
3,gitee

Topics: C