Java summary (String,StringBuider, regular expression)

Posted by ritter on Thu, 02 Sep 2021 08:33:42 +0200

  String class

String is used to represent a string. It has the following characteristics:

  • java.lang.String uses the final modifier, so it cannot be inherited.
  • The bottom layer of the string encapsulates the character array and the operation algorithm for the character array.
  • Once the string is created, the object can never be changed, but the string reference can be copied again.
  • java strings are encoded in Unicode in memory, and any character corresponds to a fixed length encoding of two bytes.

String constant pool

java opens up a space in the heap memory to cache all string objects created in literal form, and reuse the objects when creating strings again in the later stage, so as to avoid accumulating a large number of string objects with the same content in memory to reduce the memory overhead.

For the repeated string direct quantity, the JVM will first look it up in the constant pool. If it exists, it will return the object address.

String s1 = "123abc";//Literal 
String s2 = "123abc";//As s1 literal, reuse objects
//The same address indicates that s2 reuses s1 objects
String s3 = "123abc";

be careful:

Generally, we judge that the string is the comparison content, so we should use the equals method of the string

  Here a compiler feature is triggered:
If the compiler encounters several calculation expressions during compilation, it will calculate when it finds that the result can be determined during compilation
 And compile the results into the class file, so that the JVM does not need to calculate every time it executes the bytecode file.
The following code will be changed by the compiler:
String s5 = "123abc";
Therefore, s5 reuses the objects in the constant pool, so the address is the same as s2
String s5 = "123" + "abc";
System.out.println(s2==s5); //true

Common methods of string:

Return length int length()

Returns the length (number of characters) of the current string

String str = "I love java!";
int len = str.length();

Get element subscript indexOf()

Retrieves the position of the given string in the current string. If the current string does not contain the given content, the return value is - 1

String str = "";
int index = str.indexOf("l");//5

Overloaded method, which can be retrieved from the specified number of bits:

index = str.indexOf("l",6);//3
//                 Character specifies the number of digits

Retrieve the last occurrence l of lastIndexOf()

index = str.lastIndexOf("l");

Intercept the specified string substring()

Intercepts the string within the specified range in the current string. The two parameters are the subscript of the start position and the subscript of the end position. Note: in JAVA API, when two numbers are usually used to represent the range, it is "including the head but not including the tail".

String line = "";
//             012345678901
String str = line.substring(5,7);
System.out.println(str); //li

Overloaded method, which can be intercepted from the specified number of bits to the end:

str = line.substring(5);

Remove blank trim()  

Remove the blank characters on both sides of a string. / / only the two sides, not the middle

String line = "   he llo         ";
System.out.println(line);  //he llo

Get subscript element charAt()

Gets the element of the subscript at the specified position of the string

String str = "thinking in java";
//Get the 10th character
char c = str.charAt(9); //i

Determine start or end startsWith() and endsWith()

Determines whether a string starts or ends with a specified character

//starts start    // ends end

String line = "";
boolean starts = line.startsWith("echo"); //true
boolean ends = line.endsWith("li"); // false

Case conversion toLowerCase() and toUpperCase()

String line = "Java";
String lower = line.toLowerCase(); //java
String upper = line.toUpperCase(); //JAVA


Convert other types to String.

int a = 123456;
String s1 = String.valueOf(a);

double d = 123.456;
String s2 = String.valueOf(d);

String s3 = a+"";//Any content and string link result is a string


Because String is an invariant object, a new object must be created every time the content is modified, so String is not suitable for frequent modification. To solve this problem, java provides a StringBuilder class


StringBuilder is an API specially used to modify String. It internally maintains a variable char array. All modifications are made on this array, and the internal capacity will be automatically expanded. The modification speed and performance overhead are excellent. It also provides the corresponding methods for common operations of modifying String: add, delete, change and insert

String str = "java";
//Copy the given string into StringBuilder
//      StringBuilder builder = new StringBuilder(str);// Not thread safe
        StringBuffer builder = new StringBuffer(str);//Is thread safe    

  append: additional content

System.out.println(builder);//Output the contents of StringBuilder: java.lang
//                                                     0123456789

  Replace: replace part of the content

Subscript operations in Java generally contain headers and no tails

System.out.println(builder); //java.lang.String
//                             01234567890123456

Delete: delete some content

System.out.println(builder); //g.String

  Insert: insert operation

System.out.println(builder); //Learn g.String

Flip string

System.out.println(builder); //Gnirt S.G. learning

StringBuffer and StringBuilder

  • StringBuffer is thread safe and synchronous, and its performance is slightly slow  
  • StringBuilder is non thread safe, concurrent, and slightly faster

regular expression

  Regular expression is used to describe the content format of a string. It is usually used to match whether the content of a string meets the format requirements

Basic grammar

[]: represents a character, which can be the content specified in []

For example:

  • [abc]: this character can be a or b or c
  • [a-z]: represents any lowercase letter
  • [a-zA-Z]: represents any letter
  • [a-zA-Z0-9_]: Indicates an underscore of any number or letter
  • [^ abc]: as long as the character is not a or b or c

Predefined characters

  • ..: ". Indicates any character, and there is no range limit   // spot
  • \d: Represents any number, equivalent to [0-9]
  • \w: Represents any word character, equivalent to [a-zA-Z0-9_]
  • \s: Represents any white space character
  • \D: Indicates that it is not a number
  • \W: Not a word character
  • \S: Not a blank character


  • ?    :  Indicates that the previous content appears 0-1 times
    •     For example:
    •     [ abc]? Can match: a or b or c or write nothing
  • +    :    Indicates that the previous content appears more than once
    •     [ abc] + can match: aaaaaaaaaa aa... Or abcabcbabcbabcbabbabbabbabbabbabbabbabbabbabbabbabbabbabbabbabbabbabbabbabbabbabbabbabbabbabbabbabbabbabbabbabbabbabbabbabbabbabbabbabbabbabbabbabbabbabbabbabbabbabbabbabbabbabbabbabbabbabbabbab
    •     But can't match: write nothing or abcfdfsbbbaqbb34bbwer
  •  *    :    Indicates that the previous content appears any number of times (0 - more than once)
    •       The matching content is consistent with + except that it can not be written once
  •  { n}     :    Indicates that the previous content appears n times
    •     For example:
    •     [ abc]{3} can match: aaa or bbb or aab
    •     Cannot match: aaaa or aad
  •  { n,m}      :    Indicates that the previous content appears at least N times and at most m times
    •     [ abc]{3,5} can match: aaa or   abcab or abcc
    •     Cannot match: AAAA or aabbd
  • {n,}    :    Indicates that the previous content appears more than n times (including n times)
    •     [ abc]{3,} can match: aaa or aaaaa.... or abcbabbcbabcbabcbba
    •     Cannot match: aa or abbdaw
  • ()       For grouping, the contents in parentheses are regarded as a whole
    •     For example:
    •     ( abc){3} indicates that abc appears 3 times as a whole. It can match abc ABC
    •     Cannot match aaa or abcabc
  • (abc|def){3}     Indicates that ABC or def occurs 3 times as a whole
    • Can match: ABC ABC ABC or def or abcdefabc

String supports regular expression related methods

matches method

boolean matches(String regex)
Use the given regular expression to verify that the current string meets the format requirements,Return if satisfied true.Otherwise return false
Regular expression for mailbox
 User name @ domain name
String mail = "";
String regex = "[a-zA-Z0-9_]+@[a-zA-Z0-9]+(\\.[a-zA-Z]+)+"; 
//Two \ need to be escaped again because \ has escape usage in Java
boolean match = mail.matches(regex);
    System.out.println("It's a mailbox");
    System.out.println("Not a mailbox");

Split method (split)

String[] split(String regex)

Split the current string according to the part satisfying the regular expression, and return each part in the form of array

String str = "abc123def456ghi";
//Split the number part to get the letters of each part
String[] arr = str.split("[0-9]+");
System.out.println(arr.length);  //3
System.out.println(Arrays.toString(arr));  //[abc, def, ghi]

str = "123,456,789,023";
//Split out all the digital parts
arr = str.split(",");
System.out.println(Arrays.toString(arr));  //[123, 456, 789, 023]
//If split items are encountered consecutively, an empty string will be split. However, if they are encountered consecutively at the end of the string, they will be ignored
str = ",,,123,,,456,789,023,,,,";
//Split out all the digital parts
arr = str.split(",");
System.out.println(Arrays.toString(arr));  //[, , , 123, , , 456, 789, 023]

str = "123.456.789.023";
//Split out all the digital parts
arr = str.split("\\.");//. if you represent any character in a regular expression, pay attention to changing the meaning!
System.out.println(Arrays.toString(arr));   //[123, 456, 789, 023]

replaceAll method

String replaceAll(String regex,String str)

Replace the part of the current string that satisfies the regular expression with the given content

String str = "abc123def456ghi";
//Replace the numeric part of the current string with #NUMBER#
str = str.replaceAll("[0-9]+","#NUMBER#");

Topics: Java