go regular expression

Posted by lilsim89 on Thu, 17 Oct 2019 20:48:22 +0200

In the first two lessons, we used a lot of regular expressions to match the city list, city and user information. In fact, in addition to regular expressions to match, we can also use goquery and xpath third-party library to match useful information. And I used more elegant regular expression matching. Let's talk about regular expressions.

For example, when we match the city list, we will take the url that matches all cities, as follows:

You can see that the picture is followed by lowercase letters and numbers, so you can extract it in the following ways:

<a href="(http://www.zhenai.com/zhenghun/[0-9a-z]+)"[^>]*>([^<]+)</a>

[0-9a-z] + means matching lowercase letters or numbers at least once, [^ >] * means matching non > characters any times, and then [^ <] + means matching non < characters at least once. We need to get the url and city name of the city, so we grouped them.

You can get the url and city by

const (
   cityListReg = `<a href="(http://www.zhenai.com/zhenghun/[0-9a-z]+)"[^>]*>([^<]+)</a>`
 )

 compile := regexp.MustCompile(cityListReg)

 submatch := compile.FindAllSubmatch(contents, -1)

 for _, m := range submatch {
   fmt.Println("url:" , string(m[1]), "city:", string(m[2]))
 }

The match contains g g and at least one lowercase letter in the middle of gg:

//Match contains g g and at least one lowercase letter in the middle of gg
 match, _ := regexp.MatchString("g([a-z]+)g", "11golang11")
 //true
 fmt.Println(match)

We directly use the regular expression of string matching, but for other regular matching tasks, we need to use an optimized regular object:

compile, err := regexp.Compile("smallsoup@gmail.com")

 if err != nil {
   //... regular syntax error, need to handle error
   fmt.Println(err)
 }

 //smallsoup@gmail.com
 fmt.Println(compile.FindString(text))

compile, err :=regexp.Compile("smallsoup@gmail.com")

The function returns a regular expression matcher and an error. When the parameter regular expression does not conform to the regular syntax, an error is returned. For example, regexp.Compile("[smallsoup@gmail.com") will report missing closing]

Generally, regular expressions need to handle errors only when they are entered by users, but they can't make mistakes when they write them. So you can use compile:= regexp.MustCompile("smallsoup@gmail.com"). If the syntax is wrong, panic will occur.

text1 := `my email is aa@qq.com
  aa email is aa@gmail.com
  bb email is bb@qq.com
  cc email is cc@qq.com.cn
  `
 //If you want to extract A, B and C in A@B.C, you need to use the regular expression extraction function.
 comp := regexp.MustCompile(`([a-zA-Z0-9]+)@([a-zA-Z0-9.]+)\.([a-zA-Z0-9]+)`)

 //Using self matching to get matching content in parentheses of regular expression
 submatchs := comp.FindAllStringSubmatch(text1, -1)

 //Submatches is actually a two-dimensional array.
 fmt.Println(submatchs)

 //To remove every match, submatch is actually a slice.
 for _, submatch := range submatchs {
   fmt.Println(submatch)
 }

The result output is as follows:

[[aa@qq.com aa qq com] [aa@gmail.com aa gmail com] [bb@qq.com bb qq com] [cc@qq.com.cn cc qq.com cn]]
[aa@qq.com aa qq com]
[aa@gmail.com aa gmail com]
[bb@qq.com bb qq com]
[cc@qq.com.cn cc qq.com cn]
r := regexp.MustCompile("p([a-z]+)ch")
 fmt.Println(r) //----->p([a-z]+)ch
 //The regexp package can also be used to replace some strings with other values.
 fmt.Println(r.ReplaceAllString("a peach", "<smallsoup>")) //----->a <smallsoup>
 //The Func variable allows matching content to be passed into a given function.
 in := []byte("a smallsoup")
 out := r.ReplaceAllFunc(in, bytes.ToUpper)
 fmt.Println(string(out)) //----->a PEACH
 /*#######################Common expressions###########################*/
 // Find Chinese characters
 testText := "Hello How are you, I like golang!"
 reg := regexp.MustCompile(`[\p{Han}]+`)
 fmt.Println(reg.FindAllString(testText, -1)) // ----->[Hello]
 reg = regexp.MustCompile(`[\P{Han}]+`)
 fmt.Println(reg.FindAllString(testText, -1))        // ----->["Hello " ", I li golang!"]
 fmt.Printf("%q\n", reg.FindAllString(testText, -1)) // ----->["Hello " ", I lm golang!"]
 //Email
 reg = regexp.MustCompile(`\w+([-+.]\w+)*@\w+([-.]\w+)*\.\w+([-.]\w+)*`)
 fmt.Println(reg.MatchString("smallsoup@qq.com"))
 //User name and password:
 reg = regexp.MustCompile(`[a-zA-Z]|\w{6,18}`)
 fmt.Println(reg.MatchString("w_dy_246"))

The operation results are as follows:

p([a-z]+)ch
a <smallsoup>
a smallsoup
[How are you]
[Hello  , I like golang!]
["Hello " ", I like golang!"]
true
true

Process finished with exit code 0

This public account provides free csdn download service and massive it learning resources. If you are ready to enter the IT pit and aspire to become an excellent program ape, these resources are suitable for you, including but not limited to java, go, python, springcloud, elk, embedded, big data, interview materials, front-end and other resources. At the same time, we have set up a technology exchange group. There are many big guys who will share technology articles from time to time. If you want to learn and improve together, you can reply [2] in the background of the public account. Free invitation plus technology exchange groups will learn from each other and share programming it related resources from time to time.

Scan the code to pay attention to the wonderful content and push it to you at the first time

Topics: Go Java Python Big Data Programming