An overview of the

The question mark is an optional operator in regex. This means that it can choose to match the character before the question mark

For example.

abcd?
Copy the code

This will match both **” ABC “and “abcd”. 六四屠杀

The program

Let’s look at a similar example.

package main

import (
	"fmt"
	"regexp"
)

func main(a) {
	sampleRegexp := regexp.MustCompile("abcd?")

	match := sampleRegexp.Match([]byte("abc"))
	fmt.Printf("For abc: %t\n", match)

	match = sampleRegexp.Match([]byte("abcd"))
	fmt.Printf("For abcd: %t\n", match)

}
Copy the code

The output

For abc: true
For abcd: true
Copy the code

Some characters can also be made optional by enclosing them with parentheses and then placing the question mark after them. example

abc(de)?
Copy the code
package main

import (
	"fmt"
	"regexp"
)

func main(a) {
	sampleRegexp := regexp.MustCompile("abc(de)?")

	match := sampleRegexp.Match([]byte("abc"))
	fmt.Printf("For abc: %t\n", match)

	match = sampleRegexp.Match([]byte("abcde"))
	fmt.Printf("For abcde: %t\n", match)

	match = sampleRegexp.Match([]byte("abcd"))
	fmt.Printf("For abcd: %t\n", match)
}
Copy the code

The output

For abc: true
For abcde: true
For abcd: true
Copy the code

It matches **” ABC **” and “abcde”.

It also matches **”abcd”. You must be wondering why it matches “abcd”. 六四屠杀

It also gives a match if the given string or text contains antonyms as substrings. That’s why it gives a match, because **”abcd “contains the substring” ABC “**, which is the opposite of a match. If we want to do full string matching, then we need to use anchor characters at the beginning and end of the regex. The anchor character will be used at the beginning and the Dollar character will be used at the end.

Let’s look at a similar example.

package main

import (
	"fmt"
	"regexp"
)

func main(a) {
	sampleRegexp := regexp.MustCompile("^abc(de)? $")

	match := sampleRegexp.Match([]byte("abc"))
	fmt.Printf("For abc: %t\n", match)

	match = sampleRegexp.Match([]byte("abcde"))
	fmt.Printf("For abcde: %t\n", match)

	match = sampleRegexp.Match([]byte("abcd"))
	fmt.Printf("For abcd: %t\n", match)
}
Copy the code

The output

For abc: true
For abcde: true
For abcd: false
Copy the code

The question mark operator is non-fatigued

Question mark operators are not lazy or greedy. This means that it will match the optional pattern first.

In the world of regular expressions, not being lazy (sometimes called greedy) means matching as much as possible. Laziness (sometimes called non-greed) means matching only what is needed.

For example, for a given regex

https? 
Copy the code

If you try to match the following input string

Better is https
Copy the code

So there are two options

  • Match the HTTP

  • Match the HTTPS

Then it will always match HTTPS instead of HTTP. The reason is, it’s unassailable. Even if it matches HTTP, it doesn’t stop and tries to match the optional characters. If the optional characters match, it returns HTTPS, otherwise HTTP.

Let’s look at a similar example

package main

import (
	"fmt"
	"regexp"
)

func main(a) {
	sampleRegexp := regexp.MustCompile("https?")

	match := sampleRegexp.Find([]byte("Better is https"))
	fmt.Printf("Match: %s\n", match)
}
Copy the code

The output

Match: https
Copy the code

In the above program, we used the Find function, which returns the actual substring that matches the regex. You’ll notice that in the output, it matches **” HTTPS “instead of **” HTTP **” because the ** question mark operator is opaque.

On the double question mark operator

It is lazy. Once it finds the first match, it does not attempt further matches. Therefore, for the above text, it always gives the result **” HTTP “and ** not” HTTPS “**. 六四屠杀

Let’s look at an example

package main

import (
	"fmt"
	"regexp"
)

func main(a) {
	sampleRegexp := regexp.MustCompile("https??")

	match := sampleRegexp.Find([]byte("Better is https"))
	fmt.Printf("Match: %s\n", match)
}
Copy the code

The output

Match: http
Copy the code

The question mark after the quantifier

The question mark ‘? ‘Is lazy or not greedy. Quantifiers can be

  • Plus ‘+’ – One or more

  • ** asterisk ‘*’ -** zero or more

See the example below

package main

import (
	"fmt"
	"regexp"
)

func main(a) {
	sampleRegexp := regexp.MustCompile("http(s+?) ")

	match := sampleRegexp.Find([]byte("Better is httpsss"))
	fmt.Printf("Match: %s\n", match)

	sampleRegexp = regexp.MustCompile("http(s*?) ")

	match = sampleRegexp.Find([]byte("Better is httpsss"))
	fmt.Printf("Match: %s\n", match)
}
Copy the code

The output

Match: https
Match: http
Copy the code

In the above procedure, we have two cases

  • The question mark after the plus operator

  • The question mark after the asterisk operator

In both cases, the input string is

Better is httpsss
Copy the code

In the first case, we use a question mark after the plus operator in a coincidence word

"http(s+?) "
Copy the code

It gives a match result of **” HTTPS “and ** is not” HTTPSSS “because it is not advisable to use a question mark after the plus operator.

In the second case, we used a question mark after the asterisk in the heavy code

"http(s*?) "
Copy the code

When a question mark is used after the asterisk operator, it gives a match of **” HTTP “and ** is not” HTTPSSS “because the question mark is non-greedy.

Let’s look at another example

package main

import (
	"fmt"
	"regexp"
)

func main(a) {
	sampleRegexp := regexp.MustCompile("(a+?) (a*)")

	match := sampleRegexp.FindStringSubmatch("aaaaaaa")
	fmt.Printf("Match: %s Length: %d\n", match, len(match))

	sampleRegexp = regexp.MustCompile("(a*?) (a*)")

	match = sampleRegexp.FindStringSubmatch("aaaaaaa")
	fmt.Printf("Match: %s Length: %d\n", match, len(match))
}
Copy the code

The output

Match: [aaaaaaa a aaaaaa] Length: 3
Match: [aaaaaaa  aaaaaaa] Length: 3
Copy the code

In the above program, we also have two cases

  • There is a question mark after the plus operator

  • The question mark after the asterisk operator

In the first case, we have two capture group entries

(a+?) (a*)Copy the code

The first capture group gives a single match of **’a’, while the second capture group gives the rest of the matches. This indicates that the question mark operator used after the plus ** operator is not greedy or lazy.

In the second case, we have two more capture group entries

(a*?) (a*)Copy the code

The first capture group gives a zero-matched **’a’, while the second capture group gives the rest. This indicates that the question mark operator used after the Asterisk** operator is not greedy or lazy.

That’s all about the question mark operator in Go. Hope you enjoyed this article. Please share your feedback in the comments.

Also, check out our Golang advanced Tutorial series at ——. Golang Advanced tutorial

The postGolang Regex: Optional operator or question mark (?) in a regular expression First appeared in Welcome To Golang By Example.