In Python regular expressions, the three special characters “^” and “^” are used as regular expressions. “*” and “*” in Python regular expressions.

1, the special character “$” represents the meaning of the closing character. For example, the regular expression “3$” matches a string ending in 3. The code is shown below.

The regular expression matching pattern “.*3$” represents any character string ending in 3. The matching result is obviously the same as the original string, so the result is returned. 2. If the regular expression matching mode is changed to “.*4$”, it indicates any character string ending with “4”, as shown in the following figure.

3. Regular expression special character? More commonly used, it represents the meaning of non-greedy matching pattern. By default, a matching string is a greedy match, in other words, by default the string will match the maximum length, depending on the matching pattern. 4. The figure below is an example. The parentheses represent substrings to extract strings. The regular expression will place strings that meet matching conditions inside the parentheses. Matching pattern “.*(p.*p).* “represents: the left”.* “means any string, which can be empty or non-empty string, followed by the character P, the middle”.* “means any string, followed by a P, and the right”.* “means any string. The current logic is to extract the string between the two p’s along with the P.

But the output result is “PP”, which is not the “PCCCCCCCCCCP” result we want. The reason for this is the greedy matching of the regular expression, which actually matches backwards, so the result is “pp” in terms of the string. *(p.*p).* “match pattern”.*? (p.*p).* “, preceded by the first “p” with a special character “?” , the running result is shown in the figure below.

You can see that the matching pattern has started to match from the left, and the answer tends to be what we want. But there are two P’s in the back. The reason is that the latter p does not specify that it is not greedy, so the latter P is still evaluated backwards from the right. *(p.*p).* “matches the pattern”.*? (p.*? P).* “, also preceded by a special character “?” , the running result is shown in the figure below.

Now you can see that the matching result is what we want, and the reason is that both p’s are in non-greedy mode, so the matching pattern goes from left to right. 7. Once you understand the non-greedy mode, it is easy to understand the matching of regular expressions. The following result will return “PCCCP” in the non-greedy mode.

8. The result below returns “PCCCPCCCCCCCPPPP” in both non-greedy and greedy modes.

Non-greedy mode is very important for string extraction in the process of web crawler, so it must be understood and mastered. Guys, the regular expression special characters $and? Have you got it?