Regular expressions are quite a few things to learn, and today I’m going to share with you the basics of Python regular expressions. Today is going to tell everyone’s special characters is vertical bar “|”. Vertical bar “|” is essentially a or relationship.

1, directly on the demo code, for example, we need to match a string “dcpeng123”, matching mode to “(dcpeng | dcpeng123)”, remember to match the pattern should be in parentheses, otherwise the group behind the method will be an error.

As shown in the above, the matching mode “(dcpeng | dcpeng123)” mean, as long as the match “dcpeng” or “dcpeng123” any of that means success. “|” is essentially a “or” relationship, the result of the match for “dcpeng” can satisfy matching conditions, the result of the match for “dcpeng123” can also satisfy the matching conditions. So in this case, the regular expression first matches the string “dcpeng”, so the printed result is “dcpeng”. 2. When we adjust the order of the two strings in the matching pattern, as shown below.

According to the analysis step in the first step, the matching result is “DCPeng123”, which will not be described here. 3. If we change the original string to “DCpeng” and keep the matching mode unchanged, as shown in the picture below.

The match result at this time is “dcpeng”. Reason is that match the pattern first is “dcpeng123”, as the original string matching, then through the special characters “|” relocation to “dcpeng”, and found the original string matching, so the match is successful, the output matching results. 4. What if we want to match only part of a string? As shown in the figure below, just enclose the matching pattern in parentheses, leaving the outside of the parentheses consistent with the original string.

You can see that the output result is “dcpeng”. It’s easy to make a mistake here, and a lot of people might think it’s “dCPeng123”, just remember that what we’re matching is in parentheses, and the outside world doesn’t matter to us. Similarly, if we change the original string to “dCCPENG123” and save the matching pattern unchanged, the result will be “dCCpeng”, as shown below.

5. If you really want to match the outer result, you should add another layer of parentheses to enclose the outer content and, as shown in the following figure. When the program runs, we get the matching result “DCCPENG123”.

When the program runs, it’s actually in the order of the outermost parentheses, and then it goes in. When the group method takes the contents of the first parenthesis, the matching result is the contents of the outermost parenthesis, so “dCCPENG123”. You can see that 123 has been extracted as well. Similarly, when the group method takes the content of the second bracket, the matching result is the content of the lowest bracket, so it is “dCCpeng”, as shown in the figure below.

At this point you can see “123” has not been extracted, because is the content of the match “(dcpeng | dccpeng)”. The usage of extracting substrings from parentheses is very common in web crawlers, and it is also the key content of Python regular expressions, which needs to be mastered emphatically.