Last week, we shared ten articles on the basics of Python regular expressions. If you’re interested, click on the link. Python regular expressions: Python regular expressions: Python Regular expressions: Python regular expressions

The chestnut below is used to extract the date of the college entrance examination. Generally speaking, we will write June 7, 2018 when filling in the date, but many people will write 2018/6/7, 2018-6-7, 2018-06-07, etc. Some people may write 2018-06 or June 2018. In short, there are many different ways to write dates, so now we need to write a regular expression to match so many cases, how to handle? The detailed tutorial is shown below.


1, we first write a simple regular expression, and then step by step through the test, slowly achieve the matching effect.


This regular expression is more complex, may not understand, xiaobian take you to understand layer by layer.

2. We analyze the regular expression from left to right. First, “.* “represents any number of occurrences of any character, corresponding to” XXX “in the original character. There is nothing special about “gaokao time is”, it just corresponds to “GAOKAO time is” in the original string.

D {4} represents four consecutive digits, corresponding to the year “2018” in the original string. [Year /-] indicates any character in year, /, and -, which corresponds to the next character concatenated after the year 2018 in the original string.

4. “D {1,2}” represents one or two consecutive digits, corresponding to the month “6” or “06” in the original string; [month /-] indicates any character in month, /, or -, and corresponds to the next character concatenated after month 6 or 06 in the original string. It is the same as the understanding of the same year.

5, then is more complex, including the understanding of the “d {1, 2}” the understanding of the same month, the key is about “day” to extract the main need to pay attention to some string with date, some string and no date, so I need a special character “|” to represent “or” relationship, and use the special characters “$” to the end.

6. Now that you understand the relationship, verify the six original strings in turn to see if they match. The following is a match for the raw string string2.


I found a match.

7. The following figure shows the matching of the original string string3.


I found a match.

8. Below is a match for the original string string4.


I found a match.

9. Below is a match for the original string string5.


It turns out that the pattern doesn’t match. Why is that?

10. The reason is that the month “d{1,2}” is limited to be followed by “[month /-]”, while the original string string5 has a time of “2018-06”, which ends without any characters and does not match the matching mode, so it needs to be improved here.


Need to use special characters and special characters “|” “$”, do a” or “option, as shown in the above, then can match.

Of course, it is also easier to put “[month /-]” in the second parenthesis, as shown below.


The following figure shows the matching of the original string string6.

You can see that there is a successful match.

After testing, it can be found that the improved string can successfully match the string of six different dates. Feel the magic of regular expressions?