NowCoder

Topic describes

Implement a function to match regular expressions including ‘.’ and ‘*’. The character ‘.’ in the pattern represents any character, while the character ‘*’ indicates that the character preceding it can occur any number of times (including zero).

In this case, matching means that all characters in a string match the entire pattern. For example, the string “aaa” matches the patterns “A.a” and “ab*ac*a”, but not “aa.a” and “ab*a”.

Their thinking

It should be noted that ‘.’ is used as an arbitrary character, while ‘*’ is used to repeat the preceding character. The function of ‘.’ is different from that of ‘*’, so it is not like repeating the previous character once.

recursive

Assuming that the main string is S and the length is SN, and the mode string is P and the length is PN, there are three cases of ‘normal character’, ‘*’ and ‘.’ for the current i-th bit of the mode string P. We discuss these three cases:

P [I] = p[I] = p[I] = p[I] = p[I] = p[I] = p[I]

If p [I] as’, it can match any character, directly see s [I] sn – 1 + 1… and p [I + 1 pn 1]…

If p[I] is ‘*’, it indicates that p[i-1] can be repeated 0 or more times, and p[I] and p[I] need to be considered as a whole.

If p/I – 1 0 times repeated, directly see s [I]… sn – 1 and p + 2… [I pn 1] if p/I – 1 repeat or multiple times, directly see s [I] sn – 1 + 1… and p] [I… pn – 1, but there is a premise: S [I]= =p[I] or p[I]= = ‘.’

Obviously the above process can be calculated recursively. Then the recursive trilogy is:

Match (s, p) -> bool indicates whether p can match s

Recursive termination condition:

If s and p are both null, that’s a good match. If S is not null, p is null, that’s a bad match. If S is null, p is not null, you need to calculate, you can’t give the result directly.

In the case of the previously discussed 1, 2, to merge, if s = = p | | p = = ‘. ‘, the match (s + 1, p + 1)

For case 3, match(s+1,p) if repeated one or more times, and match(s, p+2) if repeated 0 times.

There are a lot of solutions, non-recursive writing, dynamic programming and recursion, feel this problem belongs to the problem, can do it, the difficulty is high but also the pursuit of a variety of solutions is not the style of the novice. Okay, so I’m going to do a summary after I look at the recursive code. First, there must be two Pointers, one on the given string and one on the template. If they match, they all move backwards, but in this case there are two special symbols messing with it, so it’s special. First analyze the case of ‘.’. When we compare two Pointers to characters, the template character is ‘.’, so we can move both Pointers back at the same time. In this case, it’s easy. It’s equal by default. In other words, you just have to worry about whether or not you have a star.

Consider the relevant case of “. When we compare the characters of two Pointers, if the string is an S pointer and the template is a P pointer to a certain character. So if the next character of P is not “, then that’s the most common case, just move it back one bit. The s++, p++. When we analyze this, notice that p+1 refers to a double. So how do we move the pointer.

Method 1: there is a “, and s and p are not equal, which is equivalent to the template in the “” before the” appeared 0 times, then must move the p pointer back 2 bits, let the template skip “. (For ABGE and ACMN, s refers to B and P refers to C, so p refers to m, that is, p+2. Method 2: ” is present, and s and P refer to equality (where equality can be true or marked by ‘.’). For example, abCD and ABCF, where s and P point to B. This is s plus 1 and p plus 2. Like abbcd and ABcf, where s and P both point to B. Since B appears many times, there should be no hurry to move P, so s+1 is enough. “Cba “,” CbaA “, I could also say that the first A that p points to never occurs, even if you’re equal. So we can keep s stationary, p plus 2. As can be seen, mode 2 is actually divided into 3 cases, and in fact, we can not distinguish the specific method to deal with, so we can only use the three methods at the same time or operation side by side. The third one is not easy to think of, because even if it is equal, I still think you are not equal, ignore you, it’s true. Let me get this straight again. Since true equality is conflated with the marked equality of ‘.’, only ” is considered. If there is no asterisk, then there is a regular comparison, and a regular comparison must be equal or unequal. If an asterisk is present, we consider whether the two Pointers are equal or unequal, and the p pointer simply +2. If two Pointers are equal, there are three parallel cases. Add boundary conditions to those judgments, and it’s easy to understand and write code.

public class Solution {

    public boolean match(char[] str, char[] pattern)

    
{

        if (str == null || pattern == null)

            return false;

        return matchCore(str, 0, pattern, 0);

    }

    private boolean matchCore(char[] str, int s, char[] pattern, int p) {

        // The following 4 lines indicate the end of the recursion

        if (s == str.length && p == pattern.length)

            return true;

        if (s < str.length && p == pattern.length)

            return false;

 

        // If * occurs after P, the rule needs to be changed.

        if (p + 1 < pattern.length && pattern[p + 1] = =The '*') {

            // * * * * * * * * * * * *

            if ((s < str.length && pattern[p] == '. ')

                    || (s < str.length && pattern[p] == str[s])) {

                return matchCore(str, s, pattern, p + 2)

                        || matchCore(str, s + 1, pattern, p)

                        || matchCore(str, s + 1, pattern, p + 2);

            } else {// * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * *

                return matchCore(str, s, pattern, p + 2);

            }

        }

        // * * * * * * * * * * * * If they are the same, each pointer is given +1

        if (s < str.length && (pattern[p] == str[s] || pattern[p] == '. '))

            return matchCore(str, s + 1, pattern, p + 1);

        // no * after p. Back you up, you dare to appear different, that must be false

        return false;

    }

}

Copy the code

tag

public boolean match(char[] str, char[] pattern) {



    int m = str.length, n = pattern.length;

    boolean[][] dp = new boolean[m + 1][n + 1];



    dp[0] [0] = true;

    for (int i = 1; i <= n; i++)

        if (pattern[i - 1] = =The '*')

            dp[0][i] = dp[0][i - 2];



    for (int i = 1; i <= m; i++)

        for (int j = 1; j <= n; j++)

            if (str[i - 1] == pattern[j - 1] || pattern[j - 1] = ='. ')

                dp[i][j] = dp[i - 1][j - 1];

            else if (pattern[j - 1] = =The '*')

                if (pattern[j - 2] == str[i - 1] || pattern[j - 2] = ='. ') {

                    dp[i][j] |= dp[i][j - 1]; // a* counts as single a

                    dp[i][j] |= dp[i - 1][j]; // a* counts as multiple a

                    dp[i][j] |= dp[i][j - 2]; // a* counts as empty

                } else

                    dp[i][j] = dp[i][j - 2];   // a* only counts as empty



    return dp[m][n];

}

Copy the code

regular

public class Solution {

    public boolean match(char[] str, char[] pattern)

    
{

        return new String(str).matches(new String(pattern));

    }

}

Copy the code