Topic: 1910 | | 28

Given two strings s and part, repeat the following for S until all substring parts are removed: Find the leftmost substring part in S and remove it from s. Return the rest of the string after deleting all the part substrings from S. A substring is a sequential sequence of characters in a string.

Train of thought

This type of String lookup has many uses in the Java String class, using indexof to find values, finding matching values to replace (or intercept, etc.), and recursing over time.

Code implementation

public static String removeOccurrences(String s, String part) {
    int pos = s.indexOf(part);
    while(pos ! = -1){
        s = s.replaceFirst(part,"");  // Substring can also be used
        pos = s.indexOf(part);
    }
    return s;
}
Copy the code

thinking

The time complexity and performance of this method need to be improved

Indexof:/** source: original string sourceOffset: original string search start value sourceCount: original string length target: matched string targetOffset: configured string start value targetCount: FromIndex: start value */
Copy the code

Indexof first finds the position equal to the first character of s, then matches each character of part one by one, with a direct return that does not match.

static int indexOf(String s,String part) {
    char[] source = s.toCharArray();
    char[] target = part.toCharArray();
    if (0 == source.length) {
        return (target.length == 0 ? 0 : -1);
    }
    if (part.length() == 0) {
        return 0;
    }
    char first = target[0];
    int max = source.length - target.length;
    for (int i = 0; i <= max; i++) {
        /* Look for first character. */
        if(source[i] ! = first) {while(++i <= max && source[i] ! = first); }/* Found first character, now look at the rest of v2 */
        if (i <= max) {
            int j = i + 1;
            int end = j + target.length - 1;
            for (int k =  1; j < end && source[j]
                    == target[k]; j++, k++);

            if (j == end) {
                /* Found whole string. */
                returni; }}}return -1;
}
Copy the code

Suppose there is a string abcda to abcdeabsda to see if a match exists

First match:

The following matching process:

In the search process, it was found that there was no match when index was matched to 4 for the first time, and index:1 was backtracked for the second time. This match was obviously repeated. Could we find a way to get the place where J jumped from the matched result, instead of backtracking to the original matched position? So there’s the KMP algorithm, invented by knuth-Morris-Pratt.

Example: Suppose there is a string S as abcdabeabcdabda, and there is a string part that needs to be checked in S to see if it exists.

First match

The idea of the KMP algorithm is to try to use this known information, not to move the “search location” back to the location where it has been compared, but to move it further back, thus increasing efficiency. So KMP keeps a record of where the move array needs to be used for part moves. How to calculate the array value is explained later.

Second match

If e and D are not known to match, the first six characters “ABCDAB” match. The “partial match value” of the last matching character B is 2, so use the following formula to calculate the number of digits moving backwards:

Move digit = Number of matched characters – corresponding partial match value

Since 6-2 equals 4, move the search term back 4 bits.

Third match

The position we need to move is 2 minus 0 = 2

Fourth match

The array j and k are not equal, they’re all moved back one bit

Matches k to the last character and returns (the value of j-k is needed to find the result).

Code implementation

public static int kmp(String haystack, String needle) {
    char[] hay = haystack.toCharArray();
    char[] need = needle.toCharArray();
    int i = 0,j=0;
    int[] next = getNext(needle);
    while(i<hay.length && j<need.length){
        if(j == -1 || hay[i] == need[j]){
            i++;
            j++;
        }else{ j = next[j]; }}if(needle.length() == j){
        return i-j;
    }else{
        return -1; }}Copy the code

Now, how do I get the next array

Get the Next array

When you build the Next array, there was a little bit of a question at first, right?

question-1Is the next array still related to the original string s?Copy the code

Answer: Irrelevant, only the part string, because the idea of KMP is that the position j that S moves does not go back to the previous position, but only moves k in the part string.

question-2The :part string needs to be moved by the formula: Move digit = number of matched characters - corresponding partial match value. So how do I get the matching value of this.Copy the code

Answer: The corresponding partial matching value is the next array. To get the next array, we can see where we want to jump to, as in the previous example:The position that s and Part already match is 6 bits (ABCDAB) and the position that needs to be moved is 4 bits (ABCD). The length of the maximum common prefix suffix in ABCDAB is 2.

Suffix: a string concatenated from the last character but not the first

In other words, the maximum length table of the common elements of each prefix suffix corresponding to the original pattern string substring is:

When we pass the above matching, we find that k matches the last character d, the common matching number is 2, that is, the calculation only matches (abcdab), so the array of next needs to be moved back one bit.

We now know that the next array can be retrieved from the common length of the prefix and suffix, but what about the code?

String: abcdabd start p[j]! = p[k], let’s look at a simpler character: aaabac

When P[k] == P[j],

Next [j] == next[j] +1 =k+1 Next [j] == k

When P [k]! = P [j],

Coordinate k k should pass back to the previous shorter matching substring to and j, k are used when the most stupid way before all of the existence of the substring to match, but considering the next array, the meaning of k corresponding to the next [k] k corresponding to the characters of the substring before the largest the length of the same prefix and suffix, is directly k left will be moved to the next [k] position, Continue to match j

This string looks misleading. Suppose the s string is ababcd and the part is abac.

The next match becomes j compared to next[k]

To sum up:

When p[j] = p[k], next[j]=k, j,k move back p[j]! = p[k], k=next[k], and then continue matching, all the way back to the original positionCopy the code

Code implementation

public static int[] getNext(String need) {
    char[] p = need.toCharArray();
    int[] next = new int[p.length];
    next[0] = -1;
    int k = -1;
    int j = 0;
    while (j < p.length - 1) {
        //p[k] indicates prefix, p[j] indicates suffix
        if (k == -1 || p[k] == p[j]) {
            Next [j] +1 == next[j] +1 =k+1
            k++;
            j++;
            next[j] = k;
        } else{ k = next[k]; }}return next;
}
Copy the code

Completion code:

public static int kmp(String haystack, String needle) {
    char[] hay = haystack.toCharArray();
    char[] need = needle.toCharArray();
    int i = 0,j=0;
    int[] next = getNext(needle);
    while(i<hay.length && j<need.length){
        if(j == -1 || hay[i] == need[j]){
            i++;
            j++;
        }else{ j = next[j]; }}if(needle.length() == j){
        return i-j;
    }else{
        return -1;
    }
}

public static int[] getNext(String need) {
    char[] p = need.toCharArray();
    int[] next = new int[p.length];
    next[0] = -1;
    int k = -1;
    int j = 0;
    while (j < p.length - 1) {
        //p[k] indicates prefix, p[j] indicates suffix
        if (k == -1 || p[k] == p[j]) {
            Next [j] +1 == next[j] +1 =k+1
            k++;
            j++;
            next[j] = k;
        } else{ k = next[k]; }}return next;
}
Copy the code

Git:gitee.com/zhwgit/leet…

Link: leetcode-cn.com/problems/re…

Leetcode-cn.com/problems/im…

Reference: 1. 2. www.cnblogs.com/zhangboy/p/…

3. www.ruanyifeng.com/blog/2013/0…