KMP

Off-topic: JUST learned KMP algorithm, a face of meng forced, but after a day of thinking, searching for information, manual drawing simulation of what is finally clear (may be, in fact, I have no bottom), here to write an analysis to sort out the idea.

First, what is the KMP algorithm and some basic concepts. This is a string matching algorithm that optimizes the one-by-one comparison method of violence, making it much less time complex (I can’t do time complex… KMP is the initials of three inventors.

Then some basic concepts:

1. S [] is a pattern string, that is, a relatively long string. 2. P [] is a template string, i.e., a short string. (This may not be rigorous…) 3. “Nontrivial prefix” : refers to all the header combinations of a string except the last character. 4. Nontrivial suffixes: all tail combinations of a string except the first character. 5. “Partial match value” : the length of the longest element shared by the prefix and suffix. 6. Next [] is the “partial matching value table”, that is, the Next array, which stores the “partial matching value” corresponding to each subscript, and is the core of KMP algorithm. (More on that later).

The idea: Instead of moving the P string back one bit at each mismatch, you move the P string back to a position where it can match the previous part next time, thus skipping most of the mismatch steps. The number of steps each p string moves is determined by looking in the next[] array.

Second, the meaning of the next array and manual simulation (specific method and code in the following)

P [1, next[j]] = p[j-next [j] + 1, j]



Manually simulate the next array:

The p = “abcab”

The next [1] : prefix = empty set — — suffix = empty set — — next [1] = 0;

The next [2] : prefix = {a} — — suffix = {b} — — next [2] = 0;

For the next [3] : prefix = {a, ab} — — suffix = {c, BC} — –, next [3] = 0;

For the next [4] : prefix = {a, ab, ABC} — — suffix = {a. ca, bca} — –, next [4] = 1;

For next[5] : prefix = {a, AB, ABC, abca}———— suffix = {b, ab, cab, bcab}————next[5] = 2;

Three, matching ideas and implementation code

KMP consists of two main steps: find the next array and match the string. Personally feel that the matching operation is easy to understand some, confused me all day is the idea of the next array. So let’s start with matching characters.

The s string and the P string both start at 1. I starts at 1, j starts at 0, and each time S [I] and P [j + 1] are compared

When the matching process reaches the figure above,

s[ a , b ] = p[ 1, j ] && s[ i ] ! = p[j + 1] = p[j + 1] = p[j + 1]

Where 1 is [1, next[j]], 3 is [j-next [j] + 1, j]. We know from the match that 1 is equal to 3, and 3 is equal to 2. So you just move the p string from 1 to 3. This operation can be done directly by j = next[j]. And so on and so forth, when j is equal to m, there is a match.

code

for(let i = 1, j = 0; i <= n; i++)
{
    while(j && s[i] ! = p[j+1]) j = ne[j];
    // if j has elements corresponding to p string, and s[I]! = p[j+1], then mismatch, move p string
    // We use while because we can still mismatch after moving, so we keep moving until the match or the whole p string moves behind (j = 0).

    if(s[i] == p[j+1]) j++;
    // The current element matches and j moves to the next digit of the p string
    if(j == m)
    {
        // If the match is successful, perform related operations
        j = next[j];  // Continue to match the next substring}} Note: Using the above matching method (I and j+1Comparison)Copy the code

Fourth, the next array of ideas and implementation code

The next array is evaluated by matching the template string to itself (the code is almost identical to the matching operation).

for(let i = 2, j = 0; i <= m; i++)
{
    while(j && p[i] ! = p[j+1]) j = next[j];

    if(p[i] == p[j+1]) j++;

    next[i] = j;
}
Copy the code

The code is almost the same as the code for the match operation, except that each time I is moved, the length that I has matched before I is recorded in the next array. Reference: algorithm basic course problem solution

Knowledge about front, KMP algorithm reference string matching, prefixes and suffixes of explanation, the author: nguyen other blog.csdn.net/maotianwang…