This is the 12th day of my participation in Gwen Challenge

Interval DP

There is a kind of DP problem that looks like a linear array, but if you look at it carefully, you’ll find that you’re dealing with subintervals. Let’s still solve an example using the 6-step problem-solving method.

Example: Perturb strings

The title

It is possible to perturb string S to produce string T using the algorithm described below.

Step 1. If the length of the string is 1, the algorithm stops.

Step 2. If the string length is greater than 1, perform the following steps:

Splits a string into two non-empty substrings at a random subscript. That is, if we know the string S, we can split it into two substrings x and y, such that s = x + y.

Decide randomly whether to “swap two substrings” or “keep the order of the two substrings unchanged”. That is, after this step, s could be either s = x + y or s = y + x.

Step 3. Continue recursive execution of the algorithm from Step 1 on the two substrings x and y.

There are two strings s1 and s2 of equal length. Check if S2 is the perturb of S1. If so, return true; Otherwise, return false.

Input: s1 = “great”, s2 = “rgeat”

Output: true,

Explanation: To obtain s2 from S1, perform the following operations:

Analysis of the

A. true/false B. true/false C. true/false D. true Because they don’t tell you how to do it.

Therefore, you can think of using the 6-step problem solving method instead of immediately following the problem idea and starting to slice strings.

1. Last step

Let’s look at the last step. Given two strings s1, s2, can s1 get s2 after the last step? (Assuming that s1 has length N, with subscripts starting at 0)

Let’s analyze that s1 can be operated in the following steps:

1) At position P, s1 is segmented into x = [0… p] and y = [p + 1, N);

2) Then, whether s1 = x + y and S1 = y + x can be spelt as S2.

P ranges from 0 to n-2, so there are n-1 options.

However, this operation does not reduce the size of the data to be processed and cannot divide a big problem into more small problems, that is to say, we only find an equivalent problem of the original problem.

That means we got that last step wrong! A rethink is needed. If s1 is the perturb of S2, then in the last step, there are two cases (only the parts with the same color need to be judged) :

Case 1. Cut both s1 and s2 in half, where s1 = x + y and s2 = c + d, then we only need to make sure that x is the disturbing string of C and y is the disturbing string of D.

Case 2: s1 = x + y, s2 = c + d, x = d, y = C

We found that the size of the original problem was greatly reduced after the last step was identified.

2. The problem of child

The original problem was:

Determine if s1 is the perturb of S2.

After the last step of processing, the problem to be handled is split into two cases:

Elseif s1 = x + y, s2 = c + d

Elseif s1 = x + y, s2 = c + d, <x, d>, < y, c>

Here, we can express the original problem as follows:

F (s1, s2) = true, s1 is a perturb of S2, false is not.

Then the last step can be expressed as:

F (s1, s2) = f (x, c) && f (y, d) | | f (x, d) && f (y, c) in which s1 = x + y, s2 = c + d

3. Recursive relations

We can express this recursive relation in pseudocode as follows:

boolean isScramble(s1, s2) {
  N = len(s1)
  f(s1, s2) = false;
  for cutPos in range(0, N) {
    // The first method of segmentation
    // s1 = x + y
    x = s1[0:cutPos]
    y = s1[cutPos+1, N)
    // s2 = c + d
    c = s2[0:cutPos]
    d = s2[cutPos+1,N)
    // see if the first one satisfies the condition
    f(s1,s2) = f(x,c) && f(y,d);
    if (f(s1,s2)) break;
    // The second way to split
    c = s2[0:N-cutPos-1]
    d = s2[N-cutPos:N)
    // see if the second one satisfies the condition
    f(s1, s2) = f(x,d) && f(y, c);
    if (f(s1,s2)) break; }}Copy the code

4. Representation of f(x)

And then we need to look at the representation of f of s1, s2.

If it were simpler, we could of course use HashMap<String,HashMap<String, Boolean>> double hash functions (a bit harder, s1+”#”+s2 as key, and then just one hash function).

But for this problem, there’s a better way to say it. <x, y> and <c, d> are substrings of the original string.

    // s1 = x + y
    x = s1[0:cutPos]
    y = s1[cutPos+1, N)
    // s2 = c + d
    c = s2[0:cutPos]
    d = s2[cutPos+1,N)
Copy the code

Then we can use the starting position and the ending position to express:

f(s1, s2) = f([s1_start, s1_end], [s2_start, s2_end])
Copy the code

At this point, we can assume that the message has become F (s1_start, s1_end, s2_start, s2_end). This 4-dimensional information can be processed by creating a 4-dimensional array. For example, dp[s1_start][s1_end][s2_start][s2_end].

However, if s1 string is a disturbance string of S2 string, then the length of the two should be equal. Therefore, the following relationship should be satisfied:

s1_end – s1_start = s2_end – s2_start

That is, two strings are always the same length, so we can compress 4-dimensional information into 3-dimensional information:

s1_start

s2_start

length
Copy the code

Because s1_end = s1_start + length and s2_end = s2_start + length. In other words, 3-dimensional information is exactly equivalent to 4-dimensional information. Dp [s1_start][s2_start][length] = dp[s2_start]

5. Initial conditions and boundaries

Despite the recursion in step 3, we soon find that there are a few items that need to be dealt with in advance, otherwise we can’t compute the results.

Both are empty strings: s1 = empty, s2 = empty. (In this case, it is given that no empty string will appear).

Both strings are 1 in length: Len (s1) = 1 and len(s2) = 1.

There are also operations that can be handled in advance, such as s1 and S2 having different string lengths. This directly returns False, because the perturbing string requires the two strings to be equal in length.

6. Compute the order

In our initial conditions, we have dealt with the case of a substring of length 1 with length 0 (an empty string). If you take two more steps, you should calculate whether any substring of length 2 is a perturbed string. And then you compute length 3, length 4, all the way to length N.

And I’m sure that when we get to length N, we’ll get to the final solution.

The complete code

After the previous 6 steps of DP analysis, you should now be able to write the following code:

boolean isScramble(String s1, String s2) {
  final int s1len = s1 == null ? 0 : s1.length();
  final int s2len = s2 == null ? 0 : s2.length();
  if(s1len ! = slen2) {return false;
  }
  // N indicates the length of the following string
  final int N = s1len;
  boolean[][][] dp = new boolean[N + 1][N + 1][N + 1];
  // The initial condition is length 1
  for (int s1start = 0; s1start < N; s1start++) {
    for (int s2start = 0; s2start < N; s2start++) {
      dp[s1start][s2start][1] = s1.charAt(s1start) == s2.charAt(s2start); }}// Then use the recursion formula to calculate other lengths
  // String interception, here we use the open and close principle [start, end]
  // In other words, end can be N.
  for (int len = 2; len <= N; len++) {
    for (int s1start = 0; s1start + len <= N; s1start++) {
      for (int s2start = 0; s2start + len <= N; s2start++) {
        // Now we need to compute two substrings
        // X = s1[s1start, s1start+len)
        // Y = s2[s2start, s2start+len)
        // Whether these two substrings are disturbed strings
        // So according to the recursion formula, we need to find the segmentation point
        // Cut the string
        // X is divided into two parts: X = a + b
        // Y = c + d
        // The length on the left is leftLen and the length on the right is len-leftlen
        for (int leftLen = 1; leftLen < len && ! dp[s1start][s2start][len]; leftLen++) {// The first type of segmentation, case 1
          // X = a + b, Y = c + d
          // [s1start, s1start + leftLen) <- a
          // [s2start, s2start + leftLen) <- c
          // [s1start + leftLen, s1start + len) <- b
          // [s2start + leftLen, s2start + len) <- d
          boolean c1 =
            // <a, c>
            dp[s1start][s2start][leftLen] &&
            // <b, d>
            dp[s1start + leftLen][s2start + leftLen][len - leftLen];
          // The second type of segmentation
          // X = a + b, Y = c + d
          // [s1start, s1start + leftLen) <- a
          // [s2start + len - leftLen, s2start + len) <- d
          // [s1start + leftLen, s1start + len) <- b
          // [s2start, s2start + len - leftLen) <- c
          boolean c2 =
            // <a, d>
            dp[s1start][s2start + len - leftLen][leftLen] &&
            // <b, c>dp[s1start + leftLen][s2start][len - leftLen]; dp[s1start][s2start][len] = c1 || c2; }}}}return dp[0] [0][N];
}
Copy the code

Complexity analysis: Time complexity O(N4), space complexity O(N3).

summary

In addition to the 6 steps of DP, the main points of the passage are as follows:

Correct understanding and selection of the last step — to ensure that the subproblem obtained in the last step is contracted;

Optimization of the space of hash functions – we derived step by step from hash functions to 4-dimensional arrays and finally to 3-dimensional arrays.