Address of the subject (interview question 17.23. The largest black square)

https://leetcode-cn.com/probl…

Topic describes

Given a square matrix, each cell (pixel) is either black or white. Design an algorithm to find the largest submatrix whose four edges are all black pixels. Returns an array [r, c, size] where r, c represent the row number and column number in the upper left corner of the submatrix. Size is the size of the side of the submatrix. If there are more than one subsquare matrix satisfying the condition, the smallest subsquare matrix r is returned. If r is the same, the smallest subsquare matrix c is returned. Returns an empty array if there are no subsquare matrices. Example 1: Input: [[1,0,1], [0,0,1]] Output: [1,0,2] Explanation: In the input,0 represents black,1 represents white, and the bold element is the largest submatrix that satisfies the condition. Example 2: Input: ,1,1 [[0], [1, 1], [1, 0]] output: [0, 1] hint: matrix. The length = = matrix [0]. Length < = 200

Front knowledge

  • Dynamic programming

The company

  • no

Train of thought

After looking at the data range, the matrix size is no more than $200 \times 200$, so the answer should be violence, this data range can be passed almost N ^ 3 complexity, where N is the length of the matrix side. The reason I mentioned in the previous article is that $200^3$is exactly $8 million, and any more can easily exceed $10 million.

At first glance, it looks the same as 221. The greatest square, but it doesn’t. This problem can be hollow, as long as the sides are zero and the parts are zero, which is a completely different problem.

In the picture below, the red part is the answer. You just have to make sure that all the edges are 0, so it doesn’t matter if you have a 1 in there.

We might as well start from the local, to see if we can open the train of thought.

This is a common technique that you can use when you are faced with a difficult problem and you don’t know what to do. You can draw a picture, starting with special and local situations, to help you open your mind.

Let’s say I want to calculate the largest black square with the red grid in the bottom right corner of the figure below. We can expand from the current point up to the left until it’s non-zero.

In the above example, it is not difficult to see that its maximum black square matrix does not exceed min(4, 5).

So the answer is just four? Yes for this case, but there are other cases. Such as:

So the upper bound on the solution space is 4, but the lower bound could still be 1.

The lower bound is 1 because we’re only interested in the lattice with a value of 0, and if the lattice is 0, the worst and greatest square matrix is itself.

So above I’ve locked my algorithm into a brute force solution. What about the violent solution of the solution space? All you do is enumerate all the solution Spaces, and at most subtract branches.

To put it bluntly:

  • Is one OK?
  • Is 2 OK?
  • Is 3 OK?
  • Is 4 OK?

This is called special.

If you understand the special case, then the general case is not difficult.

Algorithm description:

  1. Scan the matrix once from left to right and from top to bottom.
  2. If the cell value is 0, then scan up and left respectively until the first cell that is not 0, then the upper bound of the largest square matrix is min(length extended to the left, length extended to the up).
  3. Step by step try [1, upper bound] until it is impossible to form a square matrix, and the last feasible square matrix is the largest square matrix that can be formed at the current point.
  4. The scanning process maintains the maximum value, and finally returns the maximum value and vertex information.

Now the hard part is how to try the third part step by step [1, upper bound]. In fact, this is not difficult, just need:

  • Probe upward while extending to the left
  • It extends upwards while probing to the left

Take a look at the picture below to make sense of it.

If it does, we continue to push our luck until it doesn’t work or we reach the upper bound.

Next, analyze the time complexity of the algorithm.

  • Since every zero point needs to be probed left and up, the worst case is O(N), where N is the length of the side.
  • The worst case is O(N) as you go up and left.

Since we need to execute the above logic for up to $O(N^2)$points, the total time complexity is $$O(N^4)$$

In fact, each cell is a separate subproblem, so you can use a memo (such as a hash table) to store the results of each cell extension, which optimizes the complexity to $O(N^3)$.

For example, in the process of probing up and left mentioned above, if the extension results of the upper and left cells have been calculated, then just use it, and the extension complexity of this part can be reduced to $O(1)$. Therefore, it is not difficult to see that the calculation of the current cell depends on the left and top cells, so scanning the matrix from left to right and top to bottom is the correct choice, because we need to traverse the current cell when the results of the left and top cells have been calculated.

  1. (4,5) Find the top adjacent cell, if it is 1 directly return.
  2. If the top cell is 0, just go to memo.
  3. The memo returns the result, and we just need to add the result + 1 to be the maximum length we can go up.

For example, we now want to calculate the side length of the largest square matrix with the coordinate (4,5) as the bottom-right corner. The first step is to probe up to (3,5), and when you reach (3,5), you don’t need to go further up to get it from the memo. The number of zeros above (4,5) is the number of zeros above (3,5) plus 1.

One last question. What data structure can implement the above query process $O(1)$time? HashMap works. Arrays work.

  • The advantage of using a hash map is nothing more than to create space up front. The disadvantage is that if the data volume is too large, it may result in a timeout due to hash table conflict handling. For example, a stone game using a hash table can easily timeout.
  • The advantages and disadvantages of using arrays are almost the opposite of those of a hash table. Arrays need to implement the specified size, but arrays are not collided and do not need to calculate the hash key, so performance is better in many cases. Further use of memory-contiguous data structures such as arrays is CPU-friendly and therefore the same complexity is faster. Hashes, which use linked lists or trees, are not as cache-friendly.

In summary, I recommend that you use arrays for storage.

So that’s pretty much it. In fact, this is dynamic programming optimization, in fact, there is nothing magical, a lot of times are violent enumeration + memorization.

code

Code support: Java, Python

Java Code:

class Solution { public int[] findSquare(int[][] matrix) { int [] res = new int [0]; int [][][] dp = new int [2][matrix.length+1][matrix[0].length+1]; int max = 0 for(int i=1; i<=matrix.length; i++){ for(int j=1; j<=matrix[0].length; j++){ if(matrix[i-1][j-1]==0){ dp[0][i][j] = dp[0][i-1][j]+1; dp[1][i][j] = dp[1][i][j-1]+1; int bound = Math.min(dp[0][i][j], dp[1][i][j]); for(int k=0; k<bound; k++){ if(dp[1][i-k][j]>=k+1&&dp[0][i][j-k]>=k+1){ if(k+1>max){ res = new int [3]; max = k+1; res[0] = i-k-1; res[1] = j-k-1; res[2] = max; } } } } } } return res; }}

Python Code:

class Solution:
    def findSquare(self, matrix: List[List[int]]) -> List[int]:
        n = len(matrix)
        dp = [[[0, 0] for _ in range(n + 1)] for _ in range(n + 1)]
        ans = []
        for i in range(1, n + 1):
            for j in range(1, n + 1):
                if matrix[i - 1][j - 1] == 0:
                    dp[i][j][0] = dp[i-1][j][0] + 1
                    dp[i][j][1] = dp[i][j-1][1] + 1
                    upper = min(dp[i][j][0], dp[i][j][1])
                    for k in range(upper):
                        if min(dp[i-k][j][1], dp[i][j-k][0]) >= k + 1:
                            if not ans or k + 1 > ans[2]:
                                ans = [i-k-1, j-k-1, k + 1]

        return ans

Complexity analysis

  • Time complexity: $O(N^3)$, where N is the length of the matrix side.
  • Space complexity: The bottleneck in space is the memo, and the memo is the size of the matrix, so the space complexity is $O(N^2)$, where N is the length of the matrix side.

That’s all for this article. If you have any opinions on this, please leave a message to me. I will check all the answers when I have time. More algorithm routine can access my LeetCode antithesis warehouse: https://github.com/azl3979858… . Now it’s already 38K star. You can also pay attention to my public number “force buckle plus” take you to bite the algorithm this hard bone.