I've almost finished scrubbing all the stacks, and I found these things... (Second shot)

Just a little digression

Last time in my public number for everyone to do a small survey “cast you want to solve the programming language ~”. Here are the findings:

For the rest, it’s mostly Go.

Since Java and Python account for more than 60% of the total, this time I tried to write in both languages, thanks to @captainz for the Java code. At the same time, in order not to make the article stink and long, I put all the code of the Java article (Java and Python) on the official website of the power link plus, the website address: https://leetcode-solution.cn/…

If you do not get online scientifically, it may be very slow to open.

The body of the

Hello, I’m Lucifer. Today, I’m going to introduce you to the topic “Heap”. First of all, the outline of this paper is a brain map I drew with MindMap, and then I will continue to improve it and gradually improve other topics.

You can also use VS Code Blink-Mind to open the source file and look at it. There are some notes in it that you can click on and look at. The source file can go to my official account “power buckle plus plus” to get the brain map, the brain map will continue to update more content in the future. Vscode plugin address:
https://marketplace.visualstu…

This series covers the following topics:

I’ve almost finished scrubbing all the linked list problems, and I found these things…
I’ve almost finished scrubbing all the trees, and I found these things…
I’ve almost finished scrubbing all the stacks, and I found these things… (First shot)

<! — more –>

This is the next chapter, and for those of you who haven’t seen the last chapter, it’s highly recommended that you read the last chapter, and you’ve almost done all the work, and I’ve found these things… (First shot)

This is the second part, and the next part is more practical, with three tips and four applications. These two topics are specifically designed to teach you how to solve problems. With this in mind, most of the heaps in the buckle are easy to solve (I’m only talking about the heaps in the buckle, of course).

Warning: The titles in this chapter are mostly on Hard Clasp difficulty. This is because many of the heap titles are hard, as discussed earlier.

note

Before the main course, let’s have an appetizer.

Two concepts are introduced here, namely, tuples and simulated large top heaps. The reason why I’m doing this is just in case you don’t get it.

tuples

You can use the heap to store more than just a single value. For example, the 1,2,3,4 of [1,2,3,4] are all single values. In addition to single values, you can also store compound values, such as objects or tuples.

Here we introduce a way to store tuples, a technique that will be used extensively in the future, so be sure to master it. For example [(1,2,3), (4,5,6), (2,1,3),(4,2,8)].

H = [(1, 2, 3), (4 and 6), (2,1,3),(4,2,8)] heapq.heappify(h) # heapq.heappop() # heapq.heappop() # heapq.heappop() # heapq.heappop() # heapq.heappop( Heapq. Heappop () # pop(4,5,6)

The heap structure can be graphically represented as follows:

A brief explanation of the results of the above code execution.

By default, the first value of a tuple is used as the key to compare. If the first one is the same, keep comparing the second one. For example, (4,5,6) and (4,2,8) above, since the first value is the same, so continue to compare the second one, and since 5 is larger than 2, so (4,2,8) out of the heap first.

Using this technique does two things:

Carry some extra information. Let’s say I want to find the KTH decimal in a two-dimensional matrix, of course with values as the key. However, the process also needs its row and column information, so it is appropriate to use a tuple, such as (val, row, col).
You want to sort by two keys, a primary key and a secondary key. There are two typical uses of this,
2.1 One is that both are in the same order, such as both are in order or both are in reverse order.

2.2 The other is to sort in two different orders, that is, one is in reverse order and the other is in order.

Because of the length of the reason, the specific will not be launched here, you can pay attention to the process of doing the problem in peacetime, I will have the opportunity to open a separate article to explain.

If you’re using a programming language that doesn’t have heaps or a heap implementation that doesn’t support tuples, you can do a simple tweak to support them, mostly by customizing your comparison logic.

Simulate a large top heap

Because Python doesn’t have a large top heap. So here I use a small top heap for simulation implementation. If you take all of the numbers that you have, let’s say the original number is 5, you put negative 5 in the pile. After this treatment, the small top heap can be used as the large top heap. But remember, when you pop it out, you have to invert it, you have to restore it back.

Code example:

H = [] A = [1,2,3,4,5] for A in A: heapq.heappush(h, -a) -1 * heapq.heappop(h) # 5 -1 * heapq.heappop(h) # 4 -1 * heapq.heappop(h) # 3 -1 * heapq.heappop(h) # 2 -1 * heapq.heappop(h) # 1

Graphic representation would look like this:

So much for setting the stage. Now let’s get down to business.

Three techniques

Tip 1 – Fix the heap

This trick is to fix the size of the heap at the same k, which can be done codeically by pushing one in every pop that goes out. Since the initial heap may be 0, we initially need to push it one by one to get the heap size to k, so strictly speaking we should keep the heap size no larger than k.

A typical use of a fixed heap is to find the KTH smallest number. Well, the easiest way to figure out the KTH smallest number is to set up a small top heap, put them all in the heap, and then dump them one by one, for k times. The last thing that comes out of the heap is the KTH smallest number.

However, instead of putting them all in the heap first, we build a large top heap (not the small top heap) and maintain the size of the heap to k. If the size of the heap is greater than k after the new number is added to the heap, you need to compare the number at the top of the heap to the new number and remove the larger number. This guarantees that the number in the heap is the smallest k of all numbers, and isn’t the largest of the smallest k (the top of the heap) the KTH smallest? This is why you choose to build a large top heap instead of a small top heap.

To sum up in a simple sentence, fixing a large top heap of size k can quickly find the KTH smallest number, whereas fixing a small top heap of size k can quickly find the KTH largest number. For example, the third question 5663 of the week on 2020-02-24. Finding the KTH largest XOR coordinate can be achieved by using the fixed small top stack technique (this question asks you to find the KTH largest number).

That may not sound very strong to you, but let me give you two examples to help you get a better impression.

295. Median of data streams

Topic describes

Median is the number in the middle of an ordered list. If the list length is even, the median is the average of the middle two numbers. For example, the median of [2,3,4] is 3 [2,3] and the median of [2,3] is (2 + 3) / 2 = 2.5. Design a data structure that supports two operations: void addNum(int num) - Add an integer from the data stream to the data structure. Double findMedian() - Returns the median of all elements so far. Example: addNum(1) addNum(2) findnum () -> 1.5 addNum(3) findnum () -> 2 Progressive: How would you optimize your algorithm if all integers in the data stream were in the range 0 to 100? How would you optimize your algorithm if 99% of the integers in your data stream were in the range 0 to 100?

Train of thought

So this is actually a special case of finding the KTH smallest number.

If the length of the list is odd, then k is n plus 1 over 2, and the median is the KTH number. If n is 5, k is 5 plus 1 over 2, which is 3.
If the list length is even, then k is (n + 1) / 2 and (n + 1) / 2 + 1, and the median is the average of the two numbers. If n is 6, k is 6 + 1/2 = 3 and 6 + 1/2 + 1 = 4.

So we can maintain two fixed heaps with the size of $(n + 1)\div2$and $n – (n + 1)\div2$, that is, the maximum difference between the two heaps is 1, More specific is $0 < = (n + 1)/div 2 – (n – (n + 1)/div 2) < = 1 $.

Based on the knowledge mentioned above, we can:

Create a large top heap and store the smallest $(n + 1) \div 2$number such that the top of the heap is the smallest $(n + 1) \div 2$number, which is the median in the odd case.
Create a small top heap and store the largest number of n – $(n + 1) \div 2$, such that the top number of the heap is the n – $(n + 1) \div 2$, combined with the top heap, we can find the median of the even cases.

Armed with this knowledge, all that remains is how to maintain the size of the two heaps.

If there are fewer large – top heaps than small – top heaps, then the smallest of the small – top heaps is moved to the large – top heap. And since the small-top heap maintains the largest number of K, and the big-top heap maintains the smallest number of K, the small-top heap must be greater than or equal to the big-top heap, and these two heap tops are the median at this time.
If the number of large tops is 2 more than the number of small tops, then the largest of the large tops is moved to the small tops. Same reason as above.

At this point, you can probably see why you need to build two separate heaps, one with a large top heap and one with a small top heap. The reason for this is as described above.

Fixed heaps are more common than that, but let’s do another problem.

code

class MedianFinder:
    def __init__(self):
        self.min_heap = []
        self.max_heap = []
    def addNum(self, num: int) -> None:
        if not self.max_heap or num < -self.max_heap[0]:
            heapq.heappush(self.max_heap, -num)
        else:
            heapq.heappush(self.min_heap, num)
        if len(self.max_heap) > len(self.min_heap) + 1:
            heappush(self.min_heap, -heappop(self.max_heap))
        elif len(self.min_heap) > len(self.max_heap):
            heappush(self.max_heap, -heappop(self.min_heap))
    def findMedian(self) -> float:
        if len(self.min_heap) == len(self.max_heap): return (self.min_heap[0] - self.max_heap[0]) / 2
        return -self.max_heap[0]

1.3.1 (code)

857. Minimum cost of employing K workers

Topic describes

There are N workers. The work quality of worker I is QUALITY [I] and his minimum expected wage is [I]. Now we want to hire K workers to form a payroll group. When we hire a group of K workers, we must pay them according to the following rule: Each worker in the wage group shall be paid in proportion to the quality of his work as compared to the quality of the work of other workers in the same group. Every worker in a wage group should receive at least their minimum expected wage. Return the minimum amount required to form a wage group that meets the above conditions. Example 1: INPUT: QUALITY = [10,20,5], WAGE = [70,50,30], K = 2 OUTPUT: 105.00000 Explain: We pay 70 to worker 0 and 35 to worker 2. Example 2: INPUT: QUALITY = [3,1,10,10,1], WAGE = [4,8,2,2,7], K = 3 OUTPUT: 30.66667 EXPLICIT: We pay 4 to worker 0 and 13.33333 to worker 2 and 3 respectively. Tip: 1 <= K <= N <= 10000, If N = quality.length = wage. LENGTH 1 <= QUALITY [I] <= 10000 1 <= WAGE [I] <= 10000 and the error within 10^-5 of the correct answer will be deemed to be correct.

Train of thought

The question asks us to choose k individuals, pay them in proportion to the quality of their work compared to that of other workers in the same group, and that each worker in the wage group should receive at least their minimum expected wage.

In other words, for the same group of k people, the quality of their work relative to their wages is a fixed value in order to pay the least. Please understand this sentence first, the following content is based on this premise.

Let’s set an indicator of efficiency, which is equal to q/w. As mentioned above, only the same Q/W of k individuals can guarantee the minimum wage, and this Q/W must be the lowest (shortcoming) of k individuals, otherwise someone will certainly not get the minimum expected wage.

So we can write the following code:

class Solution: def mincostToHireWorkers(self, quality: List[int], wage: List[int], K: int) -> float: eff = [(q / w, q, w) for a, b in zip(quality, wage)] eff.sort(key=lambda a: -a[0]) ans = float('inf') for i in range(K-1, len(eff)): H = [] k = k-1 rate, _, total = eff[I] # Find out k people who work more efficiently than it and make the lowest possible wage. # Since it has been sorted in reverse order, so that the first is higher than it, then use the heap to get the k lowest wage. for j in range(i): heapq.heappush(h, eff[j][1] / rate) while k > 0: total += heapq.heappop(h) k -= 1 ans = min(ans, total) return ans

1.3.2 (code)

This approach is to push a large number of times and pop k times, which does not take good advantage of the dynamic characteristics of the heap, but only takes advantage of its extreme value characteristics.

A better approach is to use the fixed heap technique.

Here’s another way to think about it. So what we’re going to do is we’re going to take k people, and we’re going to take the lowest of them, and we’re going to use that lowest of them to calculate the total wage, and we’re going to find the lowest total wage. Therefore, we can fix a large top heap of size k, and make sure that the top heap is the KTH smallest by some operation (similar to the previous problem).

And the heap in the previous solution uses triples (Q/W, Q, W), which is actually not necessary. Since we know two of them, we can deduce the other one, so we can just store two of them, and since we need to make the keys of the heap based on the efficiency ratio, so we can just pick any q or w, and here I chose q, which is to store (q/2, q) tuples.

Specifically, it is: $\displaystyle \sum_{n=1}^{k}{q}_{n}/rate$, where the rate is the current q/w, and is also the minimum value of q/w for k individuals.

code

class Solution:
    def mincostToHireWorkers(self, quality: List[int], wage: List[int], K: int) -> float:
        effs = [(q / w, q) for q, w in zip(quality, wage)]
        effs.sort(key=lambda a: -a[0])
        ans = float('inf')
        h = []
        total = 0
        for rate, q in effs:
            heapq.heappush(h, -q)
            total += q
            if len(h) > K:
                total += heapq.heappop(h)
            if len(h) == K:
                ans = min(ans, total / rate)
        return ans

(code 1.3.3)

Technique two – multiple merge

This technique was actually mentioned earlier when we talked about super ugly numbers, but we didn’t give this type of problem a name.

In fact, this technique, called multi-pointer optimization may be more appropriate, but this name is too simple and easy to confuse with double Pointers, so I gave it a unique name – multi-way merge.

Multiplexing is embodied in the fact that there are multiple candidate routes. In code, we can use multiple Pointers to represent.
Merging is reflected in: the result may be the longest or shortest of the multiple candidate routes, or it may be the KTH equal. Therefore, we need to compare the results of multiple routes, and discard or select one or more routes according to the description of the title.

This is a rather abstract description, but let me give you a couple of examples to help you understand.

Here, I have prepared four questions of hard difficulty for you. Master this routine can go to the happy AC of these four problems.

1439. The KTH smallest array sum in an ordered matrix

Topic describes

I give you an m by n matrix, MAT, and I give you an integer k, and every row in the matrix is arranged in a non-decreasing order. You can select 1 element from each row to form an array. Returns the KTH smallest array sum of all possible arrays. Example 1: Input: mat = [[1,3,11],[2,4,6]], k = 5 Output: 7 Explanation: Select an element from each line, the first k and smallest array are: [1,2], [1,4], [3,2], [3,4], [1,6]. The sum of the fifth is 7. Example 2: input: mat = [,3,11 [1], [2 minus 2]], k = 9 output: 17 example 3: input: mat = [].enlighened,10,10 [1], [2,3,6]], k = 7 output: 9: Choose one from each row element, k and the smallest array respectively is: before,1,2 [1], [1,1,3],,4,2 [1], [1, 3],,1,6 [1], [1,5,2],,5,3 [1]. The sum of the seventh is 9. Input: mat = [[1,1,10],[2,2,9]], k = 7 Output: 12 M == mat.length n == mat.length[I] 1 <= m, n <= 40 1 <= k <= min(200, n ^ m) 1 <= mat[I][j] <= 5000 mat[I] is a non-decreasing array

Train of thought

So what this problem does is it gives you m one-dimensional arrays of the same length, and let’s pick one number from each of these m arrays, so let’s pick a total of m numbers, and find the sum of m numbers that is the KTH least of all possible choices.

A simple idea is to use multiple Pointers. In this case, we use m Pointers to m one-dimensional arrays, and the position of the pointer indicates which one of the one-dimensional arrays is currently selected.

Take MAT = [[1,3,11],[2,4,6]], k = 5 as an example.

We initialize two Pointers, p1 and p2, to the beginning of two one-dimensional arrays respectively. The code indicates that all Pointers are initialized to 0.
The sum of the two Pointers is 1 + 2 = 3, which is the first smallest sum.
Next, we move one of the Pointers. So we can move P1 or we can move P2.
So the second smallest is going to be the smaller value of moving p1 and moving p2. And if you move p1 and p2 here you’re actually going to get 5, so the smallest sum of the two and the third is going to be 5.

At this point it has branched, and two things have happened (note the bold position, which indicates the position of the pointer) :

,3,11 [1], [2 minus 2] and as 5
,3,11 [1], [2 minus 2] and as 5

Next, the two situations should go hand in hand and work together.

For case one, there are two more cases where you move next.

,3,11 [1], [2 minus 2] and 13
,3,11 [1], [2 minus 2] and for 7

For case 2, there are also two scenarios for the next move.

,3,11 [1], [2 minus 2] and for 7
,3,11 [1], [2 minus 2] and for 7

By comparing these four cases, we conclude that the fourth, fifth, and sixth smallest numbers are all seven. But the seventh smallest number doesn’t have to be 13. For a similar reason, it may be that the seventh smallest is hidden in a new situation after the previous seven splits, and indeed it is. So we need to continue executing the above logic.

Further, we can extend the above idea to the general case.

I mentioned that what they want is the KTH smallest sum, and the smallest is easy to figure out, which is the sum of all the first entries in a one-dimensional array. We found again, according to the minimum, we can deduce the 2 small, is to move one of the pointer is derived, which were split out of the n kinds of circumstances, where n is a one-dimensional array length, small in the division of n 2 kinds of cases, and screening the way is this kind of circumstance of n and the smallest, the latter is also similar. It is not difficult to see that the extremum changes after each split, so this is an obvious sign of dynamic extremum, and using the heap is a good choice.

So how should the code be written?

As I said, we need to initialize m Pointers and assign them to 0. Corresponding pseudocode:

H = [] # sum(vec[0] for vec in mat) is the first entry of m one-dimensional arrays and # [0] * m initializes an array of length m filled with zeros. Cur = (sum(vec[0] for vec in mat), [0] * m)

Next, we move the pointer one at a time, branching out into a new branch. Every time I pop the smallest one out of the heap, k times is the KTH smallest one. Pseudo code:

For 1 to K: # int (*); for 1 to K: # acc Acc, int = heapq.heappop(h) # roughly moves one pointer in the pointer array each time. It forks every time it moves a pointer, and the total number of possible moves is n, where n is the length of the one-dimensional array. For I, pointer in enumerate(int): # If pointer == Len (mat[0]) -1, if pointer! = len(mat[0]) - 1: * int (* a) [I] * (*); / / int (* a) [I] + 1) Heapq.heappush (h, (acc + mat[I][pointer] + 1] -mat [I][pointer], new_pointers)

This is the core code for the multipath merge problem, so keep it in mind.

The code looks like a lot, but it’s actually only seven lines without comments.

There is a problem with the pseudocode above. Let’s say we have two one-dimensional arrays with Pointers initialized to zero. The first time we move the pointer of the first one-dimensional array, and the second time we move the pointer of the second array, then the pointer array is [1, 1], that is, all the Pointers point to elements with subscript 1. If the pointer to the second array is moved the first time, and the pointer to the first array is moved the second time, then the pointer array is still [1, 1]. This is actually a situation where, if left uncontrolled, it will be computed twice and will result in an error.

One possible solution is to use a HashSet to record all pointer cases, thus avoiding the problem of the same pointer being evaluated multiple times. To do this, we need to make a few tweaks to the use of pointer arrays, using tuples instead of arrays. The reason is that arrays can’t be hashed directly. Please refer to the code area for details.

The topic of multiplexing, the idea and the code are quite similar. Please make sure that this problem is solved in order to understand it better, and we will not analyze it in such detail.

code

class Solution: def kthSmallest(self, mat, k: int) -> int: h = [] cur = (sum(vec[0] for vec in mat), tuple([0] * len(mat))) heapq.heappush(h, cur) seen = set(cur) for _ in range(k): acc, pointers = heapq.heappop(h) for i, pointer in enumerate(pointers): if pointer ! = len(mat[0]) - 1: t = list(pointers) t[i] = pointer + 1 tt = tuple(t) if tt not in seen: seen.add(tt) heapq.heappush(h, (acc + mat[i][pointer + 1] - mat[i][pointer], tt)) return acc

1.3.4 (code)

Find the KTH smallest distance pair

Topic describes

Given an array of integers, returns the KTH minimum distance between all pairs of numbers. The distance between A pair (A, B) is defined as the absolute difference between A and B. Example 1: Input: nums = [1,3,1] k = 1 Output: 0 Explanation: All pairs are as follows: (1,3) -> 2 (1,1) -> 0 (3,1) -> 2 So the first smallest pair of pairs is (1,1) and the distance between them is 0. Tip: < = 2 len (nums) < = 10000. 0 < = nums [I] < 1000000. 1 < = k < = len (nums) * (len (nums) - 1) / 2.

Train of thought

$C_n^2 = $n\times(n-1)\div2 = $n\times(n-1)\div2$

So we can use two loops to find all the pairs, sort them in ascending order, and then take the KTH one.

In fact, we can use the fixed heap technique to maintain a large top heap of size k, such that the element at the top of the heap is the KTH smallest, which was explained in the previous fixed heap and won’t be repeated.

class Solution: def smallestDistancePair(self, nums: List[int], k: int) -> int: h = [] for i in range(len(nums)): For j in range(I + 1, nums) : a, b = nums[I], nums[j] # if len(h) == k and -abs(a-b) > h[0]: heapq.heappop(h) if len(h) < k: heapq.heappush(h, -abs(a - b)) return -h[0]

1.3.5 (code)

However, this optimization doesn’t make much sense because the bottleneck of the algorithm is the enumeration of the $N^2$part, which we should try to optimize.

If we sort pairs of numbers, the smallest pair distance must be in nums[I] – nums[I – 1], where I is an integer from 1 to n, depending on who is smaller. Then you can use the idea of multiple merge above to solve the problem.

If the difference between nums[I] and nums[I – 1] is the smallest, then the second smallest must be the remaining n-1 cases and the new case of nums[I] -nums [I – 1] splitting. In terms of how we split, we just move the pointer to I to be I + 1, just like we did above. The pointer array here is fixed at length 2, not m in the problem above. Here I’ve named the two Pointers FR and TO, which stand for FROM and TO.

code

class Solution(object): def smallestDistancePair(self, nums, k): Nums. Sort () # n kind of candidate answer h = [(nums - nums [I + 1] [I], I, i+1) for i in range(len(nums) - 1)] heapq.heapify(h) for _ in range(k): diff, fr, to = heapq.heappop(h) if to + 1 < len(nums): heapq.heappush((nums[to + 1] - nums[fr], fr, to + 1)) return diff

Code (1.3.6)

Since the time complexity is related to k, which can be on the order of $N^2$at most, this method actually timeouts as well. But this proves the validity of the idea, and if you change the problem a little bit you might be able to use it.

This problem can be solved by dichotomy, and because of the deviation from the heap theme, it is discussed briefly here.

The KTH smallest number that comes to mind is heaps and dichotomies. And the reason for this dichotomy is that to find the KTH smallest number, essentially, is to find the number with k minus 1 that is not larger than itself. And this problem satisfies monotony a lot of times, so it can be solved by dichotomy.

In this case, the largest pair difference is the max-minimum value of the array, which we might call max_diff. We can ask:

How many pairs are less than max_diff?
How many pairs of numbers are less than max_diff-1?
How many pairs are less than MAX_DIFF-2?
How many pairs are less than MAX_DIFF-3?
How many pairs of numbers have a difference less than MAX_DIFF-4?
.

And we know that the answer to the question is not strictly decreasing, so the use of dichotomy should be thought of. We keep asking questions until we have k minus 1 of them. But there is a problem with such a question. There are two reasons:

There may be more than one number less than x that has k minus 1
We don’t know for sure that there are k minus 1 numbers less than x. If the difference between pairs is [1,1,1,1,2], and you are asked to find the third largest number, then there are no numbers with two less than x.

Our idea can be adjusted to find k less than or equal to x, and then we can use the leftmost template of the dichotomy. For the leftmost template, please refer to my topic on binary search

Code:

class Solution:
    def smallestDistancePair(self, A: List[int], K: int) -> int:
        A.sort()
        l, r = 0, A[-1] - A[0]

        def count_ngt(mid):
            slow = 0
            ans = 0
            for fast in range(len(A)):
                while A[fast] - A[slow] > mid:
                    slow += 1
                ans += fast - slow
            return ans

        while l <= r:
            mid = (l + r) // 2
            if count_ngt(mid) >= K:
                r = mid - 1
            else:
                l = mid + 1
        return l

1.3.7 (code)

Minimum interval

Topic describes

You have a list of k integers that are not in descending order. Find a minimum interval such that each of the k lists contains at least one number. We define the interval [a,b] to be smaller than [c,d] if b-a < d-c or if a < c when b-a == d-c. Example 1: Input: nums = [[4,10,15,24,26], [0,9,12,20], [5,18,22,30]] Output: [20,24] Explanation: Listing 1: [4,10,15,24,26], 24 in interval [20,24] Listing 2: [0, 9, 12, 20], 20 in the interval [20,24]. Listing 3: [5, 18, 22, 30], 22 in the interval [20,24]. Example 2: input: nums = [[1, 2, 3], [1, 2, 3], [1, 2, 3]] output: [1, 1] example 3: input: nums = [[10, 10], [11] 11] output: [10, 11] example 4: input: Nums = [[10], [11]] output: [10, 11] example 5: input: nums = [[1], [2], [3], [4], [5], [6], [7]] output: [1, 7] tip: Nums. Length == k 1 <= k <= 3500 1 <= nums[I]. Length <= 50-105 <= nums[I][j] <= 105 nums[I]

Train of thought

In essence, this problem is to take A number from each of the m one-dimensional arrays and reconstitute A new array A so that the difference between the maximum and minimum values (diff) in the new array A is the smallest.

This problem is a little bit similar to the problem above, but a little different. This one is a matrix, and the last one is a one-dimensional array. But we can look at a two-dimensional matrix as a one-dimensional array, and then we can use the same idea.

The above idea is that the smallest diff must occur between adjacent elements after sorting. In this case, we can’t sort a two-dimensional array directly, and even if we do sort it, we can’t determine the sorting principle.

And we could have used the same idea that we did in the last two problems. To be specific, the minimum value in the heap is obtained by using the low-top heap, and then the maximum value in the heap is recorded by a variable. In this way, the diff is known. Every time the pointer is updated, a new diff is generated, and the process is repeated and the global minimum diff is maintained.

The premise of this algorithm is that all k lists are arranged in ascending order. Here, the principle of array ascending order is the same as the problem above. After ordered, you can maintain a pointer to each list, and then use the above idea to solve.

Nums = [[1,2,3],[1,2,3]

[1, 2, 3]
[1, 2, 3]
[1, 2, 3]

We first select the minimum value of all rows, i.e. [1,1,1], where the diff is 0, the global maximum is 1, and the minimum value is 1. Next, continue to look for a spare tire, to see if there is a better spare tire for us to choose.

The next spare tire may be generated in Situation 1:

[1, 2, 3]
[1, 2, 3]
[1,2,3] moves the pointer on this row, moving it one unit from 0 to 1.

Or case 2:

[1, 2, 3]
[1,2,3] moves the pointer on this row, moving it one unit from 0 to 1.
[1, 2, 3]

These cases go on to split more cases, and this is the same as the title above, not to repeat.

code

class Solution: def smallestRange(self, martrix: List[List[int]]) -> List[int]: L, r = -10**9, 10**9 # Put the smallest row in the heap, and record the row number and column number, H = [(row[0], I, 0) for I, Row in enumerate(martrix)] heapq.heapify(h) # Max_v = Max (row[0] for row in martrix) while True: Min_v, row, col = heapq.heappop(h) # max_v-min_v If max_v-min_v < r-l: l, r = min_v, max_v if col == len(martrix[row]) -1: max_v-min_v < r-l: l, r = min_v, max_v if col == len(martrix[row]) -1: Return [l, r] # update pointer Heapush (h, (martrix[row][col + 1], row, col + 1)) max_v = Max (max_v, martrix[row][col + 1])

(code 1.3.8)

1675. Minimum offset of an array

Topic describes

I give you an array of n positive integers, nums. You can perform two types of operations on any element of an array any number of times: If the element is even, divide by 2 for example, if the array is [1,2,3,4], then you can do this on the last element to make it [1,2,3,2]. If the element is odd, multiply it by 2. For example, if the array is [1,2,3,4], Then you can do this on the first element to make it [2,2,3,4] and the offset is the largest difference between any two elements in the array. Returns the minimum offset that an array can have after performing some operation. Explanation: You can convert the array to [1,2,3,2], and then to [2,2,3,2] with an offset of 3-2 = 1. Example 2: After two operations, you can convert the array to [4,2,5,5,3] with an offset of 5-2 = 3. Example 3: Input: nums = [2,10,8] Output: 3  n == nums.length 2 <= n <= 105 1 <= nums[i] <= 109

Train of thought

They say you can do any number of operations on each item in the array, but there’s a finite number of operations you can do.

We can only double an odd number once, because when we double it, it becomes even.
We can divide an even number by 2 a number of times until it equals an odd number, which is not hard to see as a finite number of operations.

Take [1,2,3,4] from the question. We can:

Change 1 to 2.
Make 2 into 1.
Change 3 to 6 (it doesn’t change)
Change 4 to 2 or 1.

Graphic representation would look like this:

Isn’t it equivalent to minimizing the difference between each row in a two-dimensional array such as [[1,2], [1,2], [3,6], [1,2,4]? Isn’t that exactly the same as the problem above?

Here I will directly the above problem solution encapsulated into an API call, the specific code.

code

class Solution: def smallestRange(self, martrix: List[List[int]]) -> List[int]: L, r = -10**9, 10**9 # Put the smallest row in the heap, and record the row number and column number, H = [(row[0], I, 0) for I, Row in enumerate(martrix)] heapq.heapify(h) # Max_v = Max (row[0] for row in martrix) while True: Min_v, row, col = heapq.heappop(h) # max_v-min_v If max_v-min_v < r-l: l, r = min_v, max_v if col == len(martrix[row]) -1: max_v-min_v < r-l: l, r = min_v, max_v if col == len(martrix[row]) -1: Return [l, r] # update pointer Heapush (h, (martrix[row][col + 1], row, col + 1)) max_v = Max (max_v, col + 1)) martrix[row][col + 1]) def minimumDeviation(self, nums: List[int]) -> int: matrix = [[] for _ in range(len(nums))] for i, num in enumerate(nums): if num & 1 == 1: matrix[i] += [num, num * 2] else: temp = [] while num and num & 1 == 0: temp += [num] num //= 2 temp += [num] matrix[i] += temp[::-1] a, b = self.smallestRange(matrix) return b - a

1.3.9 (code)

Tip # 3 – hindsight

The trick is that when you go from left to right, you don’t know what’s on the right until you get to the right.

If you want to know what’s on the right-hand side, an easy way to do that is to go through it twice, and when you go through it the first time, write down the data, and when you go through it the second time, use the data that was recorded in the last time. This is the way we use it the most. Sometimes, though, we can iterate to the specified element and then backtrack, so that we can store it as we go through it, using only one iteration. To be specific, collect all the data from left to right, and select one from them when it is needed. Heap acceleration can be used if we all have a maximum or minimum value and the extreme value changes. Intuitively, it is the use of a time machine to go back in time, to achieve the purpose of hindsight.

You don’t know what that means. It doesn’t matter. Let’s go through a couple of examples. When you’ve looked at these examples, look back at this sentence.

Minimum refueling times

Topic describes

The car starts from its starting point and travels to its destination, which is Target miles east of the starting point. There are gas stations along the way, each station[I] representing a gas station, which is located east of the starting point station[I][0] miles and has station[I][1] litres of petrol. Assume that the capacity of the car's fuel tank is infinite, which initially has startFuel litres of fuel. It uses a litre of petrol for every mile it travels. When the car arrives at the gas station, it may stop to refuel, transferring all the gas from the gas station to the car. What is the minimum number of times a car must refuel in order to reach its destination? If the destination cannot be reached, -1 is returned. Note: If the car arrives at the gas station with zero fuel left, it can still fill up there. If the car reaches its destination with zero fuel left, it is still considered to have reached its destination. StartFuel = 1, StartFuel = [], Stations = [], Stations = [], Stations = [], Stations = [], Stations = 0 Example 2: input: target = 100, startFuel = 1, Stations = [[10,100]] output: -1 Stations = [10,60],[20,30],[30,30],[60,40]] Stations = [[10,60],[20,30],[60,40]] We drove to a petrol station ten miles from our starting point and burned ten litres of fuel. Increase the petrol from 0 litres to 60 litres. Then we drove from Mile 10 to Mile 60 (using 50 litres of fuel) and added petrol from 10 litres to 50 litres. Then we drove to our destination. We stopped at one or two gas stations along the way, so we went back to 2. Tip:  1 <= target, startFuel, stations[i][1] <= 10^9 0 <= stations.length <= 500 0 < stations[0][0] < stations[1][0] < ... < stations[stations.length-1][0] < target

Train of thought

In order to get the minimum amount of refueling, we definitely want to be able to do it without refueling. When does it have to be filled up? The answer should be when you can’t get to your next destination if you don’t refuel.

The pseudocode description is:

Cur = startFuel # startFuel # last = 0 # for I, fuel in stations: Cur - = I - # last past two staton compared with two station's distance, is I - last if cur < # 0: we must go in front, otherwise can't get here # but which station in front of the gas? The intuition tells us that we should be greedy and choose the station I where we can add the most gas. If we add I to the gas, then continue to add the larger station j until there is no more gas to add or the cur > 0

It says to choose the station I that can fill up the most gas. If it is not enough, continue to choose the second most gas station. This dynamic extremum scenario is well suited to the use of HEAP.

Specifically, it is:

At each station it passes, it adds fuel to the heap.
Drive as far as you can, keep driving as long as the oil is not less than 0.
If the amount of fuel is less than 0, take the maximum amount of fuel from the heap and add it to the tank. If the amount of fuel is still less than 0, continue to take the maximum amount of fuel from the heap.
If the amount of fuel is greater than 0 after filling, continue driving and repeat the previous step. Otherwise return -1, which means the destination cannot be reached.

So how does this algorithm reflect hindsight? You can substitute yourself into the problem and do the simulation. Imagine yourself driving a car. Your goal is the minimum amount of gas you need. When you get to one stop, you never know if you have enough gas to make it to the next, and even if you don’t, it might have been better to fill up at the last stop. So in reality you don’t really know whether I should or shouldn’t fill up at the current station, because there’s so little information.

So what would I do? If I was driving, I would have to fill up every time and I wouldn’t get there, so I wouldn’t get there. But if that gets us to the destination, then I can say that if we refuel at that stop, that stop will get us to the destination with the least amount of refuel. Why didn’t you say so? Isn’t that wise after the event?

This hindsight is reflected in the fact that we waited until we ran out of gas before thinking we should have filled up at an earlier stop.

So what this hindsight essentially solves is that we can’t get the optimal solution based on current information, we have to get all the information and go back. In this case, we can iterate over each station, then log the amount of fuel in each station into an array, and every time we “foresee” that we can’t reach the next station, we take the largest from this array… Based on this, we can consider the process of using heap optimization to get the maximum value, rather than using arrays.

code

class Solution:
    def minRefuelStops(self, target: int, startFuel: int, stations: List[List[int]]) -> int:
        stations += [(target, 0)]
        cur = startFuel
        ans = 0

        h = []
        last = 0
        for i, fuel in stations:
            cur -= i - last
            while cur < 0 and h:
                cur -= heapq.heappop(h)
                ans += 1
            if cur < 0:
                return -1
            heappush(h, -fuel)

            last = i
        return ans

(code 1.3.10)

Avoid flooding

Topic describes

Your country has an infinite number of lakes, all of which started out empty. When it rains in the NTH lake, if the NTH lake is empty, then it will be filled with water, otherwise the lake will flood. Your goal is to avoid flooding in any of the lakes. }}}}}}}}}}}}}}}}}}}}}}}}}}}}}}}}}}}}}}}}}}}}}}}}}}}}}}} Then, when rains[I] == 0, you can select a lake and drain it. Please return an array called ans with: ans. Length == RAIN.Length If RAINS [I] > 0, then ANS [I] == -1. If rains[I] == 0, ans[I] is the lake you choose to drain on day I. If there are multiple viable solutions, return any one of them. If nothing can be done to prevent the flood, return an empty array. Note that if you choose to drain a lake full of water, it will become an empty lake. But if you choose to drain an empty lake, nothing will happen (see Example 4 for details). Example 1: input: RAINS = [1,2,3,4] output: [-1,-1,-1,-1] After the first day, the lakes filled with water include [1] after the second day, the lakes filled with water include [1,2] after the third day, the lakes filled with water include [1,2,3] after the fourth day, the lakes filled with water include [1,2,3,4]. There is no day on which you can drain any of the lakes, and no lake floods. Example 2: Input: RAINS = [1,2,0,0,2,1] Output: [-1,-1,2,1,-1] Explanation: After the first day, the full lake includes [1] After the second day, the full lake includes [1,2] After the third day, we drain the lake 2. So the remaining lakes filled with water included [1] after four days, we drained lake 1. So for the time being there are no lakes full of water. After the fifth day, the lake filled with water included [2]. After six days, the lakes filled with water included [1,2]. As you can see, there will be no flooding under this scheme. At the same time, [-1,-1,1,2,-1,-1] is another feasible flood-free solution. Example 3: Input: RAINS = [1,2,0,1,2] Output: [] Explanation: After the next day, the lake filled with water includes [1,2]. We can drain a lake on the third day. But after the third day, Lakes 1 and 2 are going to rain again, so no matter which lake we drain on the third day, the other lake is going to flood. Example 4: input: RAINS = [69,0,0,0,69] output: [-1,69,1,1,-1] ,69 any form such as [- 1, x, y, 1], [69-1, x, y, 1] or [- 1, x, y, 69, 1] is feasible solution, including 1 < = x, y < = 10 ^ 9 example 5: input: rains =,20,20 [10] output: [] interpretation: As Lake 20 will rain for two consecutive days, there is no way to stop the flood. 1 <= RAIN.LENGTH <= 10^5 0 <= RAINS [I] <= 10^9

Train of thought

If the above question is far-fetched with the benefit of hindsight, then the latter two can be said to be appropriate.

So it says that we can drain a lake when it’s not raining. If we have multiple lakes that are full of rain, which lake should we drain? Apparently, it should be draining the lake that was about to flood recently. But in reality it is impossible to know which lake will rain on which day in the future anyway, even if there is a weather forecast, so it is not 100% reliable.

But the code does. We can walk through the rain array to find out which lake is raining on the day. Armed with this information, we can be wise after the event.

“Today is a beautiful day, I opened my eye, Lake 2 will flood tomorrow, we will first drain it today, otherwise it will flood.” .

Like the problem above, instead of traversing the rain array and simulating daily changes, we can simulate it directly and not drain any lakes even if it’s sunny right now. Then we record sunny days during the simulation, and when the flood occurs, we consider which lakes should be drained on which sunny days ahead. So the hindsight is that we waited until the flood hit before thinking about what we should have done the day before.

Algorithm:

Iterate through the rain and simulate daily changes
If rain is currently 0 that means it’s sunny, we don’t drain any lakes. But we log the current day into the sunny array.
If rain is greater than 0, it’s raining in one of the lakes, so let’s see if there’s a flood in the rainy lake. It’s just to see if there’s water before it rains. This tells us to use a data structure to record each lake. We can use 0 to indicate no water and 1 to indicate water. So when it rains in Lake I and lakes[I] = 1 it floods.
If a lake is currently flooding, go to the sunny array and find a sunny day to drain it so it won’t flood.

I didn’t use the heap in this problem, but I did that on purpose. The whole point of doing this is to show you that hindsight is not unique to a heap, it’s actually just a general algorithmic idea, like going backwards. However, a lot of times, we hindsight scene, need to dynamically take the maximum and minimum values, this time should consider the use of the heap, which is actually back to a center at the beginning of the article, so we must be flexible to use these skills, not mechanically.

The next problem is an absolute hindsight + heap optimization problem.

code

class Solution:
    def avoidFlood(self, rains: List[int]) -> List[int]:
        ans = [1] * len(rains)
        lakes = collections.defaultdict(int)
        sunny = []

        for i, rain in enumerate(rains):
            if rain > 0:
                ans[i] = -1
                if lakes[rain - 1] == 1:
                    if 0 == len(sunny):
                        return []
                    ans[sunny.pop()] = rain
                lakes[rain - 1] = 1
            else:
                sunny.append(i)
        return ans

(code 1.3.11)

1642. The farthest building that can be reached

Topic describes

I give you an array of integers Heights, the height of the building. There are also bricks like bricks and ladders. You start your journey from building 0 and move on to the next building, possibly using bricks or ladders. When moving from building I to building I +1 (subscript starts at 0) : If the height of the current building is greater than or equal to the height of the next building, then no ladder or bricks are needed. If the height of the current building is less than the height of the next building, you can use a ladder or (h[I +1] -h [I]) bricks. If the given ladder and bricks are used in the best way, Returns the index of the farthest building you can reach (the index starts at 0).

Input: heights = [4,2,7,6,9,14,12], bricks = 5, ladders = 1 output: 4 Explanation: Starting from building 0, you can complete the journey using this scheme: - Reach Building 1 without using bricks or ladders because 4 >= 2 - Reach Building 2 with 5 bricks. You must use bricks or ladders because 2 < 7 - do not use bricks or ladders to reach building 3 because 7 >= 6 - use the only ladder to reach building 4. You have to use bricks or ladders, because 6 < 9 can't get past building 4, because there are no more bricks or ladders. Input: heights = [4,12,2,7,3,18,20,3,19], bricks = 10, ladders = 2 Heights = [14,3,19,3], bricks = 17, ladders = 0  1 <= heights.length <= 105 1 <= heights[i] <= 106 0 <= bricks <= 109 0 <= ladders <= heights.length

Train of thought

We can see the ladder as an infinite brick that can only be used once, and of course we want to be able to use the ladder to the best of our abilities. Again, in real life, there’s no way to know when to use a ladder and when to use a brick.

That’s okay, let’s continue with hindsight, and we can do it in one pass. I don’t have the brains to use the ladder. When we run out of ladders, we start to be wise after the event. If only we had used the brick in front. When is it good to use bricks? Obviously the height difference when the ladder was used was smaller than the height difference now.

To put it bluntly, I used a ladder to climb a 5-meter wall, and now I have a 10-meter wall. I have no ladder, so I have to use 10 bricks. If I had used five bricks before, I could now use a ladder and save five bricks.

This prompts us to save the height difference of the building that is spanked by the ladder in front, and when the ladder behind is used up, we will “exchange” the ladder in front to be used as bricks. In the example above, we could trade in 10 bricks and then use up 5 bricks, which is equivalent to adding 5 bricks.

If the ladder has been used several times before, which one should we “exchange” first? It is obvious that the priority conversion height difference is large, so that the conversion of bricks to the most. This prompts each time to select the largest height difference previously stored and remove it after the “exchange”. What data structure is appropriate for this dynamic extremum scenario? Piles, of course.

code

class Solution:
    def furthestBuilding(self, heights: List[int], bricks: int, ladders: int) -> int:
        h = []
        for i in range(1, len(heights)):
            diff = heights[i] - heights[i - 1]
            if diff <= 0:
                continue
            if bricks < diff and ladders > 0:
                ladders -= 1
                if h and -h[0] > diff:
                    bricks -= heapq.heappop(h)
                else:
                    continue
            bricks -= diff
            if bricks < 0:
                return i - 1
            heapq.heappush(h, -diff)
        return len(heights) - 1

(code 1.3.12)

The four applications

Next is the last part of this article “four applications”, the purpose is to help you consolidate the previous knowledge through these several examples.

1. topK

Solving for topK is an important feature of the heap. This was actually introduced earlier in the fixed heap section.

Here’s a direct quote:

“Well, the easiest way to figure out the KTH smallest number is to set up a small top heap, put all the numbers in the heap first, and then dump them one by one, for k times. The last thing that comes out of the heap is the KTH smallest number. However, instead of putting them all in the heap first, we build a large top heap (not the small top heap) and maintain the size of the heap to k. If the size of the heap is greater than k after the new number is added to the heap, you need to compare the number at the top of the heap to the new number and remove the larger number. This guarantees that the number in the heap is the smallest k of all numbers, and isn’t the largest of the smallest k (the top of the heap) the KTH smallest? That’s why you choose to build a large top stack instead of a small top stack.”

In addition to the KTH smallest number, we could have collected all the middle numbers, and that would have given us the smallest k number. The only difference from the KTH smallest number above is that we need to collect all the numbers that come out of popp.

It should be noted that sometimes the weight is not the size of the original array value itself, but also the distance, frequency, and so on.

2. The shortest distance with the right

In fact, I mentioned this in the previous section, but only in passing. Quote: “But is BFS really not implemented with priority queues? Of course not! For example, the shortest path problem with weighted graph, if the queue is used as BFS, then the priority queue can be used, because there are weight differences between paths, this is not the original intention of the priority queue design. A typical BFS implementation using priority queues is Dijkstra’s algorithm.”

Dijkstra algorithm mainly solves the shortest distance between any two points in the graph.

The basic idea of the algorithm is greed, traversing all the neighbors each time and finding the least distance among them, essentially a breadth-first traversal. Here we use the data structure of heap to find the point where cos (t) is the smallest in the time of $logN$, where N is the size of the heap.

Code template:

Def Dijkstra (Graph, Start, End): # The data in the heap are the binary progenitor of (cost, I), which means "the distance from start to I is cost". heap = [(0, start)] visited = set() while heap: (cost, u) = heapq.heappop(heap) if u in visited: continue visited.add(u) if u == end: return cost for v, c in graph[u]: if v in visited: continue next = cost + c heapq.heappush(heap, (next, v)) return -1

1.4.1 (code)

You can see that code templates and BFS are basically similar. You can also simulate implementing BFS if you set the key of the heap to steps yourself, which has been covered in the previous section and will not be repeated here.

An example might look like this:

E -- 1 --> B -- 1 --> C -- 1 --> D -- 1 --> F
 \                                         /\
  \                                        ||
    -------- 2 ---------> G ------- 1 ------

We use the adjacency matrix to construct:

G = {
    "B": [["C", 1]],
    "C": [["D", 1]],
    "D": [["F", 1]],
    "E": [["B", 1], ["G", 2]],
    "F": [],
    "G": [["F", 1]],
}

shortDistance = dijkstra(G, "E", "C")
print(shortDistance)  # E -- 3 --> F -- 3 --> C == 6

With this algorithm template, you can go to AC 743. Network Latency Time.

Complete code:

class Solution: def dijkstra(self, graph, start, end): heap = [(0, start)] visited = set() while heap: (cost, u) = heapq.heappop(heap) if u in visited: continue visited.add(u) if u == end: return cost for v, c in graph[u]: if v in visited: continue next = cost + c heapq.heappush(heap, (next, v)) return -1 def networkDelayTime(self, times: List[List[int]], N: int, K: int) -> int: graph = collections.defaultdict(list) for fr, to, w in times: graph[fr - 1].append((to - 1, w)) ans = -1 for to in range(N): Dijkstra (graph, k-1, to) if dist == -1: return -1 ans = Max (ans, dist) return ans

1.4.2 (code)

Did you get it?

The above algorithm is not an optimal solution, but I’m just trying to illustrate the idea of encapsulating Dijkstra as an API call. A better approach is to traverse all the distance information at once, rather than double counting each time. The time complexity is greatly reduced. This makes a lot of sense when calculating the distance from one point to all the points in the graph. What adjustments do we make to our algorithm to achieve this?

Tip: You can do this using a dist hash table that records the shortest distance from the starting point to each point. If you want to come out, you can buckle 882 questions to verify Oh ~

In fact, you only need to make a small adjustment, since the adjustment is small, it is better to look directly at the code.

Code:

class Solution: def dijkstra(self, graph, start, end): heap = [(0, start)] # cost from start node,end node dist = {} while heap: (cost, u) = heapq.heappop(heap) if u in dist: continue dist[u] = cost for v, c in graph[u]: if v in dist: continue next = cost + c heapq.heappush(heap, (next, v)) return dist def networkDelayTime(self, times: List[List[int]], N: int, K: int) -> int: graph = collections.defaultdict(list) for fr, to, w in times: graph[fr - 1].append((to - 1, w)) ans = -1 dist = self.dijkstra(graph, K - 1, to) return -1 if len(dist) ! = N else max(dist.values())

1.4.3 (code)

You can see that we just replaced visitd with dist, and everything else remains the same. In addition, dist is just the visited with key, which also plays the role of visitd here.

If you need to compute the shortest paths from one node to all the other nodes, you can use dist (a HashMap) to record the shortest paths from the starting point to all the points instead of visited (a HashSet).

There are many similar questions. Let me give you another one of the cheapest flights through 787.K. Title description:

There are n cities connected by m flights. Each flight starts in city U and arrives in city V at the price W. Now given all the cities and flights, as well as the departure city SRC and the destination DST, your task is to find the cheapest price from SRC to DST with the most stops at K. If there is no such route, it prints -1. Example 1: Input: n = 3, EDGES = [[0,1,100],[1,2,100],[0,2,500]] SRC = 0, DST = 2, K = 1 Output: 200

The cheapest price from City 0 to City 2 within 1 stop is 200, as shown in red in the figure. Example 2: input: n = 3, edges = [,1,100 [0],,2,100 [1], [0,2,500]] SRC = 0, DST = 2 and k = 0 output: 500: city flight figure below

The cheapest price from City 0 to City 2 within 0 stops is 500, as shown in blue. Tip: N range is [1, 100], City tags from 0 to n-1 The number of flights range is [0, n * (n-1) / 2] The format of each flight (SRC, DST, Price) The price range of each flight is [1, 10000] K range is [0, n-1] Flights are not duplicated, And there is no self-loop

This problem is not fundamentally different from the above, I will still encapsulate it as an API to use, the specific code on the line.

The only special point in this problem is that if the number of transfers is greater than k, it is also considered unreachable. This is easy, we just need to use the tuple in the heap to carry one more step. This step is the distance in the unweighted BFS. If pop comes out, steps are greater than K, then illegal, we skip continue processing can be.

Def dijkstra(self, graph, start, end, K) def dijkstra(self, graph, start, end, K) heap = [(0, start, 0)] visited = set() while heap: (cost, u, steps) = heapq.heappop(heap) if u in visited: continue visited.add((u, steps)) if steps > K: continue if u == end: return cost for v, c in graph[u]: if (v, steps) in visited: continue next = cost + c heapq.heappush(heap, (next, v, steps + 1)) return -1 def findCheapestPrice(self, n: int, flights: List[List[int]], src: int, dst: int, K: int) -> int: graph = collections.defaultdict(list) for fr, to, price in flights: Graph [fr].append((to, price)) return self.dijkstra(graph, SRC, DST, K + 1)

1.4.4 (code)

3. Factor decomposition

Apply it to the two above, which I mentioned earlier in the section “313. Super Ugly Numbers”.

To review the definition of an ugly number, an ugly number is a positive integer whose prime factors consist of only 2, 3, or 5. So an ugly number is essentially a number that is factored down to the integers of 2, 3, and 5, without any other factors.

There are a lot of problems with ugly numbers, and most of them can also be solved from a heap point of view. Sometimes, however, the number of factors is limited and can be easily solved without the use of a heap. For example: 264. Ugly number II can be recorded using three Pointers. This technique is also discussed in the previous section and will not be repeated again.

Some problems are not ugly numbers, but they explicitly mention factor-like information and ask you to find the KTH largest (x). In this case, the heap is preferred. If the question contains some other information, such as order, then you can also consider dichotomy. Which method to use will depend on a case-by-case basis, but before that you need to be familiar with both methods.

4. The heap sort

All three of the previous applications have been mentioned to a greater or lesser extent. Heap sort, on the other hand, was not mentioned earlier.

Few topics directly examine heap sort. But the interview will probably look at it, and learning heap sort is important for you to understand important algorithmic thinking like divide and conquer. Personal feeling, heap sort, construction of binary tree, construction of line segment tree and other algorithms have great similarity, master one, the other can be analogy.

In fact, after learning from the previous heaps, we can encapsulate a heap sort in a very simple way.

Here is a simple example of heap sorting using the heap API:

H = [9,5,2,7] heapq.heappop(h) = [] while h: ans.append(heapq.heappop(h)) print(ans) # 2,5,7,9

Once you understand the example, it’s not hard to encapsulate it as a generic heap sort.

def heap_sort(h):
    heapq.heapify(h)
    ans = []
    while h:
        ans.append(heapq.heappop(h))
    return ans

This method is simple enough, and if you understand the principle of the previous heap, it’s not hard to hand out a heap sort. The downside of this approach is that it is not an in-situ algorithm, which means you have to use extra space to carry the result, which is $O(N)$. But in fact, after calling the heap sort method, the original array memory can be freed, so in theory there is no waste of space, but we calculate the space complexity is the most memory used at the time, so using the in-place algorithm is undoubtedly better. If you really don’t like the implementation, you can also modify it in place. This is not difficult, just a little modification of the previous heap implementation can be, due to space constraints, I will not go into it here.

conclusion

Heaps and queues are inextricably linked. For many problems, I think about using the heap first. Then you see that every time you put it into the heap it’s + 1, instead of jumping up and down to + 2, +3, and so on, so you have better performance using queues. For example, 649. DOTA 2 Senates and 1654. Minimum number of jumps to get home.

There’s only one center of the heap, and that’s the dynamic extremum.

Extremalism is just a maximum or a minimum, and that’s not hard to see. If we want to maximize, we can use the big top heap, and if we want to minimize, we can use the minimum heap. In fact, if there is no dynamic word, there are many situations where the use of the heap is unnecessary. For example, you can just go through it and find the largest one. And the dynamic point is not easy to see, and that’s the problem. This requires you to first analyze the problem, analysis of this problem is actually dynamic optimization, then the use of heap optimization should be thought of.

There are many implementations of heaps. For example, based on linked list skip table, based on array binary heap and based on red black tree implementation. Here we introduced the two main implementations and described in detail the implementation of the binary heap, not only is its implementation is simple, and its performance in many cases are good, we recommend to focus on mastering the binary heap implementation.

For the implementation of binary heap, the core point is to always maintain the same properties of the heap, what is the specific properties? That is, the weight of the parent node is not greater than the weight of the son (small top heap). In order to do this, we need to use up and down operations on the heap and out of the heap, and do the element exchange properly. Specifically, the floating process swaps with its larger parent, and the sinking process swaps with the smaller of the two children, provided, of course, that it has children and the children are smaller than it.

We didn’t do a detailed analysis of the heap. But if you understand the heap operation in this article, it’s actually quite easy. Thus heap itself is a continuous heap process, but it turns a discrete operation in time into a one-time operation.

In addition, I have introduced the skills of three heaps, respectively:

The fixed heap can not only solve the k-th problem, but also effectively use the calculated results to avoid repeated calculation.
Multiway merge is essentially a violent solution, and there is no essential difference between it and violent recursion. If you turn it into a recursion, it’s also a recursion that is not memorized. So it’s more like a backtracking algorithm.
Be wise after the event. Some information that we don’t have access to at the moment can be stored in a data structure for later retrieval. This data solution can be many, and common ones are hash tables and heaps. You can also think of this technique as second-guessing. Some people are more comfortable with this term, but whatever it is, it means the same thing.

Finally, I introduce four kinds of applications, all of which have been discussed more or less before, except heap sort. They are:

topK
Weighted shortest path
factorization
Heap sort

All four of these applications are actually dynamically maximizing around a center of the heap, but all four of them are just using this feature flexibly. So you just have to remember to take the dynamic extremum when you’re doing the problem. If you can figure out that this problem has to do with dynamic maxima, then be sure to consider heaps. Next, we need to go through the complexity in our mind, and we can roughly estimate whether it is feasible or not by comparing the data range of the question.

If you have any opinions on this, please leave a message to me. I will check all the answers when I have time. More algorithm routine can access my LeetCode antithesis warehouse: https://github.com/azl3979858… . It’s already 39K star. You can also pay attention to my public number “force buckle plus” take you to bite the algorithm this hard bone.

I’ve almost finished scrubbing all the stacks, and I found these things… (Second shot)

Just a little digression

The body of the

note

tuples

Simulate a large top heap

Three techniques

Tip 1 – Fix the heap

295. Median of data streams

Topic describes

Train of thought

code

857. Minimum cost of employing K workers

Topic describes

Train of thought

code

Technique two – multiple merge

1439. The KTH smallest array sum in an ordered matrix

Topic describes

Train of thought

code

Find the KTH smallest distance pair

Topic describes

Train of thought

code

Minimum interval

Topic describes

Train of thought

code

1675. Minimum offset of an array

Topic describes

Train of thought

code

Tip # 3 – hindsight

Minimum refueling times

Topic describes

Train of thought

code

Avoid flooding

Topic describes

Train of thought

code

1642. The farthest building that can be reached

Topic describes

Train of thought

code

The four applications

1. topK

2. The shortest distance with the right

3. Factor decomposition

4. The heap sort

conclusion

Related Posts

338. Counting and Counting Bits

Leetcode HOT100 solution summary JS (continuously updated)

Question of the Day Series – LeetCode -525- Contiguous Arrays