To understand the TOP K problem in depth, you must first understand the concept of “heap”.

The definition of the heap

To be a heap, at least two of the following conditions must be met.

  1. It has to be a complete binary tree.
  2. The value of any node must be greater than or less than the value of all nodes in its tree. These two cases correspond to the big top heap and small top heap respectively

Here are two examples:

How to implement a heap?

As mentioned in the previous article, complete binary trees are well suited to arrays as data structures. So the heap, as a complete binary tree, must be best implemented using arrays.

And the rule is very simple, let’s summarize it again is:

If the subscript of a node is I, the subscript of its left child is 2i, and the subscript of its right child is 2i+1

The parent is I /2.

If you still don’t understand the full binary tree – array mapping, read the previous article or draw your own.

Implementing a heap is all about inserting and deleting data into the heap.

Insert algorithm

It’s really easy to insert, and if you look at that picture, let’s take this big top heap. If we insert a value, we insert it at the end of the heap, compare it to its parent, swap if it’s bigger, and finish if it’s smaller.

Let’s say we insert a 10 into the big top heap,

And you can see that this bottom-up insertion algorithm is pretty straightforward.

Delete heap top element algorithm

According to the definition of big top heap and small top heap, the top element is always the largest or smallest. After the top element is deleted, the value of the new top element will be the left and right children of the original top element.

So obviously we naively assumed that the algorithm for removing the top of the heap would be (assuming the big top heap) :

After deleting the top of the heap, we compare the value of the left child with the value of the right child. Whoever has the larger value goes to the top of the heap and stays there. After the son got to the top of the pile,

The son’s son does the same thing and moves up to see who is bigger. It looks like this algorithm is airtight and understandable, but this algorithm is easy

The following exceptions occur.

So you can see that the algorithm here is going to cause some situation where the complete binary tree of the heap is destroyed.

So we need to improve it, and the improved algorithm is as follows:

To delete the top element, first delete the top element and then put the last position of the heap on the top of the heap, and then compare the size from top to bottom and swap places.

Let’s look at a diagram of this algorithm:

Obviously, this algorithm avoids the problem of array holes caused by the first algorithm.

Is this the heap that’s used to sort the heap?

Yes, the heap in heap sort is the heap that implements the data structure. It’s a little bit more complicated to implement, but for a brief explanation (I won’t go into detail because heap sort isn’t really useful) :

Given an unordered array, how can a heap be used to sort it?

The first choice is to build a heap for this unordered array. We have described the insertion algorithm, so for this unordered array, we can assume that the first element of the array is the position of the top of the heap, and then follow our insertion algorithm to insert the following elements. (There are actually friendlier ways to build a heap, but that’s not the point today, so if you’re interested, you can do your own research.)

Now that I have this heap, how do I sort it? Let’s say it’s a big top heap, and actually sorting the heap is just like our algorithm for deleting the top elements, you delete the top elements, and then the second dozen elements get to the top. And so on…

Why isn’t heap sort used?

In fact, heap sort performance seems to be “ok”, with the same time complexity as fast sort, O(nlogn) and stable sorting. But it has a fatal flaw:

The heap relies so much on swaps that it looks like the same time complexity as fast sort, but because you swap so many times, it’s not as good as fast sort.

It’s like a weightlifting competition. Everybody has the same score. You’re a little heavier than me.

Long-winded heap sorting so rubbish that you still tell him why?

I was born to be useful ah, brother, pile this thing dry pure sort is not good, but other things have wonderful use.

A priority queue is almost like a heap.

A priority queue is a queue that, regardless of the order of entry, the order of exit is the order of priority. If you think about it, is it consistent with our definition of the big top heap or the small top heap? Volley PriorityBlockingQueue is a thread-safe priority queue composed of a heap of data structures.

Let’s do the classic TOP K problem

With the above foundation, it is much easier to look at the TOP K problem. Leetcode 703 题 : leetcode-cn.com/problems/kt…

Basically, you’re given an array, and every time you insert data, you’re asked to return the KTH largest element in that array.

Here’s how to solve the problem:

1. So every time we insert something in the array, we do a quick sort, and then we just pick out the KTH size. This is the simplest and stupidest way of thinking. Because it’s too slow to sort all the data every time you insert it.

  1. After know a priority queue so a thing, we can maintain a small cap pile k, each time there is a new element in We will see if the value of the new element is bigger than the small value of pile top, if is greater than he Then delete the little pile jacking pile top elements, and then insert the new element to the value of the later, And then we take this little top heap and the top element is our KTH largest element.
class KthLargest { PriorityQueue<Integer> queue; int maxHeapSize; Public KthLargest(int k, int[] nums) {public KthLargest(int k, int[] nums) {queue = new PriorityQueue<Integer>(k); maxHeapSize=k;if(k < nums.length) {// Construct a heap of size kfor(int i = 0; i < k; i++) { queue.add(nums[i]); } // The size of the object must be determinedfor (int j = k; j < nums.length; j++) {
                if (nums[j] > queue.peek()) {
                    queue.poll();
                    queue.offer(nums[j]);
                } else{}}}else {
            for (Integer integer : nums) {
                queue.add(integer);
            }
        }
    }

    public int add(int val) {
        
        if(queue.size()<maxHeapSize)
        {
            queue.offer(val);
            return queue.peek();
        }
        
        if (val > queue.peek()) {
            queue.poll();
            queue.offer(val);
        }
        returnqueue.peek(); } } /** * Your KthLargest object will be instantiated and called as such: * KthLargest obj = new KthLargest(k, nums); * int param_1 = obj.add(val); * /Copy the code

Take a look at the effect:

Well, it’s ok.

Then take a final look at the Leetcode 239 question leetcode-cn.com/problems/sl…

This problem is made easier with priority queues. Just maintain a big top heap while iterating through the array, and the problem will be solved. (Code is not up, very simple)

Of course, there is a better way to solve this problem, which is to use a double-endian queue instead of a heap, if you are interested in studying it.

Is there any real production problem that the heap or priority queue solves?

In fact, there are many production problems that can be solved. For example, when we make Android customized system, suppose that the user has N tasks there, and each task has a specific execution time. When the time comes, the user’s task, such as alarm clock, will be executed. So how do we fulfill this requirement?

Does the task list check every second to see if the time matches the current time? Obviously not. It’s inefficient.

We can store these tasks as a small top heap, so that as soon as the system time passes by a second, we can see if it is the same as the top heap element.