preface

Heap (Heap), the word is not strange, because when we are learning JavaScript reference types, we all know the reference type stored in the Heap memory, and other language is different, you may not have direct access to Heap memory location in space and operating Heap memory space, only action object references in the stack memory address. But how does the inside of the heap work? This article will talk about this problem in detail and understand the implementation details of the heap. This article uses JavaScript to achieve the heap, so there may be certain language characteristics.

1. Some properties of the heap

1. A heap is usually an array object that can be thought of as a tree.

2. The heap is always a complete binary tree.

3. Drawing a path from the root of the heap to any node can always get the order from small to large (the smallest heap) or from large to small (the largest heap).

4. Since the heap is a complete binary tree, for a node with index K, the indexes of its left and right sons are 2k+1 and 2K +2 (if any) respectively.

The maximum heap: The minimum heap The following is not a heap:The figure above does not satisfy a complete binary tree, so it is not a heap (The node with a value of 19 is missing its right son)By the way, this is not a heap, and a heap must be able to draw any path from the root node to a node, and its node values should always be in the order from large to small or from small to large.

2, heap implementation

Since we implement the heap in JavaScript, since the array can grow indefinitely, we don’t need to set the maximum size for the heap, and our heap won’t be full. But for other languages, if you want to do this you need to implement a dynamically expanded array.

As usual, we’ll go to the code first, and then explain how the code works.

/* * maximum heap */
class MaxHeap {

    /** * defines the maximum number of sentinels that all elements inserted into the heap must be smaller than */
    #MAX_VAL = 1000

    /** * defines a memory space to store data */
    #data = []

    /** * The number of elements in the heap */
    #size = 0

    constructor(. nums) {
        // Set the sentry
        this.#data[0] = this.#MAX_VAL
        // Initializes an array element
        nums.forEach((v, i) = > {
            this.#data[i + 1] = v
            this.#size++
        })
        this.buildHeap()
    }

    isEmpty() {
        return this.#size === 0
    }

    /** * inserts a valid value * into the heap@param {number} val 
     */
    insert(val) {
        if (this.MAX_VAL <= val) {
            throw `can not insert val bigger than The ${this.#MAX_VAL}`
        }
        // The heap capacity is expanded by 1
        this.#size++
        // Make I point to the current new location
        let i = this.#size
        // There is no need to add constraint I > 0 because there are sentinels
        while (this.#data[Math.floor(i / 2)] < val) {
            this.#data[i] = this.#data[Math.floor(i / 2)]
            i = Math.floor(i / 2)}this.#data[i] = val
    }

    deleteMax() {
        if (this.isEmpty()) {
            console.warn("can not delete max from empty heap")
            return
        }
        // Fetch the top element of the heap
        let maxVal = this.#data[1]
        // Remove the last element from the heap and reduce the size of the heap
        let temp = this.#data[this.#size--]
        let parent, child
        // Start starts at the root node
        // If parent * 2 is out of bounds, the leaf node has been reached
        // The iteration condition is to drop the parent down
        for (parent = 1; parent * 2< =this.#size; parent = child) {
            // Suppose the left son is older
            child = parent * 2
            // Do not cross the boundary and the right son is larger than the left son
            if(child ! =this.#size && this.#data[child + 1] > this.#data[child]) {
                // Make sure the right son is older
                child++
            }
            if (temp == this.#data[child]) {
                break
            } else {
                // Add the elements of the left and right son
                this.#data[parent] = this.#data[child]
            }
        }
        // Put temp on the current parent node
        this.#data[parent] = temp
        // The JavaScript language needs to do this step to make the array smaller and free up space
        this.#data.length--
        return maxVal
    }
    
    /* Filter down: resize the subheap of heap data[p] to the maximum heap */
    percDown(p) { 
        let parent, child;
        let temp = this.#data[p]
        temp = this.#data[p]; /* Retrieve the root value */
        for (parent = p; parent * 2< =this.#size; parent = child) {
            child = parent * 2;
            if(child ! =this.#size && this.#data[child] < this.#data[child + 1]) {
                child++;  /* child refers to the larger of the left and right children */
            }
            /* find the right location */
            if (temp >= this.#data[child]) {
                break;
            }
            else {
                /* filter X */
                this.#data[parent] = this.#data[child]; }}this.#data[parent] = temp;
    }

    /*----------- build maximum heap -----------*/
    buildHeap() {
        /* Adjust the elements in data to meet the maximum heap order */
        /* All size elements already exist in data[] */
        /* Starts at the parent of the last node and goes to the root node 1 */
        for (let i = Math.floor(this.#size / 2); i > 0; i--) {
            this.percDown(i); }}}Copy the code

2.1. Insert a value into the maximum heap

1, the sentry

In this implementation, the first element in the array (subscript 0) is not used, but is used as a sentry, and with this sentry we can improve the efficiency of the program by reducing one judgment at insert time. In addition, due to the influence of sentries, our left and right son’s subscripts are also changed to 2k and 2K +1(if any).

2, insert,

First, when we want to insert an element into the heap (the value of the element is less than 1000), the size of the heap must increase first.

However, because the elements in the heap need to be ordered, the values we insert are not necessarily the smallest (in this case, the largest) in the path, so we need to find the right place for our elements. Based on the nature of the heap mentioned above, we can determine that the parent of the current node is at I /2 (for JavaScript languages, we need to round down).

The whole process is shown below:

Expand the capacity of the heap

I keep going along the parent node until I find the right place, and I keeps going up and upFinally found the right locationComplete the insert

Assuming the above process, we do you want to change into 101, why don’t we increase circulation exit restrictions I > 0, when we I go to 1 to this place, at the moment the parent node is zero, because the parent node is greater than 101, so at the moment is automatically realized I > 0 of the constraints, but in a loop because we are missing a judgment conditions, The efficiency of the program is definitely improved, which is the benefit of sentry.

For those of you who know sorting algorithms, does this process sound familiar? Yeah, it’s just insertion sort, ha ha ha.

3, Delete maximum value in the heap we delete we don’t delete randomly, we only delete a maximum value. But after removing this maximum, how do we rearrange the rest of our elements into a heap? Let’s look at the process.

First of all, if the heap is empty, then it is definitely not allowed to delete elements, and the function simply returns.

If the heap is not empty, then we take out the first valid element of the heap for the function to return. Because the heap is smaller, we need to reduce the size, but not directly, because we had elements in this position before, so we have to use a temp to keep track of that element and insert it into the appropriate place later.

Now let’s start at position 1 and rearrange the rest of the elements into piles. Again, depending on the order of the heap, we need to find a larger value from the left and right son to put in the current position. ifparent * 2 > this.#sizeIf the current node has no left and right children, then it is relatively simple. We can directly place temp in this position. The diagram below:Temp =1; I *2 = 2; temp =1; I *2 = 2;

If this is not the case, we need to place the larger values of the left and right sons at the current root node. First we need to know the left son (parent*2), and assuming that the left son is the larger one, if we can find the right son (child! = this.#size, if the left son is equal to the size, then the right son is equal to the left son.

Parent =child; parent=child; parent=child; parent=child; parent=child; parent=child

Once we’ve done that, we know where temp is stored. Temp is the current parent node.

This series of processes is roughly as shown below:

Place 31 as a temporary variable to remove a node from the heap, and then start from the root node to determine where temp is stored. Filter all the way down to find an appropriate insertion position for Temp.Since there are no left and right sons, the current parent node is the final location where Temp needs to be stored. This example is based on the exit condition that the current parent node has no left and right sons. Readers can also try to construct the situation that there are children after 35 to think about and experience the execution process of the algorithm.

This.#data.length– if you do not reduce the length of the array, the dirty data will not be recycled, resulting in waste. If you manually adjust the length of the array, the size of the array will be reduced. This is a bit of a JavaScript quirk, but with frameworks like Vue, it doesn’t trigger updates, so be careful. For C#, Java, etc., the size of the array is fixed after the array is given. This problem does not exist, because the heap management content is so large that it does not matter.

3. Build the heap

Once we know how to delete the maximum value of the heap, it’s actually easier to build the heap. Ha, ha, ha.

Suppose we have a parent node p whose left and right subtrees already form left and right subheaps. Then, we just need to adjust the position of the P node to adjust it into a heap.

So for any of our arrays, how do we start with that? The answer is to start with the parent of the last node, and then apply the heap-building operations to each node before it. By the time we adjust to the first node, the heap is complete.

This is the idea behind the above code.

One of the application scenarios of heap — heap sort

This article will only talk about heap sorting in one of the heap application scenarios. Because we already know how to organize a bunch of numbers into a heap.

Is naturally you will think of, built a heap of several fabric first, and then has been to perform operation deleteMax heap or deleteMin operation, but because in a sorting algorithm, we can only according to array by management of the fast memory area, obviously, we need to use a temporary array to save the value delete first, And then you take the derivative of the temporary array back into the original array.

Its code is roughly as follows:

/** * Resize the root node of the array to the maximum heap *@param {Array<number>} arr 
 * @param {number} p 
 * @param {number} size 
 */
function percDown(arr, p, size) {
    /* Note that there are no sentinels in heap sorting to be noted */
    /* Resize the arr[p] subheap to the maximum heap */
    let parent, child
    let temp = arr[p]
    for (parent = p; parent * 2 + 1 < size; parent = child) {
        child = parent * 2 + 1
        if(child ! = size -1 && arr[child + 1] > arr[child]) {
            child++
        }
        if (temp >= arr[child]) {
            break
        } else {
            arr[parent] = arr[child]
        }
    }
    arr[parent] = temp
}

/** * Removes a maximum value * from the heap@param {Array<number>} heap 
 * @param {number} length 
 * @returns * /
function deleteMax(heap, length) {
    /* Select the element with the largest key from the largest heap H and delete a node */
    let parent, child;
    if (!Array.isArray(heap) || length === 0) {
        console.warn("Maximum heap is empty");
        return
    }
    // Since the user is not aware of the presence of the sentry in the sort, the subscripts need to start from 0 to length-1
    let size = length - 1
    let maxVal = heap[0];
    let temp = heap[size - 1];
    // The root node starts at 0
    for (parent = 0; parent * 2 + 1 <= size; parent = child) {
        child = parent * 2 + 1;
        if(child ! = length && heap[child] < heap[child +1]) {
            child++;
        }
        if (temp >= heap[child]) {
            break;
        }
        else {
            heap[parent] = heap[child];
        }
    }
    heap[parent] = temp;
    return maxVal
}

/** * resize the array to the maximum heap *@param {Array<number>} arr 
 */
function buildHeap(arr) {
    /* Create the maximum heap */
    for (let i = Math.floor(arr.length / 2) - 1; i >= 0; i--) { percDown(arr, i, arr.length); }}/** * heap sort the array *@param {Array<number>} arr 
 * @returns * /
function heapSort(arr) {
    if (!Array.isArray(arr)) {
        return
    }
    buildHeap(arr)
    let tempArr = []
    for (let i = 0; i < arr.length; i++) {
        tempArr.push(deleteMax(arr, arr.length - i))
    }

    for (let i = 0; i < tempArr.length; i++) {
        arr[i] = tempArr[i]
    }
}
Copy the code

One point of particular interest in the above code is that the sentry element is present in the heap in section 2, but in the sorting algorithm the user is unaware of the sentry element and therefore has to sort from the index 0. Note that when the root node starts at 0, the index of the left child is 2*parent+1.

Another problem is that we can’t change the size of the array when we’re building a heap of the array passed in by the user, so we have to manually pass in length to control how much memory the heap can manage.

However, the obvious problem with the above algorithm is that there is an extra TEMP_ARR, which is equivalent to the maximum amount of data you could have sorted out is 1G. Because of the temporary array (extra space complexity O(N)), you can only sort out half of the data, which is definitely not a good solution.

So let’s imagine other sort schemes in our head, like selection sort

export function selectionSort(arr) {
    if (!Array.isArray(arr)) {
        return
    }
    let temp = null
    for (let i = 0; i < arr.length; i++) {
        I +1-length-1 = 0- I +1-length-1
        for (let j = i + 1; j < arr.length; j++) {
            // Find a maximum value from the unordered fragment and place it at the current position. Continue processing the unordered fragment until it is complete
            if (arr[i] > arr[j]) {
                temp = arr[j]
                arr[j] = arr[i]
                arr[i] = temp
            }
        }
    }
}
Copy the code

Ma sa ka!!As if we can also like selection sort, ah, just want to change the direction, because of every time is from the beginning to the end, so we sort of process from the tail end, every time we take out the heap heap the values in the tail element of the next position, and then the rest elements to build into a new maximum heap, so on, until we don’t have any element can build the heap, Then it’s done. Ha, ha, ha, nice! .

Its process is roughly as shown in the figure below: Then you keep doing this until finally the heap no longer manages the elements, and the sort is done.

The code implementation is as follows:

/** * Resize the root node of the array to the maximum heap *@param {Array<number>} arr 
 * @param {number} p 
 * @param {number} size 
 */
function percDown(arr, p, size) {
    /* Note that there are no sentinels in heap sorting to be noted */
    /* Resize the arr[p] subheap to the maximum heap */
    let parent, child
    let temp = arr[p]
    for (parent = p; parent * 2 + 1 < size; parent = child) {
        child = parent * 2 + 1
        if(child ! = size -1 && arr[child + 1] > arr[child]) {
            child++
        }
        if (temp >= arr[child]) {
            break
        } else {
            arr[parent] = arr[child]
        }
    }
    arr[parent] = temp
}

/* Heap sort */
function heapSort(arr) {
    if (!Array.isArray(arr)) {
        return
    }
    /* Create the maximum heap */
    for (let i = Math.floor(arr.length / 2) - 1; i >= 0; i--) {
        percDown(arr, i, arr.length);
    }
    for (let i = arr.length - 1; i > 0; i--) {
        /* Delete the maximum heap top */
        let temp = arr[0]
        arr[0] = arr[i]
        arr[i] = temp
        percDown(arr, 0, i); }}Copy the code

Time and space complexity analysis:

Theorem: Heapsort processing N random permutations of the average number of comparisons is 2nlogn-o (NLogLogN)

So heapsort can be written as O(NLogN), and it’s a little bit better than that.

4, summarize

The content of this paper comes from “Data Structure” tree (bottom) and sorting (top), which is taught by MoOCs in Zhejiang University of China. The algorithm running process diagram is drawn after the author’s thinking and converted into JavaScript.

In this article, the reader can see the significance of sentry in practical development. It is necessary to pay attention to the difference of sentry element in heap and heap sort, because sentry element will directly lead to the difference of index method of left and right children.

In addition, in the process of heap sorting, it is particularly important to note that the length of the heap is not the length of the array, but we need to manually control.

In addition to being used for sorting, the heap has several uses, such as:

Priority queues can be implemented; Use heap to calculate TOP K; Use the heap to find the median.

These applications are not discussed in this article, but interested readers are advised to check them out.

In this article, readers can see that the knowledge of data structure is complementary to each other (insertion sort and selection sort). If you master the existing knowledge points, it will be relatively easy to accept new knowledge points. For data structure this course, relatively abstract, may not be easy to understand, must use the brain and hands, will certainly improve. A journey of a thousand miles begins with a single step, and the year 2022 has already begun. I would like to wish you a happy old age and wish you all better achievements in 2022.

The drawing tool for this article is excalidraw.com/ readers interested can mark it.

Due to the limited level of the author, it is inevitable that there will be mistakes in the writing process. If there are any mistakes, please correct them. Please contact the author at [email protected], your opinions will help me make better progress. This article is the author’s original, if reproduced please contact the author.