directory

  • directory
  • preface
    • 1. Find the distance of the farthest node in the binary tree
    • 2. Packing and unpacking Java
    • 3. When will the CMS garbage collector suspend the user thread during collection?
    • 4. Why does ConcurrentHashMap not need to be locked during reading?
    • 5. What is the difference between Redis dictionary Rehash and JDK rehash like HashMap?
  • Refer to the article

preface

I recently had an interview as a backoffice engineer at Headline, and I failed miserably.

After coming back, I have also made some summaries of the interview process. I will not bring private goods with me. This article mainly makes a review of the technical problems in the interview process.

1. Find the distance of the farthest node in the binary tree

This is a LeetCode original, original link

Find the nodes between the farthest nodes of a binary tree. First, summarize several rules:

  1. The diameter of a tree is either entirely in its left subtree or entirely in its right subtree, or it passes through the root node.
  2. Diameter of a tree = Take each node in the tree as the root node and find the maximum diameter of the passing root node. The maximum diameter of all nodes is the diameter of the tree.
  3. Find the diameter of the root node, which can be decomposed into: find the deepest leaf of the left subtree + the deepest leaf of the right subtree.

So we’re going to do:

  1. Recursively calculate the diameter of the passing root node of each node (calculate the deepest leaf of the left subtree and the deepest leaf of the right subtree of the current node), and take its maximum value.
  2. Finding the deepest leaf of a node is equal to the greater value of the deepest leaf + 1 of its left subtree and the deepest leaf + 1 of its right subtree.
  3. When finding the deepest leaf, the diameter of the current node was actually calculated (the deepest leaf on the left + the deepest leaf on the right + 2). In order to avoid repeated calculation, we also recorded the diameter when finding the deepest leaf recursively.

The code is as follows:

   public int diameterOfBinaryTree(TreeNode root) {
       AtomicReference<Integer> ret = new AtomicReference<>(0);
        find(root, ret);
       return ret.get();
   }

   private int find(TreeNode node, AtomicReference<Integer> result) {
       if (node == null) return 0;
       int left = 0, right = 0;
       if(node.left ! =null) left = find(node.left,result) + 1;
       if(node.right ! =null) right = find(node.right,result) + 1;
       int tmp = Math.max(result.get(), left + right);
       result.set(tmp);
       return Math.max(left, right);
   }
Copy the code

The code is very simple, is the recursion to find the node of the left subtree furthest leaf and the right subtree furthest leaf. Then, in the calculation process, the current node diameter is stored as an alternative, and the maximum diameter can be calculated finally.

2. Packing and unpacking Java

Java added automatic boxing and unboxing in 1.5. Basically, it’s an automatic conversion between the base type and the corresponding wrapper type.

In the following code:

public class BoxTest {
    public static void main(String [] args){
        Integer a = 10; / / packing
        int b = a; / / split open a case}}Copy the code

We compiled the code and decompiled it, and you can see that

It is clear that the boxes and unboxes are in place at #2 and #3 in the code.

The valueOf and intValue methods of Integer are called, respectively.

3. When will the CMS garbage collector suspend the user thread during collection?

We won’t cover all garbage collectors here, but you can take a look at the JVM’s data areas and garbage collection.

As you know, the CMS garbage collection process is as follows:

So the user thread still needs to be suspended during the initial marking and re-marking phases.

4. Why does ConcurrentHashMap not need to be locked during reading?

The ConcurrentHashMap Node is defined as:

static class Node<K.V> implements Map.Entry<K.V> {
        final int hash;
        final K key;
        volatile V val;
        volatile Node<K,V> next;
}
Copy the code

As you can see, there are several attributes defined as follows:

  1. The final modified hash value cannot be changed after initialization.
  2. The final key cannot be changed again after initialization.
  3. The value that volatile modifies
  4. The pointer to the next node that volatile modifies

During a call to the GET (Ojbect) method.

public V get(Object key) {
    Node<K,V>[] tab; Node<K,V> e, p; int n, eh; K ek;
    // Get the hash value
    int h = spread(key.hashCode());
    // Retrieve hash buckets via tabat. Tabat is a thread-safe operation, guaranteed by UnSafe.
    if((tab = table) ! =null && (n = tab.length) > 0 &&
        (e = tabAt(tab, (n - 1) & h)) ! =null) {
        // Returns if the first node of the hash bucket is the lookup result
        if ((eh = e.hash) == h) {
            if((ek = e.key) == key || (ek ! =null && key.equals(ek)))
                return e.val;
        }
        // The first node is the root node of the tree
        else if (eh < 0)
            return(p = e.find(h, key)) ! =null ? p.val : null;
        // The first node is the root node of the list
        while((e = e.next) ! =null) {
            if(e.hash == h && ((ek = e.key) == key || (ek ! =null && key.equals(ek))))
                returne.val; }}return null;
}
Copy the code

And in the process,

  1. Gets the root node of the Hash buckettabAtTo operate, thread safe.
  2. The next attribute of node is used for traversal, and because it is volatile, it is visible between threads, causing concurrency problems.
  3. Read node’s volatile property val on return.

Therefore, the get process can also get the object correctly without locking.

5. What is the difference between Redis dictionary Rehash and JDK rehash like HashMap?

This is a broad question, and my personal understanding is as follows.

  1. Another table created temporarily when hashMap is rehash. Redis keeps references to two tables at all times. Only allocate enough space when rehash is needed.
  2. A hashMap Rehash is a one-time, centralized rehash process, whereas redis is a progressive hash.

The rehash process of HashMap is familiar, so let’s talk a little bit about the progressive hash of Redis.

First, rehash means rehash all the data in the original table and store it in the new table for expansion.

However, Redis is a single-threaded high-performance service. If there are hundreds of millions of data in a hash table, rehash will take a long time. During this period, Redis cannot provide services externally, which is unacceptable.

Therefore, Redis implements a progressive hash. The process is as follows:

  1. If the current data is in HT [0], allocate sufficient space for HT [1] first.
  2. Maintains a variable in the dictionary, rehashIndex = 0. It indicates the progress of the current rehash.
  3. During rehash, every time the dictionary is added, deleted, changed, and checked, after the actual operation is completed, the rehash operation will be performed to change ht[0] inrehashindexThe value at the position is rehash to HT [1]. Increments the rehashIndex by one bit.
  4. As the execution continues, the original ht[0] values will always be rehashed and the rehash process will end.

There are two issues that were not addressed in the above process:

  1. What if the server is empty? If no requests come in for several hours, then it is a waste of memory to keep two tables at the same time.

The solution is to add a rehash aid to redis’s timing function so that if the server is idle, the rehash will be done faster.

  1. How does the hash table serve the outside world while maintaining two tables?

Workaround: For add operations, add directly to HT [1], so that the number of HT [0] only decreases, not increases, and the rehash process can be completed. The deletion, modification, query and other operations will be performed on HT [0]. If no result is obtained, the operation will be performed again on HT [1].

The benefits of progressive hashing are obvious, as it takes a divide-and-conquer approach, spreading the rehash operation over each operation on the hash table, avoiding the performance pressure of centralized Rehashing.

The problem with progressive hashing, meanwhile, is that two hash tables need to be saved in the time it takes to rehash, which is a bit more memory intensive, and a large number of keys can be discarded if the Redis server suddenly rehashes when it is already full of memory.

Refer to the article

Redis Design and Implementation (2nd Edition)

To the end.

ChangeLog





All the above are personal thoughts, if there is any mistake welcome to comment.

Welcome to reprint, please sign and keep the original link.

Contact email: [email protected]

For more study notes, see personal blog or follow wechat official account < huyan10 > ——>HuYan ten