Yue say four is the best season, look for a job because of my personal career planning is not consistent with work with the current content (the specific reason is not revealed, the leadership will come here to shopping at ordinary times, ha ha), April picked out the ten companies resume (the company size from dozens to tens of thousands of people have), took part in the seven companies telephone interview, I have received 5 offers, which is not bad. Here are some of the basic and most common interview questions. So let’s just get to the point.

1. Synchronized? The difference between synchronized and Lock? Acquisition and revocation of synchronized bias lock?

Here is the analysis of the third problem. For synchronized this keyword, I just practice or a vegetable chicken, listen to the great god around, synchronized is a heavyweight lock, the overhead is very large to use less, this dish can only worship a face (a face meng forced) looking at them, unknown li. But this dish also can’t continue to dish is not, so also plan to study the principle of synchronized, see why big gods want to say so. After all, the JDK team could not bear the endless ridicule from programmers all over the world. Therefore, synchronized was optimized in JDK1.6. Prior to JDK1.6, synchronized was implemented with heavyweight locking (thread blocking). JDK1.8 was available when we started working here, so synchronized probably predates JDK1.6.

Synchronized: A lock is actually applied to an object, so the object is called a lock object. In Java, any object can be a lock object. Obviously, the focus here is on the object, and let’s take a look at how the object is saved in the virtual machine. The in-memory storage structure of Java objects has three main parts: the object header, the instance data, and the fill part, as illustrated by a diagram.

Lock-related things are stored in Mark Word. Let’s take a look at the contents of Mark Word stored in 32-bit virtualization.

Let’s take a look at some of the states of synchronized locks: lock free, bias, lightweight, and heavyweight. It escalates, but does not degrade, as thread contention improves the efficiency of acquiring and releasing locks.

2. Biased locking

Background: In most cases, locks are not contested by multiple threads, and are always acquired by the same thread multiple times. Biased locks are introduced to reduce unnecessary CAS operations in order to reduce the cost of locks acquired by threads. A biased lock is, as the name implies, a lock that is biased in favor of the first thread to acquire it. If the lock is not acquired by another thread during subsequent execution, and no other thread is competing for the lock, the thread holding the biased lock will never need to synchronize. If another thread preempts the lock while running, the thread holding the biased lock is suspended, and the JVM revokes its biased lock and upgrades it to a lightweight lock.

Partial lock acquisition and revocation process:

  1. When accessing the Mark Word, check whether the bias lock identifier is set to 1 and the lock flag bit is set to 01, and confirm that it is in the bias state.
  2. If yes, the thread ID points to the current thread. If yes, go to Step 5; otherwise, go to Step 3.
  3. If the thread ID does not point to the current thread, the lock is contested through the CAS operation. If the competition succeeds, set the thread ID in Mark Word to the current thread ID, and then perform Step 5. If the competition fails, go to Step 4.
  4. If the CAS fails to obtain a biased lock, a race occurs. When the safepoint is reached, the thread that acquired the bias lock is suspended, the bias lock is upgraded to a lightweight lock, and the thread blocked at the safepoint continues to execute the synchronization code. Removing bias locks causes stop the world.
  5. Execute synchronization code.

Whether bias locking is enabled or not can be controlled using the JVM’s parameters:

  • Open the biased locking: – XX: XX: + UseBiasedLocking BiasedLockingStartupDelay = 0
  • Disable biased locking: -xx: -usebiasedlocking

Text description is too abstract, or draw a picture to understand.

3. Lightweight locks

Lightweight locks are upgraded from biased locks, which operate when one thread enters a synchronized block and then upgrade to lightweight locks when the second thread joins lock contention.

Lightweight lock locking process: When the code enters the synchronization block, if the synchronization object lock state is lock-free and bias is not allowed (lock flag is “01” state, whether bias lock is “0”), the virtual machine will first create a space named LockRecord in the stack frame of the current thread. Store product of product product product product product product product product product product product product product product product product product product After the copy is successful, the VM uses the CAS operation to try to change the Mark Word of the object to a pointer to the LockRecord, and the owner pointer in the LockRecord to the Object Mark Word. If the update succeeds, the thread owns the lock on the object, and the object’s Mark Word lock bit is set to 00, indicating that the object is in a lightweight locked state. If the update fails, the virtual machine first checks to see if the object’s Mark Word points to the current thread’s stack frame. If it does, the current thread has the lock on the object and can proceed directly to the synchronization block. Otherwise, it indicates that multiple threads compete for the lock. When the competing thread fails to occupy the lightweight lock for many times, the lightweight lock will expand to the heavyweight lock. The heavyweight thread points to the competing thread, and the competing thread will also block, waiting for the lightweight thread to wake up after releasing the lock. The status value of the lock flag changes to “10”, and the Mark Word stores Pointers to heavyweight locks (mutex), followed by threads waiting for the lock to block.

The number of spins can be changed using the vm parameter -xx :PreBlockSpin. The default value is 10.

The above process or draw a picture to understand.

4. Heavyweight locks

Heavyweight locks are also known as mutex locks because they rely on the internal monitor lock. Monitor relies on the MutexLock(MutexLock) of the operating system. That’s why they say synchronized is expensive, especially when it comes to heavyweight locks. The main reason is that upgrading to a heavyweight lock blocks the thread waiting to acquire the lock and the blocked thread does not consume the CPU. But blocking or waking up a thread requires the operating system to help with state transitions. State transitions take a lot of time, possibly longer than the user executes the code. So let me just draw a classic picture.

Let’s make a comparison of several locks.

Of course, there are other common interview questions, such as is synchronized optimistic or pessimistic? Is optimism a better lock than pessimism? What are the three problems with using CAS? What is AQS? How does ReentrantLock achieve reentrancy? How do I synchronize Java threads with each other? What synchronizers have you learned about? Have you ever used a thread pool? Explain the various construction parameters of a thread pool and how it works? What does volatile do, and what is the difference between volatile and synchronized? Have you used ThreadlLocal before? Talk about how you use it and how it works? What is an immutable object, and how does it help write concurrency?

5. Persistence of Redis

All Redis data in memory, if suddenly outage, all data will be lost, so you must have a mechanism to ensure Redis data in case of an emergency will not be lost, or lost only a little, so must according to some strategy to put Redis data in memory to disk, so that when Redis service restart, You can restore data to memory based on the data on disk.

Redis persistence mechanisms: AOF, RDB, and hybrid persistence (supported after version 4.0, more on Redis later).

(1) RDB.

RDB (Snapshot) Persistence: Saves a snapshot of full data at a point in time.

The RDB is a full backup, that is, periodically writes the full amount of data currently in Redis memory to a snapshot file. Redis is a single-threaded program, which is responsible for the read and write requests from multiple clients at the same time, and also responsible for periodically writing the data in the current memory to the snapshot file RDB. Data writing to the RDB file is an IO operation, which will seriously affect the performance of Redis, and even in the process of persistence, the read and write requests will be blocked. In order to solve these problems, Redis need to read and write requests and persistence operation at the same time, which in turn can lead to other problems, the process of persistence, data in memory is changing, if the Redis is undergoing a persistent data structure of a large, in the process, the client sends a request to delete, delete the large data structure, What does Redis do when the persistent action is not complete?

Redis uses the multi-process Copy On Write mechanism of the operating system to realize the snapshot persistence. During the persistence process, the fork() of glibc(C function library under Linux) is called to generate a sub-process, and the snapshot persistence is completely handled by the sub-process. The parent process continues to process the client’s read and write requests. When the child process is created, it shares the code and data segments in the memory with the parent process. This is the mechanism of the Linux operating system. To save memory resources, the parent process shares the memory as much as possible.

A brief introduction to copy-on-write and fork.

The fork: fork() function creates a nearly identical process through a system call; that is, two processes can do exactly the same thing, but can also do different things if the initial arguments or variables passed in are different. When a process calls fork(), the system allocates resources to the new process, such as space to store data and code. Then all the values of the original process are copied to the new process, with only a few values different from the original process, which is equivalent to a clone of itself.

Copy On Write:

Resources are copied only when they need to be written, and previously shared only in read-only mode. This technique causes the copying of pages in the address space to be delayed until the actual writing occurs. In Linux, fork() produces a child identical to the parent, which then calls exec() to start executing.

To the question above, so treatment is the child of the current data in memory for persistence, will not modify the current data structure, if the parent has received the read and write requests, then will make a copy of processing the part of data into memory, to copy the data after modified, so even if a data for the modification, The data persisted to the RDB by Redis is also unmodified data, which is why RDB files are called “snapshot” files. The data seen by the child process is fixed as soon as it is created, and the parent process modifies a data copy of that data. Here again a little bit further, Redis memory full amount of the data consisting of one “page” data, the size of each data page for 4 k, the client to modify data in which pages, would be a copy of this page into memory, the replication process is referred to as “page separation”, in the process of persistence, as more and more isolated from the page, Memory will continue to grow, but not more than twice as much as the original memory, because in a persistent process, almost all pages will not split, read and write requests are only for a small part of the original data, and most Redis data is still “cold data”. Let me draw it as a graph.

To summarize the RDB snapshot persistence process:

  • Redis uses the fork function to make a copy of the current process (child process).
  • The parent process continues to receive and process commands from the client, while the child process begins to write data from memory to temporary files on disk.
  • When the child process finishes writing all data, it replaces the old RDB file with the temporary file. At this point, the snapshot operation is complete.

Note: Redis does not modify the RDB file during the snapshot process and only replaces the old file with the new one after the snapshot is completed, meaning that the RDB file is complete at all times. This makes it possible to back up Redis database periodically by backing up RDB files. RDB files are compressed binary files that occupy less space than data in memory and are more convenient for transmission.

RDB snapshot generation mode:

Manual trigger

  • SAVE: block the Redis server process until the RDB file is created.
  • BGSAVE: Fork a child process to create the RDB file without blocking the server process. The lastsave command can view the most recent backup time.

Automatic trigger

  • Trigger timer according to save in redis.conf configuration (BGSAVE)

  • The primary node automatically triggers the master/slave replication
  • Perform Debug Relaod
  • Shutdown is executed and AOF persistence is not enabled

With that in mind, we can draw a diagram that shows the flow of RDB snapshot persistence.

(2). AOF

The AOF log stores the sequential instruction sequence of the Redis server, that is, the instruction record to modify the data in memory. When Redis receives the modification instruction from the client, it will verify the parameters first. If the verification passes, it will store the instruction in the AOF log file, that is, save it to the disk first, and then execute the modification instruction. When Redis is restarted after downtime, instructions in the AOF file can be read and data recovery can be carried out. The recovery process is to execute the recorded instructions sequentially again, so that the state before downtime can be restored. The process of AOF is represented by a graph.

In the long-term running process of Redis, AOF logs will become larger and larger. If the Redis service is restarted and executes commands in sequence according to a large AOF file, it will be time-consuming. As a result, the Redis service cannot provide services for a long time, so the AOF file needs to be “slimed”. The process of losing weight is called AOF (rewrite). The principle of AOF Rewrite is that the main process forks the data in memory, converts it into a series of Redis operations, serializes it into a new AOF log, and then appends the operations received during the serialization to the new AOF file. After appending, replace the old AOF file immediately, thus completing the “weight loss” job. The process of Redis appending instructions to AOF files is not written directly to the AOF file, but to the operating system memory cache allocated by the operating system kernel, which asynchronously writes Redis instructions from the memory cache to the AOF file. Let me draw it as a graph.

AOF related parameters:

Rewrite (AOF) (128MB); rewrite (AOF) (128MB); rewrite (AOF) (128MB)

When 256MB > 64MB, rewrite will be triggered.

Here’s a comparison of Redis persistence:

Advantages and disadvantages of RDB:

Advantages:

  • RDB will generate multiple data files, and each data file represents the data of Redis at a certain time. This method of multiple data files is very suitable for cold backup, and such complete data files can be sent to some remote secure storage.
  • When RDB persistence is performed, the impact on the Redis service to handle read and write requests is minimal, allowing Redis to maintain high performance because the main Redis process only needs to fork a child process to perform disk IO operations for RDB persistence. The process of generating an RDB file is to write the data in the current memory into the file at a time, while AOF needs to first convert a small amount of data in the current memory into operation instructions, and then write the instructions to the memory cache, and then write them to the disk.
  • Restarting and restoring Redis data directly from RDB data files is much faster than AOF persistence. AOF stores instruction logs. During data recovery, all instruction logs should be played back and executed to recover all data in memory. The RDB, on the other hand, is a data file that can be loaded directly into memory during recovery.

Disadvantages:

  • RDB is not as good as AOF if you want to lose as little data as possible when Redis fails. In general, RDB data snapshot files are generated every 5 minutes or more, at which point you have to accept that if the Redis process goes down, the data in the last 5 minutes will be lost. This problem, and the biggest disadvantage of RDB, is that it is not suitable for the first-priority recovery scheme. If you rely on RDB for the first-priority recovery scheme, more data will be lost.
  • Each time the RDB forks a child process to perform the RDB snapshot data file generation, if the data file is very large, the service provided to the client may be suspended for milliseconds or even seconds. Therefore, the interval between RDB files generated should not be too long. Otherwise, the RDB files generated each time are too large, which will affect the performance of Redis.

Advantages and disadvantages of AOF:

Advantages:

  • AOF can better protect against data loss. Generally, AOF will execute fsync operation every second through a background thread and lose data for a maximum of one second.
  • AOF log files are written in appends-only mode, so there is no disk addressing overhead, write performance is very high, and the file is not prone to breakage, and even if the tail of the file is broken, it is easy to repair.
  • Even if the AOF log file is too large, the background rewrite operation does not affect the client read and write. Rewrite compacts the instructions in it, creating a minimal log that needs to be retrieved.
  • Commands for AOF log files are logged in a very readable manner, which is ideal for emergency recovery in the event of catastrophic deletions. Flushhall flushes all data in the flushhall file. Rewrite in the background has not yet happened. Flushhall deletes the last item in the AOF file and then flushes the AOF file back.

Disadvantages:

  • AOF log files are usually larger than RDB data snapshot files for the same data.
  • The write performance of AOF is lower than that of RDB because AOF is usually configured to fsync log files once per second. Of course, the fsync performance is still very high, but it is lower than that of RDB. If you want to ensure that no data is lost, you can do it. Fsync once, but then Redis performance will be significantly reduced.
  • The recovery speed based on AOF file is not as fast as that based on RDB file.

So how do we choose Redis persistence in a real project? Here are my personal suggestions:

  • Don’t just use RDB, because that will cause you to lose a lot of data
  • Do not only use AOF, first, data recovery is slow, second, the reliability is not as good as RDB, after all, RDB file is stored in a certain moment of real data, and AOF is only operation instructions, data into operation instructions may not be 100% good.
  • AOF and RDB persistence mechanisms are used in a comprehensive way. AOF is used to ensure that data is not lost as the first choice for data recovery. RDB is used for varying degrees of cold backup and for quick data recovery when AOF files are lost or corrupted and unavailable.

There are many common interview questions about Redis, such as why is Redis single threaded so fast? Redis data types and usage scenarios? Redis high availability solution? You said you used sentinels, so what about the split brain scenario? How does Redis implement distributed locking and the problems it encounters, and how do you solve them? Redis cache breakdown, cache penetration, cache avalanche is what, how to solve? Redis and DB data consistency solution? Does Redis support transactions and how exactly? These topics will be covered in the upcoming Redis feature.

Finally, share some experience in the interview. General opening will let self introduction, self introduction must be smooth, can practice in advance, do not stammer. Put the most familiar items first, and the interviewer will ask them in the order they appear on your resume. For example, if you write multithreading first, you will usually talk about synchronized, Lock, and the actual use of multithreading in your project (be sure to be prepared for this). For example, if you answer that distributed locks are implemented using Redis, the questions I have listed above are most likely to occur since we are talking about Redis. If you are familiar with MySQL on your resume, then you should talk about MySQL. MySQL’s storage engine is one of the questions that comes up. Well, you said you use InnoDB and MyISAM, so let’s talk about the differences between the two engines. InnoDB engine index, clustered index and non-clustered index. InnoDB engine index, clustered index and non-clustered index. InnoDB engine index. Slow queries, execution plan analysis, table creation tips, index creation tips, SQL writing tips will surely follow. Now that we’re talking about tuning, we’ll talk about the JVM, from the memory model to garbage collection algorithms, garbage collectors, class loading mechanisms, memory leaks, etc. Finally, do you have any actual tuning experience for online environments, and how? For example, in a production environment, the Full GC occurs hundreds of times while the Minor GC occurs only a few times. In other words, in a production environment, OOM is not reported, but the user thread is not executed, the intuitive phenomenon is that the application does not log output, please analyze the possible cause? These are the most common questions asked in the Java basics section of the interview process, which will lead to framework and project-related questions. Talk as much as you can about what you’re good at, and don’t let the interviewer lead you around because the interview is only going to take so long. Try to maximize your strengths and minimize your weaknesses. During the interview, try to write down the questions that you did not answer, and make a summary after the interview to see why you did not answer, whether you really do not know, or because of unclear expression or too nervous. After attending several interviews, you will find that the basic questions are basically the same, and your emotions will not be so nervous, which will greatly improve the probability of getting the offer.