This article is nuggets first, prohibit reprint oh! If you think the article content is good, welcome to turn around for me, ah! No, give me a thumbs up! After the “like” will have a surprise!

Before reading this article, recommend to you an Ali cloud double 11 activities, really very very very recommended, for the new ali cloud is really a big investment, suggest Ali cloud new must must not miss. If you think it’s just an AD, you can skip the text.

Ali Cloud double 11 latest activity (only limited to Ali cloud new users to buy, old users pull new users can get cash back red envelopes, the subsequent chance to share millions of red envelopes), the preferential strength is very, very, very big, in addition to join the group, the subsequent chance to share 100W red envelopes! At present, I have 12 new team, now is 50% discount is 10% discount to buy!! . Delimit the key point: 1 core 2G cloud server only 99.5 yuan a year!! 1 core 2G cloud server for 3 years only 298.50 YUAN!! Only 8.2 yuan a month this discount is only for new couples! This is my team address:… !

Before writing this article, I’ve actually opened up a Java learning guide for myself, which includes some basics and some back end (java-oriented) knowledge. So far 6.1k star and 1.5 K fork, 10 PR close and 10 issue. Open source is only for more people to see and participate, so that the correctness and quality of documentation can be well guaranteed. After all, I have limited personal ability, time, and knowledge breadth and depth, so a good project cannot be achieved without the joint efforts of others.

In addition, I personally think you can learn something from this article whether you are a front end or a back end (some of this may be a little java-oriented).

My technical level is limited, welcome everyone to correct! Write bad words, please forgive me!


  • preface
  • How to write a resume
    • 1.1 Why is a resume important?
    • 1.2- Here are 3 things you need to know
    • 1.3- Understand one of the two laws
    • 1.4- How to write project experience?
    • 1.5- How to write professional skills?
    • 1.6- Open source programmer resume template sharing
    • 1.7 Some other tips
  • Two computer networks often meet pilot summary
    • Computer network FAQ review
    • 2.1 Differences between TCP and UDP
    • 2.2 The process of entering the URL address in the Browser ->> display the home page
    • 2.3 Relationships between Various Protocols and HTTP
    • 2.4 LONG and short HTTP connections
    • 2.5 TCP Three-way handshake and four-way wave
  • Three Linux
    • 3.1- Brief introduction – Linux-File system?
    • 3.2 Do you know some Common Linux Commands?
  • Four MySQL
    • 4.1 Talk about my understanding of two common storage engines of MySQL: MyISAM and InnoDB
    • 4.2 Do you know the Database Index?
    • 4.3 For the common optimization of large tables
  • Five Redis
    • 5.1 introduction of redis
    • 5.2 Why use redis/Why use caching
    • 5.3 Why use Redis instead of Map/Guava for caching?
    • 5.4 Differences between Redis and memcached
    • 5.5 Analysis of Common Redis data structures and Usage Scenarios
    • 5.6 Redis Set the expiration time
    • 5.7 Redis Memory obsolescence mechanism
    • 5.8 Redis persistence mechanism (how to ensure that the data can be recovered after the restart of Redis)
    • 5.9 Solutions to cache Avalanche and cache penetration issues
    • 5.10 How do I Solve the Concurrent Contention Key Problem in Redis
    • 5.11 How Can I Ensure Data Consistency between the Cache and the Database in Dual-Write Mode?
  • Six Java
    • 6.1 Java Basics
    • 6.2 Java Collections Framework
    • 6.3 Java Multithreading
    • 6.4 Java VIRTUAL Machine
    • 6.5 Design Patterns
  • 7 data Structure
  • Eight algorithm
    • 8.1 For example (handwritten quick type)
  • Nine Spring
    • 9.1 Scope of Spring Beans
    • 9.2 Isolation levels in Spring transactions
    • 9.3 Transaction propagation behavior in Spring transactions
    • 9.4 AOP
    • 9.5 the IOC
  • Ten actual scene questions
  • Write in the last


Whether school recruitment or social recruitment can not avoid a variety of interviews, written tests, how to prepare these things is particularly important. There are rules for both written and interview interviews, and by “rules,” I mean that technical interviews can be prepared in advance. In fact, I particularly dislike the approach of the exam on the back ah remember ah all kinds of questions in advance of the behavior, very opposed! I find this method extremely extreme and completely useless in front of an interviewer with a little experience. It is suggested that we still step by step step by step.

After masterful, decisive victory thousands of miles away! Instead of being unprepared, I think you can prepare for an interview from the following aspects:

  1. Introduce yourself. (Don’t be afraid to say, “My name is xyz, my gender, where I’m from, what I went to school for, what I like to do.” Remember to say more about what’s not on your resume and what makes you better than others.)
  2. What knowledge points may be involved in your interview, and what knowledge points are important.
  3. What questions will be frequently asked and how to answer them? (Recitation is strongly not recommended. First, how much can you remember by reciting this way? For how long? Second: it is difficult to stick to the study of the way of memorization!
  4. How to write your resume.

There is some truth to the saying that 80% of the offer is in the hands of 20% of the people. It is true that strength plays a large part in the factors that determine whether you can succeed in the interview, but if your attitude or bad luck, you still can’t get a satisfactory offer. Luck aside, take the state of mind, do not be discouraged because of the failure of the interview or doubt their ability, after the failure of the interview summed up more reasons for failure, you will find that they will become more and more powerful.

In addition, it’s important for you to be clear:

  1. Be careful what you put on your resume. This may be where the interviewer asks a lot of questions.
  2. Most fresh graduates looking for a job is no work experience or internship experience;
  3. It is important to present your project experience well.

The writer’s ability is limited, if there is something wrong or different from your ideas, please be honest, do not give advice.

If you want to know more about me, you can follow me on Github or my wechat official account :”Java Interview Handbook “.

How to write a resume

As the saying goes: “to do a good job, must first sharpen its tools.” Preparing a good resume plays a vital role in finding a good job.

1.1 Why is a resume important?

If you apply online, your resume will surely be screened by HR, who may spend as little as 10 seconds looking at each resume before deciding whether you Fail or Pass.

If you’re an introvert, if your resume doesn’t have the edge, no matter how hard the introvert tries, there’s nothing he or she can do.

In addition, even if you pass the screening, the interviewer will use your resume to determine whether or not you are worth his time.

1.2 These 3 points you must know

  1. Most fresh graduates looking for a job is no work experience or internship experience;
  2. Be careful what you put on your resume. This may be where the interviewer asks a lot of questions.
  3. It is important to present your project experience well.

1.3 Understand the two rules

There are two widely accepted ways to write a resume: STAR and FAB.

STAR Rule (Situation Task Action Result) :

  • Situation: The Situation in which something happens;
  • How do you define your Task
  • Action: What Action did you take in response to the situation analysis;
  • As a Result, what did you learn from the situation?

FAB rule (Feature Advantage Benefit) :

  • Feature c.
  • 1. What advantages you have over others;
  • 6. Benefit: What a recruiter will get if they hire you.

1.4 How to write project experience?

It’s normal to have one or two projects on your resume, but it’s rare to have one that really shows off to an interviewer. For the project experience you can consider from the following points to write:

  1. A sense of the overall design of the project
  2. What was your responsibility, what did you do, and what role did you play in this project
  3. What you learn from the project, what technologies you use, what new technologies you use
  4. In addition, in the project description, it is best to reflect your comprehensive quality, such as how you coordinated the project team members to develop together or how you solved a thorny problem when you met.

1.5 How to write professional skills?

Ask yourself what you know, and then see what the company needs. The average HR person may not be very technical, so he or she may be looking for keywords in your specific skills. If you don’t have the skills that the company requires, you can spend a few days learning them and then include them on your resume. For example, you could write:

  • Dubbo: proficient in
  • Spring: proficient in
  • Docker: master
  • Distributed development for SOA: Mastery
  • Spring Cloud: understanding

1.6 Open source programmer resume template sharing

Share an open-source resume template for programmers on Github. There are PHP programmer template, iOS programmer template, Android programmer template, Web front-end programmer template, Java programmer template, C/C++ programmer template, NodeJS programmer template, architect template, and general programmer template. Github:…

To learn how to write a high quality resume with Markdown, see…

1.7 Some other tips

  1. Avoid subjective expression as far as possible, less semantic ambiguous adjectives, try to be concise and clear, logical structure.
  2. Pay attention to typography (don’t need to be colorful) and use Markdown syntax whenever possible.
  3. If you have a blog or personal technology stack, it will be a plus.
  4. If you’re active on Github, it’ll also give you extra points.
  5. Be sure not to write anything you don’t know or that is deceptive
  6. Project experience is suggested to be in reverse chronological order, and project experience is not many, but highlights.
  7. If the content is too much, you don’t need to compress it to one page. Just keep the layout clean and tidy.
  8. A good place to end your resume is, “Thank you for taking the time to read my resume, and I look forward to working with you in the future.” This sentence, show you will be very polite.

Two computer networks often meet pilot summary

Computer network FAQ review

  • TCP three handshakes and four waves,
  • The process of entering a URL address ->> into a browser to display the home page
  • How does TCP ensure reliable transmission
  • Differences between HTTP and HTTPS
  • The difference between TCP and UDP
  • Common status codes.

Here are some answers to common questions!

2.1 Differences between TCP and UDP

UDP does not need to establish a connection before transmitting data, and the remote host does not need to give any acknowledgement after receiving the UDP packet. While UDP does not provide reliable delivery, there are some situations where UDP is the most efficient way to work (typically for instant messaging), such as QQ voice, QQ video, live streaming, and so on

TCP provides connection-oriented services. A connection must be established before data transfer and released after data transfer. TCP does not provide broadcast or multicast services. Because TCP to provide a reliable, connection-oriented transport service (TCP and reliable in TCP before passing data, there will be three times handshake to establish a connection, and in data transmission, are confirmed, the window, the retransmission, the congestion control mechanism, in after the data transfer, disconnected will also be used to save system resources), the hard to avoid increased a lot of overhead, Such as validation, flow control, timers, and connection management. This not only makes the header of the protocol data unit much larger, but also consumes a lot of processor resources. TCP is used for file transfer, mail sending and receiving, and remote login.

2.2 The process of entering the URL address in the Browser ->> display the home page

Baidu seems to like asking that question the most.

Open a web page, which protocols will be used throughout the process

Image source: Illustrated HTTP

In general, it can be divided into the following processes:

  1. The DNS
  2. A TCP connection
  3. Sending an HTTP request
  4. The server processes the request and returns an HTTP packet
  5. The browser parses the rendered page
  6. Connect the end of the

Specific can refer to the following article:


2.3 Relationships between Various Protocols and HTTP

Interviewers usually ask questions like this to test your understanding of the computer network knowledge system.

Image source: Illustrated HTTP

2.4 LONG and short HTTP connections

Short connections are used by default in HTTP/1.0. In other words, each TIME the client and server perform an HTTP operation, a connection is established, and the connection is interrupted when the task is complete. When the client browser accesses an HTML or other type of Web page that contains other Web resources (such as JavaScript files, image files, CSS files, and so on), the browser re-establishes an HTTP session each time it encounters such a Web resource.

With HTTP/1.1, the default is to use persistent connections to preserve the connection feature. Using the long-connected HTTP protocol, this line of code is added to the response header:

Copy the code

In the case of a persistent connection, when a web page is opened, the TCP connection between the client and the server for the transfer of HTTP data is not closed. When the client accesses the server again, it continues to use the established connection. Keep-alive does not stay connected permanently, it has a hold time, which can be set in different server software (such as Apache). To implement persistent connections, both the client and the server need to support persistent connections.

The long and short connections of HTTP protocol are essentially the long and short connections of TCP protocol.

— What are HTTP Long and Short Connections?

2.5 TCP three handshakes and four waves (interview frequent visitor)

In order to accurately send the data to the target, TCP protocol adopts the three-way handshake strategy.

Graphic illustration:

Image source: Illustrated HTTP

Simple schematic:

  • Client – Sends packet with SYN flag – One handshake – server
  • Server – Sends packets with THE SYN/ACK flag – Second handshake – Client
  • Client – Sends packets with ACK flags – Three-way handshake – server

Why the three handshakes?

The purpose of the three-way handshake is to establish a reliable communication channel. Speaking of communication, it is simply the sending and receiving of data. The main purpose of the three-way handshake is to confirm that the sending and receiving of each other is normal.

First handshake: The Client cannot confirm anything; The Server confirms that the peer is sending properly

The second handshake: The Client confirms that its own sending and receiving are normal, and the other party’s sending and receiving are normal. The Server confirms that it receives and sends normally

Third handshake: The Client confirms that its own sending and receiving are normal, and the other party’s sending and receiving are normal. The Server confirms that its own sending and receiving are normal, and the peer party’s sending and receiving are normal

So three handshakes will confirm that both are functioning properly, and neither is necessary.

Why do we return SYN

The receiver sends the SYN back to the sender to tell the sender that the message I received is indeed the one you sent.

The SYN is the handshake used by TCP/IP to establish a connection. When a normal TCP connection is established between the client and server, the client sends out a SYN message, the server responds with a SYN-ACK and the client responds with an ACK. Acknowledgement character: in data communication, a transmission control character sent from the receiving station to the transmitting station to confirm that the data sent has been accepted. ) message response. In this way, a reliable TCP connection can be established between the client and the server, and data can be passed between the client and the server.

If SYN is passed, why do you need to pass ACK

The communication between the two parties must be correct if the messages sent to each other are correct. The SYN is transmitted, proving that the channel from the sender to the receiver is fine, but the channel from the receiver to the sender also needs an ACK signal for verification.

To disconnect a TCP connection, “four waves” are required:

  • Client – Sends a FIN to shut down data transfer from the client to the server
  • Server – Receiving the FIN, it sends back an ACK confirming that the serial number is the one received plus one. Like SYN, a FIN takes a serial number
  • Server – Close the connection with the client and send a FIN to the client
  • Client – Sends ACK packets for acknowledgement and sets the acknowledgement number to the receiving number plus 1

Why four waves

Either party can send a notice to release the connection after the data transmission ends, and enter the semi-closed state after the other party confirms. If the other party has no data to send, it sends a connection release notification. After the other party confirms that the TCP connection is completely closed.

Here’s an example: Call A and B, after the call is coming to an end, A said, “I do not have what to say,” replied “I know” B, but may also have to say, B can’t ask B end call follow their own rhythm, so B may barak barak said A phone again, finally B said, “I said,” A answered “know”, That’s how the call ends.

The above is more general, recommend a more detailed article:


Three Linux

3.1 What about the Linux File System?

Introduction to the Linux file system

In the Linux operating system, all resources managed by the operating system, such as network interface cards, disk drives, printers, I/O devices, common files, or directories, are considered to be one file.

That said, there is an important concept in LINUX systems: everything is a file. In fact, this is a reflection of the UNIX philosophy, and Linux rewrote UNIX, so the concept has survived. On UNIX systems, all resources are treated as files, including hardware devices. UNIX systems treat each piece of hardware as a file, often called a device file, so that users can access the hardware by reading and writing the file.

File types and directory structures

Linux supports five file types:

The Linux directory structure is as follows:

The Linux file system is layered, like an upside-down tree, with the root directory at the top:

Common directory description:

  • /bin: stores binary executable files (ls,cat,mkdir, etc.). Commonly used commands are stored here.
  • /etc: stores system management and configuration files.
  • /home: the root directory for storing all user files. It is the base of the home directory of a user. For example, the home directory of user is /home/user, which can be represented by ~user.
  • /usr: used to store system applications;
  • /opt: The location where additional optional application packages are installed. In general, we can install tomcat and so on here;
  • /proc: virtual file system directory, which is a mapping of system memory. This directory can be accessed directly to obtain system information;
  • /root: home directory of the superuser (system administrator) (privileged class ^o^);
  • /sbin: stores binary executable files. Only the root user can access them. This is where the system-level administrative commands and programs used by the system administrator are stored. Such as ifconfig, etc.;
  • /dev: stores device files.
  • / MNT: the installation point for system administrators to install temporary file systems. The system provides this directory for users to temporarily mount other file systems.
  • /boot: stores various files used for system boot.
  • /lib: stores the library files related to the system running;
  • / TMP: used to store a variety of temporary files, is a public temporary file storage point;
  • /var: a file that holds data that needs to be changed at runtime and is an overflow area for some large files, such as log files for various services (system startup logs, etc.). And so on;
  • /lost+found: This directory is usually empty, but the system shut down unexpectedly and left the “homeless” file (what’s called.chk under Windows) here.

3.2 Do you know some Common Linux Commands?

Directory switchover command

  • CD usr: switch to the usr directory of this directory
  • cd .. (or CD.. /) : switch to the previous directory
  • CD / : switch to the system root directory
  • CD ~ : Switch to the user home directory
  • CD – : switches to the previous directory

Directory operation command (add, delete, modify and check)

  1. Mkdir Directory name: adds a directory

  2. Ls or ll (ll is short for ls -l. The ll command displays detailed information about all directories and files in the directory) : View directory information

  3. Find directory parameter: find directory (search)

  4. Mv Directory name New directory name: Modify the directory name (change)

    Note: the mv syntax can rename not only directories but also various files, compressed packages and other operations. Using the mv command, you can rename a file or directory or move a file from one directory to another. Another use of the mv command is described later.

  5. Mv directory name Directory new location: move directory location — cut (change)

    Note: the mv syntax can be used to cut not only directories, but also files and compressed packages. In addition, the results of MV and CP are different. The mv file seems to “move”, and the number of files does not increase. While cp copies files, the number of files increases.

  6. Cp -r Directory name Destination location for directory copying: copy directory (change). -r indicates recursive copying

    Note: the cp command can not only copy directories, but also copy files, compressed packages, etc. When copying files and compressed packages, do not write -r recursion

  7. Rm [-rf] Directory: delete directory (delete)

    Note: Rm can not only delete directories, but also delete other files or compressed packages, in order to enhance your memory, regardless of any directory or file, directly use rm -rf directory/file/compressed package

Operation command of file (add, delete, modify and check)

  1. Touch file name: File creation (add)

  2. Cat /more/less/tail File name

    • Cat: Displays only the contents of the last screen
    • More: The percentage can be displayed. Press enter to go to the next line. The space box can go to the next page
    • Less: You can use PgUp and PgDn on the keyboard to turn the page up and down, and q to finish viewing
    • Tail-10: View the last 10 lines of the file, Ctrl+C to end

    Note: The tail -f command can be used to dynamically monitor a file, such as the tomcat log file. Logs change as the program runs. You can run the tail -f catalina-2016-11-11.log command to monitor the changes of files

  3. Vim file: Modify file content (change)

    Vim editor is a powerful component in Linux, is the enhanced version of the VI editor, vim editor commands and shortcuts have a lot of, but here is not a description, we do not need to study very thorough, the use of vim editing and modifying files will use the basic way.

    In actual development, the main function of using the Vim editor is to modify the configuration file. Here are the general steps:

    Vim file — — — — — — — — — — – > > enter the file command mode — — — — — – > press I to enter edit mode — — — — — — — — — — – > edit file > press Esc to enter the bottom line — — — — — > input: wq/q! (Enter wq to write content and exit, that is, save. The input q! It means forcibly exit without saving.)

  4. Rm -rf files: Delete files (delete)

    Delete the same directory: memorize the rm -rf file

Command used to compress files

1) Package and compress the file:

In Linux, packages end in.tar, and compressed commands end in.gz.

Generally, packaging and compression are done together, and the file name suffix after packaging and compression is generally.tar.gz. Run the tar -zcvf command to package the compressed file name.

Z: Use the gzip command to compress

C: Package files

V: Displays the running process

F: Specifies the file name

For example, the three files in the test directory are aa.txt bbb. TXT ccc. TXT. If you want to package the test directory and specify the name of the compressed package as test.tar.gz, run the following command: TXT: tar -zcvf test.tar.gz aa. TXT bbb. TXT or tar -zcvf test.tar.gz /test/

2) Decompress the compression package:

Command: tar [-xvf] To compress a file

X: indicates decompression


1 Run the tar -xvf test.tar.gz command to decompress the test.tar.gz file in /test to the current directory

2 Decompress test.tar.gz in /test to the root directory /usr :tar -xvf xxx.tar.gz -c /usr (-c indicates the location for decompression)

Other Common Commands

  • PWD: Displays the current location

  • Grep String to search file to search –color: search command, –color means highlight

  • Ps -ef/ps aux: The two commands are used to view the running processes of the system. The difference between the two commands is that the display format is different. If you want to view the specific processes can use this format: ps aux | grep redis (view including redis string process)

    Note: If you run the ps ((Process Status)) command directly, the Status of all processes will be displayed. Usually, you can use the grep command to check the Status of a Process.

  • Kill -9 Pid of a process: Kills the process. (-9 indicates forcibly terminating the process.)

    Use ps to find the process, and then kill it with kill

  • Network communication commands:

    • Run the ifconfig command to view the nic information in the current system
    • Run the ping command to check the connection to a certain machine
    • Run the netstat -an command to view the current system port
  • Shutdown: shutdown -h now: specifies to shutdown immediately. Shutdown +5 “System will shutdown after 5 minutes”: Specifies 5 minutes after shutdown and sends a warning message to the login user.

  • Reboot: reboot: restarts the device. Reboot-w: Simulates a reboot (only recording does not actually reboot).

Four MySQL

4.1 Talk about my understanding of two common storage engines of MySQL: MyISAM and InnoDB

Comparison and summary of the two:

  1. Because MyISAM caches table meta-data (number of rows, etc.), it is not very expensive to do count (*) for a well-structured query. InnoDB has no such cache.
  2. Support for transactions and crash recovery: MyISAM is all about performance. It is atomic per query and performs several times faster than InnoDB, but does not provide transaction support. But InnoDB provides transaction support for transactions, foreign keys and other advanced database features. Transaction-safe (ACID compliant) tables with COMMIT, ROLLBACK, and crash recovery capabilities.
  3. Support for foreign keys: MyISAM does not support foreign keys, InnoDB does.

MyISAM is better for reading dense tables and InnoDB is better for writing dense tables. MyISAM is often chosen as the storage engine for the master database in cases where the master and slave databases are separated. In general, InnoDB is a good choice if you need transaction support and you have a high frequency of concurrent reads (MyISAM’s table locks are too granular, so there is a lot of waiting for queries when there is a high number of concurrent writes to the table). If you have a large amount of data (MyISAM supports compression features to reduce the disk footprint) and you don’t need to support transactions, MyISAM is the best choice.

4.2 Do you know the Database Index?

The main data structures used by Mysql indexes are BTree index and hash index. For the hash index, the underlying data structure is the hash table, so in the vast majority of the need for a single record query, you can choose the hash index, query performance is the fastest; In most other scenarios, you are advised to select the BTree index.

Mysql’s BTree index uses B+Tree in the B number, but the implementation is different for the two main storage engines.

  • MyISAM: B+Tree The data field of the leaf contains the address of the data record. During index retrieval, the B+Tree search algorithm is first used to search the index. If the specified Key exists, the data field value of the Key is fetched, and then the corresponding data record is read with the value of the data field as the address. This is called a “non-clustered index”.
  • InnoDB: The data file itself is an index file. Compared with MyISAM, the index file and data file are separated. The table data file itself is an index structure organized according to B+Tree, and the Tree’s leaf node data field preserves the complete data records. The key of this index is the primary key of the data table, so the InnoDB table data file itself is the primary index. This is called a clustered index (or clustered index). The rest of the indexes are secondary (non-clustered). The data field of the secondary index stores the value of the corresponding primary key rather than the address, which is different from MyISAM. When searching according to the primary index, the data can be retrieved by directly finding the node where the key is located. When searching according to the secondary index, you need to fetch the value of the primary key first and then walk through the primary index. Therefore, when designing a table, it is not recommended to use excessively long fields as primary keys, and it is not recommended to use non-monotonic fields as primary keys, as this will cause frequent splitting of primary indexes. PS: From the “Java Engineer Training”

In addition, I would like to recommend some good articles about indexing:

  • Juejin. Im/post / 684490…

4.3 For the common optimization of large tables

When the number of MySQL single table records is too large, the CRUD performance of the database will degrade significantly. Some common optimization measures are as follows:

  1. Limit the scope of data: It is important to disallow queries that do not have any conditions that limit the scope of data. For example, when the user is querying the order history, we can control within a month. ;

  2. Read/write separation: the classic database split scheme, the primary library is responsible for the write, the secondary library is responsible for the read;

  3. Cache: Use MySQL’s cache, and consider using application-level cache for heavy, less updated data.

  4. Vertical partition:

    Split according to the correlation of the tables in the database. For example, if the user table contains both the login information and the basic information of the user, you can split the user table into two separate tables or even put it into a separate database.

    In simple terms, vertical splitting is the splitting of data table columns, splitting a table with more than one column into multiple tables. It looks like this, so it should make sense to you.

    Advantages of vertical splitting: It can make row data smaller, reduce the number of blocks read during query, and reduce I/O times. In addition, vertical partitioning simplifies the structure of the table and makes it easier to maintain.

    Disadvantages of vertical splitting: the primary key will be redundant, the redundant columns need to be managed, and Join operation will be caused, which can be solved by joining at the application layer. In addition, vertical partitioning makes transactions more complex;

  5. Horizontal partition:

    Keep the structure of the data table unchanged, and store data fragments through some policy. In this way, each piece of data is dispersed into different tables or libraries, achieving the purpose of distribution. Horizontal splitting can support very large amounts of data.

    Horizontal splitting is the splitting of rows in an index table. When the number of rows in a table exceeds 2 million, it slows down. In this case, the data in one table can be split into multiple tables. For example, we can split the user information table into multiple user information tables to avoid the performance impact of too much data in a single table.

    Water separation can support very large data volumes. One thing to note is that the sub-table only solves the problem of single table data too large, but because the table data is still on the same machine, in fact, there is no meaning for improving the MySQL concurrency ability, so the water product split is the best sub-library.

    Horizontal splitting can support a very large amount of data storage, and the application end transformation is less, but fragmented transactions are difficult to solve, the performance of Join at the crossing point is poor, and the logic is complex. The author of “The Way of Java Engineers” recommends that data should not be sharpened as far as possible, because splitting will bring the complexity of logic, deployment, operation and maintenance, and the general data table should be optimized properly to support the data volume of less than 10 million. If sharding is necessary, try to choose a client-side sharding architecture, which can reduce the network I/O with the middleware.

    Here are two common scenarios for database sharding:

    • Client proxy: Sharding logic is implemented on the application side, encapsulated in a JAR package, by modifying or encapsulating the JDBC layer. Dangdang’s Sharding-JDBC and Ali’s TDDL are two commonly used implementations.
    • Middleware broker: Adds a broker layer between application and data. Sharding logic is maintained uniformly in middleware services. We are now talking about Mycat, 360 Atlas, netease DDB and so on are the implementation of this architecture.

Five Redis

11 must-know and Must-know questions about Redis! The last two issues, not yet updated! If necessary, you can follow my Github or wechat official account: “Java Interview Clearance Manual” for further updates.

  1. Introduction of redis
  2. Why use redis/Why use caching
  3. Why use Redis instead of Map/Guava for caching?
  4. Redis and memcached
  5. Analysis of common data structures and usage scenarios of Redis
  6. Redis Sets the expiration time
  7. Redis memory obsolescence mechanism
  8. The redis persistence mechanism (how to ensure that the data can be recovered if the Redis fails)
  9. Cache avalanche and cache penetration problem solution
  10. How to solve the concurrent contention Key problem in Redis
  11. How can I ensure the data consistency between the cache and the database in dual-write mode?

5.1 introduction of redis

In short, Redis is a database, but unlike traditional databases, redis data is stored in memory, so the storage and write speed is very fast, so Redis is widely used in the direction of cache. In addition, Redis is often used for distributed locking. Redis provides multiple data types to support different business scenarios. In addition, Redis supports transactions, persistence, LUA scripting, LRU-driven events, and a variety of clustering solutions.

5.2 Why use redis/Why use caching

Look at this in terms of “high performance” and “high concurrency.”

High performance:

Suppose the user accesses some data in the database for the first time. This process can be slow because it is being read from the hard disk. Store the data accessed by the user in the cache so that the next time the data is accessed, it can be retrieved directly from the cache. Manipulating the cache is directly manipulating memory, so it’s quite fast. If the corresponding data changes in the database, the corresponding data in the cache can be synchronized!

High concurrency:

Direct caching can handle far more requests than direct access to the database, so we can consider moving some of the data from the database to the cache, so that some of the user’s requests will go directly to the cache instead of through the database.

5.3 Why use Redis instead of Map/Guava for caching?

The following questions from a net friend from segmentfault, address:…

The cache is divided into local cache and distributed cache. Java, for example, uses its own map or Guava to implement the local cache. The main features are lightweight and fast, the life cycle ends with the DESTRUCTION of the JVM, and in the case of multiple instances, each instance needs to keep its own cache, which is not consistent.

Using something like Redis or memcached is called a distributed cache. In the case of multiple instances, the cached data is shared by each instance and the cache is consistent. The downside is the need to keep Redis or Memcached services highly available, and the overall program is architecturally complex.

5.4 Differences between Redis and memcached

I have summarized the following four points for Redis and memcached. Now companies generally use Redis to achieve caching, and Redis itself is more and more powerful!

  1. Redis supports richer data types (for more complex application scenarios) : Redis not only supports simple K/V data, but also provides the storage of list, set, Zset, hash and other data structures. Memcache supports a simple data type, String.
  2. Redis supports data persistence, which keeps data in memory on disk and can be reloaded upon restart, while Memecache stores all data in memory.
  3. Cluster mode: Memcached does not have a native cluster mode. It relies on clients to write data to the cluster in fragments. Redis currently supports cluster mode natively. Redis officially supports cluster mode, which is better than memcached.
  4. Memcached is a multi-threaded, non-blocking I/O reuse network model; Redis uses a single-threaded multiplexing I/O model.

A picture from the network, here to share with you!

5.5 Analysis of Common Redis data structures and Usage Scenarios

1. String

Common commands: set,get,decr,incr,mget, etc.

String data structures are simple key-value types, and value can be not only a String but also a number. Regular key-value cache applications; Regular count: number of tweets, number of followers, etc.


Common commands: hget,hset, hGEtall, etc.

A Hash is a mapping table of string fields and values. Hashes are particularly useful for storing objects. Later on, you can simply change the value of one of the fields in the object. For example, we can Hash data structures to store user information, product information, and so on. For example, here I use the hash type to store some information about myself:

Key =JavaUser293847 value={" id ": 1," name ":" SnailClimb ", "age" : 22, "location" : "Wuhan, Hubei"}Copy the code


Common commands: lpush, rpush lpop, rpop, lrange, etc

List is a linked list. Redis List has many application scenarios and is also one of the most important data structures of Redis. For example, the following list, fan list, message list and other functions of Microblog can be realized by Redis list structure.

Redis List is implemented as a two-way linked list, that is, can support reverse lookup and traversal, more convenient operation, but brings some additional memory overhead.

In addition, you can use the lrange command, which is to read how many elements from a certain element, and you can achieve paging query based on the list. This is a great function, based on Redis to achieve simple high-performance paging, you can do things like Twitter down and paging (page by page to go down), high performance.


Common commands: sadd, spop smembers, sunion, etc

A set provides functionality similar to a list, except that a set can be automatically rearranged.

A set is a good choice when you need to store a list of data that you don’t want to duplicate, and it provides an important interface for determining whether a member is in a set that a list doesn’t provide. The operation of intersection, union and difference set can be easily realized based on set.

For example, in the microblog application, all the followers of a user can be stored in a set, and all the fans can be stored in a set. Redis can be very convenient to achieve such as common attention, common fans, common preferences and other functions. This process is also the process of finding the intersection, the specific command is as follows:

Sinterstore KEY1 Key2 Key3 the intersection exists in KEY1Copy the code

5.Sorted Set

Common commands: zadd,zrange,zrem,zcard, etc

Compared to set, the sorted set adds a weighted parameter, score, so that the elements in the set can be sorted by score.

For example, in the live broadcast system, the real-time ranking information includes the list of online users in the live broadcast room, various gift leaderboards, bullet-screen messages (which can be understood as message leaderboards by message dimensions) and other information, which is suitable for storage using the SortedSet structure in Redis.

5.6 Redis Set the expiration time

Redis has a set expiration function, which allows you to set an expiration time for values stored in the Redis database. As a cache database, this is very useful. For example, the tokens or some login information in our general projects, especially the SMS verification code, are time-limited. According to the traditional database processing method, they are generally judged to expire by themselves, which will undoubtedly seriously affect the project performance.

When we set a key, we can give it an expire time, which allows us to specify when the key will be stored.

If you set a batch of keys to last for an hour, how does Redis delete the batch of keys an hour later?

Periodic deletion + lazy deletion.

You can probably guess what these two delete methods mean by their names.

  • Periodically delete: By default, Redis randomly selects some keys with expiration time every 100ms, checks whether they expire, and deletes them if they expire. Notice that this is a random selection. Why random? If redis has hundreds of thousands of keys, it would be a huge load on the CPU to iterate over all the keys that set the expiration time every 100ms.
  • Lazy delete: Regular delete may result in many expired keys not being deleted when the time is up. So you have lazy deletion. If your expired key has not been deleted by regular deletion, it still stays in memory, unless your system looks up that key, it will be deleted by Redis. This is the so-called lazy delete, is also lazy enough ha!

But there is a problem with just setting the expiration time. Let’s think about this: what happens if you miss a lot of expired keys during regular deletion, and then you don’t check them in time, so you don’t go lazy deletion? If a large number of expired keys accumulate in memory, the Redis memory block is exhausted. How to solve this problem?

Redis memory obsolescence mechanism.

5.7 Redis memory flushing mechanism (MySQL has 2000W of data, redis only has 20W of data, how to ensure that the data in Redis is hot data?) /redis-stabl/…

Redis offers six data elimination strategies:

  1. Volatile-lru: Validates the least recently used data set from the set expiration (server.db[I].expires)
  2. Volatile – TTL: Deletes data from the set (server.db[I].expires) that will expire
  3. Volatile-random: Specifies a set of data with an expiration time (server.db[I].expires)
  4. Allkeys-lru: Remove the least-recently used key (this is the most commonly used) from the key space when memory is insufficient for new writes.
  5. “Allkees-random” : Selects data from the data set (server.db[I].dict)
  6. No-enviction: Forbids evicting data. That is, new write operations will report errors when memory is insufficient for new write operations. This should not be used!

Note: Redis set expiration time and memory elimination mechanism, I just briefly summarize here, will write a special article to summarize!

5.8 Redis persistence mechanism (how to ensure that the data can be recovered after the restart of Redis)

Many times we need to persist data, that is, write data from memory to hard disk, mostly to reuse the data later (such as restarting the machine, recovering data after machine failure), or to back up the data to a remote location in case of system failure.

One important difference between Redis and Memcached is that Redis supports persistence and supports two different persistence operations. One method of Redis persistence is called snapshotting (RDB). Another method is appending only files (AOF). Each of these methods has its advantages and disadvantages, and I’ll go into more details about what they are, how to use them, and how to choose the right persistence method for you.

Snapshotting persistence (RDB)

Redis can create snapshots to get a copy of data stored in memory at a point in time. After Redis creates a snapshot, it can back up the snapshot, copy the snapshot to another server to create a copy of the server with the same data (Redis master/slave structure, mainly used to improve Redis performance), and leave the snapshot in place for use when the server restarts.

Snapshot persistence is the default mode used by Redis. By default, the following parameters are configured in the redis.conf configuration file:

Save 900 1 # After 900 seconds (15 minutes) Redis will automatically trigger the BGSAVE command to create a snapshot if at least one key has changed. Save 300 10 # After 300 seconds (5 minutes) Redis will automatically trigger the BGSAVE command to create a snapshot if at least 10 keys have changed. Save 60 10000 # After 60 seconds (1 minute) Redis will automatically trigger the BGSAVE command to create a snapshot if at least 10000 keys have changed.Copy the code

Persistence of AOF (Append-only File)

AOF persistence is more real-time than snapshot persistence, so it has become a mainstream persistence solution. By default, Redis does not enable AOF (Append only File) persistence. It can be enabled with the appendonly parameter:

appendonly yes
Copy the code

After AOF persistence is enabled, every command that changes data in Redis is written to the AOF file on the disk. The AOF file is saved in the same location as the RDB file, which is set by the dir parameter. The default file name is appendone.aof.

There are three different AOF persistence methods in the Redis configuration file. They are:

Appendfsync always # write to the AOF file every time a data change occurs, this seriously slows down Redis. Appendfsync no # lets the operating system decide when to synchronize multiple write commands to the diskCopy the code

In order to balance data and write performance, the user can consider the appendfSync Everysec option, allowing Redis to synchronize AOF files once per second with little performance impact. Moreover, even if the system crashes, the user will only lose data generated for a maximum of one second. Redis also elegantly slows itself down to accommodate the disk’s maximum write speed when the disk is busy performing write operations.

Supplementary content: AOF rewrite

An AOF rewrite can produce a new AOF file that holds the same database state as the original AOF file, but is smaller.

AOF rewriting is an ambiguous name. It is implemented by reading key-value pairs from a database without the need for a program to read, analyze, or write an existing AOF file.

When BGREWRITEAOF is executed, the Redis server maintains an AOF overwrite buffer that records all write commands executed by the server during the creation of a new AOF file by the child process. When the child process finishes creating the new AOF file, the server appends all the contents of the overwrite buffer to the end of the new AOF file, so that the new and old AOF files hold the same database state. Finally, the server completes the AOF file rewrite by replacing the old AOF file with the new AOF file

For more on this, check out my post:


5.9 Solutions to cache Avalanche and cache penetration issues

Cache avalanche

Summary: The cache fails a large number of times at the same time, so all subsequent requests will fall on the database, causing the database to collapse under a large number of requests in a short period of time.

The solution (mentioned in his video) :

  • Beforehand: try to ensure the high availability of the entire Redis cluster, and make up for machine downtime as soon as possible. Choose an appropriate memory obsolescence strategy.
  • What’s going on: Local EhCache + Hystrix traffic limiting & degradation to avoid MySQL crash
  • After the event: Use the redis persistence mechanism to restore data to cache as soon as possible

The cache to penetrate

Brief introduction: it is the hacker to request the data that does not exist in the cache intentionally generally, cause all requests to fall on the database, cause the database to bear a large number of requests in a short time and collapse.

Solution: There are many ways to effectively solve the cache penetration problem, the most common is to use a Bloom filter to hash all possible data into a bitmap large enough that a certain non-existent data will be intercepted, thus avoiding the query pressure on the underlying storage system. An even simpler approach (which is what we used) is that if a query returns empty data (either nonexistent data or a system failure), we still cache the empty result, but it will expire for a short period of time, no more than five minutes.


  •…enter link description here

5.10 How do I Solve the Concurrent Contention Key Problem in Redis

The problem with concurrent competing keys in Redis is that multiple systems operate on the same Key at the same time, but the order of execution is different from the expected order, which leads to different results!

A recommended solution: distributed locking (both ZooKeeper and Redis can implement distributed locking). (Do not use distributed locks if Redis does not have concurrent contention Key issues, it will affect performance)

Distributed locking based on temporary ordered nodes in ZooKeeper. The general idea is that when each client locks a method, a unique instantaneous ordered node is generated in the directory of the specified node corresponding to the method on ZooKeeper. The way to determine whether to acquire the lock is very simple, only need to determine the sequence number of the smallest node. When the lock is released, simply delete the transient node. At the same time, it can avoid the service downtime caused by the lock cannot be released, and the deadlock problem. After the service process is complete, the corresponding child node is deleted to release the lock.

In practice, of course, reliability is the priority. So Zookeeper first.



5.11 How Can I Ensure Data Consistency between the Cache and the Database in Dual-Write Mode?

Any time you use a cache, you’re going to be dealing with the cache and the database, double storage, double write, any time you’re doing double write, you’re going to have a consistency problem, so how do you solve the consistency problem?

In general, if your system is not strict with cache + database must be consistent, cache can be a little bit with the occasional inconsistent database, it is best not to do this project, read and write request serialization, string into an in-memory queue, so that you can guarantee won’t appear inconsistent

Serialization can result in a significant reduction in system throughput, with several times more machines than normal supporting a single request on the line.


  • Java Engineer Interview Assault season 1 (probably the best Java interview assault course ever) – Chinese Huxton Teacher. See the video address below!
    • Link:…
    • Password: 5 i58

Six Java

6.1 Java Basics

The difference between overloading and overwriting

Overloading: Occurs in the same class, the method name must be the same, the parameter type may be different, the number may be different, the order may be different, the method return value and the access modifier may be different, occurs at compile time.

Override: occurs in parent and child classes, method name, argument list must be the same, return value range is less than or equal to the parent class, throw an exception range is less than or equal to the parent class, access modifier range is greater than or equal to the parent class; If the parent method access modifier is private, a child class cannot override the method.

What is the difference between String and StringBuffer and StringBuilder? Why is String immutable?


To put it simply: The String class uses an array of final keyword characters to hold strings, private  final  char  Value [], so the String is immutable. And both StringBuilder and StringBuffer inherit from AbstractStringBuilder, AbstractStringBuilder also uses an array of characters to hold the string char[]value but is not modified with the final keyword, so both objects are mutable.

The constructor for StringBuilder and StringBuffer is implemented by calling the parent constructor, which is AbstractStringBuilder. You can consult the source code.

abstract class AbstractStringBuilder implements Appendable.CharSequence {
    char[] value;
    int count;
    AbstractStringBuilder() {
    AbstractStringBuilder(int capacity) {
        value = new char[capacity];
Copy the code

Thread safety

Objects in a String are immutable, and thus can be understood as constant and thread-safe. AbstractStringBuilder is a public superclass of StringBuilder and StringBuffer. It defines some basic string operations, such as expandCapacity, Append, INSERT, indexOf and other public methods. StringBuffer synchronizes the method or the method being called, so it is thread-safe. StringBuilder does not synchronously lock methods, so it is non-thread-safe.


Each time a String is changed, a new String is generated, and the pointer is pointed to the new String. Instead of generating new objects and changing object references, StringBuffer operates on the StringBuffer object itself each time. Using StirngBuilder is only 10% to 15% better than using StringBuffer, but at the risk of multithreading insecurity.

Summary of the use of the three:

  1. Manipulate a small amount of data = String
  2. Single-threaded manipulation of large amounts of data under a string buffer = StringBuilder
  3. Multithreaded manipulation of large amounts of data under a StringBuffer = StringBuffer

Automatic packing and unpacking

Boxing: Wrapping basic types with their corresponding reference types;

Unpacking: converts the packing type to the basic data type;

= = and equals

== : Determines whether the addresses of two objects are equal. That is, determine whether two objects are the same object. (The basic datatype == compares values, and the reference datatype == compares memory addresses)

Equals () : determines whether two objects are equal. But it can be used in two ways:

  • Case 1: Class does not override the equals() method. Comparing two objects of this class via equals() is equivalent to comparing the two objects via “==”.
  • Case 2: Class overrides the equals() method. In general, we override equals() so that the contents of two objects are equal; Return true if their contents are equal (that is, the objects are considered equal).

Here’s an example:

public class test1 {
    public static void main(String[] args) {
        String a = new String("ab"); // a is a reference
        String b = new String("ab"); // b is another reference, and the contents of the object are the same
        String aa = "ab"; // Put in the constant pool
        String bb = "ab"; // Search from the constant pool
        if (aa == bb) // true
        if (a == b) // false, not the same object
        if (a.equals(b)) // true
        if (42= =42.0) { // true
            System.out.println("true"); }}}Copy the code


  • The equals method in String is overridden because the equals method of object compares the memory address of the object, while the equals method of String compares the value of the object.
  • When an object of type String is created, the virtual machine looks in the constant pool for an existing object with the same value as the one being created, and if so, assigns it to the current reference. If not, create a new String in the constant pool.

Some summary of the final keyword

The final keyword is used in three main places: variables, methods, and classes.

  1. If a final variable is of a basic data type, its value cannot be changed once it is initialized. If a variable is of reference type, it cannot point to another object after it is initialized.
  2. When a class is decorated with final, it indicates that the class cannot be inherited. All member methods ina final class are implicitly specified as final.
  3. There are two reasons to use the final method. The first reason is to lock a method in case any inherited class changes its meaning; The second reason is efficiency. In earlier versions of The Java implementation, final methods were turned into inline calls. But if the method is too large, you may not see any performance gains from the embedded call (the current Java version no longer requires these optimizations with final methods). All private methods in the class are implicitly specified as FIANL.

6.2 Java Collections Framework

Arraylist is similar to LinkedList

  • 1. Guaranteed thread safety: ArrayList and LinkedList are not synchronized, which means they are not guaranteed thread safety.
  • 2. The underlying data structure: Arraylist uses an Object array; LinkedList is a two-way circular list data structure.
  • 3. Whether insertions and deletions are affected by element position:ArrayList is stored as an array, so the time complexity of inserting and deleting elements is affected by the location of the elements.For example: executionadd(E e)Method, ArrayList will append the specified element to the end of the list by default, in which case the time complexity is O(1). But if you want to insert and delete elements at the specified position I (add(int index, E element)The time complexity is O(n- I). Because when we do that, we’re going to move back/forward a bit for every (n-i) element after the I ‘th and I ‘th elements in the set. 2.LinkedList is stored as a LinkedList, so the time complexity of inserting and deleting elements is not affected by the position of the element, and the time complexity of inserting and deleting elements is approximately O (1) and the time complexity of array is approximately O (n).
  • 4. Fast random access:LinkedList does not support efficient random element access, whereas ArrayList implements the RandmoAccess interface, so there is random access. Fast random access is to quickly obtain an element object by its ordinal number (corresponding toget(int index)Methods).
  • 5. Memory footprint: ArrayLists are empty because they leave a certain amount of space at the end of the list, while LinkedLists are empty because each element consumes more space than ArrayLists (because of the direct successors and precursors and data).

Supplement: bidirectional linked list based on data structure

Two-way linked list, also known as double linked list, is a kind of linked list, it has two Pointers in each data node, respectively pointing to the direct successor and direct precursor. Therefore, starting from any node in the bidirectional linked list, it is easy to access its predecessors and successors. In general, we construct a two-way circular list, as shown in the figure below, and the figure below is also the bottom use of two-way circular list data structure.

ArrayList is different from Vector

All methods of the Vector class are synchronous. A Vector can be accessed safely by two threads, but it takes a lot of time for the code to synchronize if one thread accesses the Vector.

Arraylist is not synchronous, so it is recommended to use Arraylist whenever thread-safety is not required.

The underlying implementation of HashMap

1) before JDK1.8

Before JDK1.8, the underlying use of HashMap was a combination of arrays and lists, known as a hashlist hash. The HashMap uses the hashCode of the key to obtain the hash value after the perturbation function processing, and then determines the current location of the element (where n refers to the length of the array) by (n-1) & hash. If there are elements in the current location, To determine whether the hash value and key of the element to be stored are the same, if the same, directly override, not the same through the zipper method to resolve the conflict.

The perturbation function refers to the hash method of the HashMap. The hash method which is the perturbation function is used to prevent some poorly implemented hashCode() method in other words you can reduce collisions by using the perturbation function.

JDK 1.8 HashMap hash

The HASH method in JDK 1.8 is simpler than the HASH method in JDK 1.7, but the principle remains the same.

    static final int hash(Object key) {
      int h;
      // key.hashcode () : returns the hash value, namely hashCode
      // ^ : xor by bit
      // >>>: unsigned right shift, ignore the sign bit, empty space is filled with 0
      return (key == null)?0 : (h = key.hashCode()) ^ (h >>> 16);
Copy the code

Compare the hash method source of JDK1.7 HashMap.

static int hash(int h) {
    // This function ensures that hashCodes that differ only by
    // constant multiples at each bit position have a bounded
    // number of collisions (approximately 8 at default load factor).

    h ^= (h >>> 20) ^ (h >>> 12);
    return h ^ (h >>> 7) ^ (h >>> 4);
Copy the code

The performance of the JDK 1.7 hash method is slightly worse than that of the JDK1.8 hash method because of the 4 perturbations.

The zipper method is a combination of lists and arrays. That is, create an array of linked lists, and each cell in the array is a linked list. If hash conflicts are encountered, add the conflicting values to the list.

(2) after JDK1.8

Compared to the previous version, JDK1.8 has a significant change in resolving hash conflicts. When the list length is larger than the threshold (default is 8), the list is converted into a red-black tree to reduce the search time.

TreeMap, TreeSet, and the underlying HashMap after JDK1.8 all use red-black trees. Red-black trees are designed to solve the problem of binary search trees, which can degenerate into a linear structure in some cases.

Recommended Reading:

  • “8 series of Java HashMap rethink” :

The difference between HashMap and Hashtable

  1. Thread safety:HashMap is non-thread-safe, HashTable is thread-safe; Basically all the methods inside HashTable go throughsynchronizedModification. (Use ConcurrentHashMap if you want to be thread-safe!) ;
  2. Efficiency: HashMap is slightly more efficient than HashTable because of thread-safety issues. In addition, HashTable is basically obsolete, so don’t use it in your code;
  3. Support for Null keys and Null values: In a HashMap, Null can be used as a key, and there can be only one such key, and there can be one or more keys with Null values. A NullPointerException is thrown when a key is put in a HashTable with a NULL.
  4. The difference between the initial capacity and the capacity expansion is as follows: (1) If the initial capacity value is not specified when the Hashtable is created, the initial size of the Hashtable is 11 by default, and the capacity changes to 2n+1 each time the Hashtable is expanded. The default initialization size of a HashMap is 16. And then each time you expand it, it doubles. (2) If the initial size is given, the Hashtable will use the given size, and the HashMap will expand it to a power of two. That is to say, a HashMap always uses a power of two as the size of the hash table, and we’ll see why that is later.
  5. Underlying data structure: Since JDK1.8, HashMap has changed a lot in solving hash conflicts. When the length of the list is larger than the threshold (default is 8), the list is transformed into a red-black tree to reduce the search time. Hashtable has no such mechanism.

Why is the length of a HashMap a power of two

In order for HashMap access to be efficient, there should be as few collisions as possible, which means that data should be distributed as evenly as possible. As we mentioned above, the range of Hash values — 2147483648 to 2147483648 — is about 4 billion mapping Spaces, so as long as the Hash function is evenly and loosely mapped, it’s very difficult to collide in general applications. But the problem is that with an array four billion long, there’s no room for it. So this hash value is not directly usable. You have to take the magnitude of the length of the array, and you have to use the remainder for where you want to put it, which is the index of the array.

How should the algorithm be designed?

The first thing we might think of doing is taking mod by %. But here’s the big point: “the mod (%) operation where the divisor is a power of 2 is equivalent to the & operation where the divisor is subtracted by one (hash%length==hash&(length-1) if length is 2 to the n;).” .” And using the binary bit operation &, which is more efficient than %, explains why the length of HashMap is raised to a power of 2.

HashMap multi-threaded operations cause infinite loops

In multithreaded situations, putting causes a HashMap to loop indefinitely because of the HashMap’s expanded resize() method. Since expansion is to create an array, copy the original data to the array. The list needs to be copied because the array index is attached to the list, but multithreading can lead to a circular list. The process of copying a linked list is as follows: Simulate simultaneous expansion of two threads. Suppose that the current HashMap space is 2 (critical value is 1), the hashcode is 0 and 1, and there are elements A and B at hash address 0. In this case, the element C is added, and the hash address C is calculated, and the hash address is 1. In this case, the space is insufficient because the hash address exceeds the critical value. Need to call the resize method to expand, then in the multi-threaded condition, there will be conditional competition, the simulation process is as follows:

Thread 1: reads the current HashMap situation, and when preparing for expansion, thread 2 intervenes

Thread 2: Reads the HashMap and expands it

Thread 1: Continues execution

The process is to copy A to the new hash table, and then copy B to the head of the chain (before A: B. ext=A; b. ext=A; b. ext=B;

A HashSet is different from a HashMap

If you’ve seen the source code for HashSet, you know that the underlying implementation of HashSet is based on HashMap. (The source code for HashSet is very, very sparse, because all the methods call directly from HashMap, except for the clone(), writeObject(), and readObject() methods that HashSet itself has to implement.)

Difference between ConcurrentHashMap and Hashtable

The main difference between ConcurrentHashMap and Hashtable is the thread-safe approach.

  • Underlying data structure: JDK1.7 ConcurrentHashMap using piecewise array + linked list implementation, JDK1.8 data structure is the same as HashMap1.8 structure, array + linked list/red and black binary tree. The underlying data structure of Hashtable is similar to that of HashMap before JDK1.8, which is in the form of array + linked list. The array is the main body of HashMap, while the linked list mainly exists to resolve hash conflicts.
  • How to achieve thread-safety (important) : (1) In JDK1.7, ConcurrentHashMap (Segment lock) is used to Segment the entire bucket array. Each lock only locks a part of the data in the container. When multiple threads access different data segments in the container, there is no lock contention, and the concurrent access rate is improved. (The default allocation is 16 segments, which is 16 times more efficient than Hashtable.) In JDK1.8, the concept of Segment has been abandoned, instead, the data structure of Node array + linked list + red-black tree is implemented directly, and the concurrency control is performed by synchronized and CAS. The whole thing looks like an optimized, thread-safe HashMap, although you can still see the Segment data structure in JDK1.8, but the attributes have been simplified to be compatible with older versions. ② Hashtable(same lock) : Using synchronized to ensure thread safety is very inefficient. When one thread accesses a synchronized method and another thread accesses a synchronized method, it may enter a blocking or polling state. For example, if you use PUT to add an element, the other thread cannot use PUT to add an element, and the other thread cannot use GET to add an element. The competition becomes more intense and less efficient.

A comparison of the two:



JDK1.7 ConcurrentHashMap:

ConcurrentHashMap thread safety concrete implementation/underlying concrete implementation

①JDK1.7 (schematic diagram above)

First, data is divided into segments for storage, and then each segment of data is assigned a lock. When a thread uses the lock to access one segment of data, data in other segments can also be accessed by other threads.

ConcurrentHashMap consists of the Segment array structure and the HahEntry array structure.

A Segment implements ReentrantLock, so it is a ReentrantLock that acts as a lock. HashEntry is used to store key-value pair data.

static class Segment<K.V> extends ReentrantLock implements Serializable {}Copy the code

A ConcurrentHashMap contains an array of segments. The structure of a Segment is similar to that of a HashMap. It is an array or a linked list structure. A Segment contains an array of Hashentries, and each HashEntry is an element of a linked list structure. When modifying data in a HashEntry array, you must first acquire a lock for the Segment.

②JDK1.8 (schematic diagram above)

ConcurrentHashMap removes the Segment Segment lock and uses CAS and synchronized to ensure concurrent security. The data structure is similar to that of HashMap1.8: array + linked list/red-black binary tree.

Synchronized locks only the first node of the current linked list or red-black binary tree, so as long as the hashes do not collide, there is no concurrency and efficiency is increased by N times.

Summary of the underlying data structure of the collection framework



  • Arraylist: Indicates an Object array

  • Vector: an array of objects

  • LinkedList: two-way circular list 2.Set

  • HashSet (unordered, unique) : Implementation based on HashMap, which is the underlying use of HashMap to hold elements

  • LinkedHashSet: LinkedHashSet inherits from HashSet and is internally implemented through LinkedHashMap. This is similar to how we said earlier that LinkedHashMap is internally implemented based on Hashmap, but with a slight difference.

  • TreeSet (ordered, unique) : red-black tree (self-balancing sorting binary tree).


  • A HashMap: Before JDK1.8, HashMap was composed of an array and a linked list. The array was the main body of the HashMap, and the list was mainly used to resolve hash conflicts (the “zippers” method for resolving conflicts). Convert the list into a red-black tree to reduce search time
  • LinkedHashMap: LinkedHashMap inherits from HashMap, so its underlying structure is still based on a zipped-hash structure consisting of arrays and linked lists or red-black trees. In addition, LinkedHashMap adds a two-way linked list to the above structure so that the above structure preserves the insertion order of the key-value pairs. At the same time, through the corresponding operation of the linked list, the access sequence correlation logic is realized. LinkedHashMap source code detailed analysis (JDK1.8)
  • HashTable: A combination of an array (the body of a HashMap) and a linked list (the main purpose of a HashMap)
  • TreeMap: Red-black tree (self-balancing sorting binary tree)

6.3 Java Multithreading

About Java multithreading, in the interview, asked more is ① pessimistic lock and optimistic lock (specific can see my article: (②synchronized and volatile) (②synchronized and volatile) ③ reentrant lock and non-reentrant lock difference, ④ multi-thread is to solve what problem, ⑤ thread pool to solve what problem, ⑥ thread pool principle, ⑦ thread pool use matters for attention, ⑧AQS principle, ⑨ReentranLock source code, design principle, the whole process and so on.

The interviewer in the multithreading section will most likely ask you if you’ve actually used multithreading on a project. So, if you have experience with actual Java multithreading in your projects, it’s a plus!

6.4 Java VIRTUAL Machine

The Java Virtual Machine (JVM) is one of the most commonly asked questions in an interview: (1) Java memory region, (2) VIRTUAL machine garbage algorithm, (3) virtual machine garbage collector, (4) JVM memory management, (5) JVM tuning. Check out my two articles:


6.5 Design Patterns

Design patterns more common is to make your handwriting a singleton pattern (note the singleton pattern of several different implementation methods) or let you say a common design patterns is how to use in your project, in addition the interviewer may also ask you what is the difference between the abstract factory and factory method pattern, factory pattern, the thought of such a problem.

Take a good look at the proxy pattern, observer pattern, and (abstract) factory pattern. These three design patterns are also important.

7 data Structure

Data structure is often asked: binary tree, red black tree (probably let you draw a red black tree out oh!) , binary search tree (BST), self-balancing binary Search tree (BST), B-tree, B+ tree and B* tree advantages and disadvantages comparison, LSM tree.

Data structures are important and relatively hard to learn. It is suggested to learn data structure step by step, step by step. Make sure you understand how it works, and you’d better do it in code yourself.

Eight algorithm

Common encryption algorithms, sorting algorithms need to understand in advance, sorting algorithms can be independently written out.

I think the most exciting, the most pressure or the most challenging part of the interview is the tear algorithm. In the interview, most of the algorithm questions are from Leetcode, sword refers to the offer above, I suggest that we can squeeze out a little time every day to brush the algorithm questions.

Recommend two must-have websites for brushing questions:


  • LeetCode (China) official website
  • How to use LeetCode efficiently

The cow from:

  • Cow passenger net home page

8.1 For example (handwritten quick type)

The interviewer may ask you, what kind of ranking do you know? Can you write me a sort algorithm other than bubble sort and selection sort. Or the interviewer might just ask you, “Can you write me a quick list?” .

The basic idea of quicktype: the records to be sorted are divided into two independent parts by the selected reference value, one part is all less than the selected reference value, the other part is all greater than the selected reference value. Do the same for the split part until it is no longer possible to do the operation (recursion is possible).

Here is a simple quicksort algorithm that I wrote. The reference value I chose is the first element of the array.

import java.util.Arrays;

public class QuickSort {

	public static void main(String[] args) {
		// TODO Auto-generated method stub

		int[] num = { };
		QuickSort.quickSort(num, 0, num.length - 1);

	public static void quickSort(int[] a, int start, int end) {
		// This value defines where to start splitting the array
		int ref;
		if (start < end) {
			// Call the partition method to sort the array
			ref = partition(a, start, end);
			// Sort the two arrays after splitting
			quickSort(a, start, ref - 1);
			quickSort(a, ref + 1, end); }}/** * select a reference value to perform a quicksort on the given array **@paramA * array *@paramStart * (shard) The position of the first element in each array *@paramEnd * (shard) The position of the last element in each array *@returnThe next time you slice the array, */
	public static int partition(int[] a, int start, int end) {
		// Take the first value of the array as the reference value (key data)
		int refvalue = a[start];
		// Start from the right side of the array and walk to the left until you find an element less than the reference value
		while (start < end) {
			while (end > start && a[end] >= refvalue) {
			// Assign the element directly to the first element on the left, where pivotkey is located
			a[start] = a[end];

			// Start from the left side of the sequence and walk to the right until you find an element greater than the reference value
			while (end > start && a[start] <= refvalue) {
			a[end] = a[start];
			return end;
		// The last start is where the reference value is
		a[start] = refvalue;
		returnstart; }}Copy the code

Time complexity analysis:

  • In the optimal case, the Partition is evenly divided every time, and the time complexity of the quicksort algorithm is O(nlogn).
  • The worst-case quicksort, when the sequence to be sorted is in positive or reverse order, has a time complexity of O(n^2).

Space complexity analysis:

  • In the best case, the depth of the recursion tree is log base n, so the space complexity is order log base n.
  • In the worst case, the n‐1 recursive call is O(n), and in the average case, the space complexity is also O(logn).

A simple way to optimize:

Three-way sharding quicksort: the core idea is to sort the data into three parts, the left is less than the comparison value, the right is greater than the comparison value, and the middle number and the comparison value are equal. The feature of three-way shard quicksort is that when you encounter the same value as the comparison, you don’t exchange data. In this way, for the sorting of a large number of repeated data, three-way sharding quicksort algorithm will be better than ordinary quicksort algorithm, but because of its overall judgment code than ordinary quicksort a little more, so for a large number of common non-repeated data, it can not be more than ordinary quicksort advantage.

Nine Spring

If your resume says you know Spring Boot or Spring Cloud, the interviewer may ask you about both technologies. For example, he or she may ask you the difference between Springboot and Spring. So be careful what you put on your resume and be very familiar with what you put on your resume.

In addition, AOP implementation principles, dynamic proxy and static proxy, Spring IOC initialization process, IOC principles, how to implement an IOC container? These are the things that are often asked.

9.1 Scope of Spring Beans

9.2 Isolation levels in Spring transactions

The TransactionDefinition interface defines five constants that represent isolation levels:

  • TransactionDefinition. ISOLATION_DEFAULT: use a backend database default isolation level, Mysql default REPEATABLE_READ isolation level used Oracle READ_COMMITTED) isolation level used by default.
  • TransactionDefinition. ISOLATION_READ_UNCOMMITTED: the lowest isolation level, allowing the read has not yet been submitted data changes, may lead to dirty reads, phantom read or not repeatable read
  • TransactionDefinition. ISOLATION_READ_COMMITTED: allow read of concurrent transactions have to submit data, can prevent dirty reads, but phantom read or not repeatable read could still happen
  • TransactionDefinition. ISOLATION_REPEATABLE_READ: many of the same field to read the results are consistent, unless the data have been modified by itself affairs on their own, can prevent the dirty read and not repeatable read, but phantom read could still happen.
  • TransactionDefinition. ISOLATION_SERIALIZABLE: the highest isolation level, completely obey the ACID isolation level. All transactions are executed one at a time so that interference between transactions is completely impossible, that is, this level prevents dirty reads, non-repeatable reads, and phantom reads. However, this can seriously affect the performance of the program. This level is not usually used.

9.3 Transaction propagation behavior in Spring transactions

The current transaction is supported:

  • TransactionDefinition. PROPAGATION_REQUIRED: if a transaction exists, then join the transaction; If there is no current transaction, a new one is created.
  • TransactionDefinition. PROPAGATION_SUPPORTS: if a transaction exists, then join the transaction; If there are no current transactions, the run continues in a non-transactional manner.
  • TransactionDefinition. PROPAGATION_MANDATORY: if a transaction exists, then join the transaction; If there is no current transaction, an exception is thrown. (Mandatory)

Current transaction not supported:

  • TransactionDefinition. PROPAGATION_REQUIRES_NEW: create a new transaction, if a transaction exists, suspending the current transaction.
  • TransactionDefinition. PROPAGATION_NOT_SUPPORTED: run way of transaction, if a transaction exists, suspending the current transaction.
  • TransactionDefinition. PROPAGATION_NEVER: run way of transaction, if the current transaction, throw an exception.

Other situations:

  • TransactionDefinition. PROPAGATION_NESTED: if a transaction exists, then create a transaction for the current affairs of nested transactions to run; If no current affairs, the value of equivalent to the TransactionDefinition. PROPAGATION_REQUIRED.

9.4 AOP

The implementation of AOP ideas is generally based on the proxy pattern. In JAVA, the JDK dynamic proxy pattern is generally adopted, but as we all know, the JDK dynamic proxy pattern can only proxy interfaces but not classes. Therefore, Spring AOP switches this way because Spring AOP supports both CGLIB, ASPECTJ, and JDK dynamic proxies.

  • Spring AOP will use JDK dynamic proxies to generate AOP proxy classes if the target object’s implementation class implements the interface.
  • If the target object’s implementation class does not implement the interface, Spring AOP will use CGLIB to generate the AOP proxy class — but the selection process is completely transparent and unconcerned to the developer.

Check out the following articles for this part:

  • Juejin. Im/post / 684490…

9.5 the IOC

Spring IOC initialization process:

IOC source code reading


Ten actual scene questions

I think the actual scene question is to your ability to use knowledge and thinking ability. It is suggested that we develop the habit of thinking more at ordinary times, so that we will not panic when meeting such questions during the interview. On the other hand, if you really don’t know what to say to the interviewer, the interviewer may remind you. Avoid by all means do not understand pretend to understand, disorderly answer a gas. The interviewer may ask you questions like this: ① Suppose you want to build a banking app, and there are likely to be multiple people sending money to the same account at the same time. What are the possible problems and how to solve (lock)? ② How do you ensure the quality and correctness of your code? (3) In the process of placing an order to reduce inventory or payment to reduce inventory, analyze the advantages and disadvantages of both; ④ At the same time to 100,000 people pay wages, how to design concurrent program, can ensure that all in 1 minute. ⑤ If you were asked to design XXX system, how would you design it?

Write in the last

Finally, a few more points:

  1. Be careful what you put on your resume and be very familiar with what you put on your resume. Because in general, the interviewer will be based on your resume to ask;
  2. It’s also important to have a project that’s off the top of your head. This is one area where interviewers are likely to ask a lot of questions, so review any projects you’ve worked on before the interview;
  3. Talk to the interviewer about the basics, such as the use of design patterns, the use of multithreading, etc., which can be combined with specific project scenarios or how you use them in daily life.
  4. Keep an eye out for your open source Github project. Interviewers may poach your Github project to ask questions.
  5. It is recommended to find out the values of the company you want to interview with in advance to determine if you are a good fit for the company.

In addition, I personally feel that the interview is also like a new journey, failure and victory are normal. Therefore, I advise you not to lose heart and morale because of the failure of the interview. Don’t be complacent because you passed the interview. A better future awaits you. Keep going!

Besides the first time, the author also dug a hole for himself here. I will make a systematic summary of Dubbo, ZooKeeper and other contents in the future. After you see, you must have a harvest!

Finally, I attach a link for soliciting articles.

There are also three partners for the event: