Read notes: SSD performance knowledge relevant to programmers

Content from tao Hui teacher “system performance tuning must know will”, geek time can subscribe

Just recently the solid state of my computer is broken, so I took notes when I saw it.

First content summary:

External performance indicators of SSDS are IOPS, bandwidth/throughput, and access latency.
Understand its inner workings: the concepts of cells, pages, and blocks
How to design SSD-friendly applications:

Data structure: Optimized data structure to avoid in-place update: Separate hot data from cold data. I/O processing: Avoid long and heavy WRITE I/O processing: Avoid SSD storage capacity overload

Hard disks that are faster than HDDS

Many people know more about traditional hard drives, which have been used in the industry for decades and are described in many textbooks. So, for SSD performance, LET me take you through their external performance metrics and features by way of comparison. The performance of a hard disk depends on IOPS, bandwidth/throughput, and access latency. IOPS refers to the number of read/write requests processed by the system Per Second. Access latency refers to the time interval between the initiation of AN I/O request and the completion of the I/O processing by the storage system. Throughput or Bandwidth, which measures the actual data transmission rate. For a traditional hard disk, we should be familiar with its internal operation. In simple terms, when an application sends an I/O request to the hard disk, the request is entered into the I/O queue of the hard disk. When it is the IO’s turn to access data, the head needs to move mechanically to the location where the data is stored. This requires the head to address the appropriate track and rotate to the appropriate sector, and then the data is transferred. For a common hard disk, the random I/O latency is about 8 ms, the I/O bandwidth is about 100MB per second, and the random IOPS is about 100. There are many types of SSDS, including single and multi-tier ones, depending on the technology. According to the quality and performance of the points, there are enterprise and ordinary. The ports and protocols can be SAS, SATA, PCIe, and NVMe. I used a table to compare three performance metrics for HDDS and SSDS. Consider SSDS with the popular NVMe protocol. As you can see, SSDS have random I/O latency that is a hundred times faster than traditional hard drives, generally at the subtle level; IO bandwidth is also many times higher, up to several gigabits per second; Random IOPS is thousands of times faster, reaching hundreds of thousands.

Performance features and mechanisms of SSDS

SSDS work very differently internally than HDDS, so let’s start with a few concepts. Cells, pages, and blocks. Today’s mainstream SSDS are NAND based and store numeric bits in cells. Each SSD unit can store one or more bits. Each erasure of a cell reduces the life of the cell, so the cell can only withstand a certain number of erasures. The more bits of cell storage, the lower the cost of manufacturing, the greater the capacity of the SSD, but also the lower the durability (erasing times).

Added: NAND Flash rules

Neither Flash supports overwriting, which means that a write can only be done in an empty or erased cell. When the data is changed, the whole page is copied to the Cache to modify the corresponding page, and the changed data is moved to a new page for storage, and the original page is marked as invalid. Specifies that when writing to a location where invalid data already exists, the invalid page must be erased before new data can be written to that location.
Write in page and erase in Block. Before erasing a Block, move the valid pages in it.
Each Block has a limit on how many times it can be erased (it has a lifetime), and too many erasures make it a bad Block.

A Page contains many cells. A typical Page size is 4KB, and a Page is the smallest storage unit to read or write. There is no “override” operation on SSDS, unlike HDDS, which can override any byte directly. Once a Page has been written, it cannot be partially rewritten and must be erased and reset as a whole along with other adjacent pages. Multiple pages are grouped into blocks. The typical size of a Block is 512KB or 1MB, which is about 128 or 256 pages. A Block is the basic unit of erasing, and each erasing is a reset of all pages within the entire Block.

Because if an SSD already stores a lot of data, writing to a page often requires moving existing data
Erasure is done in blocks. The erasing is relatively slow, usually a few milliseconds. So for synchronous IO, the issuing application may experience significant write latency due to block erasers
Maintaining a threshold of free blocks is necessary for fast write responses. This is the purpose of SSD garbage collection (GC)
Ensure that future page writes can be quickly allocated to a brand new page

Write Amplification, or WA. This is a disadvantage of SSDS compared to HDDS. The amount of physical data actually written to SSDS can be many times that of data written to the application layer. On the one hand, page-level writes require moving existing data to empty the page. On the other hand, GC operations also move user data for block-level erasing. Therefore, actual writes to SSDS may be larger than actual writes, which is called write amplification. An SSD can only do a limited number of erasures, also known as programming/erasure (P/E) cycles, so write amplification shortens the SSD’s life. Personal observation: When you write, you have to move some room for you. Moving data is also a kind of writing. So the act of writing data will be amplified. For each block, once the maximum number is reached, the block dies. For SLC blocks, the typical number of P/E cycles is 100,000; For MLC blocks, the number of P/E cycles is 10,000; For TLC blocks, it can be in the thousands. To ensure SSD capacity and performance, we need to balance the number of erasures, and SSD controllers have this “wear balance” mechanism to achieve this goal. During loss-balancing, data is moved between blocks to equalize losses, a mechanism that also contributes to write magnification. Personal summary: you can’t let some blocks die directly, so if you have a long life, you will amortize the short life, and then you will move the data, which will cause write magnification problems **

Design ssD-friendly programs

The I/O performance of SSDS is thousands of times higher than that of HDDS in IOPS and access latency, and the throughput rate is tens of times higher. However, SSDS have the following disadvantages: high cost, small capacity, and easy loss. With the development of technology, these three shortcomings have been weakened in recent years. More and more systems are using SSDS to alleviate IO performance bottlenecks in applications. As a result of many deployments, SSDS provide significant application performance improvements compared to HDDS. However, in most deployment scenarios, SSDS are viewed only as “faster HDDS” and their potential is underutilized. Although applications can achieve better performance when SSDS are used as storage, these gains are mainly due to the higher IOPS and bandwidth provided by SSDS. Further, if applications are designed to be SSD-friendly by taking into account SSD internals, SSDS can be optimized to further improve application performance, and SSDS can be extended to reduce operating costs. Let’s take a look at how to make a series of SSD-friendly design changes at the application layer.

Why design SSD-friendly software and applications?

Ssd-friendly programs have three benefits:

Improving application performance;
Improve I/O efficiency of SSDS.
The SSD service life is extended.

Let me explain them separately.

Better application performance. While migrating from HDDS to SSDS generally means better application performance, mainly due to the better IO performance of SSDS, simply adopting SSDS without changing the application design may not achieve optimal performance. We had an application that did just that. The application requires constant writing to files to save data, and the main performance bottleneck is disk IO. With HDD, the maximum application throughput was 142 queries per second (QPS). This is the best performance you can get regardless of various changes or tuning to your application design. When migrating to SSDS with the same application, throughput increased to 20,000 QPS, a 140-fold increase in speed. This is mainly due to the higher IOPS provided by SSDS. After further optimization of the application design to make it SSD friendly, throughput was increased to 100,000 QPS, a four-fold improvement over the original simple design. The secret is to use multiple concurrent threads to execute IO, which takes advantage of the internal parallelism of SSDS. Remember, multiple IO threads are no good for HDDS, because HDDS only have one head. More efficient storage IO. The minimum internal IO unit on an SSD is a page, say 4KB in size. Therefore, single-byte reads/writes to SSDS must be done at the page level. Applications writing to SSDS can cause physical writes to be larger on SSDS, known as write magnification (WA). Because of this, if an application’s data structure or IO is not SSD friendly, write magnification is unnecessarily large and SSD IO cannot be fully utilized. Longer service life. SSDS wear out because each storage unit can only sustain a limited number of write erase cycles. In fact, SSD life depends on four factors: SSD size, maximum number of erase cycles, write magnification factor, and application write rate. For example, suppose you have an SSD with a 1TB size, an application with a write speed of 100MB per second, and an SSD with 10,000 erase cycles. When write magnification is 4, SSDS only last 10 months. SSDS with 3,000 erase cycles and write magnification factor of 10 can only be used for a month. Given the relatively high cost of SSDS, it is our hope that these applications will be SSD-friendly to extend the lifespan of SSDS.

SSD friendly design principles

When we design our programs, we can make them SSD friendly to reap the three benefits mentioned earlier. So what are some SSD friendly programming options? I’ve summarized four principles here, which fall broadly into two categories: data structures and IO processing.

1. Data structure: Avoid optimization of in-place updates

The addressing latency of traditional HDDS is large, so applications that use HDDS are often optimized to perform in-place updates that do not require addressing (such as writing only after a file). As shown in the figure below, when performing random updates, throughput is generally only about 170QPS; For the same HDD, in-place updates can reach 280QPS, much higher than random updates (as shown in the left figure below).

For SSDS, random and sequential read and write performance is similar. In-place updates do not provide any IOPS advantage

Personal summary: That is, SSD reads and writes randomly can be written to an empty block as much as possible, thus avoiding the previous move, whereas HDDS need to be updated in place

Question:In-place updates cause a read – modify – write process, while random updates are just write processes. Will random updates not touch pages already written? The answer isIn-place updates are guaranteed, but random updates are not

2. Data structure: Separate hot data from cold data

For almost all applications dealing with storage, the probability of accessing data stored on disk is not the same. Let’s consider a social networking application that needs to track user activity. For user data storage, a simple solution would be to compress all users into the same location (for example, files on an SSD) based on user attributes (such as registration time), and later, when you need to update the activities of popular users, SSDS need to be accessed (read/modify/write) at the page level. Therefore, if the user’s data size is less than a page, nearby user data will also be accessed. If the application doesn’t really need data from nearby users, the extra data not only wastes IO bandwidth, but also needlessly wears down SSDS. To mitigate this performance issue, hot data should be separated from cold data when SSDS are used as storage devices. To separate at different levels or in different ways, for example, to different files, different parts of files, or different tables.

3. IO processing: Avoid long and heavy writes

SSDS typically have a GC mechanism that constantly reclaims storage blocks for later use. GC can work in background or foreground mode. SSD controllers generally maintain a threshold of free blocks. Whenever the number of available blocks drops below the threshold, the background GC starts. Since background GC occurs asynchronously (that is, non-blocking), it does not affect the APPLICATION’s IO latency, but if the block request rate exceeds the GC rate and the background GC cannot keep up, the foreground GC will be triggered. During the foreground GC, each block must be erased (that is, blocked) immediately for use by the application, which affects the write latency experienced by the writing application. Specifically, the foreground GC operation of releasing blocks can take more than a few milliseconds, resulting in significant application IO latency. For this reason, it is best to avoid long, heavy writes and never use a foreground GC.

4. I/O processing: Prevent SSD storage from being too full

Too much SSD storage affects write magnification and GC – induced write performance. During GC, blocks need to be erased to create free blocks. You need to move and retain valid data before erasing a block to get a free block. Sometimes we need to compress several storage blocks to get a free block. The number of free blocks that need to be compressed for production depends on disk space usage. Assume that the average full disk percentage is A %. To release A disk, compress 1 / (1-A %) disk. Obviously, the higher the SSD’s space usage, the more blocks will have to be moved to free a block, which takes up more resources and leads to longer I/O wait times. For example, if A= 80%, approximately five data blocks are moved to free one block; When A= 95%, about 20 blocks will be moved.

conclusion

Storage systems are based on traditional hard disks or solid state disks (SSDS), which provide much higher I/O performance than traditional hard disks. SSDS are generally used if the system has high REQUIREMENTS on IOPS or latency. There are a number of file systems, database systems, and data infrastructures designed specifically for SSDS that offer significant performance improvements over systems using HDDS. SSDS work differently from HDDS, so if you can design ssD-friendly applications, you can achieve the full performance potential of SSDS

Recently, I have been operating my own knowledge planet, which aims to help friends who want to know about big data to get into the content of big data, and have been engaged in big data developers to advance together. It is free but does not mean that you will have no harvest. If you are interested, you can add it