If you ask about the differences between memory and disk, I’m sure most developers will list one or two. Indeed, in a recent internal sharing of the team, four or five students said different answers: fast memory storage speed, large disk capacity, disk can store long-term data, disk size is relatively large

This is all true, of course, but any question should look beyond the surface to the reason behind it. When I ask why most computers have large disk capacity, why disk storage is slow, why disks can hold data for a long time, and even why disks have to be so big at all? Not everyone can clearly tell the reason, even to these two kinds of memory is not sensitive, do not know from which Angle to answer! Although most people involve both of these things in the development process as they fiddle with their machines, let me share my understanding and hopefully get a feel for some of the details and features in general.

As a volatile memory, DRAM cannot hold data in the case of power failure. This is because memory itself uses capacitors to store binary bits. The capacitance of each bit is very small, but the memory is made very densely so that it can hold many bits. A bit is a bit, and each bit is called a memory unit in memory. The overall structure can be simplified as shown in the figure below:



If you look at the grid area in the diagram, this is the arrangement of memory cells, which is simply a two-dimensional matrix. Let’s assume that we have a memory that is very small, and it is divided into 16 supercells, and the supercell contains 8 memory cells, then we know that this memory can hold up to 16*8= 108bits of information.

So how to implement data read and write?

In the figure, the address line marked ADDR between the storage controller and the memory unit block. The number 2 above indicates that there are two address lines. In the actual process, the two address lines will identify the address information by sending 0 and 1 signals.

If you want to specify a supercell in a row or column of the memory matrix, the storage controller sends electrical signals to the ADDR line twice. Respectively the two super cell line specifies the target address and columns, such as the first send DiYiErLiangTiao sent address line respectively 0, 1, the electrical signals, then is the value of 01, specify the memory matrix of the second line (note that the first line is 0), then the second send DiYiErTiao address line respectively sent a signal, 1, 1 is the value of 11, Specifying the data in the fourth column ultimately specifies the supercell in the fourth column of the first row. The storage controller then reads and writes data from the supercell.

How will the data in memory be transferred?

You may have noticed the double arrow marking data in the figure. The number 8 indicates that there are eight data lines, which means that the memory can transfer up to 8 bits, i.e. one byte of information, per read and write. The details need not be further explored, but we just need to know that it is through address lines and data lines connected to two-dimensional matrix elements in memory construction to achieve read and write operations.

So that’s the basic memory impression THAT I want to share. The first impression that comes to me is that memory relies on electricity for reading, writing and storing. In the continuous operation of memory, in my mind is the rapid flow of countless currents in the memory chip, we should know that the speed of electricity is the same as light.

As a kind of non-volatile memory, disk (hard disk) is suitable for long-term storage of data. To understand the reason for long-term storage, we also need to start from the disk structure.

To simplify this, let me use the optical drive as an example. We know that the optical drive reads data and has a laser pointing at the optical disk. The optical disk spins as it plays. In fact, the disk data reading process is similar to this operation, the disc disc through the change of the concave pit recorded binary information, the laser will be aimed at a sector of the disc, the rotation of the disc brings about a change in position, so that the laser head can be pointed at different areas of the disc surface, to achieve all the data read.

Here, it is important to understand the concept of sectors and tracks. The following image shows a disk surface to help you establish a first impression of the disk:



You can understanding of disc shape shifted to disk, as shown above, the disk of the disk can be understood as a lot of different radius of track plows, each track is a hollow circular, the round and is composed of multiple sector, the sector is the smallest unit of disk to save data (usually 512 bytes), so a disk contains many of these sectors.

A compact disc is scanned by a floating laser pointed at it, and a disk has a similar object pointed at it, as shown below:



There is a drive arm above the disk, and the end of the drive arm is the read and write head (similar to the laser head of a CD). The drive arm can swing horizontally at different angles, and the disk can rotate at a certain rate. These two characteristics are very important.

What is the process of reading and writing data to a disk?

As we know above, memory locates and transmits data to the memory unit through electrical signals and current. The whole physical process is electrical change. If you want to read data from a disk, you must first determine the destination address (sector location) and then have the read/write head locate at the beginning of the sector.

Notice, how does the read/write header locate the target sector?

Through the structure of the disk disk we know, the first goal should be determined by controlling the driving arm Angle sector in which magnetic path, and then let the disk rotation Angle, makes the final target sector just parked in the read/write head directly, so we called a series of process of mechanical movement (so you know why a lot of people called the ordinary hard disk mechanical hard disk?) When the disk to do a lot of read and write operations, if the target data is distributed in different sectors on the disk surface, then the process requires the drive arm to constantly adjust the Angle, constantly rotating the disk, because of the mechanical process, so the disk read and write bottleneck appears here.

In the actual disk, there is often not just one disk, but multiple disks stacked vertically, as shown below:



Each disk has two faces, each with a read/write head, but this does not prevent us from understanding the disk read/write process with only one face.

Ok, so that’s the basic understanding of the structure of memory and disk. With that in mind I believe that going back to the first few questions in the article may provide a more fundamental explanation. For example, memory storage is fast because this process is the process of electrical transmission, the speed of electricity is very fast, while the disk reading is a mechanical process, itself has great limitations; Disk capacity is large because the disk uses magnetic materials to store data and is suitable for storing data for a long time. At the same time, it also has low manufacturing cost and is relatively economical in the current environment. In addition, the disk has high memory cost and is suitable for the role of data cache. The large disk size is determined by the way the disk is constructed, there are multiple disk faces and transmission arms, which need to do mechanical movement, and then the fine disk also has certain space requirements.

In fact, share this theme, just want to give developers an experience, is sensitive to the cost of data storage. Today’s computer cpus are powerful, and the speed of computation is beyond our general understanding, but the speed of I/O is not greatly improved, which is where the processing bottleneck of many projects is now. From this understanding, we can recognize the physical process of frequently manipulating data from the disk, which automatically gives us a mental warning that mechanical movement is too slow. So a lot of the subsequent optimizations were really about reducing disk I/O, and that’s a topic that’s going to be talked about for a long time, no matter how awesome the technology.