【 Blockbuster upgrade! Write for the front end of the non-science class "computer composition principle"!!

Previously wrote a similar article (non – class front-end attention! Computer engineering is on the tip of your tongue! But it was a year ago, and this year I feel I have grown up a lot. Once again, the content is much richer than the previous article! Without further ado! Get in the car!

Preface – How important computer fundamentals really are

For example, we are going to talk about the principle of computer composition. In our upcoming iterative task, one student’s task is to write a formula calculation task similar to javascript calculator with a lot of specific business. In fact, this task has many details and needs to understand the basic principle of numbers in javascript. Javascript uses the IEEE 754 standard for 64-bit floating-point numbers, so 64-bit floating-point numbers have some inherent problems:

What is the maximum integer supported for 64-bit floating-point numbers? This should be clear to the product, why you can’t support a larger integer operation, which is actually the largest integer that javascript supports, which is 2 to the power of 53 minus 1. Why? You can’t understand a computer without learning how it works
JavaScript is often criticized0.3-0.2 == 0.1What’s the reason? This needs to be made clear to the product, how to deal with the accuracy problem, do not understand the representation of decimals in floating point numbers and how to convert to base 10 is not understood
How many decimal places are there in JavaScript, and why are there so many digits

The brief history of computers

Don’t underestimate this history, it is also important for us, for example, I/O devices, the evolution of I/O devices is a microcosm of the history of computing, it is very helpful for us to understand the evolution of I/O devices, you are in a larger framework to understand some of the more detailed concepts.

Tube age

The background: For military computing, such as ballistic trajectory calculations, there was a need for a computing device that could replace the human brain.
Specific output: The first electronic digital computer: ENIAC(1946) was produced in this context
- How do you do the math, roughly? Computers of the time had many logic processing elements that were wired together to perform computation at high and low voltages, which could represent the binary of 01.
- Was the main logic element is a tube, tube have us half a palm size, at the same time also means that the volume of this machine is very big, the vacuum tube power, the computer’s power consumption is also high, and the computer can identify 0101 binary number, so can only use the machine language programming, the programmer programming is in a paper bag, as the chart, There are holes for 0, there are no holes for 1

Problems: high power consumption, large volume, and start and close often logic components damage, poor stability

Transistor age

The picture is as follows: the tube is on the far right, next to the transistor and the integrated circuit. We can see that the transistor is much smaller than the tube

The background at that time: the hope of computer size, power consumption, computing power and other aspects better than the previous era
The output: The electrical properties of transistors can replace tubes, and transistors are much smaller than tubes, which means computers are much smaller
- And the emergence of process-oriented programming language and the prototype of the operating system, the manufacture of a computer probably requires tens of thousands to hundreds of thousands of transistors, and the need to use a manual way to solder transistors to the circuit board, it is very prone to error

The age of small and medium scale integrated circuits

Specific output: Integrated circuit technology has made our computers smaller and smaller, with lower power consumption and higher reliability than manually welded transistors. At this time, computers are mainly used for scientific calculation, some high-level languages are produced at the same time, and time-sharing operating system appears

The age of vlSI

With the improving of the integrated circuit technology, a large and very large scale integrated circuit, the microprocessor and microcomputer, began to appear that is our home computer now, take the apple A13 processor, each of the logic element among them not more than seven nanometers, a fingernail CPU is integrated with 8.5 billion transistors.

The basic components of computer hardware

Von Neumann system

For early computers such as ENIAC, every step of calculation and instructions needed to be executed had to be manually operated by programmers, which wasted a lot of time manually
In order to solve this problem, the von neumann proposed the stored program concept, means, the instructions in the form of binary code in the prior input to the computer’s memory, and memory according to the instruction from the first address is stored inside the first instruction in order a a, until the end of the program execution, This automatic execution mechanism greatly improves the computational efficiency of the computer than manual operation
The von Neumann system is based on an algorithm, and our modern computers are based on memory. It’s not worth knowing how the von Neumann system is based on an algorithm, so let’s just introduce the components of a modern computer that are based on memory

First of all, the five basic components of a computer are shown as follows: input device (such as keyboard), memory (such as memory), arithmetic unit (CPU), controller (CPU) and output device (display). Let’s take a look at how these basic hardware devices process data

The solid lines in the figure above are data lines, which are the paths of data exchange, and the dashed lines are control lines and feedback lines, which are the paths of command transmission

First of all, our data will be added to the form of 0101 that the engineering computer can recognize through the input device, and the code that we directly input is not recognized by the computer.
The data processed by the input device is then stored in memory (the controller controls the input device), which holds data and program instructions
Then the controller can directly obtain the program instructions to be executed from the memory, after obtaining the instructions, the controller will analyze what the instructions to do (the instructions are divided into operation code and address code), analysis is the operation code, exactly what to do
Suppose it is an operation to read data, that is, to fetch data from the memory to the computer, then the address to read the data is written in the address code, then the computer to tell the memory to fetch the address of the data, and the memory directly to the computer
At the end of the operation, the result is returned to the memory, which can directly return the result to the output device (under the control of the controller)
Finally, the output device, such as the monitor, will see the data we want

Next is the dry text, too boring, break for 5 minutes, we first eat a chicken leg, continue!

Here’s the basic computer process. Let’s take an actual javascript code as an example:

Suppose that in our JS code, run the code let a = 1 + 1, then how do the above five computer components deal with? `

First, the keyboard input code let a = 1 + 1 will be parsed into binary code, which is placed into memory under the control of the controller
Then the memory is saved, and the CPU controller begins to fetch instructions from memory, analyzing that the instruction is an addition operation (1+1 is performed first, and the result of 1+1 is assigned to variable A later).
The controller then controls the arithmetic unit, which takes two ones directly from memory, does an addition operation to get the result, and returns it to memory, storing it in an address in memory
The controller then executes the second instruction (let a = 2), since 2 has already been computed, and the second instruction is assignment (assign the value of 1+1 to variable A, which is just a memory address).
The CPU controller will control the CPU processor to do the 1+1 addition and get the result 2
At the end of the command, if we print console.log(a), which is essentially a memory address, the CPU will find the value in that address, which is what console.log shows
After obtaining the value to display, the memory passes the data directly to the display, so that we can see the result of 2 on the screen

Computer programming language

The diagram below shows the differences between interpreted languages like javascript and Python and compiled languages like C and C ++. Understand why interpreted languages are generally slower than compiled languages.

High-level languages generally have two ways of translating into machine languages

One is to use it directlyThe compiler, to convert the high-level language tobinaryCode, such ascSo that thecThis is particularly fast because it is compiled in machine language and runs directly on the system, as shown above, but the problem is that compilation can be slow.
One is explanatory, for examplejs, is to translate a line of code intoMachine languageIt is possible thatassemblyCode orThe bytecode), interpret a line, execute a line

It is important to note that in accordance with the first one will be a lot of advanced code translated into machine language, which has a lot of space to make the compiler code optimization, interpreted language is hard to do the optimization, but in the v8 engine, js, or to be optimized, the compilation phase (code is compiled and executed two phase) to do some optimization code, The method of executing immediately after compilation is often referred to as JIT (Just In Time) Comipler

Carry counting system (emphasis)

This chapter mainly introduces base conversion, such as how to convert base 10 to base 2, and how to convert base 2 to base 10.

It is necessary to master these things. For example, Leetcode has a simple problem called Excel number, which is essentially 26 base to 10 base. It is not easy to solve this problem without understanding base conversion.

How can an arbitrary base be converted to decimal

For example, how to convert base 2 101.1 to base 10. (parseInt(‘101.1’, 2) is not acceptable because parseInt returns an integer.)

The conversion method is as follows (adding weights) : base 2 101.1 = 1 x 22 + 0 x 21 + 1 x 20 + 1 x 2-1

The rule is that you multiply every number in binary by the corresponding power of 2, and notice that you multiply it by the negative power of the decimal point.

Here is a question: The first plane of unicode code, which can contain characters in the range 0000-FFFF (hexadecimal). What is the value of hexadecimal FFFF in base 10?

Decimal integer to arbitrary base

The method is to divide the quotient mod method: for example, base 10 to base 2

For example, convert 89 to a binary number

Let’s put 89 in binary

More than 89 present 2 = 44 1

44 members present 2 = 22 more than 0

More than 22 members present 2 = 11 0

More than 11 members present 2 = 5 1

More than 5 members present 2 = 2 1

More than 2 present 2 = 1 0

More than 1 present 2 = 0 1

And then you sort the remainder from the bottom up

1011001

So that’s 89 in binary

Decimal Decimal to n base

Let’s take base 2 as an example, and use the “round by two, order” method. The specific approach is:

Multiply two by a decimal number to get the product, taking out the integral part of the product –
Multiply the remaining fraction by 2, and you get another product, and then you take out the integral part of the product
This continues until the fractional part of the product is zero or the desired precision is achieved

So base N is the same thing

Let’s take a specific example

For example, decimal 0.25 is converted to binary

0.25 * 2 = 0.5Extract the integer part:0
0.5 * 2 = 1.0Take the integer part 1

That is, the binary of decimal 0.25 is 0.01 (the first fetch is the highest bit, the last fetch is the lowest bit).

In this case, we can try the decimal system 0.1 and 0.2 to convert to binary, to see why 0.1 + 0.2 does not equal 0.3

0.1(decimal) =0.0001100110011001(binary) Decimal number0.1Conversion to binary calculation process:0.1*2=0.2...0-- Integer parts are"0". Integer part"0"After zero clearance"0","0.2"Then calculate.0.2*2=0.4...0-- Integer parts are"0". Integer part"0"After zero clearance"0","0.4"Then calculate.0.4*2=0.8...0-- Integer parts are"0". Integer part"0"After zero clearance"0","0.8"Then calculate.0.8*2=1.6...1-- Integer parts are"1". Integer part"1"After zero clearance"0","0.6"Then calculate.0.6*2=1.2...1-- Integer parts are"1". Integer part"1"After zero clearance"0","0.2"Then calculate.0.2*2=0.4...0-- Integer parts are"0". Integer part"0"After zero clearance"0","0.4"Then calculate.0.4*2=0.8...0-- Integer parts are"0". Integer part"0"After zero clearance"0","0.8"Then calculate.0.8*2=1.6...1-- Integer parts are"1". Integer part"1"After zero clearance"0","0.6"Then calculate.0.6*2=1.2...1-- Integer parts are"1". Integer part"1"After zero clearance"0","0.2"Then calculate.0.2*2=0.4...0-- Integer parts are"0". Integer part"0"After zero clearance"0","0.4"Then calculate.0.4*2=0.8...0-- Integer parts are"0". Integer part"0"After zero clearance"0","0.2"Then calculate.0.8*2=1.6...1-- Integer parts are"1". Integer part"1"After zero clearance"0","0.2"Then calculate. ... ... So, the integers are, in turn,"0","0","0","1","1","0","0","1","1","0","0","1"...... . So you can definitely see that there's an infinite loop going on in the integer part.Copy the code

Let’s look at 0.2

0.2Reduced binary is0.2*2=0.4, the integer bit is0
0.4*2=0.8, the integer bit is0
0.8*2=1.6, the integer bit is1, remove the integer bits0.6
0.6*2=1.2, the integer bit is1, remove the integer bits0.2
0.2*2=0.4, the integer bit is0
0.4*2=0.8. The integer bit is0Just push it! The decimal *2Round and round and round and round and round0.0011001
Copy the code

So 0.1 and 0.2 don’t translate perfectly into binary, so of course they don’t add up to 0.3

Truth value and number of machines

Such as:

+15 => 01111(base 2)

-8 => 11000 (base 2)

Truth value is the number form we use in our daily life, such as +15, -8, the number of machines is the form stored in the machine, that is, the form of base 2, where 01111, the first 0 represents a positive number, 1111 is the value saved, converted to base 10 is 15

So it adds up to plus 15

A character encoding

byte

Inside a computer, all information is ultimately a binary value
Each bit has two states, zero and one, so eight bits can make 256 states, which is called a byte.

unit

8 bits = 1 byte
1024 bytes = 1K
1024K = 1M
1024M = 1G
1024G = 1T

Base in JavaScript

Hexadecimal said

let a = 0b10100; // let b = 0o24; // let c = 20; // let d = 0x14; // hexadecimal console.log(a == b); console.log(b == c); console.log(c == d);Copy the code

Hexadecimal conversion

Base 10 to any base 10 number. ToString (target base)

console.log(c.toString(2)); Copy the codeCopy the code

ParseInt (‘ Arbitrary base string ‘, primitive base), the decimal part is truncated;
```
console.log(parseInt('10100', 2)); Copy the codeCopy the code
```

ASCII

At first computers were only used in the United States, and an eight-bit byte could combine 256 different states. The 0-32 states specify the special purpose. Once the terminal or printer encounters the agreed bytes, it must do some agreed actions, such as:

When 0×10 is encountered, the terminal is newline;
When it hits 0×07, the terminal beeps at people;

And all the Spaces, punctuation marks, numbers, upper and lower case letters are represented by successive byte states, until no. 127, so that the computer can use different bytes to store English words

The 128 symbols (including 32 non-printable control symbols) occupy only the last seven bits of a byte, with the first one uniformly specified as zero

This scheme is called ASCII encoding

GB2312

Later, some countries in Western Europe did not use English, and their letters were not in ASCII. In order to save their text, they used the space after 127 to save the new letter, until the last digit 255. For example, the French e code is 130. Of course, the symbol varies from country to country. For example, 130 stands for E in French, but Gimel (ג) in Hebrew.

The set of characters from 128 to 255 pages is called the extended character set.

In Order to represent Chinese characters, The Chinese government abolished the symbols after 127

A character less than 127 has the same meaning as the original, but two characters larger than 127 are linked together to represent a Chinese character.
The preceding byte (which he calls the high byte) from0xA1use0xF7, the following byte (low byte) from0xA1 到 0xFE;
So we can compose about 7,000 (247-161)*(254-161)=(7998) simplified Chinese characters.
Mathematical symbols, Japanese kana, and ASCII numbers, punctuation marks, and letters were rewritten into two-character codes. These are full corner characters, and those below 127 are half corner characters.
Call this character scheme GB2312. GB2312 is a Chinese extension of ASCII

GBK

Later, it was not enough, so it simply dropped the requirement that the lower byte must be after 127, and the first byte greater than 127 was fixed as the beginning of a Chinese character. Nearly 20,000 new Chinese characters (including traditional Characters) and symbols were added.

Each country, like China, developed its own coding standards, resulting in no one understanding each other’s coding and no one supporting the others’ coding

Unicode

The international organization of ISO scrapped all regional coding schemes and started over with a code that included every culture, every letter and symbol on the planet! Unicode is, of course, a large set, now larger than a million symbols.

International Organization for Standardization: International Organization for Standardization.
Universal multiple-OCTET Coded Character Set, abbreviated as UCS, commonly known as Unicode

ISO directly states that all characters must be represented in two bytes, or 16 bits. For ASCII half-corner characters, Unicode keeps its original encoding unchanged, but expands its length from 8 bits to 16 bits, while characters from other cultures and languages are recoded entirely.

From Unicode onwards, whether it is a half-angle English letter, or a full-angle Chinese character, they are the same character! At the same time, they are the same two bytes

A byte is an 8-bit physical storage unit,
A character is a culturally relevant symbol.

A Plane.

Unicode uses numbers from 0 to 0x10FFFF, all of which have corresponding pairs of characters (of course, some are not encoded yet, and some are used for private customization). Each number is a Code Point.

These code points are divided into 17 planes. It’s actually 17 groups with fancy names

Plane 3 to Plane 14 is not used, TIP(Plane 3) is used to map oracle script, golden script, xiao Zhuan and other ideographic characters. Pua-a, puA-B is for private use, is used for everyone’s own fun – to store some custom characters.

Plane 0, conventionally called Basic Plane; The remaining ones are called Supplementary planes.

utf 32

Utf-32 uses four bytes to represent a stored code point: converts the code point to a 32-bit binary, and fills 0 to the left if there are not enough bits.

4 bytes is 4 * 8 = 32 bits, which can represent 2 to the 32nd digits. These numbers can correspond to 2 to the 32nd characters, but we usually use 0 to 2 to the 16th characters, and you can see that UTF32 encodings are particularly space-wasting

utf 16

Utf-16 uses two bytes to represent the base plane and four bytes to represent the extension plane. That is, utF-16 encoding length is either 2 bytes (U+0000 to U+FFFF) or 4 bytes (U+010000 to U+10FFFF)

UTF-8

Utf-8 is a variable-length encoding method, with character lengths ranging from 1 to 4 bytes. The more commonly used characters are, the shorter the bytes are. The first 128 characters are represented by only one byte, which is exactly the same as ASCII characters.

Serial number range	byte
0x0000 – 0x007F	1
0x0080 – 0x07FF	2
0x0800 – 0xFFFF	3
0x010000 – 0x10FFFF	4

The scope of Chinese in Unicode

4E00~9FA5	China, Japan and South Korea have unified ideograms
2 e80 – A4CF	Toward the radical complement between China and Japan, kangxi radical, ideogram descriptors, the symbols and punctuation between China and Japan, Japanese hiragana, Japanese katakana, phonetic letters, English proverbs compatible letters, pictographs, annotation marks, phonetic alphabet extension, the stroke between China and Japan, Japanese katakana voice with ring and extend the letters and between China and Japan in the compatibility between China and Japan, A, the 64 hexagrams of yi Jing, the unified ideograms of China, Japan and South Korea, yi syllables, yi character roots
F900-FAFF	Ideograms compatible with China, Japan and Korea
FE30-FE4F	China, Japan and Korea compatible form
FF00-FFEF	Full ASCII, full English punctuation, half katakana, half Hiragana, half Korean alphabet

General use 4 e00-9 fa5 can already, if you want to a broader, use 2 e80 – A4CF | | F900 – FAFF | | FE30 – FE4F

See, 4e00-9fa5 is the range of regular expressions that match Chinese. Why are you here? You see how this works

How do I go to UTF8 in javascript?

You can use encodeURIComponent

encodeURIComponent('张')
"%E5%BC%A0"
Copy the code

In addition, we usually say that Chinese is represented by two bytes, but this is wrong. The representation of several bytes completely depends on the encoding, for example, UTF8 and UTF16 may have the same Unicode code, and the number of encoded bytes is different.

Our usual page is UTF8 encoding, in fact, in the underlying binary system, Chinese is usually 3 bytes.

How does JavaScript use Unicode internally

While JavaScript source files can have any kind of encoding, JavaScript internally converts them to UTF-16 before execution.

JavaScript strings are utF-16 sequences, as the ECMAScript standard says:

When a String contains actual text data, each element is treated as a single UTF-16 code unit

Number of fixed points (emphasis)

Fixed point and floating point numbers

Unsigned number

Is the whole machine word length (machine word length refers to the number of bits of binary data that can be processed by the computer for an integer operation, such as we often say 32-bit machine, 64-bit machine) all the binary bits are numeric bits, no sign bits, equivalent to positive numbers.

For example, the range of 8-bit unsigned integers is binary: 00000000-11111111

In base 10 it’s 0 minus 255

Notice that when we talk about unsigned numbers, we’re talking about integers, not decimals

A fixed point representation of a signed number

Let’s take a look at how fixed-point integers and decimals are represented in a computer.

Fixed-point integer: the sign bit is the first, usually 0 for a positive number and 1 for a negative number. The decimal point defaults to the last and is hidden
Fixed decimal: The first sign bit, usually 0 for a positive number and 1 for a negative number. The decimal point is hidden behind the sign bit. The numeric part of a decimal can also be called the mantissa, as we used to refer to floating-point numbers.

Fixed-point numbers integers and decimals can be represented by source code, inverse code, complement code, integers can also be represented by shift code, and we’ll talk about what that means later.

The original code

The source code is to use mantissa to represent the absolute value of the truth value, the sign bit 0 represents the positive number, 1 represents the negative number, assuming that our machine word length is 8 bits

So let’s take a plus 19 and a minus 19

The value of +19 is 0,0010011. The value of 19 is 1,0010011

Here’s the representation of a fixed-point decimal, and the same thing

Radix-minus-one complement

If the sign bit is 0, the inverse code is the same as the original code

If the sign bit is 1, the numeric bits are all reversed

complement

The complement is divided into:

The complement of a positive number = the source

The complement of a negative number = the end of the inverse + 1

frameshift

Complement on the basis of the symbol bit inverse, can only represent the integer. Why do you need to shift the code, shift code is very convenient to determine the size of two numbers. The diagram below:

And we’re going to see that the shift from left to right, as long as there’s a 1 first, it’s going to be bigger, if there’s a 1 first, it’s going to be bigger

Why do you need these what complement shifts

Why is there a problem with the original code? Let’s say we do an operation 14 plus minus 14.

It should be equal to 0, but when we convert them to base 2, we have a problem with the addition of fixed points, and they are not equal to 0, as shown in the following figure

What about what to do, the original code add need to subtract is 14-14, which is a step in the right direction, but this means that our computer is to design a adder and to design a subtracter, subtracter complexity is very high, in order to facilitate operation, some smart people realize that let addition instead of a subtraction, this needs us before about the complement of the knowledge.

How do I get 14 plus minus 14 right?

We can add 14 to the complement of minus 14, which is minus 14

00001110 + 11110010 (this is the complement of -14) = 100000000, because the machine word length is 8 bits, that is, it can hold up to 8 bits of base 2, the leftmost 1 will be discarded by the machine naturally, so the final result is 00000000.

Floating point Numbers

Why floating-point numbers, mostly fixed-point numbers, are particularly space-wasting for very large numbers. For example, a floating-point number 1.2 x 10 to the 20th, which we know is base 10, only needs 1.2 and 20 to represent it, but a fixed-point number takes up a pit, There is certainly no floating point number that represents a larger number in a smaller space.

Let’s take an example to understand the representation of floating-point numbers, such as the number +302657264526, which is the representation of fixed-point integers. In scientific notation, we would express it as: So if we want to preserve the number represented by the scientific method, we can ignore the base of 10 and just save +11 and +3.026 to derive the scientific method of the number and get the number

We can give +11 and +3.026 two names, the order and mantissa in floating-point numbers, as shown below

Note that the order code is divided into the order symbol and the numerical part of the order code, and the mantissa is divided into the numerical part that conforms to the mantissa.

The order is positive if the decimal point moves backwards, negative if the decimal point moves forwards. The order code shows how many places the decimal point has moved.

The mantissa indicates the accuracy of a numeric value.

Among them, the exponent reflect the size of the numerical terminal response numerical accuracy, why say so, such as case, before + 11 decimal point to what moves to the right, who is, the greater the number of bits, the more the greater the number, for the mantissa, such as + 3.0265748 is also moves to the right five more accurate than + 3.026 said number

In a binary computer, the order code is usually a fixed integer represented by complement or shift, and the mantissa is usually a fixed decimal represented by source code or complement.

Floating-point numbers are represented as N = rE * M, where r is the base number, which is 2 (the same as 10 in scientific notation), E is the order code, and M is the mantissa.

An IEEE 754 standard double-precision floating-point number

The common IEEE 754 standard is divided into float and double. We can see the difference between float and double.

Because javascript numbers are double precision floating-point numbers, we’ll cover only that one. The mantissa of a double-precision floating-point number is 52 bits. It can actually represent 53 bits. Why?

And the mantissa is represented by the original code.

Now let’s talk about the points of order code. Exponent is expressed in the unsigned integer that decide a dot, as [0204] 11 power minus one (2) but that there is a problem, cannot say negative, to show a negative number can be introduced into the sign bit, but the sign bit fixed-point number subtract to introduce again complement, is very troublesome, so take a tricky way, unified minus exponent part 1023, It becomes [-1023, 1024]

Because all 0s and all 1s of order codes have special purposes, the shift codes of -1023 and 1024 are all 1s and all 0s, so the range of order codes becomes [-1022, 1023]. So what’s the special use if the order is 0 and 1?

When the step codes E are all 0, but the mantissa M is not all 0, the 1 of the first hidden mantissa 1. XXX becomes 0

When the order codes E are all 0 and mantissa M are all 0, the truth value is positive or negative 0

When the order E is all 1 and the mantissa M is all 0, infinity is indicated

When the rank code E is all 1 and the mantissa M is not all 0, NaN is indicated

Instruction system (Understanding)

Format of instruction

First of all, what is a directive? The smallest functional unit of computer operation. The set of all the instructions of a computer constitutes the instruction system of the machine, also known as the instruction set. For example, the well-known x86 architecture (Intel’S PC) and ARM architecture (mobile phone) have different instruction sets.

For example, there was news earlier that Apple was developing its own chip based on the ARM architecture (abbreviated instruction set), instead of Intel chips that used complex instruction sets.

An instruction is a statement in machine language, a meaningful set of binaries. An instruction usually consists of an opcode (OP) + address code (A)

Opcodes are basically what am I going to do, like I’m going to do 1+1, add, stop, etc.
An address code is a data address, usually a memory address, for example, to implement the addition operation.

According to the number of operand address codes in the instruction, the instruction can be divided into the following formats. Let’s give you a few examples (not all covered) to give you a feel for it, especially the three-address instruction, so you can understand the general format of the instruction

1. Zero address instruction

Only opcode OP is given, with no explicit address. This directive has two possibilities:

Instructions that do not require operands, such as empty operation instructions, stop instructions, etc

2, three address instruction

Instruction meaning :(A1) OP (A2) ->A3

It represents fetching data from A1 and A2 addresses, performing OP operations, and finally storing data to A3 addresses.

Addressing mode

What does addressing look for? What we store in computers are instructions and data, and those are the guys we’re looking for.

Instruction addressing

There are two kinds of instruction addressing: one is sequential addressing and the other is jump addressing.

1. Sequential addressing can be incremented by the program counter PC, which executes instructions in the order they are executed in memory

2. Jump addressing is implemented by transferring class instructions. Jump is the format of the next instruction given by this instruction, rather than the address of the downward instruction given by the program counter automatically incrementing by one (instead of executing instructions sequentially)

Data addressing

Determines the real address specified by the address code of this instruction. There are roughly 10 ways of addressing, and we’ll cover only three of them, because this part is all about understanding.

Direct addressing

Direct addressing is the memory address that the address code points to in the instruction is the effective address of the operand, as shown below

As shown in the figure above, address A corresponds to the operand we want

Indirect addressing

As shown in the figure above, address A does not correspond to an operand, but to another address, which refers to the operand

Base address

It means that the found address is not the actual address that we are going to look for in memory, but we need to add a base address, which is equivalent to the offset. The diagram below:

CISC and RISC

CISC (Complex Instruction System Computer) : A single instruction performs a complex basic function. X86 computers, for example, are used in laptops and desktops. Computer instruction system is relatively rich, there are special instructions to complete specific functions. Therefore, it is more efficient to handle special tasks.
RISC (Reduced instruction system computer) : a single instruction performs a basic “action”; Multiple instructions combine to perform a complex basic function. For example, ARM architecture is mainly used for mobile phones and tablets. Designers focus on frequently used instructions to make them as simple and efficient as possible. Uncommon functions are often accomplished by combining instructions. Therefore, implementing special functions on RISC machines may be less efficient.

CPU + GPU (Understand)

Since we have seen the general process of CPU and memory execution in Chapter 1, you should refer back to section 1.

I’ll add some CPU internals here

The two most important parts of the CPU are the arithmetic unit and the controller. Let’s take a look at the main functions of the arithmetic unit

2.1 Main components of arithmetic unit

As shown above, the most important part of an arithmetic unit is the ALU, or arithmetic logic unit in Chinese, which performs arithmetic and logical operations. The other MQ,ACC, we don’t have to worry about that, it’s just registers.

2.2 Main Components of the controller

The most important part of the controller is the CU (control unit), as long as it is the analysis of instructions, give control signals.

IR (instruction register), which stores the instruction that needs to be executed currently

The address of the instruction stored on the PC.

2.3 Example – The execution process of fetch command

First, the process of fetching instructions is as follows

The first step,PCThat’s where the instruction is located, and to know what the next instruction is, we have to go to memory,CPUI know what to do next.PCTo the memory storeMARTake the address of the instruction to execute,MAR(a place in memory where instruction addresses are stored)
Step two and step three,MARGo to the storage body to get the instruction, put the instruction address inMDR(a place in memory dedicated to storing data)
The fourth stepMDRIs returned toIRInside,IRIt’s where we put the instructions we just took out of the storage

The instructions are then analyzed and executed as follows

In the fifth step, IR puts the instruction into CU to analyze the instruction. For example, it analyzes that the instruction is a fetch instruction, and then executes the instruction (here the fetch instruction is actually an address code, press this address to fetch data from the storage).
In step 6, in step 7 IR will then go to the MAR in the storage, and MAR will go to the MAR in the storage according to the address in the fetch instruction
Step 8, the retrieved data is returned to the MDR (where the data is stored)
In step 9, the data in the MDR is put into the register of the arithmetic unit, where the fetch process is complete.
Here we will mainly supplement the content of GPU.

GPU(Graphics Processing Unit), also known as Graphics processor, is known as the core component of the Graphics card, is the “heart” of the Graphics card. GPU is designed for complex mathematical operations and geometric operations and designed chips, its use we commonly known is used in graphics image processing (graphics card).

CPU and GPU

We can take a look at the CPU versus GPU

CPU is equivalent to 1 old professor, can do Olympic math and primary school arithmetic, GPU is equivalent to 1000 primary school students, can only do primary school arithmetic.
From the above we can know the GPU to more space (transistor) is used as the execution unit, rather like CPU as the control unit for complex and cache (CPU need good support parallel and serial operation at the same time, strong commonality is needed to deal with a variety of different data types, and to support complex general logic judgment, This introduces a lot of branch and interrupt handling.
All these make the internal structure of CPU extremely complex, and the proportion of computing units is reduced. In fact, 25% of the chip controls of CPU are ALU, while 90% of GPU (GPU is faced with highly unified, non-interdependent large-scale data and pure computing environment without interruption. Therefore, GPU chips are much simpler than CPU chips), which is why Gpus are so powerful.

Application of GPU acceleration in front end

First we need to understand why GPU acceleration (hardware acceleration) is used (enabled), then we can discuss how and how to apply GPU acceleration.

3D or Perspective transform CSS properties
Video elements that use accelerated video decoding
A Canvas element with either a 3D (WebGL) context or an accelerated 2D context
Hybrid plug-ins (e.g. Flash)

Animate CSS animations on opacity of its own or use an animated Webkit transform element
An element that has accelerated CSS filters
If element A has an element B with A z-index smaller than it, and element B is A composition layer (in other words, the element is rendered above the composition layer), element A is promoted to the composition layer

The most common ones here are 1 and 7. 1 is the TransFrom3D property. In point seven, I’ll explain how you can tell if your page is 3d accelerated. See the picture below:

First of all:

Then observe these two layers:

That how to make a 2 d also can separate layer rendering enable GPU acceleration, only need to 2 d CSS and add an index, and then click the animation, can appear yellow border, everyone can do the test in the site: www.w3school.com.cn/css3/css3_3…

Bus (understand)

The bus part is not the focus, the main understanding of the bus work process

Definition of bus

A bus is a set of common information transmission lines that can be shared by multiple components in a time-sharing manner

Why do you need a bus structure

1. Simplify the design of hardware. We know from a brief history of computers that devices were plugged into computers so that there was no unified interface command to control them. Bus structure is convenient to adopt modular structure design method. As long as the design of bus-oriented microcomputer makes CPU plug-in, memory plug-in and I/O plug-in according to these provisions, they can be connected to the bus to work, without considering the detailed operation of the bus.

2. Good system expansibility. One is scale expansion, scale expansion only needs to insert some plug-ins of the same type. The other is function expansion. Function expansion only needs to design new plug-ins according to the bus standard, and there is often no strict limit on the location of plug-ins into the machine.

Similar to webPack’s plug-in system, adding and removing features are pluggable, making it more flexible than writing code to death.

The simple process by which a bus works

Take the picture above:

The CPU can send address information to main memory, printer, or hard disk through the address bus.
Similarly, the CPU can communicate with other hardware devices through the data bus and control bus or send control commands

The storage system

This chapter focuses on the basic principle of cache (why cache is needed, what is the principle of locality), cache replacement algorithm (interview was asked several times how to write LRU cache algorithm).

Multi-level Storage System (Understanding)

Why do you need a multilevel storage structure? As shown in the following figure, you can see why cache is introduced

The execution speed of the main memory is much slower than that of the CPU, which causes the PROBLEM that the CPU will wait when the main memory is running. For example, the CPUprocesses 10 fingers in 1 second, but it takes 1 minute to fetch 10 instructions from the memory, which is a waste of CPU resources. Therefore, in order to solve this problem, the cache-main memory method is adopted. A cache is a cache that is close to the speed of the CPU.

Classification of memory by Access Mode (Understanding)

Random access memory: The time required to read and write to any storage location is the same, regardless of the physical location of the storage location. Memory sticks, for example.
Sequential access memory: The time required to read and write a storage unit depends on the location of the storage unit. For example, if you have finished playing the tape and want to listen to it from the beginning, you need to rewind it to the beginning.

Features of RAM and ROM (emphasis)

RAM

RAM, also known as “random access memory,” is the internal memory that exchanges data directly with the CPU, also known as main memory. It can be read and written all the time and is fast, often serving as a temporary data storage medium for operating systems or other running programs. If you want to save data, you must write it to a long-term storage device (such as a hard disk).

ROM

ROM is also known as “read-only memory”, ROM stored data, is generally loaded into the machine in advance written, the machine can only be read in the process of work, but not like random memory can be quickly, convenient to rewrite. ROM storage data is stable, power off the storage data will not change

Locality principle (emphasis)

Look at the picture below

(Note that MDR and MAR are logically in main memory, but in circuit implementation, MDR and MAR are close to the CPU.)

The figure above shows a string of code that can be interpreted as a FOR loop for JS

const n = 1000;
const a = [1, 2, 3, 4, 5, 6, 7]
for(let i =0; i < n; i++) {
    a[i] = a[i] + 2
}
Copy the code

We can see that

Array data is sometimes stored contiguously in memory (array A in code corresponds to the block a[0]- A [7] in main memory)
If we want to fetch data, such as 1000ns(ns stands for nanoseconds) to fetch a[0] from memory, then 1000 * 8 = 8000 ns to fetch a[0] to a[7]
If our CPU finds that this is fetching array data, then I would like to store the nearest data blocks A [0] through A [7] in the cache, so that it only needs to fetch data once, consuming 1000ns

Cahce is an application of the locality principle

Spatial locality: Information to be used in the near future (instructionanddata), likely with the information now being used inThe storage spaceIs adjacent to each other (e.g. the data used in the for loop is stored adjacent to each other in main memory)
Time locality: Information that will be used in the near future is likely to be used today

Note in the following figure that the CPU fetches data from cache first, and if not from main memory

As you can see, the cache fetches data from a[0] to a[9] all at once in 1000ns. Since the cache is a high-speed memory, it can interact with the CPU much faster than the CPU can interact with main memory

Input/output system

The I/O section focuses on THE I/O approach, which is to understand the evolution of I/O devices

What is I/O?

Input /Output (I/O for short) refers to the transfer of data between any operation, program or device and a computer.

File reads and writes, for example, are typical I/O operations. Let’s take a look at the evolution of I/O devices

I/O device evolution process

Key: I/O device evolution process is actually the process of CPU liberation, why so say, read the following introduction to know!

The main function of early computers was computing, so the CPU was the core
Peripherals to connect the CPU needs a set of dedicated lines, peripherals to add and delete is very troublesome

Serial mode of CPU and peripherals

In the early days of computers, how did the CPU know that the I/O device had completed its task because it took time for the peripheral to prepare data, such as reading data from an external device? For example, how do you know that an I/O device has read data from a file? The CPU constantly checks whether the I/O device is ready. At this point, the CPU is in a wait state. So when the CPU is working, the I/O system is not working, the I/O system is working, the CPU is not working. Also, main memory and peripherals need to communicate with the CPU, so there is less time for the CPU.

So LET me look at the obvious problems at this stage:

Peripherals are connected separately, adding and deleting peripherals is very troublesome, we introduced the bus structure
The high-speed peripherals communicate frequently with the CPU

Let’s move on to phase two

After the CPU starts the peripherals, it returns to continue its work. After the peripherals prepare the data, the CPU is notified by the interrupt request. The CPU only needs to suspend the work on hand, deal with the process of specific data transmission, and reduce the time of continuous query

In order to solve the first phase of the CPU waiting for the I/O device, the serial way of working, all the I/O devices talk to the CPU through the I/O bus, and once one of the I/O devices has finished its task, it tells the CPU through the I/O bus, in the form of an interrupt request, that I’m ready.
To solve this problem, high-speed peripherals are connected to main memory using a direct data path, a DMA bus. DMA control only requires the CPU to arrange the initial tasks of the initial high-speed peripherals, and the subsequent data exchange is controlled by the DMA controller, which prevents frequent interruptions to the CPU and frees the CPU

Question:

The transfer task of the DMA controller is arranged by the CPU, and the type and number of DMA connection peripherals are not flexible, so we set up some processors dedicated to managing the peripherals

Finally, let’s look at stage three

In the third stage, the CPU manages the I/O devices through the channel control unit. The CPU does not need to arrange tasks for it, but simply issues commands such as start and stop, and the channel unit automatically arranges the corresponding I/O devices to work. Why is this better than DMA? Commercial midsize machines and mainframes may have so many I/O devices attached to them that the CPU can be very tired to manage them all.

Channels can be thought of as a “weak version of the CPU.” You can recognize the channel instructions, you can understand that the CPU tells the channel how much data to fetch, where to store the data in memory, and the channel takes care of itself, without the CPU having to manage so many things

As shown in the figure above, the channel is parallel to the CPU, so the channel can help the CPU share the task. In this case, the channel has its own set of instructions to execute the channel instructions.

Channels were further enhanced, with I/O processors with CPU power

Addendum: What are interrupts (emphasis, operating systems class will cover this concept as well)

Earlier we talked about a noun called interruption, and it’s a very important concept, so let’s study it as a supplementary concept

Concept of interruption

Program interrupt refers to the process of computer execution of the current program, there is some urgent need to deal with the abnormal situation or special request, THE CPU temporarily suspend the current program, and turn to these abnormal situation or special request for processing, after the processing is finished, the CPU automatically returned to the current breakpoint, continue to execute the program. Let’s take an example.

When the program is executed to K, the keystroke produces an interrupt (I/O interrupt). At this time, the CPU will terminate the execution of the current instruction and turn to deal with the interrupt. When the interrupt service program is completed, the execution of K +1 will continue.

Main references in this paper:

Tang Shuofei: Computer composition principle

Yuan Chunfeng: Computer composition principle

Principles of computer composition

Geek time: How computers are made

【 Blockbuster upgrade! Write for the front end of the non-science class “computer composition principle”!!