The concept of bit operation is not strange, most programmers in this field when more or less have been exposed to bit operation, it is estimated that at that time have written a lot of exercises.

Bit arithmetic itself is not difficult, the difficulty is that people have not learned to use it in system design, improve system performance, increase your irreplaceability.

Without doing too much foreshadowing, directly say the dry goods about today:

Bit operation usage scenarios

Interview questions

For example, I once interviewed Tencent

O(1) How does time detect if the integer n is a power of 2?

Looking at a Google interview question:

There were 64 bottles, 63 of which were non-toxic and one was toxic. If the rats were given the toxic drug, they would die three days later, but not if they were given other drugs, including several at the same time. Now there are only 3 days left, how many mice is the minimum required to test that bottle of medicine toxic?

This is not long su wordy bar, stable and bit operations are concerned.

There are many similar interview questions, one not paying attention will be knocked down.

This part of the overall difficulty of the topic is not big, itself is not a great knowledge point, but it is easy to be ignored by everyone, today dragon SU took out to talk about it, we can remember oh, otherwise…

System design is often used

Those of you who like to look at source code will notice that you often see code like this in it.

The lucene source

Redis source

Long Uncle source code

Notice how the code is surprisingly similar, as good design always does.

Read so much, presumably we already know that this thing still has some effect, should be well aware of his principle. We’re going to play him together.

Bit operation principle

A bit is a bit, not a byte, so a bit operation is a bit calculation.

CPU All computing is binary computing, a high-performance service must maximize the utilization of CPU resources, that is, use the least resources for maximum profit.

Of course, with the increasing speed of modern cpus, many people design systems without considering these performance points at all, but truly high-concurrency systems are extremely high performance.

It doesn’t matter if development code is a piece of shit as long as it doesn’t involve high concurrency. Most people CRUD on this piece of shit and it becomes a big piece of shit.

The boss only cares about the results, not what your code looks like. Oops, looks like uncle Long is a CRUD chicken player.

When the machine becomes too large to support, the luckiest group of programmers are born and must start refactoring the system. Why is the luckiest? Does everyone know? Chances don’t come every day. This is a golden opportunity.

Ha, that’s a bit far.

In the computer world, everything is 0, 1, 0, 1 gives birth to everything. The process of everything getting to zero or one is called coding.

The binary representation of a number in a computer is called the machine number of that number. Machine number is signed. In a computer, the highest bit of a number is used to store the symbol. Positive numbers are 0 and negative numbers are 1.

There are three ways to encode numbers in computers: source code, inverse code, and complement:

Source code: The source code representation adds a sign bit (the highest bit is the sign bit) to the front of the number: positive numbers have 0 bits and negative numbers have 1 bits. For example, decimal 3 is 00000011 in eight bits, and -3 is 10000011.

The inverse of a positive number is itself; The inverse of a negative number is based on its original code, the sign bit is unchanged, the rest of the bits are inverse.

Complement: Representation of complement: The complement of a positive number is itself; The complement of a negative number is based on its original code, with the same sign bits, the other bits reversed, and finally +1. (i.e. +1 on the basis of inverse)

These three are coded, but in computer systems, values are all represented (stored) by complement.

Here’s an example:

1. 10Source code inverse code complement00001010  --> 00001010 --> 00001010
2. -15
10001111  --> 11110000 --> 11110001  
Copy the code

Having said data coding, we already know how a data is stored in the computer, and then we will look at how the data bits are calculated.

Various programming languages provide a way to operate directly on the binary bits of the complement, namely bitwise operations.

symbol describe The rules
& with If both digits of the same bit are 1, it is 1. If one of them is not 1, it’s 0.
| or If one of the same bits is 1, it is 1.
~ non Take the opposite of 0 and 1.
^ or If the same bit is different, the value is 1; if the same bit is the same, the value is 0.
<< Shift to the left A << b represents converting a to binary by shifting b bits to the left (followed by b zeros).
>> Moves to the right A >> b indicates that the binary is shifted right by b bits (removing the last b bits), which is equivalent to a divided by 2 to the b power (rounded).

Just a couple of examples

10 & -15 = 00001010 & 11110001
Copy the code

If it is the same, it is 1; otherwise, it is 0. The final calculation result is 00000000, that is, 0

10 & 15 = 00001010 & 00001111
Copy the code

If it is the same, it is 1; otherwise, it is 0. The final calculation result is 00001010, that is 10

10 | 15 = 00001010 | 00001111
Copy the code

The value is 1 as long as the same bit is 1. 00001111 is 15

15>>2
Copy the code

Move the binary 2 bits to the right, fill in the sign bit on the left and erase the right, resulting in 00000011 or 3

15<<2
Copy the code

Shift the binary 2 bits to the left, erase the left side, keep the sign bit, fill in the right side with 0, and you get 00111100

The principle is still relatively simple, mainly is to carry on the logical operation to the bit.

Why are bits so fast?

See here in fact, most people already understand why the bit operation is fast, but warm heart long uncle or in the wordy reason, it is icing on the cake (gild the lily).

  • Storage is more user-friendly, bit storage, no need to convert after storage
  • CPU friendly, direct bit-bit operation, reducing machine count to bit-bit conversion
  • Less addressing, multiply by 2 if you move one to the left

Talk about a performance boost from searching for bits inside

For example, if you search for “Guangdong rich woman” in Baidu, the segmentation will be divided into two words “Guangdong rich woman” and recalled from two inverted versions respectively. Suppose that “Guangdong” recalls 100W doc and “Rich woman” recalls 1000W doc.

At this point, the two DOC chains will carry out a merger, and the result of the merger is that there is both Guangdong and the rich woman’s DOC.

A 64-bit CPU can process 64 doc files in a single instruction cycle, while a normal CPU can only merge one doc at a time.

This kind of performance improvement cannot be solved by adding machines.

conclusion

This content is not difficult, I hope that when you do system design, performance considerations are not simply add machines, but really maximize the VALUE of CPU.

Small changes, big effects, small changes can have a lot of effect on performance. Anyway, I’ve basically changed some of my calculations to bitwise.

I’m Uncle Long, a late bloomer on the Internet, and I’ll see you next time. Like me, remember to pay attention to me.