NULL

In theory, theory and practice are in accord. In practice, hehe.

— Einstein

(INFO: Formulas will not appear in this article, please feel free to read)

AES 256 has been cracked?

For TLNR (Too Long, Not Read) readers, put the answer here: Yes, but Not quite.

The event review is as follows: a few days ago, a news entitled “AES 256 encryption was broken a set of 1500 yuan equipment within 5 minutes” was forwarded on the Internet, which attracted the attention of all walks of life. The news was picked up by major Chinese media outlets, and popular comments included a variety of highly praised but actually incorrect claims: some said that the dictionary attacked wireless signals, which is different from cracking AES, while others said that the dictionary attacked wireless signals based on radio characteristics, which is unrelated to AES. Some media who want to make a big news directly said that the router was cracked, and even said that it took 5 minutes to crack any WiFi password, lest the world would be in chaos.

In fact, this crack came from Fox-it [1], which did attack the AES algorithm itself, using information leaked by electromagnetic radiation, which could be used for wireless attacks (walls with ears). This type of Attack, known as Side Channel Attack, has been studied in academia and industry for more than 20 years. It is a systematic Attack method. This time, AES256 was attacked by using the electromagnetic bypass signal to perform differential power analysis (DPA) and obtain the key. From the introduction itself, it is a very good work, but it is not the first time that AES has been cracked. AES 128 can be cracked in a similar way for a long time. AES 256, in the opinion of DPA, has no essential difference from the former and has been cracked in the laboratory. And it certainly won’t take five minutes to crack any WiFi password. The reason is that SCA requires certain physical conditions, and currently AES algorithm itself is still secure, so there is no need to panic too much.

Background knowledge

Take AES, which stands for Advanced Encryption Standard, a block Encryption Standard adopted by the U.S. federal government and the de facto industry Standard for block ciphers. AES is widely used in various fields (including WiFi encryption of course). In fact, AES hardware accelerators are widely included in the current mainstream processors (CRYP in STM32 for a few dollars [2], AES-Ni in Intel CPUS for thousands of dollars [3]). For such a mature cryptography standard, the cryptography algorithm itself is designed very perfect, the traditional differential analysis, linear analysis and other methods basically can not be completed in limited complexity, AES is safe in theory. But as the title of this article describes, even if there is a perfectly safe algorithm, there is no perfectly safe system. The principle of universality and diversity in materialist dialectics, the theoretical security of modern cryptosystem in design, can not replace the implementation security of cryptosystem. The attacker can observe the bypass leakage of time, power consumption and electromagnetic radiation without interfering the operation of the cryptochip, and then restore the key with the implementation of the algorithm to achieve the so-called bypass attack. The defense against bypass attack involves the category of cryptographic algorithm implementation security. Obviously, the successful AES 256 of this attack shows that there is still a long way to go in the research of implementation security.

Here is a brief introduction to the AES algorithm. The AES algorithm consists of multiple rounds, and in each round (except the last), there are four steps [4][5] :

(1) AddRoundKey — Every byte in the matrix performs XOR operation with the round key; Each subkey is generated by the key generation scheme.

(2) SubBytes – Each byte is replaced with its corresponding byte by a lookup table using a nonlinear substitution function.

(3) ShiftRows — Shift each horizontal column in the matrix in a circular manner.

(4) MixColumns — operations in order to fully mix each straight line in the matrix. This step uses a linear transformation to mix each of the four bytes inline. In the last encryption loop, the MixColumns step is omitted and replaced with another AddRoundKey.

The flow chart of AES

Next, notice the two numbers, 2^256 and 8192. 2^256 is the entire key space, which is a fairly large number, indicating that if you had to guess the key one by one, you would need this many times to make sure you got it right. This number is too large to guess, so brute force cracking is not recommended.

But 8,192 guesses, which is a perfectly acceptable number for a computer. How do you get it right out of 8192? Information theory tells us that there must be additional information input. In short, the idea of divide and conquer: one guess at a time. A 256-bit key contains 32 Bytes. If you guess byte by byte, each byte has 256 possibilities. 32 Bytes require 256*32 = 8192 guesses.

However, due to the limitation of bit width and algorithm implementation, the chip will not complete the whole 256bits key processing at one time. Just as the meal needs to be eaten mouthful by mouthful, the data is also processed byte by byte, which provides an implementation basis for our byte by byte guess.

This bypass analysis still focuses on the nonlinear SubBytes as usual. The so-called nonlinear replacement function is implemented as a table lookup operation. The s-box output of lookup operations is the attack point. Of course, the optimization of the algorithm will incorporate some operations to speed up the computation, where the attacker can secretly enjoy. The reason is that while optimization requires a lot of work, in practice, the end result is table lookup, larger table lookup and smaller table lookup, which makes no real difference in a bypass attack. The difference between AES 128 and 256 is also the difference in the number of rounds, the difference in the key length, and the look-up operation itself is not substantially changed (highlighted).

Next, a diagram is used to illustrate the relationship between the elements.

The input of AES is the key and plaintext. To the attacker, the plaintext is known and the key is unknown, which is also the target of analysis.

Then an xOR operation (AddRoundKey for the first time) is performed on the original key and plaintext, and the value obtained is entered as the index of the lookup table to perform the lookup operation. Notice that the lookup table data is related to the xor of the key and plaintext. The plaintext is known, the lookup table itself is always known, and xOR operation is simple and reversible, so it can be considered that the output of lookup table is related to the key. A table lookup operation in a modern computer architecture is a fetch operation, so both the address and the data appear on the bus. If you know what data is on the bus, you can simply reverse the key. Now, what is a bus? Low frequency is a wire, rf is an antenna, antennas are a good thing for security analysts. The signal reversal of high-speed digital circuit contains rich spectral components, which will radiate to the outside of the chip. In theory, such radiation could be accurately detected and an attack could be completed.

Of course, in practice, hehe.

Part of the problem is measurement accuracy. We can’t actually measure electromagnetic radiation with such precision. What we can measure is the relative level of electromagnetic radiation. For example, there is a significant difference in signal strength when the signals on one wire are flipped together with those on eight wires. There’s a term involved here called model of leakage, which describes leaks. The Hamming Distance model was used in this attack, which means that the signal jump degree can be observed, which is also a common modeling method in electromagnetic bypass analysis.

The other problem is the signal-to-noise ratio. (even with hydropower from Brahmaputra), there is always a lot of interference in the environment, and the way to improve signal quality is to take multiple measurements and then extract the maximum statistical correlation through correlation analysis.

There are a few other issues that I don’t have space to discuss here, but the DPA solves them nicely anyway:

(1) First input a plaintext. In the encryption process, the plaintext will be xor with the i-th byte of the key. Input the plaintext into the lookup table, and the lookup table result will appear on the bus, and then generate electromagnetic radiation. This process actually takes place physically, during which the electromagnetic radiation is recorded using hardware.

(2) Analyze the simulation calculation process of software (1). Of course, because we do not know the specific value of the ith byte of the key, every possible value should be calculated. Use the leakage model to calculate the relative value of 256 simulated electromagnetic radiation.

(3) Repeat the process of (1) and (2) by transforming different plaintext, and obtain the result for N times. There are N actual measured values and N*256 calculated values.

(4) Using the method of correlation analysis, compare the guess value that has the greatest correlation with the actual measured value among the 256 guesses, which is the real value of the ith byte of the actual key.

(5) Repeat (1) to (4) to guess 32 key bytes respectively to obtain the complete key.

This is the main flow of electromagnetic/power differential analysis (the popular version). Because the cipher chip in the encryption process, is word-by-word processing, and processing each byte, there will be electromagnetic information leakage, giving the attacker word-by-word guess opportunity, so that the above mentioned 8192 guess to complete the crack. In practical analysis, there are many difficulties to be encountered, so let’s see how fox-it experts pull off the attack.

In actual combat

Here’s how the experts at Fox-It describe the attack.

Firstly, the electromagnetic radiation of the target chip is collected by radio frequency acquisition equipment and stored in the analysis computer after quantization of mixing frequency. The analysis computer preprocesses the collected signal and obtains the key by using the DPA method.

The target hardware is SmartFusion2 from Microsemi, a hybrid ARM and FPGA SoC. The attack targeted the ARM part, a kernel called Cortex-M3. The target software is an IMPLEMENTATION of AES 256 from OpenSSL [6]. Although SoC is a hybrid chip, it only uses ARM parts. Cortex-m3 is a classic ARM Core, and the software is also a standard implementation of OpenSSL, which can be considered as a representative attack.

SmartFusion2 SoC FPGA architecture [7]

Now let’s look at the signal chain.

The first is the antenna. In theory, antenna design is a very serious and deep work, such as the picture below is only the tip of the iceberg (from the Internet).

Back in practice, here is the antenna used in this attack.

It’s a looped antenna made “arbitrarily” from a cable and duct tape.

The attack scenario is as follows:

The green PCB is the target board, the loop antenna is suspended above the chip, and the signal passes through an external amplifier and bandpass filter, which are standard and inexpensive industrial devices.

More interesting is the acquisition equipment, usually the time domain acquisition can use oscilloscope, or special data recorder, at least also use a USRP and other software radio equipment. The experts at Fox-It naturally thought so at first.

On the left is a dedicated data recorder, stupid and bulky, but the price is beautiful. In the middle is the USRP board, the performance of the board is enough, the price is also general research institutions or a (TU) person (HAO) can afford. The interesting part is on the right, a USB gadget labeled RTLSDR that will be familiar to anyone playing radio. In fact, there are also domestic sales, the price is only tens to hundreds of RMB. This research shows that such a small entry level is fully capable of completing the attack.

Above is the AES mode observed with the above hardware. You can clearly see the AES encryption flow between I/O operations, including the Key Schedule process and 14 rounds of operations, which are clearly visible.

Next comes the analysis process, which is not described in detail in literature [1]. However, DPA is a relatively standard routine, and they also use Riscure’s Inspector, a benchmark software in the industry, because we can talk about it based on the author’s experience.

The first is signal preprocessing, which mainly includes digital filtering, complex signal into real signal, of course, also includes resampling, interception and other steps. Another important process is the alignment of different traces. It is easy to use sliding Windows and correlation analysis methods so that all traces are precisely aligned.

Next comes the real DPA, which comes with standard modules in The Inspector software and doesn’t need to be implemented yourself. However, there are some techniques, one of which is mentioned in reference [1]. In order to quickly verify that the electromagnetic signal collected is directly related to the power consumption of the equipment, and whether the location of the acquisition is correct. Do a correlation analysis with input plaintext (or output ciphertext) and trace, and verify that the leak model is valid.

This correlation curve indicates the point where there is a correlation between the data and the signal, that is, data can be detected from the collected signal.

The next step is to guess the key, and the most relevant guess in the figure below is the correct key. Experiments show that in SmartFusion2, the leakage comes from AHB, which is in line with expectations. Because AHB connects cortex-M3 to on-chip RAM, table lookup is an operation that the M3 kernel uses to access RAM. Compared with simple MCU, the influence of cache should also be considered. For instruction cache, Hamming distance model can be used. In order to connect SmartFusion2 with FPGA module, data cache is not set for the sake of data consistency, which also simplifies the attacker’s work.

The operation detects electromagnetic signals within a few centimeters, and the hardware costs less than 200 euros (1,500 yuan). In fact, if these hardware are bought in China, it may be less than 1000 yuan.

On the software side, Inspector is commercial software and requires a license fee. Fortunately, the core algorithms are already public, and you can write your own or use a cheaper solution, so it’s understandable that this is not included in the total price.

limitations

By sorting out the whole attack process, we can summarize the prerequisites and limitations of this type of bypass attack.

(1) it must be completely able to control the target device, input different plaintext to it, and control its completion of encryption operation.

(2) It must be able to get close to the target device, because to measure the physical properties of the device (electromagnetic characteristics), how close the distance depends on the electromagnetic environment of the site.

(3) Must be familiar with the algorithm and implementation details used by the target device. The algorithm itself is relatively easy to confirm, the implementation details of many devices will not be open source, but cryptography algorithms usually have several standard implementations, it is not difficult to guess.

To complete the attack, condition (1) is basic, so don’t worry if your neighbor can just bypass your router.

Condition (2) mainly depends on how close the distance is. Here are some in-depth studies.

The hand-made ring-shaped antennae used above work at a distance of only a few centimetres, before being drowned in the noise.

So the antenna engineers made the PCB antenna shown below, which is much better, but in order to reduce the jia inch (ge), it operates at 400MHz. The SmarFusion2 only works up to 142MHz. Since it is research, it is better to change the target device (it is so caprine), such as Xilinx Pynq board can run steadily at 400MHz. Practice shows that rTL-SDR can still be used to complete the attack. This attack can be done within 30 cm, but don’t forget that you need to collect 400K traces, and this is done with some electromagnetic shielding.

PCB antenna

Use a first-aid blanket to wrap the attack environment

 

Finally, one more meter of impact, which needs to be done under ideal conditions.

First, the tests were carried out in a microwave anechoic chamber to eliminate as many interference signals as possible. For the antenna, a cone antenna is used and the electrical isolation between the measuring subsystem and the encryption subsystem is ensured. The 1-meter distance was very difficult to achieve with 2.4 million traces. This ideal experiment proved that, under good enough conditions, an attack range of 1 meter was completely feasible.

defense

The bypass attack can take effect because the bypass information leaked by the password device is related to the operation data. In terms of algorithm implementation, such correlation can be eliminated through masking or hiding. The details of this are not covered in this article. In the field of cryptography algorithms, building your own wheels is not recommended for the average application or system developer. In particular, you can not think that you understand the algorithm of cryptography, to modify them, a simple small modification, may destroy the theoretical security and implementation of security, these are not ordinary developers can do things, or the mature wheel is the most reliable.

Take SmartFusion 2 as an example, you can use an implementation with protection, such as the FPGA provided by Microsemi, without using the OpenSSL algorithm. At the circuit level, techniques such as power balancing can address these leaks to some extent, and proprietary hardware for cryptographic operations is a good defense against such attacks.

reference

[1] https://www.fox-it.com/en/insights/blogs/blog/tempest-attacks-aes/

[2] www.st.com/zh/embedded…

[3] https://www.intel.com/content/www/us/en/architecture-and-technology/advanced-encryption-standard–aes-/data-protection-a es-general-technology.html

[4] en.wikipedia.org/wiki/Advanc…

[5] Pub N F. 197: Advanced encryption standard (AES)[J]. Federal information processing standards publication, 2001, 197(441): 0311.

[6] https://www.openssl.org/

[7] www.microsemi.com/products/fp…

[8] Mangard S, Oswald E, Popp T. Power analysis attacks: Revealing the secrets of smart cards[M]. Springer Science & Business Media, 2008.

* Author: Cyxu, for more security knowledge sharing and hot news, please pay attention to the official blog of Aliju security