1. Basic principle of MD5

MD5 is message-digest Algorithm 5, which is used to ensure information consistency. The principle of MD5 is briefly described. Input information is processed in 512-bit packets, and each packet is divided into 16 32-bit packets. After a series of processing, the output of the algorithm consists of four 32-bit packets, which are cascaded to generate a 128-bit hash value.

2. Algorithm flow

The following is an example of an SO library that invokes the standard MD5 algorithm for encryption

F5 is converted to pseudo-C code as follows

A standard MD5 algorithm first evaluates MD5Init(), MD5Update(), MD5Final() in that order, and finally computs a 32-bit hexadecimal string. It then outputs the string in hexadecimal form if necessary (sprintf(str_tmp, “%02x”, decrypt[v1]), otherwise it looks like garbled.

  1. MD5Init()

This step is the initialization variable. The initial 128-bit value is the initial link variable. These parameters are used for the first round of operation, and are represented in big-enode order: A=0x01234567, B=0x89ABCDEF, C=0xFEDCBA98, D=0x76543210. Each variable gives the value high byte stored at the low memory address and low byte stored at the high memory address, i.e., big-endian byte order. In the program, the values of variables A, B, C and D are 0x67452301, 0xEFCDAB89, 0x98BADCFE, 0x10325476 respectively)

F5 is converted to pseudo-C code as follows

Basically, in the general SO, we can see that the values are initialized to 0x67452301, 0xEFCDAB89, 0x98BADCFE, 0x10325476

  1. MD5Update (), MD5Final ()

The algorithm flow of each group is as follows:

The first group needs to copy the above four link variables into the other four variables: A to A, B to B, C to C, and D to D. The variables starting from the second group are the results of the previous group, i.e., A = A, B = B, C = C, D = D.

The main cycle has four rounds (MD4 only has three), and each cycle is similar. The first round of 16 operations. Each operation performs a nonlinear function operation on three of a, B, C, and D, and then adds the result to a fourth variable, a subgroup of the text, and a constant. Shift the result to the left by an indeterminate number and add one of a, B, C, or D. Finally replace one of A, B, C or D with the result.

Here are the four nonlinear functions used in each operation (one for each round).

F( X ,Y ,Z ) = ( X & Y ) | ( (~X) & Z )

G( X ,Y ,Z ) = ( X & Z ) | ( Y & (~Z) )

H( X ,Y ,Z ) =X ^ Y ^ Z

I( X ,Y ,Z ) =Y ^ ( X | (~Z) )

(& is And (And), | And (Or), ~ is (Not), ^ is exclusive Or (Xor))

Description of these four functions: If the corresponding bits of X, Y and Z are independent and uniform, then each bit of the result should also be independent and uniform.

F is a bitwise operation. That is, if X, then Y, otherwise Z. The function H is the bitwise parity operator.

Suppose Mj represents the JTH subgroup of the message (from 0 to 15), the constant Ti is the integer part of 4294967296*abs(sin(I)), and I takes the value from 1 to 64 in radians. (4294967296=232)

FF (a, b, c, d, Mj, s, ti) operation for a = b + ((a + F (b, c, d) + Mj + ti) < < s)

GG (a, b, c, d, Mj, s, ti) operation for a = b + ((a + G (b, c, d) + Mj + ti) < < s)

HH (a, b, c, d, Mj, s, ti) operation for a = b + ((a + H (b, c, d) + Mj + ti) < < s)

II (a, b, c, d, Mj, s, ti) operation for a = b + ((a + I (b, c, d) + Mj + ti) < < s)

Note: “<<” indicates left shift of the loop, not left shift.

The four rounds (64 steps in total) are:

The first round

FF(a ,b ,c ,d ,M0 ,7 ,0xd76aa478 )

FF(d ,a ,b ,c ,M1 ,12 ,0xe8c7b756 )

FF(c ,d ,a ,b ,M2 ,17 ,0x242070db )

FF(b ,c ,d ,a ,M3 ,22 ,0xc1bdceee )

FF(a ,b ,c ,d ,M4 ,7 ,0xf57c0faf )

FF(d ,a ,b ,c ,M5 ,12 ,0x4787c62a )

FF(c ,d ,a ,b ,M6 ,17 ,0xa8304613 )

FF(b ,c ,d ,a ,M7 ,22 ,0xfd469501)

FF(a ,b ,c ,d ,M8 ,7 ,0x698098d8 )

FF(d ,a ,b ,c ,M9 ,12 ,0x8b44f7af )

FF(c ,d ,a ,b ,M10 ,17 ,0xffff5bb1 )

FF(b ,c ,d ,a ,M11 ,22 ,0x895cd7be )

FF(a ,b ,c ,d ,M12 ,7 ,0x6b901122 )

FF(d ,a ,b ,c ,M13 ,12 ,0xfd987193 )

FF(c ,d ,a ,b ,M14 ,17 ,0xa679438e )

FF(b ,c ,d ,a ,M15 ,22 ,0x49b40821 )

The second round

GG(a ,b ,c ,d ,M1 ,5 ,0xf61e2562 )

GG(d ,a ,b ,c ,M6 ,9 ,0xc040b340 )

GG(c ,d ,a ,b ,M11 ,14 ,0x265e5a51 )

GG(b ,c ,d ,a ,M0 ,20 ,0xe9b6c7aa )

GG(a ,b ,c ,d ,M5 ,5 ,0xd62f105d )

GG(d ,a ,b ,c ,M10 ,9 ,0x02441453 )

GG(c ,d ,a ,b ,M15 ,14 ,0xd8a1e681 )

GG(b ,c ,d ,a ,M4 ,20 ,0xe7d3fbc8 )

GG(a ,b ,c ,d ,M9 ,5 ,0x21e1cde6 )

GG(d ,a ,b ,c ,M14 ,9 ,0xc33707d6 )

GG(c ,d ,a ,b ,M3 ,14 ,0xf4d50d87 )

GG(b ,c ,d ,a ,M8 ,20 ,0x455a14ed )

GG(a ,b ,c ,d ,M13 ,5 ,0xa9e3e905 )

GG(d ,a ,b ,c ,M2 ,9 ,0xfcefa3f8 )

GG(c ,d ,a ,b ,M7 ,14 ,0x676f02d9 )

GG(b ,c ,d ,a ,M12 ,20 ,0x8d2a4c8a )

In the third round

HH(a ,b ,c ,d ,M5 ,4 ,0xfffa3942 )

HH(d ,a ,b ,c ,M8 ,11 ,0x8771f681 )

HH(c ,d ,a ,b ,M11 ,16 ,0x6d9d6122 )

HH(b ,c ,d ,a ,M14 ,23 ,0xfde5380c )

HH(a ,b ,c ,d ,M1 ,4 ,0xa4beea44 )

HH(d ,a ,b ,c ,M4 ,11 ,0x4bdecfa9 )

HH(c ,d ,a ,b ,M7 ,16 ,0xf6bb4b60 )

HH(b ,c ,d ,a ,M10 ,23 ,0xbebfbc70 )

HH(a ,b ,c ,d ,M13 ,4 ,0x289b7ec6 )

HH(d ,a ,b ,c ,M0 ,11 ,0xeaa127fa )

HH(c ,d ,a ,b ,M3 ,16 ,0xd4ef3085 )

HH(b ,c ,d ,a ,M6 ,23 ,0x04881d05 )

HH(a ,b ,c ,d ,M9 ,4 ,0xd9d4d039 )

HH(d ,a ,b ,c ,M12 ,11 ,0xe6db99e5 )

HH(c ,d ,a ,b ,M15 ,16 ,0x1fa27cf8 )

HH(b ,c ,d ,a ,M2 ,23 ,0xc4ac5665 )

The fourth round of

II(a ,b ,c ,d ,M0 ,6 ,0xf4292244 )

II(d ,a ,b ,c ,M7 ,10 ,0x432aff97 )

II(c ,d ,a ,b ,M14 ,15 ,0xab9423a7 )

II(b ,c ,d ,a ,M5 ,21 ,0xfc93a039 )

II(a ,b ,c ,d ,M12 ,6 ,0x655b59c3 )

II(d ,a ,b ,c ,M3 ,10 ,0x8f0ccc92 )

II(c ,d ,a ,b ,M10 ,15 ,0xffeff47d )

II(b ,c ,d ,a ,M1 ,21 ,0x85845dd1 )

II(a ,b ,c ,d ,M8 ,6 ,0x6fa87e4f )

II(d ,a ,b ,c ,M15 ,10 ,0xfe2ce6e0 )

II(c ,d ,a ,b ,M6 ,15 ,0xa3014314 )

II(b ,c ,d ,a ,M13 ,21 ,0x4e0811a1 )

II(a ,b ,c ,d ,M4 ,6 ,0xf7537e82 )

II(d ,a ,b ,c ,M11 ,10 ,0xbd3af235 )

II(c ,d ,a ,b ,M2 ,15 ,0x2ad7d2bb )

II(b ,c ,d ,a ,M9 ,21 ,0xeb86d391 )

After all this is done, add a, B, C and D to the original, respectively. A = a + A, b = b + b, C = c + c, d = d + D

The key pseudocode is as follows:

void __fastcall MD5Transform(unsigned int *state, unsigned __int8 *block)
{
  unsigned int *v2; // r4@1
  unsigned int v3; // r5@1
  unsigned int v4; // r6@1
  unsigned int v5; // ST14_4@1
  unsigned int v6; // r7@1
  int v7; // r1@1
  int v8; // r1@1
  int v9; // r2@1
 ------skip--------
  int v132; // r0@1
  int v133; // r3@1
  unsigned int v134; // r5@1
  unsigned int v135; // r3@1
  unsigned int x[64]; // [sp+48h] [bp-118h]@1

  v2 = state;
  v3 = state[1];
  v4 = *state;
  v5 = state[2];
  v6 = state[3];
  MD5Decode(x, block, 0x40u);
  v7 = __ROR4__(x[0] - 680876936 + v4 + (v6 & ~v3 | v5 & v3), 25);
  v8 = v7 + v3;
  v9 = __ROR4__(x[1] - 389564586 + v6 + (v5 & ~v8 | v3 & v8), 20);
  v10 = v9 + v8;
  v11 = __ROR4__(x[2] + 606105819 + v5 + (v8 & v10 | v3 & ~v10), 15);
  v12 = v11 + v10;
  v13 = __ROR4__(v3 + x[3] - 1044525330 + (v10 & v12 | v8 & ~v12), 10);
  v14 = v13 + v12;
  v15 = __ROR4__(v8 + x[4] - 176418897 + (v12 & v14 | v10 & ~v14), 25);
  v16 = v15 + v14;
  v17 = __ROR4__(v10 + x[5] + 1200080426 + (v14 & v16 | v12 & ~v16), 20);
  v18 = v17 + v16;
  v19 = __ROR4__(v12 + x[6] - 1473231341 + (v16 & v18 | v14 & ~v18), 15);
  v20 = v19 + v18;
  v21 = __ROR4__(v14 + x[7] - 45705983 + (v18 & v20 | v16 & ~v20), 10);
  v22 = v21 + v20;
  v23 = __ROR4__(v16 + x[8] + 1770035416 + (v20 & v22 | v18 & ~v22), 25);
  v24 = v23 + v22;
  v25 = __ROR4__(v18 + x[9] - 1958414417 + (v22 & v24 | v20 & ~v24), 20);
  v26 = v25 + v24;
  v27 = __ROR4__(v20 + x[10] - 42063 + (v24 & v26 | v22 & ~v26), 15);
  v28 = v27 + v26;
  v29 = __ROR4__(v22 + x[11] - 1990404162 + (v26 & v28 | v24 & ~v28), 10);
  v30 = v29 + v28;
  v31 = __ROR4__(v24 + x[12] + 1804603682 + (v28 & v30 | v26 & ~v30), 25);
  v32 = v31 + v30;
  v33 = __ROR4__(x[13] - 40341101 + v26 + (v30 & v32 | v28 & ~v32), 20);
  v34 = v33 + v31 + v30;
  v35 = __ROR4__(x[14] - 1502002290 + v28 + ((v31 + v30) & v34 | ~v34 & v30), 15);
  v36 = v35 + v34;
  v37 = __ROR4__(x[15] + 1236535329 + v30 + (v34 & v36 | ~v36 & (v31 + v30)), 10);
  v38 = v37 + v36;
  v39 = __ROR4__(x[1] - 165796510 + v32 + (~v34 & v36 | v34 & v38), 27);
  v40 = v39 + v38;
  v41 = __ROR4__(x[6] - 1069501632 + v34 + (~v36 & v38 | v36 & v40), 23);
  v42 = v41 + v40;
  v43 = __ROR4__(v36 + x[11] + 643717713 + (v40 & ~v38 | v38 & v42), 18);
  v44 = v43 + v42;
  v45 = __ROR4__(v38 + x[0] - 373897302 + (v42 & ~v40 | v40 & v44), 12);
  v46 = v45 + v44;
  v47 = __ROR4__(v40 + x[5] - 701558691 + (v44 & ~v42 | v42 & v46), 27);
  v48 = v47 + v46;
  v49 = __ROR4__(v42 + x[10] + 38016083 + (v46 & ~v44 | v44 & v48), 23);
  v50 = v49 + v48;
  v51 = __ROR4__(v44 + x[15] - 660478335 + (v48 & ~v46 | v46 & v50), 18);
  v52 = v51 + v50;
  v53 = __ROR4__(v46 + x[4] - 405537848 + (v50 & ~v48 | v48 & v52), 12);
  v54 = v53 + v52;
  v55 = __ROR4__(v48 + x[9] + 568446438 + (v52 & ~v50 | v50 & v54), 27);
  v56 = v55 + v54;
  v57 = __ROR4__(v50 + x[14] - 1019803690 + (v54 & ~v52 | v52 & v56), 23);
  v58 = v57 + v56;
  v59 = __ROR4__(v52 + x[3] - 187363961 + (v56 & ~v54 | v54 & v58), 18);
  v60 = v59 + v58;
  v61 = __ROR4__(v54 + x[8] + 1163531501 + (v58 & ~v56 | v56 & v60), 12);
  v62 = v61 + v60;
  v63 = __ROR4__(v56 + x[13] - 1444681467 + (v60 & ~v58 | v58 & v62), 27);
  v64 = v63 + v62;
  v65 = __ROR4__(v58 + x[2] - 51403784 + (v62 & ~v60 | v60 & v64), 23);
  v66 = v65 + v64;
  v67 = __ROR4__(x[7] + 1735328473 + v60 + (v64 & ~v62 | v62 & v66), 18);
  v68 = v67 + v66;
  v69 = __ROR4__(x[12] - 1926607734 + v62 + (v66 & ~v64 | v64 & v68), 12);
  v70 = v69 + v68;
  v71 = __ROR4__(x[5] - 378558 + v64 + (v68 ^ v66 ^ v70), 28);
  v72 = v71 + v70;
  v73 = __ROR4__(x[8] - 2022574463 + v66 + (v70 ^ v68 ^ v72), 21);
  v74 = v73 + v72;
  v75 = __ROR4__(x[11] + 1839030562 + v68 + (v72 ^ v70 ^ v74), 16);
  v76 = v75 + v74;
  v77 = __ROR4__(x[14] - 35309556 + v70 + (v74 ^ v72 ^ v76), 9);
  v78 = v77 + v76;
  v79 = __ROR4__((v76 ^ v74 ^ v78) + x[1] - 1530992060 + v72, 28);
  v80 = v79 + v78;
  v81 = __ROR4__((v78 ^ v76 ^ v80) + x[4] + 1272893353 + v74, 21);
  v82 = v81 + v80;
  v83 = __ROR4__((v80 ^ v78 ^ v82) + x[7] - 155497632 + v76, 16);
  v84 = v83 + v82;
  v85 = __ROR4__((v82 ^ v80 ^ v84) + x[10] - 1094730640 + v78, 9);
  v86 = v85 + v84;
  v87 = __ROR4__((v84 ^ v82 ^ v86) + x[13] + 681279174 + v80, 28);
  v88 = v87 + v86;
  v89 = __ROR4__((v86 ^ v84 ^ v88) + x[0] - 358537222 + v82, 21);
  v90 = v89 + v88;
  v91 = __ROR4__((v88 ^ v86 ^ v90) + x[3] - 722521979 + v84, 16);
  v92 = v91 + v90;
  v93 = __ROR4__((v90 ^ v88 ^ v92) + x[6] + 76029189 + v86, 9);
  v94 = v93 + v92;
  v95 = __ROR4__((v92 ^ v90 ^ v94) + x[9] - 640364487 + v88, 28);
  v96 = v95 + v94;
  v97 = __ROR4__((v94 ^ v92 ^ v96) + x[12] - 421815835 + v90, 21);
  v98 = v97 + v96;
  v99 = __ROR4__(v92 + x[15] + 530742520 + (v96 ^ v94 ^ v98), 16);
  v100 = v99 + v98;
  v101 = __ROR4__(v94 + x[2] - 995338651 + (v98 ^ v96 ^ v100), 9);
  v102 = v101 + v100;
  v103 = __ROR4__(x[0] - 198630844 + v96 + ((~v98 | v102) ^ v100), 26);
  v104 = v103 + v102;
  v105 = __ROR4__(x[7] + 1126891415 + v98 + ((~v100 | v104) ^ v102), 22);
  v106 = v105 + v104;
  v107 = __ROR4__(x[14] - 1416354905 + v100 + ((~v102 | v106) ^ v104), 17);
  v108 = v107 + v106;
  v109 = __ROR4__(v102 + x[5] - 57434055 + ((~v104 | v108) ^ v106), 11);
  v110 = v109 + v108;
  v111 = __ROR4__(x[12] + 1700485571 + v104 + ((~v106 | v110) ^ v108), 26);
  v112 = v111 + v110;
  v113 = __ROR4__(x[3] - 1894986606 + v106 + ((~v108 | v112) ^ v110), 22);
  v114 = v113 + v112;
  v115 = __ROR4__(x[10] - 1051523 + v108 + ((~v110 | v114) ^ v112), 17);
  v116 = v115 + v114;
  v117 = __ROR4__(x[1] - 2054922799 + v110 + ((~v112 | v116) ^ v114), 11);
  v118 = v117 + v116;
  v119 = __ROR4__(x[8] + 1873313359 + v112 + ((~v114 | v118) ^ v116), 26);
  v120 = v119 + v118;
  v121 = __ROR4__(x[15] - 30611744 + v114 + ((~v116 | v120) ^ v118), 22);
  v122 = v121 + v120;
  v123 = __ROR4__(x[6] - 1560198380 + v116 + ((~v118 | v122) ^ v120), 17);
  v124 = v123 + v122;
  v125 = __ROR4__(x[13] + 1309151649 + v118 + ((~v120 | v124) ^ v122), 11);
  v126 = v125 + v124;
  v127 = __ROR4__(x[4] - 145523070 + v120 + ((~v122 | v126) ^ v124), 26);
  v128 = v127 + v126;
  v129 = __ROR4__(x[11] - 1120210379 + v122 + ((~v124 | v128) ^ v126), 22);
  v130 = v129 + v128;
  v131 = __ROR4__(x[2] + 718787259 + v124 + ((~v126 | v130) ^ v128), 17);
  v132 = v131 + v130;
  v133 = __ROR4__(x[9] - 343485551 + v126 + ((~v128 | v132) ^ v130), 11);
  *v2 += v128;
  v134 = v2[3];
  v2[1] += v132 + v133;
  v135 = v2[2];
  v2[3] = v134 + v130;
  v2[2] = v135 + v132;
}
Copy the code

The algorithm is then continued with the next block of data.

The final output is a cascade of A, B, C, and D

3. IDA dynamic debugging

At MD5Update(), the breakpoint is found before entering

MD5(“[hbzFZ9dew0]@YWROdW09MWJkdXNzPWJpcnRoZGF5PTIwMTgtMDUtMTFjaGFubmVsPWJhaWR1emh1c2hvdWN1aWQ9OEY3ODcxMzNERTdDOTQ1NUZEMk I5MTc3MTFDNzEwMzN8NDAzNTU2NzYwMDk0MzUzZGlnRmxhZz0xaG90PTBpc0FJPTBvdnVsYXRpb25UaW1lPTBwZXJpb2Q9MnBuPTEwcHJlZ1N0PTNxdWVyeT 3lrp3lrp3og47orrBybj0xMHRva2VuPTFfODA1ZmE3MWI2NzU3MDZjYjNmYmNiZjBjYTFjMzc3OTB0eXBlPTB2Yz0yMzZ2b2ljZT0w”) = d7166568860b42b3dcb13c695e0fc5b0

When I came out, it became

And then finally after sprintf formatting the output will become

4. MD5 encryption of Python

import hashlib

m = hashlib.md5()
m.update("123".encode("utf-8"))
result = m.hexdigest() # m.digest() returns a hexadecimal format similar to the one above, before sprintf formatting
print (result)

Copy the code

Note: MD5Update() executed twice is actually a splicing effect (such as md5.update(md5.update(“123″.encode(” utF-8 “))) is actually equal to md5.update(“123123″.encode(” utF-8 “)), So to avoid that we should re-instantiate every time, MD5Init()

5. Reveal the hash function in MD5 encryption & Inspeckage in Java layer

Here is an example of an APK decompilation:

In reverse exists in the use of Java. Java layer security. The MessageDigest class MD5 encryption, and can be used directly Inspeckage to check MD5 encrypted string before and after the difference

  • http://repo.xposed.info/module/mobi.acpm.inspeckage

The MD5 string hook is implemented in HashHook. Java

First hook Java. Security. MessageDigest getInstance, for what is a hash type

And then hook Java. Security. MessageDigest. Update, get the string before the update

Finally hook Java. Security. MessageDigest. Digest, a character before formatting output

conclusion

We can check whether there is a value that needs to be initialized in MD5Init, then check whether there is MD5 encryption, and then hook MD5Update() before and after the input and output string.

In addition, you can use the following parameters to quickly test whether it is MD5 algorithm:

MD5(“”) = d41d8cd98f00b204e9800998ecf8427e

MD5(“a”) = 0cc175b9c0f1b6a831c399e269772661

MD5(“abc”) = 900150983cd24fb0d6963f7d28e17f72

reference

  • https://baike.baidu.com/item/MD5/212708?fr=aladdin