Read MD5

Have sailed the seven seas for water, except wushan not cloud. – to yuan

MD5 summary

The MD5 message-digest Algorithm, a widely used password hash function, produces a 128-bit hash value to ensure complete and consistent transmission of messages. MD5 was designed by American cryptographer Ronald Linn Rivest.

MD5 Implementation Steps

The following description is based on rfc1321. The following description assumes that there is a b-bit message as the input, that is:

m = m_0 m_1 ... m _{b-1}
Copy the code

Step 1: Data Padding (Append Padding)

MD5 is processed according to the block length, the block length is 512 bits. In most cases, the data length is not exactly an integer multiple of 512, so the padding is required to the given length.

Padding rule: the b bit of the original plaintext message is followed by 100… PaddingLength % 512 = 448 if b % 512 is between [448, 512(0)], add a partition and fill it as before (because RFC says fill 1bit minimum).

Step 2: Length filling

If b + paddingLength % 512 = 448, then we have 512-448 = 64 bits for the last partition, which contains the length of the original message, b. This means that MD5 can process data with a plaintext length of up to 2^64 bits.

After the processing of the above two steps, the final processed data is shown as the figure below:

Step 3: Initialize the MD buffer

The MD Buffer is a set of four 32-bit vectors.

A four-word buffer (A,B,C,D) is used to compute the message digest. Here each of A, B, C, D is a 32-bit register. These registers are initialized to the following values in hexadecimal, low-order bytes first): word A: 01 23 45 67 word B: 89 ab cd ef word C: fe dc ba 98 word D: 76 54 32 10

Note that the implementation of the program needs to be handled in the following way:

let A = 0x67452301;
let B = 0xEFCDAB89;
let C = 0x98BADCFE;
let D = 0x10325476;
Copy the code

Step 4: Process each chunk

This step is the core of the whole MD5 algorithm, but also the most convoluted part, if I describe where is not very clear, then the big guys combined with the original RFC look.

The F function in the figure is actually the following four functions:

F(X,Y,Z) = XY v not(X) Z
G(X,Y,Z) = XZ v Y not(Z)
H(X,Y,Z) = X xor Y xor Z
I(X,Y,Z) = Y xor (X v not(Z))
Copy the code

This section uses a sine function table, as described in the RFC:

This step uses a 64-element table T[1 … 64] constructed from the sine function. Let T[i] denote the i-th element of the table, which is equal to the integer part of 4294967296 times abs(sin(i)), where i is in radians. The elements of the table are given in the appendix.

In simple terms, this is the absolute value of the sine of a constant (4294967296).

T = [int(abs(math.sin(i)) * 4294967296) for i in range(1.65)]
# output
# [3614090360, 3905402710, 606105819, 3250441966, 4118548399, 1200080426, 2821735955, 4249261313, 1770035416, 2336552879, 4294925233, 2304563134, 1804603682, 4254626195, 2792965006, 1236535329, 4129170786, 3225465664, 643717713, 3921069994, 3593408605, 38016083, 3634488961, 3889429448, 568446438, 3275163606, 4107603335, 1163531501, 2850285829, 4243563512, 1735328473, 2368359562, 4294588738, 2272392833, 1839030562, 4259657740, 2763975236, 1272893353, 4139469664, 3200236656, 681279174, 3936430074, 3572445317, 76029189, 3654602809, 3873151461, 530742520, 3299628645, 4096336452, 1126891415, 2878612391, 4237533241, 1700485571, 2399980690, 4293915773, 2240044497, 1873313359, 4264355552, 2734768916, 1309151649, 4149444226, 3174756917, 718787259, 3951481745]
Copy the code

Having this table, which is actually an important feature of MD5, allows you to search the table directly in memory.

This is followed by 16 rounds of calculations for each block, which I won’t post here but refer to the RFC description in Resources below.

Step 5: Output

Finally, the final state of A, B, C, and D is the output, and this step is very simple and I won’t explain it too much.

Code implementation

The rust implementation of MD5 is given below.

pub static T: [u32; 64] = [
    3614090360.3905402710.606105819.3250441966.4118548399.1200080426.2821735955.4249261313.1770035416.2336552879.4294925233.2304563134.1804603682.4254626195.2792965006.1236535329.4129170786.3225465664.643717713.3921069994.3593408605.38016083.3634488961.3889429448.568446438.3275163606.4107603335.1163531501.2850285829.4243563512.1735328473.2368359562.4294588738.2272392833.1839030562.4259657740.2763975236.1272893353.4139469664.3200236656.681279174.3936430074.3572445317.76029189.3654602809.3873151461.530742520.3299628645.4096336452.1126891415.2878612391.4237533241.1700485571.2399980690.4293915773.2240044497.1873313359.4264355552.2734768916.1309151649.4149444226.3174756917.718787259.3951481745
];

fn f(x: u32, y: u32, z: u32) - >u32{ (x & y) | (! x & z) }fn g(x: u32, y: u32, z: u32) - >u32{ (x & z) | (y & ! z) }fn h(x: u32, y: u32, z: u32) - >u32 {
    x ^ y ^ z
}

fn i(x: u32, y: u32, z: u32) - >u32{ y ^ (x | ! z) }fn ff(a: u32, b: u32, c: u32, d: u32, m: u32, s: u32) - >u32 {
    // b + ((a + F(b,c,d) + X[k] + T[i]) <<< s)
    a.wrapping_add(f(b, c, d)).wrapping_add(m).rotate_left(s).wrapping_add(b)
}

fn gg(a: u32, b: u32, c: u32, d: u32, m: u32, s: u32) - >u32 {
    // b + ((a + G(b,c,d) + X[k] + T[i]) <<< s)
    a.wrapping_add(g(b, c, d)).wrapping_add(m).rotate_left(s).wrapping_add(b)
}

fn hh(a: u32, b: u32, c: u32, d: u32, m: u32, s: u32) - >u32 {
    // b + ((a + H(b,c,d) + X[k] + T[i]) <<< s)
    a.wrapping_add(h(b, c, d)).wrapping_add(m).rotate_left(s).wrapping_add(b)
}

fn ii(a: u32, b: u32, c: u32, d: u32, m: u32, s: u32) - >u32 {
    // b + ((a + I(b,c,d) + X[k] + T[i]) <<< s)
    a.wrapping_add(i(b, c, d)).wrapping_add(m).rotate_left(s).wrapping_add(b)
}

pub struct MD5 {}

impl MD5 {
    fn padding(message: &[u8]) -> Vec<u8> {
        let message_length = message.len() as u64 * 8;
        let mut result = message.to_owned();
        // padding 1
        result.push(0x80);
        // padding 0
        while ((result.len() * 8) + 64) % 512! =0 {
            result.push(0b00000000);
        }
        // padding message length
        for b in 0.8 {
            result.push((message_length >> (b * 8)) as u8);
        }
        return result;
    }

    pub fn hash(message: &[u8]) -> String {
        let padding_message = MD5::padding(message);
        // init MD Buffer
        let mut a0 = 0x67452301u32;
        let mut b0 = 0xefcdab89u32;
        let mut c0 = 0x98badcfeu32;
        let mut d0 = 0x10325476u32;
        // Process Message in 16-Word Blocks
        for chunk in padding_message.chunks(64) {
            let m: Vec<u32> = chunk.chunks(4).map(|i| {
                ((i[3] as u32) < <24) | ((i[2] as u32) < <16) | ((i[1] as u32) < <8) | ((i[0] as u32) < <0)
            }).collect();

            let mut a = a0;
            let mut b = b0;
            let mut c = c0;
            let mut d = d0;

            // round 1
            for i in (0.16).step_by(4) {
                a = ff(a, b, c, d, m[i].wrapping_add(T[i]), 7);
                d = ff(d, a, b, c, m[i + 1].wrapping_add(T[i + 1]), 12);
                c = ff(c, d, a, b, m[i + 2].wrapping_add(T[i + 2]), 17);
                b = ff(b, c, d, a, m[i + 3].wrapping_add(T[i + 3]), 22);
            }

            // round 2
            let mut t = 1;
            for i in (0.16).step_by(4) {
                a = gg(a, b, c, d, m[t & 0x0f].wrapping_add(T[16 + i]), 5);
                d = gg(d, a, b, c, m[(t + 5) & 0x0f].wrapping_add(T[16 + i + 1]), 9);
                c = gg(c, d, a, b, m[(t + 10) & 0x0f].wrapping_add(T[16 + i + 2]), 14);
                b = gg(b, c, d, a, m[(t + 15) & 0x0f].wrapping_add(T[16 + i + 3]), 20);
                t += 20;
            }

            // round 3
            t = 5;
            for i in (0.16).step_by(4) {
                a = hh(a, b, c, d, m[t & 0x0f].wrapping_add(T[32 + i]), 4);
                d = hh(d, a, b, c, m[(t + 3) & 0x0f].wrapping_add(T[32 + i + 1]), 11);
                c = hh(c, d, a, b, m[(t + 6) & 0x0f].wrapping_add(T[32 + i + 2]), 16);
                b = hh(b, c, d, a, m[(t + 9) & 0x0f].wrapping_add(T[32 + i + 3]), 23);
                t += 12;
            }

            // round 4
            t = 0;
            for i in (0.16).step_by(4) {
                a = ii(a, b, c, d, m[t & 0x0f].wrapping_add(T[48 + i]), 6);
                d = ii(d, a, b, c, m[(t + 7) & 0x0f].wrapping_add(T[48 + i + 1]), 10);
                c = ii(c, d, a, b, m[(t + 14) & 0x0f].wrapping_add(T[48 + i + 2]), 15);
                b = ii(b, c, d, a, m[(t + 21) & 0x0f].wrapping_add(T[48 + i + 3]), 21);
                t += 28;
            }

            a0 = a0.wrapping_add(a);
            b0 = b0.wrapping_add(b);
            c0 = c0.wrapping_add(c);
            d0 = d0.wrapping_add(d);
        }

        // output
        let mut result = String::new();
        for v in &[a0, b0, c0, d0] {
            result.push_str(&format!(
                "{:02x}{:02x}{:02x}{:02x}",
                (v >> 0) as u8,
                (v >> 8) as u8,
                (v >> 16) as u8,
                (v >> 24) as u8)
            )
        }
        result
    }
}

#[cfg(test)]
mod test {
    use crate::md5::MD5;

    #[test]
    fn test() {
        println!("md5([empty string]) = {}", MD5::hash("".as_bytes()));
        println!("md5([The quick brown fox jumps over the lazy dog]) = {}", MD5::hash("The quick brown fox jumps over the lazy dog".as_bytes())); }}Copy the code

One day

Here is an MD5 algorithm I modified (I have not verified the collision resistance), can you restore the algorithm to calculate the MD5 of 1629547200?

Link: pan.baidu.com/s/1Y-M-TOel… Extraction code: 9PO4

Tip: this APP has no reverse debugging, do not hook the answer, try to restore the algorithm.

The resources

  • rfc1321
  • MD5 – Wikipedia