Column address: a Python module per week

Meanwhile, welcome to follow my wechat official account AlwaysBeta, more exciting content waiting for you.

The hashlib module defines apis for accessing different cryptographic hash algorithms. To use a specific hash algorithm, you first create the hash object with the appropriate constructor or new(). Then, objects use the same API no matter what algorithm is used.

Hash algorithm

Since Hashlib is “supported” by OpenSSL, all the algorithms provided by the library are available, including:

  • MD5
  • SHA1
  • SHA224
  • SHA256
  • SHA384
  • SHA512

Some algorithms are available on all platforms, and some rely on the underlying libraries. For each list, look at the AlgorithMS_guaranteed and AlgorithMS_available functions separately.

import hashlib


print('Guaranteed:\n{}\n'.format(', '.join(sorted(hashlib.algorithms_guaranteed))))
print('Available:\n{}'.format(', '.join(sorted(hashlib.algorithms_available))))

# output
# Guaranteed:
# blake2b, blake2s, md5, sha1, sha224, sha256, sha384, sha3_224,
# sha3_256, sha3_384, sha3_512, sha512, shake_128, shake_256
# 
# Available:
# BLAKE2b512, BLAKE2s256, MD4, MD5, MD5 - SHA1, RIPEMD160, SHA1,
# SHA224, SHA256, SHA384, SHA512, blake2b, blake2b512, blake2s,
# blake2s256, md4, md5, md5 - sha1, ripemd160, sha1, sha224, sha256,
# sha384, sha3_224, sha3_256, sha3_384, sha3_512, sha512,
# shake_128, shake_256, whirlpool
Copy the code

Sample data

All examples in this section use the same sample data:

# hashlib_data.py 

import hashlib

lorem = '''Lorem ipsum dolor sit amet, consectetur adipisicing
elit, sed do eiusmod tempor incididunt ut labore et dolore magna
aliqua. Ut enim ad minim veniam, quis nostrud exercitation
ullamco laboris nisi ut aliquip ex ea commodo consequat. Duis
aute irure dolor in reprehenderit in voluptate velit esse cillum
dolore eu fugiat nulla pariatur. Excepteur sint occaecat
cupidatat non proident, sunt in culpa qui officia deserunt
mollit anim id est laborum.'''
Copy the code

MD5 sample

To compute an MD5 hash or digest of a block of data (in this case, a Unicode string converted to a byte string), first create a hash object, then add the data and call digest() or hexdigest().

import hashlib

from hashlib_data import lorem

h = hashlib.md5()
h.update(lorem.encode('utf-8'))
print(h.hexdigest())	# 3f2fd2c9e25d60fb0fa5d593b802b7a8
Copy the code

This example uses the hexdigest() method instead of digest() because the output is formatted so it can be printed cleanly. If the binary digest value is acceptable, use digest().

SHA1 sample

The SHA1 digest is calculated in the same way.

import hashlib

from hashlib_data import lorem

h = hashlib.sha1()
h.update(lorem.encode('utf-8'))
print(h.hexdigest())	# ea360b288b3dd178fe2625f55b2959bf1dba6eef
Copy the code

The digest value is different in this example because the algorithm is changed from MD5 to SHA1.

Create a hash by name

Sometimes it is more convenient to refer to an algorithm by name in a string than by using a constructor directly. For example, store hash types in configuration files. In this case, the hash object is created with new().

# hashlib_new.py 

import argparse
import hashlib
import sys

from hashlib_data import lorem


parser = argparse.ArgumentParser('hashlib demo')
parser.add_argument(
    'hash_name',
    choices=hashlib.algorithms_available,
    help='the name of the hash algorithm to use',
)
parser.add_argument(
    'data',
    nargs='? ',
    default=lorem,
    help='the input data to hash, defaults to lorem ipsum',
)
args = parser.parse_args()

h = hashlib.new(args.hash_name)
h.update(args.data.encode('utf-8'))
print(h.hexdigest())

# output
# $ python3 hashlib_new.py sha1
# ea360b288b3dd178fe2625f55b2959bf1dba6eef
# 
# $ python3 hashlib_new.py sha256
# 
# 3c887cc71c67949df29568119cc646f46b9cd2c2b39d456065646bc2fc09ffd8
# 
# $ python3 hashlib_new.py sha512
# 
# a7e53384eb9bb4251a19571450465d51809e0b7046101b87c4faef96b9bc904cf7f90
# 035f444952dfd9f6084eeee2457433f3ade614712f42f80960b2fca43ff
# 
# $ python3 hashlib_new.py md5
# 
# 3f2fd2c9e25d60fb0fa5d593b802b7a8
Copy the code

Incremental updating

Update () can call the hash calculator method repeatedly. Each time, the summary is updated as additional text is entered. Progressive updates are more efficient than reading the entire file into memory and produce the same results.

import hashlib

from hashlib_data import lorem

h = hashlib.md5()
h.update(lorem.encode('utf-8'))
all_at_once = h.hexdigest()


def chunkize(size, text):
    "Return parts of the text in size-based increments."
    start = 0
    while start < len(text):
        chunk = text[start:start + size]
        yield chunk
        start += size
    return


h = hashlib.md5()
for chunk in chunkize(64, lorem.encode('utf-8')):
    h.update(chunk)
line_by_line = h.hexdigest()

print('All at once :', all_at_once)	# All at once : 3f2fd2c9e25d60fb0fa5d593b802b7a8
print('Line by line:', line_by_line)  # Line by line: 3f2fd2c9e25d60fb0fa5d593b802b7a8
print('Same :', (all_at_once == line_by_line))	# Same : True
Copy the code

Related documents:

Pymotw.com/3/hashlib/i…