Geekflare is supported by our audience. We may earn affiliate commissions from buying links on this site.
Share on:

# Secure Hashing with Python Hashlib

Invicti Web Application Security Scanner – the only solution that delivers automatic verification of vulnerabilities with Proof-Based Scanning™.

This tutorial will teach you how to create secure hashes using built-in functionality from Python’s hashlib module.

Understanding the significance of hashing and how to programmatically compute secure hashes can be helpful—even if you do not work in application security. But why?

Well, when working on Python projects, you’ll likely come across instances where you are concerned about storing passwords and other sensitive info in databases or source code files. In such cases, it’s safer to run the hashing algorithm on sensitive info and store the hash instead of the information.

In this guide, we’ll cover what hashing is and how it is different from encryption. We’ll also go over the properties of secure hash functions. Then, we’ll use common hashing algorithms to compute the hash of plaintext in Python. To do this, we’ll use the built-in hashlib module.

For all of this and more, let’s get started!

## What Is Hashing?

The process of hashing takes in a message string and gives a fixed-length output called the hash. Meaning the length of the output hash for a given hashing algorithm is fixed – regardless of the length of the input. But how is it different from encryption?

In encryption, the message or plain text is encrypted using an encryption algorithm that gives an encrypted output. We can then run the decryption algorithm on the encrypted output to get back the message string.

However, hashing works differently. We just learned that the process of encryption is invertible in that you can go from the encrypted message to the unencrypted message and vice versa.

Unlike encryption, hashing is not an invertible process, meaning we cannot go from the hash to the input message.

### Properties of Hash Functions

Let’s quickly go over some properties that hash functions should satisfy:

• Deterministic: Hash functions are deterministic. Given a message m, the hash of m is always the same.
• Preimage Resistant: We’ve already covered this when we said hashing is not an invertible operation. The preimage resistance property states that it’s infeasible to find the message `m` from the output hash.
• Collision Resistant: It should be difficult (or computationally infeasible) to find two different message strings `m1` and `m2` such that the hash of `m1` is equal to the hash of `m2`. This property is called collision resistance.
• Second Preimage Resistant: This means given a message `m1` and the corresponding hash `m2`, it’s infeasible to find another message `m2` such that `hash(m1) = hash(m2)`.

## Python’s hashlib Module

Python’s built in hashlib module provides implementations of several hashing and message digest algorithms including the SHA and MD5 algorithms.

To use the constructors and built-in functions from the Python hashlib module, you can import it into your working environment like so:

``import hashlib``

The hashlib module provides the `algorithms_available` and `algorithms_guaranteed` constants, which denote the set of algorithms whose implementations are available and are guaranteed on a platform, respectively.

Therefore, `algorithms_guaranteed` is a subset of `algorithms_available`.

Start a Python REPL, import hashlib and access the `algorithms_available` and `algorithms_guaranteed` constants:

``>>> hashlib.algorithms_available``
``````# Output
{'md5', 'md5-sha1', 'sha3_256', 'shake_128', 'sha384', 'sha512_256', 'sha512', 'md4',
'shake_256', 'whirlpool', 'sha1', 'sha3_512', 'sha3_384', 'sha256', 'ripemd160', 'mdc2',
'sha512_224', 'blake2s', 'blake2b', 'sha3_224', 'sm3', 'sha224'}``````
``>>> hashlib.algorithms_guaranteed``
``````# Output
{'md5', 'shake_256', 'sha3_256', 'shake_128', 'blake2b', 'sha3_224', 'sha3_384',
'sha384', 'sha256', 'sha1', 'sha3_512', 'sha512', 'blake2s', 'sha224'}``````

We see that `algorithms_guaranteed` is indeed a subset of `algorithms_available`

## Create Hash Objects in Python

Next let’s learn how to create hash objects in Python. We’ll compute the SHA256 hash of a message string using the following methods:

• The generic `new()` constructor
• Algorithm-Specific Constructors

### Using the new() Constructor

Let’s initialize the `message` string:

``>>> message = "Geekflare is awesome!"``

To instantiate the hash object, we can use the `new()` constructor and pass in the name of the algorithm as shown:

``>>> sha256_hash = hashlib.new("SHA256")``

We can now call the `update()` method on the hash object with the `message` string as the argument:

``>>> sha256_hash.update(message)``

If you do so, you’ll run into an error as hashing algorithms can only work with byte strings.

``````Traceback (most recent call last):
File "<stdin>", line 1, in <module>
TypeError: Unicode-objects must be encoded before hashing``````

To get the encoded string, you can call the `encode()` method on the method string, and then use it in the `update()` method call. After doing so, you can call the `hexdigest()` method to get the sha256 hash corresponding to the message string.

``````sha256_hash.update(message.encode())
sha256_hash.hexdigest()

Instead of encoding the message string using the `encode()` method, you can also define it as a string of bytes by prefixing the string with `b` like so:

``````message = b"Geekflare is awesome!"
sha256_hash.update(message)
sha256_hash.hexdigest()

The obtained hash is the same as previous hash, which confirms the deterministic nature of hash functions.

In addition, a small change in the `message` string should cause the hash to change drastically (also known as “avalanche effect”).

To verify this, let’s change the ‘a’ in ‘awesome’ to ‘A’, and compute the hash:

``````message = "Geekflare is Awesome!"
h1 = hashlib.new("SHA256")
h1.update(message.encode())
h1.hexdigest()
# Output: '3c67f334cc598912dc66464f77acb71d88cfd6c8cba8e64a7b749d093c1a53ab'``````

We see that the hash changes completely.

### Using the Algorithm-Specific Constructor

In the previous example, we used the generic `new()` constructor and passed in “SHA256” as the name of the algorithm to create the hash object.

Instead of doing so, we can also use the `sha256()` constructor as shown:

``````sha256_hash = hashlib.sha256()
message= "Geekflare is awesome!"
sha256_hash.update(message.encode())
sha256_hash.hexdigest()

The output hash is identical to the hash we obtained earlier for the `message` string “Geekflare is awesome!”.

## Exploring Attributes of Hash Objects

The hash objects have a few useful attributes:

• The `digest_size` attribute denotes the size of the digest in bytes. For example, the SHA256 algorithm returns a 256-bit hash, which is equivalent to 32 bytes
• The `block_size` attribute refers to the block size used in the hashing algorithm.
• The `name` attribute is the name of the algorithm that we can use in the `new()` constructor. Looking up the value of this attribute can be helpful when the hash objects don’t have descriptive names.

We can check these attributes for the `sha256_hash` object we created earlier:

``````>>> sha256_hash.digest_size
32
>>> sha256_hash.block_size
64
>>> sha256_hash.name
'sha256'``````

Next, let’s look at some interesting applications of hashing using Python’s hashlib module.

## Practical Examples of Hashing

### Verifying Integrity of Software and Files

As developers, we download and install software packages all the time. This is true regardless of whether you’re working on the Linux distro or on a Windows or a Mac.

However, some mirrors for software packages may not be trustworthy. You can find the hash (or checksum) beside the download link. And you can verify the integrity of the downloaded software by computing the hash and comparing it with the official hash.

This can be applied to files on your machine as well. Even the smallest change in file contents will change the hash drastically, you can check if a file has been modified by verifying the hash.

Here’s a simple example. Create a text file ‘my_file.txt’ in the working directory, and add some content to it.

``````\$ cat my_file.txt
This is a sample text file.
We are  going to compute the SHA256 hash of this text file and also
check if the file has been modified by
recomputing the hash.``````

You can then open the file in read binary mode (`'rb'`), read in the contents of the file and compute the SHA256 hash as shown:

``````>>> import hashlib
>>> with open("my_file.txt","rb") as file:
...     sha256_hash = hashlib.sha256()
...     sha256_hash.update(file_contents)
...     original_hash = sha256_hash.hexdigest()``````

Here, the variable `original_hash` is the hash of ‘my_file.txt’ in its current state.

``````>>> original_hash

Now modify the file ‘my_file.txt’. You can remove the extra leading whitespace before the word ‘going’. 🙂

Compute the hash yet again and store it in the `computed_hash` variable.

``````>>> import hashlib
>>> with open("my_file.txt","rb") as file:
...     sha256_hash = hashlib.sha256()
...     sha256_hash.update(file_contents)
...     computed_hash = sha256_hash.hexdigest()``````

You can then add a simple assert statement that asserts if the `computed_hash` is equal to the `original_hash`.

``>>> assert computed_hash == original_hash``

If the file is modified (which is true in this case), you should get an AssertionError:

``````Traceback (most recent call last):
File "<stdin>", line 1, in <module>
AssertionError``````

You can use hashing when storing sensitive info, such as passwords in databases. You can also use hashing in password authentication when connecting to databases. Validate the hash of the inputted password against the hash of the correct password.

### Conclusion

I hope this tutorial helped you learn about generating secure hashes with Python. Here are the key takeaways:

• Python’s hashlib module provides ready-to-use implementations of several hashing algorithms. You can get the list of algorithms guaranteed on your platform using `hashlib.algorithms_guaranteed`.
• To create a hash object, you can use the generic `new()` constructor with the syntax: `hashlib.new("algo-name")`. Alternatively, you can use the constructors corresponding to the specific hashing algorithms, like so: `hashlib.sha256()` for the SHA 256 hash.
• After initializing the message string to be hashed and the hash object, you can call the `update()` method on the hash object, followed by the `hexdigest()` method to get the hash.
• Hashing can come in handy when checking the integrity of software artifacts and files, storing sensitive info in databases, and more.

Next, learn how to code a random password generator in Python.