Rob’s Toolbox: md5sum and sha256sum

Greetings everyone, in this post I’ll be discussing the facts and importance of hashing. Hashing is the process of changing a string of characters into a fixed length value. This process is useful for digital forensics as well as for storing passwords on computer systems. When a user account is created on a computer system the operating system does not store the clear text password (i.e. not the password the user typed in. For example if a user set password as his password the operating system does not store the word “password”). Instead the operating system takes the typed in password and hashes it using a hashing algorithm. After the clear text password is put through the hashing algorithm a hash is produced. This hash is stored by the operating system. When the user attempts to log into the system the password the user types in is hashed then the hash of the typed in password is compared to the stored password hash that was made when the account was created. If the hashes match then the user is granted access.

Hashing is also used in digital forensics, when evidence is taken a copy of the original evidence is generated for examination. This “working” copy must be exactly the same as the original. The way to confirm if this is true is to use hashing. First the original evidence is hashed then the copy is hashed. If the hashes match then they are exactly the same bit for bit. With hashing when a file is even slightly changed the resulting hash will be radically different than before. (I will show this in the demo later in this post).

There are two main hashing algorithms being used in digital forensics:

  • MD5
  • SHA-256

The MD stands for message digest; this algorithm creates a 128 bit (16 byte) hash value when used. This value is sometimes shown as a 32 digit hexadecimal number.

SHA stands for secure hash algorithm; SHA-256 creates a 256 bit (32 byte) hash value. This value is sometimes shown as a 64 digit hexadecimal value.

For more information on MD5 and SHA-256 visit these pages:

These hashing algorithms are not reversible. Meaning if the hash is known it there is no way it can be changed back into the file it was computed from.

Using an Ubuntu linux system I will demo how the tools are used.

The tools that I will use are called md5sum and sha256sum. Both md5sum and sha256sum are included in the Linux coreutils program package and are usually installed by default.

First I create a test file.
First I create a test file.
I used the md5sum program to calculate the md5 hash of the test file. This is the digital fingerprint of the test file. The hash is also known as the checksum.
Next used the md5sum program to calculate the md5 hash of the test file. This is the digital fingerprint of the test file. The hash is also known as the checksum. The random string of numbers is the checksum of the test.txt file.
Next I use the sha256sum program to calculate the SHA-256 hash of the test file. The output is pretty much the same as the md5sum program. The checksum is longer than the md5 checksum.
Next I use the sha256sum program to calculate the SHA-256 hash of the test file. The output is pretty much the same as the md5sum program. The sha-256 checksum is longer than the md5 checksum.
To show that the checksums will change if the file's content is changed I'm going to slightly change the content of the test file.
To show that the checksums will change if the file’s content is changed I’m going to slightly change the content of the test file. I used the nano text editor to do this. I changed the first letter in the test file from a capital t to a lowercase t.
Lastly I take the hashes of both the original file and the altered file. Notice how both hashes changed. This shows that the file's content has been changed.
Lastly I take the hashes of both the original file and the altered file. Notice how both hashes changed. This shows that the file’s content has been changed. The blue rectangles are the original file’s hashes and the yellow rectangles are the altered file’s hashes.

Because the hashes change when the file’s content is changed this makes hashing incredibly useful if not vital to digital forensics. The original evidence as well as the evidence that is examined (the working copy) cannot change at all. If it does then the case can be thrown out. Hashing is used to make sure that changes do not happen to any of the evidence during the course of an investigation and a case. Thanks for reading.

Advertisements

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out / Change )

Twitter picture

You are commenting using your Twitter account. Log Out / Change )

Facebook photo

You are commenting using your Facebook account. Log Out / Change )

Google+ photo

You are commenting using your Google+ account. Log Out / Change )

Connecting to %s