A cryptographic hash function is a one-way computational mathematical operation (aka checksum or digest) that takes a stream of data and returns a fix sized bit string known as cryptographic hash value, this value is unique, any small modification to the file will change it, for example, modifying a single pixel on a photograph will not be noticeable by the human eye but a cryptographic hashing of the picture will return value differing from the original.
Cryptographic hashing algorithms are widely used in computer forensics to guarantee that files have not been tampered with, it can be compared to a digital fingerprint, security related software and Linux distributions normally come with a hash value, the user is meant to use a special program to calculate a hash value resulting from file he has just downloaded and make sure that it coincides with the string listed by the developer, if it doesn’t it means that the file been changed by someone or accidental data corruption occurred during the download, when two files have the same cryptographic hash value it is guaranteed that they are identical.
Hashing a file does not mean to encrypt it, cryptographic algorithms used for encryption are totally different from those used for hashing files, encryption software like Truecrypt, gives two algorithm choices, one for encrypting the data and another to hash the user keyfile or password. Another use of cryptographic hashes is password storage, encryption software does not store user passwords in plain text, it creates a cryptographic function of a password, when the user wants to decrypt the data the software performs that operation again, if the cryptographic hashes coincide it then decrypts everything.
SSL certificates contain a cryptographic hash to show its uniqueness, certification authorities use a hash algorithm to generate a certificate signature. Hashing algorithms can also be used to compare text, if the values coincide it assures content integrity this guarantees the receiver that the message has not been tampered with, in addition it is impossible to recreate the original message out of a hash string.
Note: Flaws have been found in the MD5 algorithm, The United States Computer Emergency Readiness Team (US-CERT) considers the MD5 algorithm broken and unsuitable for use, the MD5 hashing algorithm should not be used in SSL certificates and digital signatures. Most U.S. government applications require SHA-2 hash functions (SHA-224, SHA-256, SHA-384, SHA-512), SHA-2 has been designed by the National Security Agency (NSA) and stands for Secure Hashing Algorithm.
Cryptographic hashes and law enforcement
Law enforcement agencies and RIAA sponsored investigators use hashing algorithms to track down those sharing illegal files in P2P networks, in the case of law enforcement, when they seize child pornography images, they automatically hash photos and videos storing the hash strings on a database,these unique values are compared with the cryptohashes of other previously seized files to see if it matches any of them.
There are USB thumbdrives that can be plugged into a computer to scan its hard disk in search of files whose unique hashing algorithm matches one of the child pornography files previously seized, in a matter of minutes and without visually looking at the content law enforcement personnel can detect this kind of material, the same automatic software helps law enforcement to classify these images, when a new image not in the hashing database is found the software marks it for manual inspection to assess it.
Law enforcement also owns specialist software that analyses P2P networks attempting to match a cryptographic hash file to one of those in their database of banned child pornography images, with very little supervision it is possible to detect child pornography, once a file has been flagged it is brought to the attention of an officer to start the process of tracking down the IP and gathering further evidence, the only flaw this has is that if someone modifies one of those photos using a graphics editor giving it a little more/less brightness, then the cryptographic files will not coincide. Software like ssdeep attempts to plug that gap by using a technique known as fuzzy hashing, this method can match cryptographic hashes of very similar files, if someone changes a single bit on a file, it would still pick it up, extreme file changes would not, the same technique can be used to detect similar malware files.
RIAA sponsored companies can use cryptographic files to track down people sharing copyrighted material on P2P networks too, during their evidence gathering they will include a file hash value, if the case ever goes to court, after seizing the user’s computer, that unique hash string compared with the files in the computer will be solid evidence of guilt. Computer forensics software like Encase can create a cryptographic function of a computer hard disk as proof that the data not been tampered with when that hard disk gets to court or defence attorney.
In order to make it more difficult for intellectual rights owners to prosecute violators, a new peer to peer system using Distributed Hash Table (DHT) to defeat automatic tracking systems has been implemented in BitTorrent and eMule (changing default settings is needed), instead of names, DHT uses hashing algorithms to index files, it makes it harder for the user to find the files he wants but adds an extra layer of privacy to filesharing, although not enough to make it impossible to track the infringer, DHT does not hide an individual’s identity.
List of free hash and checksum calculators
To cryptographically hash a file you will need to obtain special software to do that, select the file you would like to hash, from a 1bit file up to a full hard disk, choose the algorithm of your choice and hash it, the same software can also allow you to verify that hashing algorithms coincide (aka integrity check). If you do not want to download software, websites like Hashemall allow you to compute hashes online.
FeeBooti: This free cryptographic hash value generator can computer all the common hashing algorithms (CRC32, MD5, Whirlpool, RipeMD160, SHA512, etc), simple to use interface, file integrity checksum for files of unlimited sizes, simultaneous checksum calculation using different algorithms, it copies hash values to Windows clipboard and integrates into windows property pages.
Multihasher: Portable hash value calculator supporting CRC32, MD5, SHA1,SHA256,SHA384 and SHA512. It can be used for hash file verification and upload files to VirusTotal querying its database to find out if the file is malware. Multihasher integrates with Windows Explorer context menu, supports Unicode characters, file drag and drop and much more.
HashGenerator: Beginner friendly application that can be installed or used as portable, to generate a hash file you simply right click on it using the context menu options or use the drag and drop feature. It computes 14 different type of checksums and can export a list of hashes to an HTML or .txt file.
MD5Deep: Command line open source hashing tool for Windows, it can be compiled for other systems like Linux and BSD, MD5Deep can compute MD5, SHA-1, SHA256, Tiger and Whirlpool message digests, it can process regular files or block devices, it can recursively dig through the directory structure. This tool is best avoided by beginners.