Hashing
In cryptography, hashing is the transformation
of a string to create a fixed-length string of numbers and characters using a
mathematical function. Unlike ‘encryption’ hash functions are one-way functions,
meaning that once the data is transformed into their respective hash values, it
cannot be decrypted back to the original data. Hash functions are extremely
useful in computer science and in the field of cybersecurity.
Why Are Hashes values Irreversible?
f(x)=x%6
f(305)=305%6=5
f(245)=245%6=5
f(1211)=1211%6=5
In this example you can see that, there can be many values for x satisfying f(x)=5. Therefore, this function is one-way and it is suitable for hashing. This makes hashing different from encryption because, for encryption you need to get the value back.
In order to be an effective cryptographic tool, hash functions are designed to have three fundamental safety properties.
- Preimage resistant – Given H it should be hard to find M such that H = hash(M). This means it should be computationally hard and time-consuming for an attacker to find the original data from a hash value.
- Second preimage resistant – Given an input m1, it should be hard to find another input, m2 (not equal to m1) such that hash(m1) = hash(m2). This means when a message and its hash is given, it should be impossible to find another message with the same hash. This protects against attackers who has the value and its hash and wants to replace different value as the legitimate value in place of the original value.
- Collision-resistant – It should be hard to find two different messages m1 and m2 such that hash(m1) = hash(m2). This means it is difficult to find two distinct inputs to the hash function that result in the same hash.
Uses of Hashing
1.Data Integrity Check
Suppose that you have to store some files in the cloud or send some important files to another person and it is really important for you to have to ensure that these files are not tampered by any third party. You can use hashing to check whether any changes made to the original files. Before saving the files in the cloud you can compute the hash value of the file and save it in the local machine. Now, when you download the files, you can compute the hash values of the downloaded files. Then you can compare the two hash values and find whether the files have tampered or not. If you are sending some important files to another one you can use a hashing algorithm to generate a checksum and you can send the checksum for the receiver to validate the integrity of the files. The receiver can use the same hashing function on the files received and compare two hash values to validate the integrity of the files.
2.Password verification
For user logins, we need passwords to authenticate. Instead of storing passwords, we can store the hash values. When the password is entered, the hash value of the password is computed and then sent to the server for verification. This is done to ensure that when the passwords are sent from client to server, no sniffing is there and if an intruder hacks the database he/she can access only the hash values. The intruder can neither login using hash nor can derive the password from hash value since the hash function possesses the property of pre-image resistance.
3.In Blockchain
In blockchain, Merkel trees are a fundamental part and used by both Bitcoin and Ethereum. Merkle trees are created by repeatedly hashing pairs of nodes until there is only one hash left called Root Hash. They are constructed from the bottom up, from hashes of individual transactions (known as Transaction IDs). Each leaf node is a hash of transactional data, and each non-leaf node is a reference of its child nodes.
In cryptocurrency blockchains, hashing is used to write new transactions, timestamp them, and every block has a hash of the previous block. If we change any data in the current block, the hash of the block will be changed, this will affect the previous block because it has the address of the previous block. Because of the one-way nature of the hashing, it is impossible to reverse a transaction due to the huge computing power that is required to tamper with the blockchain. Therefore this implies hashing is of the core fundamentals and foremost aspects of the immutable and defining potential of blockchain technology and how much crucial to maintain the integrity of the blockchain.
Popular hashing algorithms
SHA-0
This is the first hash algorithm of Secure Hashine Algorithms(SHA) family. It is a 160-bit hash function published by National Institute of Standards and Technology (NIST) of U.S.A. in 1933. But this algorithm was withdrawn soon due to an undisclosed "significant flaw".
MD5
MD5 or message-digest algorithm was designed by Ronald Rivest in 1991 and it is produces a 182-bit hash value. Though this algorithm shows collisions, it is widely used in computer world.
An example for MD5 collision is,
Both these messages produces MD5 hash 79054025255fb1a26e4bc422aef54eb4, though they are differing in 6 characters.
SHA-1
This algorithm produces a 160 bit hash and it was released in 1994. SHA-1 is similar to MD5 hashing algorithm but SHA-1 is slower than the MD5. However this algorithm also shows security vulnerabilities and in 2005 Xiaoyun Wang and Yiqun Lis Yin found a method to find collisions in SHA-1.
SHA-2
SHA-2 consists of six hash functions (SHA-224, SHA-256, SHA-384, SHA-512, SHA-512/224, SHA-512/256). They are designed using the Merkle-Damgard paradigm and currently bitcoin uses SHA-256.
SHA-3
This is the latest member of the SHA family and was released in 2015. But its internal structure is significantly different from the rest of the SHA family. Ethereum uses this hashing algorithm.
References
- https://www.tutorialspoint.com/cryptography/cryptography_hash_functions.htm
- https://blockgeeks.com/guides/cryptographic-hash-functions/
- https://www.geeksforgeeks.org/blockchain-technology-introduction/



Is there a difference between Second preimage resistant and Collision-resistant? Both are talking about hash(m1) != hash(m2) right
ReplyDeleteThe difference is in the choice of m1.
Delete•In the first case (second preimage resistance), the attacker is handed a fixed m1 to which he has to find a different m2 with equal hash. In particular, he can't choose m1.
•In the second case (collision resistance), the attacker can freely choose both messages m1 and m2, with the only requirement that they are different (and hash to the same value).
But, if a hash function is collision-resistant then it is second pre-image resistant.
Thanks
DeleteWhat is the difference between hashing and salting? Are they same or two different concept?
ReplyDeleteThey are two different concepts. Salting is essentially the addition of random data before it is put through a hash function, and they are most commonly used with passwords. This adds a layer of security to the hashing process, specifically against brute force attacks. The idea is that by adding a salt to the end of a password and then hashing it, you’ve essentially complicated the password cracking process.
DeleteThis comment has been removed by the author.
ReplyDeleteNicely written Chamal, Just to clarify, you have said we can use hashing to store password and you said that hashing is only a one time function. So how do we check authentication details. I can't understand. Can you explain bit?
ReplyDeleteWhen you hash the password the first time (when the user registers), you store the resulting hash value of the password in the database.
DeleteThe second time (when they try to log in again), you use your username, to get the details and then obtain the hash value of the user entered password. Then you compare both hash values, that is in the database and the hash value of the user entered password to authenticate.