Hashing

 



In cryptography, hashing is the transformation of a string to create a fixed-length string of numbers and characters using a mathematical function. Unlike ‘encryption’ hash functions are one-way functions, meaning that once the data is transformed into their respective hash values, it cannot be decrypted back to the original data. Hash functions are extremely useful in computer science and in the field of cybersecurity.



Why Are Hashes values Irreversible?


Although hash functions are implemented differently, they all use mathematical functions to make them one-way functions. For this, irreversible math is used. 
For example, the modulus operator in math is a simple hash function. 

 f(x)=x%6
f(305)=305%6=5
f(245)=245%6=5
f(1211)=1211%6=5

In this example you can see that, there can be many values for x satisfying f(x)=5. Therefore, this function is one-way and it is suitable for hashing. This makes hashing different from encryption because, for encryption you need to get the value back.


In order to be an effective cryptographic tool, hash functions are designed to have three fundamental safety properties.

  1. Preimage resistant – Given H it should be hard to find M such that H = hash(M). This means it should be computationally hard and time-consuming for an attacker to find the original data from a hash value.
  2. Second preimage resistant – Given an input m1, it should be hard to find another input, m2 (not equal to m1) such that hash(m1) = hash(m2). This means when a message and its hash is given, it should be impossible to find another message with the same hash. This protects against attackers who has the value and its hash and wants to replace different value as the legitimate value in place of the original value.
  3. Collision-resistant – It should be hard to find two different messages m1 and m2 such that hash(m1) = hash(m2). This means it is difficult to find two distinct inputs to the hash function that result in the same hash.


Uses of Hashing

1.Data Integrity Check

Suppose that you have to store some files in the cloud or send some important files to another person and it is really important for you to have to ensure that these files are not tampered by any third party. You can use hashing to check whether any changes made to the original files. Before saving the files in the cloud you can compute the hash value of the file and save it in the local machine. Now, when you download the files, you can compute the hash values of the downloaded files. Then you can compare the two hash values and find whether the files have tampered or not. If you are sending some important files to another one you can use a hashing algorithm to generate a checksum and you can send the checksum for the receiver to validate the integrity of the files. The receiver can use the same hashing function on the files received and compare two hash values to validate the integrity of the files.

2.Password verification

For user logins, we need passwords to authenticate. Instead of storing passwords, we can store the hash values. When the password is entered, the hash value of the password is computed and then sent to the server for verification. This is done to ensure that when the passwords are sent from client to server, no sniffing is there and if an intruder hacks the database he/she can access only the hash values. The intruder can neither login using hash nor can derive the password from hash value since the hash function possesses the property of pre-image resistance.

3.In Blockchain

In blockchain, Merkel trees are a fundamental part and used by both Bitcoin and Ethereum. Merkle trees are created by repeatedly hashing pairs of nodes until there is only one hash left called Root Hash. They are constructed from the bottom up, from hashes of individual transactions (known as Transaction IDs). Each leaf node is a hash of transactional data, and each non-leaf node is a reference of its child nodes.


In cryptocurrency blockchains, hashing is used to write new transactions, timestamp them, and every block has a hash of the previous block. If we change any data in the current block, the hash of the block will be changed, this will affect the previous block because it has the address of the previous block. Because of the one-way nature of the hashing, it is impossible to reverse a transaction due to the huge computing power that is required to tamper with the blockchain. Therefore this implies hashing is of the core fundamentals and foremost aspects of the immutable and defining potential of blockchain technology and how much crucial to maintain the integrity of the blockchain.


Popular hashing algorithms

SHA-0

This is the first hash algorithm of  Secure Hashine Algorithms(SHA) family. It is a 160-bit hash function published by National Institute of Standards and Technology (NIST) of U.S.A. in 1933. But this algorithm was withdrawn soon due to an undisclosed "significant flaw". 

MD5

MD5 or message-digest algorithm was designed by Ronald Rivest in 1991 and it is produces a 182-bit hash value. Though this algorithm shows collisions, it is widely used in computer world.

An example for MD5 collision is,

d131dd02c5e6eec4 693d9a0698aff95c 2fcab58712467eab 4004583eb8fb7f89
55ad340609f4b302 83e488832571415a 085125e8f7cdc99f d91dbdf280373c5b
d8823e3156348f5b ae6dacd436c919c6 dd53e2b487da03fd 02396306d248cda0
e99f33420f577ee8 ce54b67080a80d1e c69821bcb6a88393 96f9652b6ff72a70
d131dd02c5e6eec4 693d9a0698aff95c 2fcab50712467eab 4004583eb8fb7f89
55ad340609f4b302 83e4888325f1415a 085125e8f7cdc99f d91dbd7280373c5b
d8823e3156348f5b ae6dacd436c919c6 dd53e23487da03fd 02396306d248cda0
e99f33420f577ee8 ce54b67080280d1e c69821bcb6a88393 96f965ab6ff72a70

Both these messages produces MD5 hash 79054025255fb1a26e4bc422aef54eb4, though they are differing in 6 characters.

SHA-1

This algorithm produces a 160 bit hash and it was released in 1994. SHA-1 is similar to MD5 hashing algorithm but SHA-1 is slower than the MD5. However this algorithm also shows security vulnerabilities and in 2005 Xiaoyun Wang and Yiqun Lis Yin found a method to find collisions in SHA-1.

SHA-2

SHA-2 consists of six hash functions (SHA-224, SHA-256, SHA-384, SHA-512, SHA-512/224, SHA-512/256).  They are designed using the Merkle-Damgard paradigm and currently bitcoin uses SHA-256.

SHA-3

This is the latest member of the SHA family and was released in 2015. But its internal structure is significantly different from the rest of the SHA family. Ethereum uses this hashing algorithm.


References

  • https://www.tutorialspoint.com/cryptography/cryptography_hash_functions.htm
  • https://blockgeeks.com/guides/cryptographic-hash-functions/
  • https://www.geeksforgeeks.org/blockchain-technology-introduction/





Comments

  1. Is there a difference between Second preimage resistant and Collision-resistant? Both are talking about hash(m1) != hash(m2) right

    ReplyDelete
    Replies
    1. The difference is in the choice of m1.

      •In the first case (second preimage resistance), the attacker is handed a fixed m1 to which he has to find a different m2 with equal hash. In particular, he can't choose m1.
      •In the second case (collision resistance), the attacker can freely choose both messages m1 and m2, with the only requirement that they are different (and hash to the same value).

      But, if a hash function is collision-resistant then it is second pre-image resistant.

      Delete
  2. What is the difference between hashing and salting? Are they same or two different concept?

    ReplyDelete
    Replies
    1. They are two different concepts. Salting is essentially the addition of random data before it is put through a hash function, and they are most commonly used with passwords. This adds a layer of security to the hashing process, specifically against brute force attacks. The idea is that by adding a salt to the end of a password and then hashing it, you’ve essentially complicated the password cracking process.

      Delete
  3. This comment has been removed by the author.

    ReplyDelete
  4. Nicely written Chamal, Just to clarify, you have said we can use hashing to store password and you said that hashing is only a one time function. So how do we check authentication details. I can't understand. Can you explain bit?

    ReplyDelete
    Replies
    1. When you hash the password the first time (when the user registers), you store the resulting hash value of the password in the database.

      The second time (when they try to log in again), you use your username, to get the details and then obtain the hash value of the user entered password. Then you compare both hash values, that is in the database and the hash value of the user entered password to authenticate.

      Delete

Post a Comment