First Look at Hash Functions - UNDERSTANDING THE BLOCKCHAIN; Episode 11

in Project HOPE4 years ago

Hey there! Welcome to another episode of Understanding the Blockchain.

You can check out previous episodes here:

1. Episode 1 | This Is How Money Works

2. Episode 2 | Is Digital Money the Answer?

3. Episode 3 | Before Bitcoin, Earlier Attempts at Crypto

4. Episode 4 | Finally, Bitcoin!

5. Episode 5 | How is Crypto Acquired?

6. Episode 6 | A Bitcoin Transaction in Depth

7. Episode 7 | An Overview of Mining

8. Episode 8 | An Overview of Mining

9. Episode 9 | Introduction to Cryptography | Earlier Attempts at Cryptography

10. Episode 10 | One Way Functions



One-Way Functions
Pixabay | Photographer: Jan Baborák


Hey! Welcome back

In the last episode, we talked about One Way Functions and what they mean. Today, we’ll be talking about another concept we need to understand to fully grasp how the blockchain works and that is Hashing Functions.

According to Wikipedia,

A hash function is any function that can be used to map data of arbitrary size to fixed-size values.



A hash function that maps names to integers from 0 to 15. There is a collision between keys "John Smith" and "Sandra Dee". By Jorge Stolf - Own work, Public Domain, Link


Technically, the output of a hash function is a fingerprint. In this series, however, we’ll use the term hashes and fingerprints interchangeably. It is quite safe to use them interchangeably unless of course, you are talking to a mathematician.

How do Hashing Functions work?

Remember we talked about how one-way functions work in the last episode?
Well, hashing functions work similarly but instead, they take the input that is being fed into the function and output a long string of alphanumeric characters referred to as a hash. This hash always corresponds to the inputted file or text or folder or whatever is inputted.

Although the output hash is long, it isn’t going to be as long as the input. The fingerprint hash is not informationally as large or dense as the actual input file. This hash fingerprint is much shorter than the actual input file; it may be 32 or 64 characters only, but it always corresponds to the input file.

You can imagine how hard it will be to write the algorithm or logic to do this. There are three famous or commonly known examples of these fingerprint functions.


  1. MD5

    The first is the MD5 function. It was written in 1992. It is very widely used and common. The problem with it is that in 2004, computer scientists found a way to cheat it; this of course makes it less valuable as a tool to determine whether or not something is unique or to verify its authenticity. We’ll talk about what it means to “cheat” or break these functions later on in the series.

  2. SHA-2 (Secure Hash Algorithm 2)

    The second function is the SHA-2 designed by the United States National Security Agency (NSA) and first published in 2001. SHA-2 fixed some of the problems MD5; a lot of the security issues were patched in SHA-2. SHA-2 is widely used in today’s world especially in internet security. SHA-2 is also the function that bitcoin uses. If SHA-2 were to be broken the same way MD5 was broken, virtually all of the internet application infrastructure will be broken. It would be a game-changer and you would probably… no, definitely hear about it in the news.

  3. KECCAK (SHA-3)

    This is another family of hash function developed in 2009 2002 by the National Institute of Standards and Technology and released on August 5, 2015. It is more secure and faster than SHA-2. A lot of new technologies or moving and making use of SHA-3; this doesn’t mean that SHA-2 is obsolete.
    A lot of newer cryptocurrencies after Bitcoin including Ethereum make use of SHA-3 for its internal hashing mechanism.

We’ll be focusing on SHA-2 because that is what Bitcoin uses. So, SHA-2 is actually a family of hash functions.

  • SHA-256
  • SHA-384
  • SHA-512

These specify the size of the output in bits. Most commonly, you are going to encounter SHA-256 but there is a more complex function that inputs twice as much information. In the case of Bitcoin, we’ll use the 256 Bit which means that every time you use SHA-256, the output is 256 zeros and ones.

So, imagine you have 1000 characters, that is a piece of data that is 1000 Bits (1 character in data storage is 1 Bit) and we run it through SHA-256, it is going to slim it down to 256 zeros and ones. If we, however, run it through SHA-512, it is going to run it down to 512 zeros and ones.

This is the principle of encryption that Bitcoin uses. In the case of Bitcoin, however, it displays it in a much better alphanumeric format by converting it to Base 10 or other bases.


Up Next

We’ll explore more properties of SHA-256 and other hash functions in the next episode.


Resources

  1. Knuth, D. 1973, The Art of Computer Science, Vol. 3, Sorting and Searching, p.527. Addison-Wesley, Reading, MA., United States
  2. Menezes, Alfred J.; van Oorschot, Paul C.; Vanstone, Scott A (1996). Handbook of Applied Cryptography


hr.png

You can check out My Blog for more amazing content.


Sort:  

Hello @gamsam
You have not only explained but made part of history. These terms can be understood, the point here is to visualize all this. It can be a bit abstract. But it's like that, you just can't simplify it to understand. The number of characters in those footprints I assume is what

Hello @josevas217
Have you read the last episode? It a stepping stone for this episode and it will help you understand how functions work first.

Coin Marketplace

STEEM 0.22
TRX 0.25
JST 0.039
BTC 105647.40
ETH 3331.47
SBD 4.08