Hashing

Reviewed by: Richard Becker
Last Updated: August 14, 2020

Definition - What does Hashing mean?

Hashing is the practice of taking a string or input key, a variable created for storing narrative data, and representing it with a hash value, which is typically determined by an algorithm and constitutes a much shorter string than the original.

Hashing is also a method of sorting key values in a database table in an efficient manner.

Techopedia explains Hashing

Think of a three-word phrase encoded in a database or other memory location that can be hashed into a short alphanumeric value composed of only a few letters and numbers. This can be incredibly efficient at scale, and that’s just one reason that hashing is being used.

Other top reasons have to do with superior cybersecurity.

Hashing in Computer Science and Encryption

Hashing has several key uses in computer science. One that perhaps receives the most attention today in a world where cybersecurity is key is the use of hashing in encryption.

Because hashed strings and inputs are not in their original form, they can't be stolen the way they can be if they are not hashed. If a hacker reaches into a database and finds an original string like "John's wallet ID 34567," they can simply glean, nab or pilfer this information and use it to their advantage, but if they instead find a hash value like "a67b2," that information is completely useless to them, unless they have a key to decipher it.

Is Hashing Used in Data Compression?

However, there are also massive benefits in hashing in terms of data compression.

Hashing is not compression.

It's a different animal, but it can operate very much like file compression in that it takes a larger data set and shrinks it into a more manageable form. Suppose you had "John's wallet ID" written 40 or even 4000 times throughout a database.

By taking all of those repetitive strings and hashing them into a shorter string, you’re saving tons of memory space.

Using Hashing in Database Retrieval

Then there's also the use of hashing in database retrieval.

Here's where another example comes in handy — many experts analogize hashing to a key library innovation of the 20th century — the Dewey decimal system.

In a sense, what you get when you retrieve a hash value is like getting a Dewey decimal system number for a book. Instead of searching for the book’s title, you're searching for the Dewey decimal system address or identification, plus a few key alphanumeric characters of the book's title or author.

We've seen how well the Dewey decimal system has worked in libraries, and it works just as well in computer science. In short, by shrinking these original input strings and data assets into short alphanumeric hash keys, engineers are able to do several key cybersecurity enhancements and save file space at the same time.

Hashing's Role in File Tampering

Hashing is also valuable in preventing or analyzing file tampering.

Here's how this works — the original file will generate a hash which is kept with the file data. The file and the hash are sent together, and the receiving party checks that hash to see if the file has been compromised. If there were any changes to the file, the hash will show that.

All of this shows why hashing is such a popular part of DB handling.

This definition was written in the context of Cybersecurity
Are you missing out when it comes to Machine Learning?

If you're in the IT industry and looking to understand some of the context around Machine Learning and how it can be useful, here are some basics and essential concepts that help to define ML and it's role in business.

Get Instant Access to Machine Learning and Why it Matters
Share this: