Encryption hashing salting

Encryption, hashing and salting are all related techniques, but each of these processes have properties that lend them to different purposes.

In short, encryption involves encoding data so that it can only be accessed by those who have the key. This protects it from unauthorized parties.

Cryptographic hashing involves calculations that cannot be reversed. These functions have some special properties that make them useful for digital signatures and other forms of authentication.

Salting involves adding random data before it is put through a cryptographic hash function. It’s mostly used to keep passwords safe during storage, but it can also be used with other types of data.

What is encryption?

To put it simply, encryption is the process of using a code to stop other parties from accessing information. When data has been encrypted, only those who have the key can access it. As long as a sufficiently complicated system is used, and it’s used correctly, then attackers are prevented from seeing the data.

Data is encrypted with encryption algorithms, which are also known as ciphers. One of the most important distinctions between encryption and hashing (which we will get to later) is that encryption is designed to go both ways. This means that once something has been encrypted with a key, it can also be decrypted.

This makes encryption useful in a range of situations, such as for securely storing or transferring information. Once data has been encrypted properly, it is considered secure and can only be accessed by those who have the key. The most commonly known type is symmetric-key encryption, which involves using the same key in both the encryption and decryption processes.

Public-key encryption is a little bit more complicated because one publicly available key is used to encrypt data, while its matching private key is used to decrypt it. This feature allows people who have never met each other to communicate securely. Public-key encryption is also an important part of digital signatures, which are used to validate the authenticity and integrity of data and messages.

See also: Common encryption types explained

Common encryption algorithms

  • Caesar cipher – This is a simple code that involves each letter being shifted a fixed number of places. If a Caesar cipher has a shift of three, every “a” will become a “d”, every “b” will become an “e”, every “c” will become an “f” and so on. It’s named after Julius Caesar, who was the first person recorded to use the scheme.
  • AES – The Advanced Encryption Standard is a complex symmetric-key algorithm that secures a significant part of our modern communications. It involves a number of sophisticated steps and is often used to encrypt data in TLS, messaging apps, at rest and in many other situations. We take a deeper dive in to AES encryption here.
  • 3DES – Triple DES is based on the DES algorithm. When growing computer power made DES insecure, 3DES was developed as a reinforced algorithm. In 3DES, data is run through the DES algorithm three times instead of just once, which makes it harder to crack. 3DES can be used for many of the same things as AES, but only certain implementations are considered safe.
  • RSA – The Rivest-Shamir-Adleman cipher was the first form of widely-used public-key cryptography. It allows entities to communicate securely even if they haven’t met or had a chance to exchange keys. It can be used in a number of different security protocols, such as PGP and TLS. We have an in-depth guide to RSA encryption here.
  • ECDSA – The Elliptic Curve Digital Signature Algorithm is a variant of DSA that uses elliptic curve cryptography. As a public-key algorithm, it can be applied in similar situations to RSA, although it is less-commonly implemented due to some security issues.

Encryption in action

To give you an idea of how encryption works in practice, we’ll use the Caesar cipher as an example. If we wanted to encrypt a message of “Let’s eat” with a shift of three, the “L” would become an “O”, the “e” would become an “h” and so on. This gives us an encrypted message of:

Ohw’v hdw

To decrypt the message, the recipient would have to know that the encryption algorithm involved a shift of three, then roll back each letter by three places. If we wanted, we could vary the code by shifting each letter by a different number. We could even use a far more sophisticated algorithm.

One example is AES. If we use 128-bit AES to encrypt “Let’s eat” with a key of “1234”, it gives us:

FeiUVFnIpb9d0cbXP/Ybrw==

This ciphertext can only be decrypted with the key of “1234”. If we were to use a more complex key and keep it private, then we could consider the data secure from attackers.

What is hashing?

Cryptographic hash functions are a special type of one-way calculation. They take a string of data of any size and always give an output of a predetermined length. This output is called the hash, hash value or message digest. Since these functions don’t use keys, the result for a given input is always the same.

It doesn’t matter if your input is the entirety of War and Peace or simply two letters, the result of a hash function will always be the same length. Hash functions have several different properties that make them useful:

  • They are one-way functions – This means that there is no practical way to figure out what the original input was from a given hash value.
  • It’s unlikely for two inputs to have the same hash value – While it is possible for two different inputs to yield the same hash value, the chances of this happening are so small that we don’t really worry about it. For practical purposes, hash values can be considered unique.
  • The same input always delivers the same result – Every time you put the same information into a given hash function, it will always deliver the same output.
  • Even the slightest change gives a completely different result – If even a single character is altered, the hash value will be vastly different.

What are hashes used for?

Hash functions may have some interesting properties, but what can we actually do with them? Being able to spit out a unique, fixed-sized output for an input of any length may seem like nothing more than an obscure party trick, but hash functions actually have a number of uses.

They are a core component of digital signatures, which are an important aspect of verifying authenticity and integrity on the internet. Hash message authentication codes (HMACs) also use hash functions to achieve similar results.

Cryptographic hash functions can be used as normal hash functions as well. In these scenarios, they can act as checksums to verify data integrity, as fingerprinting algorithms that eliminate duplicate data, or to create hash tables to index data.

Common cryptographic hash functions

  • MD5 –This is a hash function that was first published in 1991 by Ron Rivest. It is now deemed insecure and should not be used for cryptographic purposes. Despite this, it can still be used to check the integrity of data.
  • SHA-1 – Secure Hash Algorithm 1 has been in use since 1995, but hasn’t been considered secure since 2005, when a number of successful collision attacks took place. It is now recommended to implement either SHA-2 or SHA-3 instead.
  • SHA-2 – This is a family of hash functions that act as successors to SHA-1. These functions contain numerous improvements, which make them secure in a wide variety of applications. Despite this, SHA-256 and SHA-512 are vulnerable to length-extension attacks, so there are certain situations where it is best to implement SHA-3.
  • SHA-3 – SHA-3 is the newest member of the Secure Hash Algorithm family, but it is built quite differently from its predecessors. At this stage, it has not yet replaced SHA-2, but simply gives cryptographers another option that can provide improved security in certain situations.
  • RIPEMD – RIPEMD is another family of functions that was developed by the academic community. It is based on many of the ideas from MD4 (MD5’s predecessor) and is not restricted by any patents. RIPEMD-160 is still considered relatively secure, but it hasn’t seen much widespread adoption.
  • Whirlpool – Whirlpool is a hash function from the square block cipher family. It’s based on a modification of AES and is not subject to any patents. It is considered secure, but somewhat slower than some of its alternatives, which has led to limited adoption.

Hashing functions in action

Now that you understand what hash functions are, it’s time to put them into practice. If we put the same text of “Let’s eat” into an SHA-256 online calculator, it gives us:

5c79ab8b36c4c0f8566cee2c8e47135f2536d4f715a22c99fa099a04edbbb6f2

If we change even one character by a single position, it changes the whole hash drastically. A typo like “Met’s eat” yields a completely different result:

4be9316a71efc7c152f4856261efb3836d09f611726783bd1fef085bc81b1342

Unlike with encryption, we can’t put this hash value through the function in reverse to get our input once again. Although these hash functions can’t be used in the same way as encryption, their properties make them a valuable part of digital signatures and many other applications.

Hash functions and passwords

Hash functions have another common use that we haven’t discussed yet. They are also a key component of keeping our passwords safe during storage.

You probably have dozens of online accounts with passwords. For each of these accounts, your password needs to be stored somewhere. How could your login be verified if the website didn’t have its own copy of your password?

Companies like Facebook or Google store billions of user passwords. If these companies kept the passwords as plaintext, then any attacker who could work their way into the password database would be able to access every account they find.

This would be a serious security disaster, both for the company and its users. If every single password was exposed to attackers, then all of their accounts and user data would be in danger.

Unfortunately, this is what happened in 2017, when attackers were no doubt pleased to discover that Equifax, one of the three largest credit reporting agencies in the US, was storing customer passwords in plaintext files. This compounded the severity of an already serious data breach that effected more than 143 million people.

The best way to prevent this from happening is to not store the passwords themselves, but the hash values for the passwords instead. As we discussed in the previous section, cryptographic hash functions operate in one direction, producing a fixed-sized output that isn’t feasible to reverse.

If an organization stores the hash of a password instead of the password itself, it can verify that the two hashes match when a user logs in. Users enter their passwords, which are then hashed. This hash is then compared with the password hash that is stored in the database. If the two hashes match, then the correct password has been entered and the user is given access.

This setup means that the password never has to be stored. If an attacker makes their way into the database, then all they will find is the password hashes, rather than the passwords.

While hashing passwords for storage doesn’t prevent attackers from using the hashes to figure out the passwords, it does make their work significantly more difficult and time-consuming. This brings up our final topic, salting.

What is salting?

Salting is essentially the addition of random data before it is put through a hash function, and they are most commonly used with passwords.

The best way to explain using salts is to discuss why we need them in the first place. You might have thought that storing the hashes of passwords would have solved all of our problems, but unfortunately, things are a little more complex than that.

Weak passwords

A lot of people have really bad passwords, maybe you do too. The problem is that humans tend to think in predictable patterns and choose passwords that are easy to remember. These passwords are vulnerable to dictionary attacks, which cycle through thousands or millions of the most common password combinations each second, in an attempt to find the correct password for an account.

If password hashes are stored instead, things are a little bit different. When an attacker comes across a database of password hashes, they can use either hash tables or rainbow tables to look up matching hashes which they can use to find out the passwords.

A hash table is a pre-computed list of hashes for common passwords that is stored on a database. They require more work ahead of time, but once the table has been completed, it is much faster to look up the hashes in the table than it is to compute the hash for each possible password. Another advantage is that these tables can be used repeatedly.

Rainbow tables are similar to hash tables, except they take up less space at the cost of more computing power.

Both of these attack methods become far more practical if weak passwords are used. If a user has a common password, then it is likely that the hash for the password will be in the hash table or rainbow table. If this is the case, then it’s only a matter of time before an attacker has access to a user’s password.

Users can help to foil these attacks by choosing longer and more complex passwords that are far less likely to be stored in the tables. In practice, this doesn’t happen anywhere near as much as it should, because users tend to choose passwords that are easy to remember. As a loose rule of thumb, things that are easy to remember are often easy for attackers to find.

If you struggle to remember multiple passwords (who doesn’t), a password manager like Dashlane or Sticky Password will keep all of your passwords safe and accessible via a single master password.

Salts offer another way of getting around this issue. By adding a random string of data to a password before it is hashed, it essentially makes it more complex, which hampers the chances of these attacks succeeding.

How salting works in practice

As an example, let’s say you have an email account and your password is “1234”. When we use an SHA-256 online calculator, we get the following as the hash value:

03ac674216f3e15c761ee1a5e255f067953623c8b388b4459e13f978d7c846f4

This hash is what would be stored in the database for your account. When you enter your password of “1234”, it is hashed and then the value is compared against the stored value. Since the two values are the same, you will be granted access.

If an attacker breaks into the database, they will have access to this value, as well as all of the other password hashes that were there. The attacker would then take this hash value and look it up in their pre-computed hash table or rainbow table. Since “1234” is one of the most common passwords, they would find the matching hash very quickly.

The hash table would tell them that:

03ac674216f3e15c761ee1a5e255f067953623c8b388b4459e13f978d7c846f4

Corresponds to:

1234

The attacker will then know that your password is “1234”. They can then use this password to log in to your account.

As you can see, this wasn’t a whole lot of work for the attacker. To make things more difficult, we add a salt of random data to the password before it is hashed. Salting helps to significantly reduce the chances of hash tables and rainbow tables from returning a positive result.

Let’s take a 16 character salt of random data:

H82BV63KG9SBD93B

We add it to our simple password of “1234” like so:

1234H82BV63KG9SBD93B

Only now that we have salted it do we put it through the same hash function that we did before, which returns:

91147f7666dc80ab5902bde8b426aecdb1cbebf8603a58d79182b750c10f1303

Sure, this hash value isn’t any longer or more complex than the previous one, but that’s not the point. While they are both the same length, “1234H82BV63KG9SBD93B” is a far less common password, so it’s much less likely that its hash will be stored in the hash table.

The less likely a password is to be stored in a hash table, the less likely for an attack to succeed. This is how adding salts helps to make password hashes more secure.

Hacking entire databases

When an attacker has access to a whole database of password hashes, they don’t have to test every hash against each entry. Instead, they can search the entire database for matches that coincide with their hash table. If the database is big enough, an attacker can compromise a huge number of accounts, even if they only have a five percent success rate.

If the passwords are given unique salts before they are hashed, then this makes the process far more complex. If the salts are sufficiently long, the chances of success become much lower, which would require hash tables and rainbow tables to be prohibitively large in order to be able to find matching hashes.

Another advantage of salts comes when multiple users in the same database have the same password, or if a single user has the same password for multiple accounts. If the password hashes aren’t salted beforehand, then attackers can compare the hashes and determine that any accounts with the same hash value also share the same password.

This makes it easier for hackers to target the most common hash values that will give them the largest rewards. If passwords are salted beforehand, then the hash values will be different even when the same passwords are used.

Potential salt weaknesses

Salting loses its effectiveness if it is done incorrectly. The two most common issues occur when salts are too short, or if they are aren’t unique for each password. Shorter salts are still vulnerable to rainbow table attacks, because they don’t make the resulting hash sufficiently rare.

If salts are reused for each hashed password, and the salt is discovered, it makes it much simpler to figure out each password in the database. Using the same salt also means that anyone with the same password will have the same hash.

Common salting algorithms

It isn’t recommended to use normal hashing functions for storing passwords. Instead, a number of functions have been designed with specific features that help to boost security. These include Argon2, scrypt, bcrypt and PBKDF2.

Argon2

Argon2 was the winner of 2015’s Password Hashing Competition. It’s still relatively new as far as algorithms go, but it has quickly become one of the most trusted functions for hashing passwords.

Despite its youth, so far it has held its own in a number of research papers that have probed it for weaknesses. Argon2 is more flexible than the other password hashing algorithms and can be implemented in a number of different ways.

scrypt

Pronounced “ess crypt”, this is the second youngest password hashing algorithm that is in common use. Designed in 2009, scrypt uses a large, yet adjustable amount of memory in its computations. Its adjustable nature means that it can still be resistant to attacks even as computing power grows over time.

bcrypt

bcrypt was developed in 1999 and is based off the Blowfish cipher. It was one of the most commonly relied upon algorithms used in password hashing for many years, but it is now more vulnerable to field-programmable gate arrays (FPGAs). This is why Argon2 is often preferred in newer implementations.

PKFD2

This key derivation function was developed to replace PBKDF1, which had a shorter and less secure key length. NIST guidelines from 2017 still recommend PKFD2 for hashing passwords, but Argon2 addresses some of its security issues and can be a better option in many situations.

Encryption, hashing and salting: a recap

Now that we’ve gone through the details of encryption, hashing and salting, it’s time to quickly go back over the key differences so that they sink in. While each of these processes is related, they each serve a different purpose.

Encryption is the process of encoding information to protect it. When data is encrypted, it can only be decrypted and accessed by those who have the right key. Encryption algorithms are reversible, which gives us a way to keep our data away from attackers, but still be able to access it when we need it. It is used extensively to keep us safe online, performing a crucial role in many of our security protocols that keep our data secure when it is stored and in transit.

In contrast, hashing is a one-way process. When we hash something, we don’t want to be able to get it back to its original form. Cryptographic hash functions have a number of unique properties that allow us to prove the authenticity and integrity of data, such as through digital signatures and message authentication codes.

Specific types of cryptographic hash functions are also used to store our passwords. Storing a password’s hash instead of the password itself provides an extra layer of security. It means that if an attacker gains entry to a database, they can’t immediately access the passwords.

While password hashing does make life more difficult for hackers, it can still be circumvented. This is where salting comes in. Salting adds extra data to passwords before they are hashed, which makes attacks more time-consuming and resource-heavy. If salts and passwords are used correctly, they make hash tables and rainbow tables impractical means of attack.

Together, encryption, hashing and salting are all important aspects of keeping us safe online. If these processes weren’t in place, attackers would have a free-for-all with your accounts and data, leaving you with no security on the internet.

Technology-1 by tec_estromberg under CC0