key derivation function

Key derivation functions (KDFs) are critical parts of cryptographic systems. As their name suggests, they can be used for deriving strong keys from other inputs. These include:

  • Passwords
  • Shared secrets
  • Other secure keys (when additional keys are needed)
  • Weaker keys

They can also be used for safely storing passwords.

But key derivation functions and their overlapping uses can be a little confusing. To truly understand them, it’s best if we go back to the fundamentals and explain what keys are, how they differ from passwords, and the best ways to store passwords.

What are keys?

Let’s start with the absolute basics. The National Institute of Standards and Technology (NIST) defines a key as:

“A parameter used with a cryptographic algorithm that determines its operation in such a way that an entity with knowledge of the key can reproduce or reverse the operation, while an entity without knowledge of the key cannot.”

This is a complex definition, so let’s break it down a little further. A key is basically an additional input that we add to various types of cryptographic algorithms in order to produce a specific result. Keys take the form of really big numbers, often referred to as strings.

Let’s give you a more concrete explanation of keys. In the context of a symmetric-key encryption algorithm like AES—which is what we use to encrypt most of our data—we take the data we want to encrypt and run it through the algorithm alongside a secret key in order to give us the encrypted ciphertext.

This ciphertext cannot be decrypted without the key (there are a few caveats here, but let’s not over-complicate things at this stage). When we want to decrypt the data so that we can access it, we take our ciphertext and the key, and essentially run them through the algorithm backwards to give us the original data.

Since keys allow us to keep data confidential, we often need to keep them a secret. When we are the only ones who know a key that protects data, this means that we are also the only ones who can access that data. This keeps attackers out of our sensitive information.

Asymmetric keys

Not all keys must be kept secret. Asymmetric cryptography, also known as public-key cryptography, is a vital part of our cybersecurity landscape that involves keeping some keys public.

It was invented as a solution to the problem of how to securely exchange keys. In a nutshell, asymmetric cryptography involves two sets of keys: one public and one private. When someone wants to send a secure message to another person, they obtain the recipient’s public key — which is freely available — and use it to encrypt the message. Once encrypted, only the recipient, who possesses the corresponding private key, can decrypt and read the message.

Asymmetric encryption algorithms like RSA allow us to do all kinds of things beyond what the symmetric-key encryption is capable of. They enable us to securely communicate with people we have never met before, and bring trust into our communications via digital signatures.

Keys are complex and there are a range of different types that help us achieve a variety of important things. However, in this article we will be focusing on the types of keys that we need to keep secret.

The properties of keys

If our important data is secured by a secret key, we need to make sure that hackers cannot figure out the key. If they can find the key, they can decrypt the data just as easily as we can.

In addition to us needing to keep the key a secret, the key must also meet certain properties in order for us to be confident that an attacker will not be able to figure it out:

Key length

The length of the key is an important factor in the overall security of a given cryptosystem. Key length essentially refers to the size of the key, with larger keys being generally more secure (there are exceptions to this). However, there is always a trade-off between security and speed.

The underlying concept is relatively simple: if someone is thinking of a number between one and ten, you will be able to guess it in just a few tries. If they are thinking of a number between one and a million you will probably get bored before you figure it out.

Keys are basically just big numbers, so the larger the field of numbers to choose from, the harder it will be for an attacker to figure out your key. Keys need to be certain lengths or greater in order to be secure for a given purpose. This depends on the situation, but as an example, AES has three key lengths:

  • 128 bits
  • 192 bits
  • 256 bits

Other algorithms, like RSA, use much larger keys. It’s not uncommon for RSA to use 4,096 bit keys. However, these large RSA keys do not necessarily grant more security to a cryptosystem than the smaller AES keys. This is due to some complexities of the math that underlie these two algorithms, but it’s out of the scope of this article.

Randomness

A key needs to be random, or close enough to random that even sophisticated techniques can’t detect any patterns. If there are any patterns or restrictions, it can make it much easier for a hacker to figure out the key.

Let’s explain this further by returning to our example from above: If your friend chooses a truly random number between one and a million, then there is an equal chance that it could be any single number within that range. To figure it out, you would have to guess every single number in a systematic fashion until you got lucky and came across the right one. On average, it would take you 500,000 guesses to get it right. This would take you an exceptionally long time.

Let’s switch up the example and say that you’ve seen your friend play this game with others before. You noticed that the numbers they choose aren’t actually random. Whenever they play, the number always ends in 837. You aren’t sure why, but for some reason your friend has an obsession with this number. You have noticed a pattern, and you can use it to make your job a whole lot easier.

Instead of having to work your way through every single number, you could speed things up dramatically by only guessing numbers that end in 837:

837, 1837, 2837, 3837, 4837, etc..

Because you figured out the pattern, it should only take you an average of 500 guesses to figure out the number that your friend is thinking of.

The security of keys works exactly the same way. If all of the keys from a given system follow a predictable pattern, it is much easier to figure out what the key actually is.

Uniform distribution

It’s not just patterns that we have to worry about. We also have to worry about whether keys are from a uniform distribution. For example, let’s say you have a key generator that is supposed to give you random 128-bit keys. We will use decimal numbers to explain this, to avoid complicating things any further than necessary.

In decimal, the largest 128-bit number is:

340,282,366,920,938,463,463,374,607,431,768,211,456

If the key generator was truly random and had a uniform distribution, you would expect it to be equally likely for the key to be any single number between 0 and 340,282,366,920,938,463,463,374,607,431,768,211,456.

But let’s say there is a bug in the key generator. You run it a few times, and these are the keys that it outputs:

  • 8722
  • 9345
  • 0497
  • 1438
  • 7506
  • 4379
  • 3984

What’s happened? These numbers are seemingly random, but they certainly aren’t uniform and evenly spread out between 0 and 340,282,366,920,938,463,463,374,607,431,768,211,456. They are all clustering under 10,000 because of the bug.

If an attacker figured out that the key generator used in the cryptosystem was faulty and only output keys in such a limited range, it would make their job much easier. Due to the limited distribution of the keys, the 128-bit key would be trivial to guess.

It would take an average of only 5,000 guesses, instead of the trillions and trillions that a truly random and uniform 128-bit key would require. This is why uniformity is another important aspect of key generation.

There should be no observable relationship between different keys

One threat against cryptosystems that we need to consider is known as a related-key attack. In this scenario, an attacker is able to observe the operation of an encryption algorithm as it processes data using a number of different keys.

At the start of the observation, the attacker does not know the values for any of the keys, but they do know that there is a relationship between them.

One example is WEP, a flawed protocol you might have encountered when setting up a wi-fi router. It implements a single master key for all users of a WLAN. When it encrypts data with RC4, it uses this key plus a 24-bit initialization vector as the encryption key for each packet. This means that there is a relationship between the keys.

The reuse of the key, plus the relatively short initialization vector means that an attacker can intercept the data packets for a relatively short period of time and then exploit the known properties of the RC4 key schedule to recover the key.

The reuse of similar keys ultimately means that attackers can easily recover the plaintext, completely undoing any security purportedly provided by WEP. This is just one example of why our cryptosystems must never reuse keys that have observable relationships between them.

Keys vs passwords

passwords

Passwords are important parts of our security systems.

Keys can seem a lot like passwords at the surface level, and there are indeed a lot of similarities between the two. However, it is important to understand the differences, and the separate roles that each plays in our cryptosystems.

Like keys, passwords are also pieces of data that need to be kept secret from unauthorized parties. They play vital parts in keeping our data safe as well. However, passwords were originally intended to be memorized by humans, and they are mostly used as a way for users to prove their identities.

When logging in to a system, a user can essentially claim any username, because usernames are not required to be private. Access control systems use passwords to sort legitimate users from attackers and impostors. In essence, if a person tries to log in as user123, the system essentially asks them to prove it:

“If you’re really user123, then prove it! Tell me the secret piece of information that only user123 knows.”

The person then types in their password. If it matches the information that the system has on file for user123, then it assumes that the person must be the legitimate user123 and grants them access. It’s not a perfect system, but it more or less works.

If passwords are supposed to be secret information that we use to prove our identities, then we need a way to store them that limits the risk of attackers being able to uncover them. Traditionally, this was best done by keeping them in our heads, and maybe having a backup copy locked in a safe at home.

This brings us to the first major point of differentiation between keys and passwords:

Passwords needed to be memorable

Historically, people needed to be able to memorize their passwords in order to keep them secure, yet easily accessible. As humans, we are great at memorizing patterns and words, but we are also lazy. This leads to us creating passwords often aren’t long enough, nor are they completely random strings of data with uniform distributions.

People tend to choose things like hunter2, or the name of their favorite sports team plus their birthdate (note that these are not good passwords) rather than Zq4t7w!z%C^F-J@N. These are far easier to remember than random strings.

The result is that most passwords do not have the length or randomness that are necessary for them to be appropriate keys, nor are they taken from a uniform distribution. If we were to use them as keys, it would make our systems vulnerable to attacks.

In recent years, things have changed a little bit with the rise of password managers. These allow users to remember just one master password, which then controls access to all of their other passwords. Since only one password needs to be remembered, this makes it possible for all of the others to be random and from uniform distributions—users don’t have to keep them locked away in their brains.

There are free open-source password managers available, such as KeePass, as well numerous easy-to-use paid-for options.

Nevertheless, many people have still not adopted password managers, so this hasn’t caused any major changes to the overall structure of security systems.

How can we turn passwords into keys?

We have stated that we need keys to act as secure inputs to our cryptographic systems, but that people tend to remember and input passwords.

This brings us to a major problem: How can we turn human-memorable passwords into keys that our cryptosystems can use?

With key derivation functions (KDFs).

What is a key derivation function (KDF)?

In the most general sense, a key derivation function (KDF) takes an input, runs it through a special function, and then outputs secure keying material. The input may be a password, or other weak keying material.

What are key derivation functions (KDFs) used for?

Key derivation functions can actually do a range of things, including:

  • Turning passwords and other weak sources of keying material into strong keys. This is known as key stretching.
  • Generating numerous keys from a single source of keying material. This allows us to use separate keys for each aspect of a cryptosystem.
  • Securely storing passwords to keep them safe from hackers.

How do password-based key derivation functions (PBKDFs) work?

There are a number of different password-based key derivation functions, but we will demonstrate how they work by discussing PBKDF2:

How does PBKDF2 work?

As you can probably guess, PBKDF2 stands for Password-Based Key Derivation Function 2. PBKDF2 was published in 2000 as part of RSA Laboratories’ series on public-key cryptography standards (PKCS). PKCS lays out the fundamentals for how a secure public key cryptosystem should be implemented. PBKDF2 replaced PBKDF1, which produces keys that aren’t large enough to be secure in the current environment.

One of the aims of a password-based key derivation function is to turn a password, a secret with low entropy, into a key of an appropriate length and sufficient randomness from a uniform distribution.

PBKDF2 does this by taking four inputs:

  • A password — This is the user’s password that needs to be turned into a strong key. Of course, the password needs to be kept secret.
  • A salt — A salt is essentially an additional random input of data that is processed by the KDF alongside the password. Salts do not have to be secret, and they are stored alongside password hashes.
  • An iteration count — The iteration count sets out how many times the values will be run through the function.
  • The derived key length — This is the length that the implementer wants the output–the strong key–to be.

PBKDF2 puts the password and the salt through a pseudo-random function a set number of times, according to the value for iteration count. The final output is a strong key of the length that was specified in the derived key length input. This key can then be used to secure various aspects of a cryptosystem.

PBKDF2 can use a range of different pseudo-random functions. These include:

  • HMAC-SHA-256
  • HMAC-SHA-512
  • Even HMAC-SHA-1 can be used for password hashing, although it needs a much higher iteration count in order to be secure. SHA-1 is not normally considered secure in most other applications.

Now that we have outlined the overall design of PBKDF2, let’s take a step back and examine some of its components in more detail.

What is a pseudo-random function (PRF)?

In a general sense, a pseudo-random function is a function that takes a secret random seed and a data variable as its inputs. It outputs a value that is computationally indistinguishable from a truly random output.

PBKDF2 uses hash-based message authentication code (HMAC) functions as its PRF. HMACs were originally designed for message authentication, but they can be easily adapted to fit the role of PRFs.

What are hash-based message authentication codes (HMACs)?

Hash-based message authentication code (HMAC) functions are built on top of cryptographic hash functions such as:

In more traditional applications, HMAC functions take on a key as the random seed and the text of a message as the data variable. They output a message authentication code (MAC), which is used for authenticating messages, as you can probably guess from its name.

The initial key is first turned into a block-sized key, which is then used as input into the HMAC twice. The first time, it is added to inner padding and concatenated (this basically means that one value is appended to the other, forming a larger string composed of one value after the other) to the text of the message. This string is then run through a hash, giving us the first output.

The key is then added to the outer padding, and is concatenated to the output from the prior operation. This string is then hashed, resulting in the final output, the message authentication code (MAC).

The hash function used in the overall HMAC function can vary, but in HMAC-SHA-256, it is SHA-256. In this case, SHA-256 is used for both of the HMAC function’s hashing operations.

In mathematical notation, it looks like this:

HMAC = H (K XOR opad ‖ H(K XOR ipad ‖ text))

Where:

  • HMAC is the resulting hash-based message authentication code.
  • H is the hash function. In HMAC-SHA-256, this would be SHA-256.
  • K is a block-sized version of the initial key.
  • XOR represents an XOR operation.
  • opad represents outer padding.
  • symbolizes concatenation.
  • ipad represents inner padding.
  • text represents the message text.

How are hash-based message authentication (HMAC) functions used in PBKDF?

You may have noticed that a normal HMAC kind of operates in a loop. The key with one type of padding is concatenated with the text, then this string is hashed:

H(K XOR ipad ‖ text)

The key with another type of padding is then concatenated with the output from the above function:

H(K XOR opad ‖ First Output)

Normally, a HMAC function would stop there, and we would have a message authentication code, which we could use to authenticate the message.

But what if we just kept going in this loop?

H(K ‖ text)

Then:

H(K ‖ First Output)

Then:

H(K ‖ Second Output)

Then:

H(K ‖ Third Output)

Then:

H(K ‖ Fourth Output)

And so on…

This is exactly what PBKDF2 does with the HMACs. This is where the iteration count comes in. It specifies how many times we should go through this process. OWASP currently recommends a 310,000 iteration count of HMAC-SHA-256 for secure password hashing. This means that the outputs need to be hashed again and again in this loop 310,000 times.

The aim of this is to slow down hackers attempting to figure out passwords, but we will cover this in more depth later on.

In PBKDF2, the user’s password acts as the seed, while the salt functions as the text. So the first iteration of PBKDF-HMAC-SHA-256 would look like this:

HMAC = H (Password ‖ H(Password ‖ Salt))

We mentioned that PBKDF2 has four inputs. Let’s choose some values for them:

  • The password (P) — PASSWORD123
  • The salt (S) — 12345678
  • The iteration count (c) — 310,000
  • The derived key length (dkLen) — 256 bits, which is 32 bytes

The derived key length and the hash function have the same length (256 bits), which makes things easier for us. If the derived key we required needed to be longer than the hash length, we would have to do the entire process we are about to describe for each of the hash blocks. Then we would have to concatenate each of these individual outputs together to get our final key output from the function.

The PBKDF2 function can be written like this:

F (P, S, c, i) = U_1 XOR U_2 XOR … XOR U_c

Where:

U_1 = PRF (P, S || INT (i)) ,

U_2 = PRF (P, U_1) ,

U_c = PRF (P, U{c-1}) .

In our case:

  • U_x stands for the underlying pseudorandom function. U_1 is the result of PRF (P, S || INT (i)).
  • PRF is HMAC-SHA-256
  • P is PASSWORD123
  • S is 12345678
  • INT (i) is 1, because the derived key length and the hash length are the same. We won’t bother going into the precise reason for why, but you can find it for yourself in the RFC.
  • c is 310,000, because the iteration count is 310,000.

Therefore:

U1 = HMAC-SHA-256 (PASSWORD123, 12345678 || 1)

U_2 = HMAC-SHA-256 (PASSWORD123 || U_1)

U_3 = HMAC-SHA-256 (PASSWORD123 || U_2)

U_310,000 = HMAC-SHA-256 (PASSWORD123 ‖ U{310,000-1])

Because there is only one block to process in our simple example, our output for our final derived key is the same as U_310,000. If there were multiple blocks, the end results of each block would need to be concatenated together to get the final key.

Let’s just say that the following is the 256-bit key, because no one wants to go through the process 310,000 times:

bd4650a6091a9ec0efd4cee0551a8da15bca4a7766c7e5c56e676c646e8b1f0b

How does turning passwords into keys strengthen our security?

We mentioned that one of the main uses of KDFs is to turn passwords and other weak sources of keying material into strong keys. This is known as key stretching. But how does key stretching make our systems more secure? What role do key derivation functions like PBKDF2 play in it?

To demonstrate how password-based key derivation functions can strengthen cryptosystems, let’s compare a system that doesn’t implement one against a system that does.

In both cases, we have a user, Alice, who has a weak password, Rabbits192. Although Rabbits192 is definitely a weak password, it’s not unusually weak. Many users would default to using passwords that are this simple unless they are forced not to. Our password tester says it would take about two weeks to crack this with a computer.

If PBKDF2 wasn’t being used, an attacker would probably use a dictionary attack to cycle through the most likely password combinations until they arrived at Rabbits192. After running the attack for two weeks, it’s likely that they would crack the password and be able to gain full control of Alice’s account.

Now, let’s run through the same thing with a system that implements PBKDF2 with HMAC-SHA-256 to stretch the password into a stronger key. In this system, Alice inputs her password, Rabbits192, and the system runs it through the many iterations of hashing. It then spits out a 256-bit hash. Let’s say it’s something like this:

ff2fcd3cc0fe4ab6cc18a17a643483588f744a6af664eba96cfe651dc58a69d5

This time, the hacker has two options if they want to gain access to the system:

Attempting to crack the enhanced password

They could attempt to crack this new enhanced password. However, this new password is random and from a uniform distribution, which means that dictionary attacks would not be effective. Instead, the hacker would be stuck with brute forcing it, attempting every single character combination until they eventually arrived at the key. Since it’s 256-bits long, this would take them a very long time.

Our password strength testing tool estimates that it would take “3,760 septuagintillion years” to crack the password. This is long after the heat death of the universe, so our hacker is going to get very bored waiting around for that long. In this case, PBKDF2 would make it infeasible to crack the password.

Attempting to crack the original password

The attacker’s other option is to try and crack the original password. But now that PBKDF2 has been implemented, it’s a completely different game.

A terribly insecure system

Under a terribly insecure password system, when someone inputs a password attempt, the system would simply look in its database and see whether the attempt matches the password that it has on record. This would happen almost instantly.

SHA-256 only

A significant step up from this would be if the database only stored an SHA-256 hash of the password, rather than the password itself. In this case, when a person input the password, the system would first hash it, then check in its database to see if this hash matched the hash that it had on record.

Because the system also has to hash the password before looking it up, this takes a tiny fraction of a second longer. Note that this is still considered insecure, but just not as insecure as the first password authentication system.

Implementing PBKDF2

Now, let’s compare these earlier two options with a system that implements PBKDF2. Under this system, when a person enters a password attempt, it doesn’t just get hashed through SHA-256 once, adding that tiny bit of extra time.

Instead, as we discussed earlier, PBKDF2 involves hashing the password with the salt, then hashing that result with the password, then hashing that result with the password and so on.

The exact number of times is dependent on the iteration count that has been chosen for the implementation. Each additional iteration takes a tiny bit longer. In our example from above, there were 310,000 iterations, meaning that the result was hashed along with the password again and again, 310,000 times in total.

Why is this significant?

Put yourself in the shoes of the attacker. They are trying to find the correct password by guessing possible combinations. Under the second password authentication system, each guess would only take a tiny fraction of a second because they would only have to compute an SHA-256 hash once.

With an implementation of PBKDF2 running 310,000 iterations of HMAC-SHA-256, you are essentially forcing them to do 310,000 times more work for every single password guess. This also drives up the cost by a multiple of 310,000.

The result of implementing PBKDF2 is that it becomes too expensive to crack a lot of passwords that were susceptible to being cracked without it. By making it substantially harder to crack the passwords, PBKDF2 can help to keep a lot of users safe.

All of these extra iterations of SHA-256 do have an impact on users, but it isn’t really noticeable. We log in relatively infrequently, and most of us only ever make a few attempts before we get our passwords right.

Even if PBKDF2 changes the log in time for a user from near-instant to about a second, it has a negligible effect on them. But making a hacker’s work 310,000 times harder has a huge impact on them. This trade-off makes it well worthwhile because the security gains are huge.

What is salting and what role does it play in key derivation functions (KDFs)?

Now that we have explained how pseudo-random functions like HMAC-SHA-256 work in PBKDF2, let’s examine the role of salting. A salt is a non-secret value that ultimately gets stored alongside the final password hash.

As you saw, the salt was added as an input in the first iteration of PBKDF2’s HMAC-SHA-256 function:

U_1 = HMAC-SHA-256 (PASSWORD123, 12345678 || 1)

In our example, the salt was 12345678.

To understand what the salt actually does for us, we need to discuss how passwords are stored and how they can be cracked.

How are passwords stored?

If you remember what we mentioned earlier, organizations should never store their users’ passwords. Instead, the system should store their password hash. The system works a little like this:

  • A user types in their password to log in.
  • The system immediately hashes the password and never stores it.
  • The system then searches through its database to verify whether this password hash matches the password hash that it has on file for that user. If the two match, the user is granted entry.

The benefit of this system is that password hashes cannot be used to directly breach user accounts. If an organization suffers a data breach, it is much harder for hackers to compromise user accounts when they only have password hashes as opposed to passwords in plaintext.

Let’s give you a more concrete example: Alice has an account with a company, and her password is Rabbit9 (we have made her password even weaker than last time to make the hacker’s job easier).

When she sets up her account, the company hashes the password she enters with the SHA-256 hash function (this isn’t a good security practice, it’s just a simplified example) and keeps this value in its database so that it can verify her password in the future.

A simple SHA-256 hash for Rabbit9 would be:

7c848f9515f8d64b9e37f4459e5232b373746178c0129ddefe8c3315088db32e

So the company would have the above hash in its database, and whenever Alice goes to log in, she would type Rabbit9 into the password field. The company would immediately hash it, which would result in the same value. It would then compare this result with the hash it has in its database. Since the two match, it would grant her access.

Now, let’s say that the company had a data breach, and a hacker found Alice’s username and password hash in the breach. If the attacker tried to log in with:

7c848f9515f8d64b9e37f4459e5232b373746178c0129ddefe8c3315088db32e

The company’s system would immediately hash the value, resulting in:

ca32706cd89b61ba81c734630d09a7ec3982ca0b01e08084aebfed531b5ed61c

The company’s system would then take this value and compare it to the hash stored in its database. Since the two don’t match, the attacker would not be granted access.

Because authentication systems are set up like this, password hashes do not allow attackers to use them to gain direct access. Instead, attackers need to find some way to figure out the original password from the hash. But one of the key properties of hash functions is that they are supposed to be one-way. So if a strong hash function has been used, it’s not feasible to simply turn the hash back into the password.

But this doesn’t mean that password hashes are completely useless to attackers.

Introducing rainbow tables

rainbow tables

Unfortunately, there aren’t a lot of rainbows involved in rainbow tables.

One option is to build rainbow tables. These tables are constructed by running common passwords through hash functions and then storing the results in a big table.

In relative terms, hashing can take a long time for computers to perform. Lookups can be done much faster. This means that if you already have a pre-computed rainbow table of matching passwords and password hashes, you can save yourself the time it takes to compute all of those hashes.

If an attacker comes across a data breach that includes password hashes of users, they could compare these hashes against the password hashes in their rainbow table.

If they find matches between the hashes from the data breach and their rainbow table, they could look up the corresponding password for these users in their rainbow table. This gives them an efficient way to find the passwords for users who use relatively common passwords.

Now, we said it’s only practical to compute rainbow tables for relatively common passwords. This is because although rainbow tables can save you a lot of time, there is a trade-off—they require a lot more memory space. This limits how big a rainbow table can be, and it’s why they are only effective against more common password combinations.

Salting protects users

Let’s go back to Alice and her weak password, Rabbit9. Rabbit9 is simple enough that many hackers would have it in their pre-computed rainbow tables. They may already know that the SHA-256 hash for Rabbit9 is:

7c848f9515f8d64b9e37f4459e5232b373746178c0129ddefe8c3315088db32e

This means that when the company suffers a data breach and its database of password hashes is compromised, Alice’s password hash will match up with the hash in the hacker’s rainbow table. The hacker can then look up the corresponding password in the rainbow table, and they can easily take over her account.

One way to protect users from rainbow table attacks is through salting. Let’s simplify things again, and say that instead of simply hashing Rabbit9 with SHA-256, the system first appends it with a salt, and then hashes it. The salt is a non-secret value. Let’s say that in this case, the salt is the eight digit string 95056398. This gives us:

Rabbit995056398

When you put the password plus the salt through SHA-256, you get a completely different hash as the result:

82ab7b8879960812cca5091215f9a7a6462f44a4c41b8c949eeec61577a27fb1

How does this change things from the user’s perspective?

It doesn’t. The system simply saves the salts alongside the password hashes. Whenever a user types in their password to log in, the system retrieves the salt, appends it to their password and then hashes them together. Once the two are hashed, the system takes the result and then compares it to the password hash on file. If the two match, it grants the user access.

It’s from the attacker’s perspective that we see a big difference. Why? Because it makes rainbow table attacks unfeasible. As we stated, rainbow tables speed things up at the expense of taking up space. The longer and more complex a password is, the larger the rainbow table would have to be in order for its hash to be included on the table.

Anthony Ferrara did some of the calculations on his blog:

  • A four character password — 35,153,041 unique combinations, which would require a 913MB rainbow table. Any of us could store this on our computers. It’s the size of a low quality movie.
  • A five character password — 2,706,784,157 unique combinations, which would require a 70GB rainbow table. Many phones have this much storage. Some have a lot more.
  • A six character password — 208,422,380,089 unique combinations, which would require a 5.4TB rainbow table. Depending on your country, you can probably buy a hard drive bigger than this for a few hundred dollars.
  • A seven character password — 16,048,523,266,853 unique combinations, which would require a 417TB rainbow table. At this stage, we are getting beyond the reach of a lone hacker, but many hacking groups or nation states would easily have this much available storage.
  • An eight character password — 1,235,736,291,547,681 unique combinations, which would require a 32PB rainbow table. One petabyte (PB) is 1,024 terabytes. This is a lot of storage. An adversary would need to have an exceptional amount of resources and incredible determination to build a rainbow table this large.

Even at just seven characters, it’s unfeasible for most solo hackers to build a rainbow table that would be able to guess all possible passwords within that range. Now, that doesn’t mean that every password longer than eight characters is safe—longer passwords that are common may still be in rainbow tables.

With these figures in mind, the hash for Rabbit9 is very likely to be in a hacker’s rainbow table. The hash for Rabbit995056398 is not, because this is a long and far less common password, and it would simply cost too much money to store uncommon passwords of this length.

The concept behind this is that adding a salt to a password radically expands the size of the rainbow table a hacker would need in order to match up the password hashes. It doesn’t even matter if the hacker knows the salt, so the salt doesn’t need to be a secret value.

In this way, the salt gives users with weak passwords much more protection, without users actually having to do anything. Note that this is not a recommendation to use weak passwords because the system will compensate. Instead, it’s a practice of defense-in-depth — a cybersecurity strategy and approach that involves implementing multiple layers of security measures.

What role does the iteration count play in key derivation functions (KDFs)

The iteration count plays another vital role in the security of key derivation functions (KDFs). It allows an implementer to specify the number of times that a password and a salt will be hashed. Each additional hashing iteration makes the process of key derivation take slightly longer and requires slightly more computing power.

The genius of PBKDF2 is that the iterations can be set on a sliding scale. With just a few iterations, the algorithm runs incredibly fast. The more iterations an implementer sets, the longer the key takes to compute. OWASP currently recommends 310,000 iterations of SHA-256 when PBKDF2 is used for password hashing purposes.

We increase the iteration count to deliberately slow down the algorithm. Since legitimate users only ever need to run through the process a few times at most, this has a limited effect on them.

It’s hackers who are running billions and billions of password guesses that are impacted significantly. Since they need to make so many attempts in order to crack passwords, increasing the iteration count by a factor of 100 also increases their costs 100 times. Adjusting the iteration count upward can make it unfeasible to crack many passwords that would have been easy to crack at a lower count.

When PBKDF2 was first released, the recommendation was to only set the iteration count to 1,000. However, this number has increased dramatically over the years as technology has gotten more powerful. Hackers are now able to access more powerful equipment, which would make 1000 iterations trivial.

What is the derived key length?

The derived key length input allows an implementer to specify how long their desired output will be. This is critical when key derivation functions are being used to turn passwords into strong keys, because a given cryptosystem will need a key of a specific size.

If the cryptosystem uses 128-bit AES for symmetric key encryption, the key derivation function’s output for AES needs to be 128-bits. 256-bit AES would require a 256-bit key, and so on.

The derived key length input of PBKDF2 and similar KDFs allows implementers to set the key length according to the needs of the cryptosystem. This is one reason why bcrypt is unsuitable for deriving keys from passwords. It’s a password hashing scheme that does not allow the implementer to select the output length.

When a key derivation function is being used for password hashing, having an adjustable output for the key length isn’t necessary. This is why some KDFs can be used for both key derivation and password hashing, but not all password hashing schemes are suitable for key derivation.

What is a hash-based key derivation function (HKDF)?

The hash-based key derivation function (HKDF) can take input keying material (IKM) and turn it into strong output keying material (OKM). It is based on hash-based message authentication codes (HMACs), and is designed to be a fast mechanism for producing strong keys.

The hash-based key derivation function (HKDF) accomplishes this through an extract-then-expand approach, which has two uses:

  • Extract — This allows us to take imperfect sources of initial keying material and produce strong output keying material from them. Examples of imperfect keying material include sources that aren’t sufficiently random, and those that do not come from a uniform distribution.
  • Expand — The expansion module allows us to produce multiple strong keys from a single source of strong keying material.

These two modules allow us to:

  • Take a source of weak material and produce a strong key from it.
  • Take a strong key and produce multiple strong keys from it.
  • Take a weak source of keying material and produce multiple strong keys from it.

The ability to produce multiple strong keys is incredibly important for our cryptosystems, because keys should never be reused for different purposes. A tool like HKDF gives us an easy way to produce separate keys for processes such as:

  • Encryption
  • Authentication
  • Integrity
  • Key wrapping
  • Random bit generation

We need to use separate keys for each aspect of our cryptosystem, because it helps to limit the amount of damage that can be done if an attacker manages to compromise a key. If an attacker can compromise a key that controls everything, they can devastate the security of the communication channel. They could read all of the messages and even insert themselves in the middle of the conversation and produce fraudulent messages — known as a man-in-the-middle (MitM) attack.

When separate keys are used for each purpose, the potential fallout can be less severe. If an attacker compromises a key that only controls encryption, the confidentiality of the communication channel will be breached, but the integrity and the authenticity of the channel will remain intact. It’s still a big problem, but it’s not as bad as a completely compromised channel.

As we stated, the hash-based key derivation function is designed to be fast. This makes it unsuitable for deriving keys from passwords or for storing passwords.

If HKDF were implemented in these applications, its speed would mean that an attacker could cycle through many more password attempts per second than they could with a secure password hashing scheme. This would make it far easier to crack passwords, which is why HKDF should only be implemented for its intended uses.

The different requirements of key derivation functions and password hashing schemes

Now that we have gone through PBKDF2 as an example of a password-based key derivation function, and covered each of the important elements in detail, we can take a step back and summarize our requirements for each of the varied use cases.

For deriving strong keys from passwords:

  • We want an algorithm that can take on passwords as inputs and produce random and uniform keys. Putting the input through a large number of iterations in a HMAC function helps to stretch keys in this way.
  • We want to be able to set the derived key length.

For storing passwords:

  • We want the algorithm to be slow, to make a hacker’s job much harder. Key stretching through large numbers of iterations help to achieve this.
  • We want to make rainbow table attacks impractical. This is why salting is so important.

For deriving strong keys from weaker sources of keying material that aren’t passwords:

  • We want to produce strong keys with sufficient entropy from a uniform distribution.
  • We want to be able to set the desired key length.
  • These weaker sources of keying material still have much more entropy than many passwords do. In this scenario, it’s also unnecessary to have a deliberately slow function, like in PBKDFs. Instead, efficiency is more desirable.

For deriving multiple strong keys from from a single source of strong keying material:

  • Each of these keys must be strong and independent from one another so that if an attacker compromises one, it does not lead to the compromise of all of the keys used in other aspects of the cryptosystem.
  • We want to be able to set the desired key length.
  • Because there are no passwords involved and the initial keys already have a strong source of entropy, there is no need for large numbers of iterations to deliberately slow down attackers. Once again, efficiency is desired.

Common functions and their uses

Here is a list of some of the most common functions used for key derivation or password hashing:

  • Argon2 — This function was the winner of the Password Hashing Competition. Argon2 is a KDF that is designed to be memory intensive. This makes it suitable for both password hashing and key derivation. It takes on a password and a salt as primary inputs, as well as eight separate secondary inputs. There are several different versions, each with their own specific advantages:
    • Argon2d
    • Argon2i
    • Argon2id
  • scryptscrypt is another key derivation function that can be used for both storing passwords and deriving keys. It was also designed to be memory intensive in order to make password cracking much more expensive for attackers. Its inputs include the password, salt, cost factor, block-size factor, parallelization factor, the desired key length, and a couple of other parameters.
  • PBKDF2 — We have already discussed PBKDF2 in depth throughout this article. Just to reiterate, its inputs include the password, salt, iteration count and derived key length. Although it can be used for both deriving keys and password hashing, these days it is mainly used for password hashing because it is quite an old function. OWASP recommends 310,000 iterations of SHA-256 in PBKDF2 in order for its password hashes to have a safe security margin.
  • bcrypt — bcrypt is solely a password hashing scheme, rather than a KDF that can be used in both applications. It incorporates a salt and has an iteration count that can be varied in order to slow it down and make it more resistant to attacks. However an implementer cannot adjust the output key length, which is what makes it unsuitable for deriving keys.
  • HKDF — HKDF is a key derivation function. However, it should not be used for deriving keys from passwords or for password storage, because it is too fast. Instead, it should be used for deriving strong keys from weaker input keying materials (not passwords), deriving multiple keys from strong input material, or both.

Implementing key derivation functions

As you might expect, each programming language has its own method for implementing key derivation functions (KDF).

In Python, one of the most commonly used libraries for creating scripts to perform key derivation is the “cryptography” library. In Rust, you can implement a KDF using the “ring” crate cryptographic library, while in Ruby you’ll need the OpenSSL library. In Perl, a PBKDF2 algorithm is provided by the Crypt::PBKDF2 module. Those using C++ will need to download and install the Crypto++ library from its official website.