Effective ways to hash phone numbers?

Question

Suppose a company wants to implement 2FA for it's users using phone number OTP system, but does not really want to store their phone numbers as it could get breached and phone numbers are considered private.

What they intend to do is store some kind of hash of the phone number. When user logs in, they provide their original phone number, which would later be used to verify against hash, and an OTP will be sent if it matches.

A simple hash is definitely out of question because of how ridiculously easy it is to crack 10 digit phone numbers.

two possible solutions seem to be slow hashing algorithm and salting.

I found that ProtonMail does something similar using salt rotation.

How would one implement something like this? Is it feasible? Is it possible to have some sort of protection even after complete breach (including salt)?

In most cases, asking user to enter number to receive OTP is not required. You just send OTP to the whatever number user has registered. You can encrypt the number in database. — defalt, Dec 01 '21 at 08:48
How would the server know which phonenumber is supposed to receive the OTP if only hashes are stored? — Beltway, Dec 01 '21 at 08:50
Well, using a modern hashing function seems to be what you're looking for (see Argon2 for instance). — Lou_is, Dec 01 '21 at 08:52
@Beltway the user has to enter their phone number, the hash will check if it is infact their phone number and then send OTP, purpose being, we don't want to store raw phone numbers, in case of breach. — Abhishek Choudhary, Dec 01 '21 at 09:39
@defalt that can serve purpose of preventing attackers if we keep encryption key secure, another purpose of hashing was to make it impossible even for company to get their phone numbers, only verify when required, guess that would be difficult as either way we have to keep something safe, either salt or encryption key. — Abhishek Choudhary, Dec 01 '21 at 09:42
you could hash beg instead of 123 to avoid a digit-only basis. — dandavis, Dec 01 '21 at 18:33
HMAC is the key Is it easy to crack a hashed phone number? from cryptography. — kelalaka, Dec 01 '21 at 19:07
@kelalaka yes, that would work, still gotta keep the key secure in case of breach. — Abhishek Choudhary, Dec 01 '21 at 20:32
You are sending OTPs to the number anyway so you will always know which number is associated with which user sooner or later. — defalt, Dec 03 '21 at 07:24
@defalt yes, it's not that company is untrustworthy, they just don't intend to store phone numbers — Abhishek Choudhary, Dec 03 '21 at 07:44

score 3 · Answer 1 · answered Dec 01 '21 at 10:42

3

You could use a slow salted hashing algorithm (such a bcrypt or Argon2id), and tune the work factor/parameters to make them even slower. It's not a huge keyspace (especially if you have a smart attacker than can exclude invalid ranges), but if calculating a hash takes (for instance) 500ms on your server CPU, then it will still take a long time for an attacker to crunch through the keyspace (even if they can go much faster than you can).

It will always be possible to crack the hashes with enough time and compute power, but given that they're just hashes of phone numbers, there's probably not much motivation for an attacker to devote significant resources to this. If they've got access to your database, there are probably other (better) ways to attack your users.

A few other things that you need to bear in mind:

SMS is considered a very weak form of MFA, and has all kinds of security issues. If you care this much about security, you should be using something stronger like TOTP.
Make sure you normalise the numbers before you hash them (to avoid inconsistencies with spaces).
Remember that not every one has a phone number in the same length/format.
Make sure that you have appropriate protection to prevent this slow hashing for being used to DoS your application.
For an added layer of protection, you could encrypt these hashes with a key stores outside of the database, so an attacker would need to compromise both your database and something else to get the key. This is a similar concept to peppering - but if you do it you need to think about things like key rotation.

answered Dec 01 '21 at 10:42

Gh0stFish

10,932
2
35
36

1

No, don't. You are slowing down an attacker, but making impossible for the company to use the phone number to send a SMS. It makes the process of sending a message so difficult and time consuming that isn't worthy it. – ThoriumBR Dec 01 '21 at 10:48
@ThoriumBR OP explains exactly how they would do this in the second paragraph of their post. – Gh0stFish Dec 01 '21 at 10:54
Just imagine bruteforcing the hash of the phone for an account every time that account logs in... – ThoriumBR Dec 01 '21 at 10:55
2

@ThoriumBR Why on earth would you do that? As OP says, you prompt the user for their phone number during the login process, hash it, compare to to the hash you have stored, and then send them an SMS if they match. – Gh0stFish Dec 01 '21 at 10:59
You wouldn't prompt the user for his phone number upon 2FA either. The idea of 2FA with password and SMS is to combine something you know with something you have. prompting for the phone number essentially just adds something to the knowledge condition despite we can't even expect a phone number to be a secret. This is especially poor from a usability perspective as it forces a second prompt plus the OTP and might raise confusion (also keep in mind that a lot of people do not memorize their phone number, especially when using multiple). – Beltway Dec 01 '21 at 11:11
2

@Beltway I agree that it's not very nice UX, and as I said in my answers, you shouldn't really be using SMS at all for MFA. But OP's question wasn't "How should I implement MFA?" or "How can I improve the UX of my login process?", it was how they could hash phone numbers and still use them for MFA, which is the question I answered. It's not how I'd design a login, but I don't know what other constraints OP has that might meant they can't use something better like TOTP. If you do, feel free to share. – Gh0stFish Dec 01 '21 at 11:18
Slow hashing seems to be the only solution then and salting with assumption that salt won't get breached, I agree it won't be great UX, but logging in isn't very frequent anyways, ProtonMail simply stores hashes unassociated with any user, they are only used to check for duplicates. – Abhishek Choudhary Dec 01 '21 at 11:40

Luc · Answer 2 · 2023-11-14T12:10:15.417

A simple hash is definitely out of question because of how ridiculously easy it is to crack 10 digit phone numbers.

two possible solutions seem to be slow hashing algorithm and salting.

Salting

If you have the salt, then it's trivial again.
If you don't, and
- the salt was secure (e.g.: 128 bits or more from a good random source), then it's impossible for both any attacker and the data owner themselves. The website can no longer check whether you entered the correct number, in your example, so salt rotation is effectively the same as deleting the hash.
- if the salt is not secure (4 bytes alphanumeric seems to be common), then it is the same as slow hashing. Instead of having a fixed number of rounds, you have to guess the salt. This, again, applies both to the data's legitimate owner and to any attacker from whom the salted hash should protect the phone number. I don't see an advantage over slow hashing.

Slow hashing

How slow is reasonable, depends on the scenario. In the example from your question:

When user logs in, they provide their original phone number, which would later be used to verify against hash, and an OTP will be sent if it matches.

It might be possible to go pretty slow. If you don't need to log in multiple times a day, waiting 5 seconds for the slow hash isn't so bad. (In practice, very few of our customers find it worth it to keep a server busy and make the user wait for more than half a second, but let's use 5 seconds as a best-case scenario.)

Based on my own tests, a cracking station with GPUs get about a 17× speedup as compared to a server which performs Bcrypt hashing on a CPU. I suspect this is due to the higher memory bandwidth of a GPU. (For PBKDF2, the speedup is 1000×, but let's once again assume a best-case scenario where a good hashing algorithm was chosen.)

An attacker therefore needs to spend 5/17=0.3 seconds to check whether a given hash matches a given phone number. Give it a few days and they can check 1.2 million phone numbers¹. Add a second GPU and it goes twice as fast. I don't know about other countries but there are only 60 million Dutch mobile numbers in total.

Conclusion: hashed phone numbers can always be cracked over time for a motivated attacker, such as law enforcement or if there is monetary gain². If you can narrow the number range down to a few area codes you're interested in, the equation gets worse. If the target site didn't use Bcrypt but PBKDF2, or plain Blake2 or something, it also gets much easier. If the website didn't make users wait 5 seconds for the hashing to complete, but a more realistic 0.5 seconds instead, it also gets much faster.

Practically, phone number hashing adds only a little technical protection. The practical advantage I see is that, if you tell your users that their phone numbers go through a strong hashing function, the marketing department can't use them without publicly changing the privacy policy and having everyone accept the new terms first. It's also very clearly off-limits if you need to involve technical expertise to crack the numbers rather than just reading the plain numbers off of a sheet, so it's unlikely to be mishandled. The hash provides a psychological barrier more than technical security.

Secure computation

This is not my expertise, the text below should be correct on principles but may be incorrect on implementation details.

The basis for this idea comes from Signal: https://signal.org/blog/private-contact-discovery/#sgx-contact-discovery. They compare a list of contacts (can be hundreds of numbers) against their list of all registered users (might scale to billions of numbers) to find people to chat with. Your scenario, namely verifying whether one phone number belongs to your account (one specific database record), should be a lot easier than what Signal does.

Your client device would:

Verify that the expected code runs in the secure execution environment. This involves some public key encryption, where the maker of the environment (in Signal's example: Intel, because they made SGX) provides a public key and the private key is baked into the CPU.
Establish a communications channel that the server application can't read. Since you have the public half of a key which is baked into the trusted environment of the CPU, you can encrypt a secret for it and then only the trusted environment can decrypt that data.
Encrypt the phone number for the trusted environment and send it over.

The code inside the environment can now do its thing, for example verifying that the number matches what is on record in the database for your account, and returning only 'yes' or 'no' to the application server. The application server can then send the SMS if the response was 'yes'.

In practice, there are constantly vulnerabilities being found in SGX. A different HSM vendor could probably be a good substitute, where the HSM's CPU is actually separate and you don't have all these side channels that SGX has. I don't know why Signal didn't consider that as an option, perhaps HSMs lack features that Signal needs for this mass data comparison.

Besides being the only actually secure option (unless a vulnerability is found in the environment, which is always a risk), this approach also does not have a time trade-off, like the slow hash where slower is more secure. The client only needs a few KB of RAM to do the asymmetric cryptographic operations.

¹ apt install qalc && qalc '4 days / (5/17) seconds' #shows 1175040
² Examples: not everyone wants it known that they're registered on (certain) dating sites. If the database of hashed phone numbers leaks, an attacker could send SMSes to cracked phone numbers, saying that they'll publish the information unless they pay up. Or one could do targeted phishing based on which website the phone numbers came from.

score 0 · Answer 3 · answered Dec 03 '21 at 06:02

From a security aspect, hashing numbers does not provide much protection other than being moderately difficult to crack.

Since phone numbers are used repeatedly for 'OTP' based transactions, you would have to change the salt every time, using something like a CSPRNG. What salt and hash functions you use depends on you, but there are limits to how secure you can make a hash.

Depending on your security needs, you need to decide how often you want to change the salt and where these salts are stored.

Effective ways to hash phone numbers?

3 Answers3

Salting

Slow hashing

Secure computation

Linked