clkhash - why is secret secret?

Question

The documentation for clkhash (https://clkhash.readthedocs.io/en/stable/tutorial_api.html) states that "knowledge of this secret is sufficient to reconstruct the PII information from a CLK". My question is, how is that possible, considering that hashes are supposed to be one-way transformation? Ideally, I would like to see some code that actually does it.

score 0 · Answer 1 · answered May 24 '23 at 15:49

It's true that the hash algorithm which are used here (MD5 and SHA-1) cannot be reverted, but if there was no secret, the CLK scheme would be highly susceptible to dictionary or brute-force attacks. That is, an attacker could simply try out potential inputs (e. g. different names and birth years), and whenever they find a matching CLK, they can be reasonably sure they've identified the person. Consumer GPUs like NVIDIA's RTX 4090 can calculate around 160 billion(!) MD5 hashes per second, so it won't take long to try out a large number of possible inputs.

This is why the scheme uses keyed-hash message authentication codes (HMACs) instead of simple hashes. An HMAC can only be calculated if the key is known, so a brute-force attack is no longer feasible (unless the secret is leaked, of course).

clkhash - why is secret secret?

1 Answers1