Linked question: AES encrypting multiple files
With a password, I have 100k files to encrypt. (Maybe 100k files today; or maybe 50k today, 10k tomorrow, and 40k files the next week).
Up to now, I did this (pseudo-code):
for each file: plaintext = file.read() nonce = getrandom(bytes=16) key = KDF_PBKDF2(password, salt=nonce, count=1000000) # very slow for each file! ciphertext, tag = AES_GCM_cipher(key, nonce=nonce).encrypt(plaintext) write to disk: nonce | ciphertext | tagand to decrypt the encrypted file
nonce, ciphertext, tag = file.read() key = KDF_PBKDF2(password, salt=nonce, count=1000000) # very slow for each file! plaintext = AES_GCM_cipher(key, nonce=nonce).decrypt(ciphertext)Obviously, this is not optimal, since I run the KDF function for each file, and this is slow!
I thought about this solution:
# do this ONLY ONCE for each encryption session: salt = getrandom(bytes=16) key = KDF_PBKDF2(password, salt=salt, count=1000000) # run only oncefor each file: plaintext = file.read() nonce = getrandom(bytes=16) ciphertext, tag = AES_GCM_cipher(key, nonce=nonce).encrypt(plaintext) write to disk: salt | nonce | ciphertext | tag
but this has the drawback of having to prepend 16 more bytes (
salt) at the beginning of each encrypted file. Is it a common practice?And above all it has the following drawback when decrypting:
for each encrypted file: salt, nonce, ciphertext, tag = file.read() # since salt may be different for each file # we have to run: key = KDF_PBKDF2(password, salt=salt, count=1000000) # very slow for each file! ...Since
saltis on the beginning of each encrypted file, this means we have to run the KDF function ... for each encrypted file that we want to decrypt! This will be very slow.We could put them in cache
cache[salt] = key, such that if we find the samesaltagain, we already have thekey, but I'm not sure if this is an elegant solution.
Question: which scheme to use to encrypt 100k files (in one pass, or in multiple sessions) with a password with AES-GCM?
passwordand derive a key from this password. Can you give more details about your solution? – Basj Nov 20 '20 at 00:05password. And do we have to store thesaltused for KDF in each encrypted file? Thissaltis indeed mandatory if we want to be able to decrypt them: withoutsaltwe can't decrypt. – Basj Nov 20 '20 at 00:08saltonce for all when we start the encryption job, in a.encryptionsaltmetadata file. Then all subsequent files (today's 50k files, next week's 40k files, or even next year's 1 million files) will use the same salt for KDF, and only differentnonce? – Basj Nov 20 '20 at 00:10