Is constructing IND-CCA2 public key encryption schemes particularly easy with the KEM/DEM approach?

Question

After the introduction of McBits, I was interested what security notions are neccessary for IND-CCA2 security of integrated encryption schemes (IES, following the key encapsulation mechanism / data encapsulation mechanism KEM/DEM approach). Now I've recently answered a different question and stated that the DEM and the KEM both need to be CCA secure in order for the whole scheme to be CCA secure.

As a consequence of this I asked myself how to generically construct an IND-CCA2 public key encryption scheme. My question now is:

Is the below ("simple") KEM/DEM public key encryption scheme IND-CCA2 under the assumption that $KDF(x)$ behaves like a random oracle and that $f_K(m):\mathcal K\times \mathcal M \rightarrow \mathcal C$ is an invertible one-way trapdoor function?

The scheme (formalized as per "Introduction to modern Cryptgraphy", second edition, by Katz and Lindell):

The KEM:
$Gen:$ the same as for $f_K(m)$
$Encaps:$ choose a random $k\in \mathcal M$, apply $c\gets f_K(k)$, convert both to a (suitable) binary representation and output $(c,KDF(k))$ with KDF being a secure arbitrary length hash function
$Decaps:$ convert $c$ from a binary string to an element of $\mathcal C$, apply $k\gets f^{-1}_K(c)$ and output $KDF(k)$ or $\bot$ in case $k=\bot$ which is always the case whenever $c \notin \text{Range}(f_K)$ holds.

The encryption scheme is then constructed as in construction 11.10, meaning the returned $c$ is prepended to the ciphertext, the $KDF(k)$ is used to key an authenticated encryption with associated data (AEAD) scheme which does the bulk encryption. To prevent unclearities while parsing the inputs and outputs a special encoding (pairing functions) must be used, any additional data introduced here will be fed into the AEAD scheme, see this comment (by Ricky Demer) for the details. Decryption is then obviously applying Decaps to the prepended $c$ and decrypting and verifying using the AEAD scheme, decryption fails (and thereby returns $\bot$) if either $Decaps$ or the AEAD scheme return $\bot$.

I hope I got the terminology right, if anything is wrong feel free to correct. — SEJPM, Aug 01 '15 at 12:55
Is $\hspace{.04 in}f_K^{-1}$ assumed to not output an element of $\operatorname{Dom}(\hspace{.05 in}f_K)$ whenever $\hspace{.04 in}f_K^{-1}$'s $\hspace{1.93 in}$ input is not an element of $\operatorname{Range}(\hspace{.05 in}f_K)$? $;$ — , Aug 01 '15 at 14:56
You mean something like $c \notin \mathcal C \rightarrow f_K(c)=\bot$? Yes I think it's reasonable to assume this, I'll add it. — SEJPM, Aug 01 '15 at 16:26
No, since you wrote $\mathcal{C}$ instead of $\operatorname{Range}(\hspace{.05 in}f_K)$ and wrote $\hspace{.04 in}f_K$ instead of $\hspace{.04 in}f_K^{-1}$. $;$ — , Aug 01 '15 at 16:37
@RickyDemer, sorry I just forgot the $^{-1}$ :( Well, it looks like I just assumed $Range(f_K)=\mathcal C$ and I'd still assume it's reasonable behavior to define $c\notin Range(f_K)\rightarrow f_K^{-1}(c)=\bot$ as it's impossible to map something back that couldn't have been output in the first place... — SEJPM, Aug 01 '15 at 16:46
That is reasonable, it's just something to make sure of, since otherwise the KEM can be malleable. $\hspace{.48 in}$ — , Aug 01 '15 at 16:51
By the syntax above your post's grey line, $: \operatorname{Range}(\hspace{.05 in}f_K) \subseteq \mathcal{C} :$, $:$ so the left part of the disjunction can be removed. $;;;$ (Also, I've now convinced myself that without the assumption I mentioned, even the integrated scheme could be malleable, rather than just the KEM.) $;;;;;;;;$ — , Aug 01 '15 at 17:05
Also, is it assumed that the private-key holder can recover length(c) from $\hspace{1.79 in}$ length(c) + length(the_ciphertext) $:$ ? $;;;;$ — , Aug 01 '15 at 18:49
@RickyDemer, yes it is indeed assumed that any party can correctly decomposit $(c,ciphertext)$ and thereby recover the length of $c$, for example using meta-data about the message length.and performing a simple subtraction. — SEJPM, Aug 01 '15 at 18:56
Is it also "assumed that any party can correctly decomposit" $:$ c || ciphertext $:$ "and thereby recover the length of $c$"? $;;;$ (After all, if $c$ is just being prepended, then that's what would need to be decomposed.) $;;;;;;;;$ — , Aug 01 '15 at 19:51
@RickyDemer Yes it is assumed that any party can correctly parse / decomposit c||ciphertext ($=(c,ciphertext)$) and thereby recover all related meta-data like the length of c and ciphertext. The concrete details on how the message length / the length of c is transmitted are considered "implementation details". If they are neccessary for security analysis they can of course be put into the definition (for example by prepending a tuple of 64-bit LE integers specifying the bit lengths of all objects and stating that this is considered "associated data" to the AEAD scheme). — SEJPM, Aug 01 '15 at 20:00
In general, $;;; (x,\hspace{-0.03 in}y) : = : x\hspace{.04 in}||\hspace{.03 in}y ;;;$ does not work, since $;;; 1\hspace{.03 in}||\hspace{.03 in}11 : = : 111 : = : 11\hspace{.03 in}||\hspace{.03 in}1 \hspace{1.68 in}$ but one would need $;;; (1,\hspace{-0.04 in}11) \neq (11,\hspace{-0.04 in}1) ::::$. $;;;;;;;;;$ — , Aug 01 '15 at 20:10
@RickyDemer, thank you for clarifying this :) I wasn't aware that this isn't neccessarily the same. Yes of course I mean that $(1,11)\neq(11,1)$ and this accomplished via associated data authentication (the AE scheme authenticates the lengths as well and if they get tampered decryption will fail.) I'll add this one to the spec as I doubt this will render Yehuda's answer obsolete. — SEJPM, Aug 01 '15 at 20:21
Associated data authentication might work, but in general one should just use a(n efficiently-invertible) pairing function (that is efficiently computable), such as $:$ prefixfree(x) || y $:$ or if length(y) < length(x) then 1 || prefixfree(y) || x else 0 || prefixfree(x) || y. $\hspace{.54 in}$ — , Aug 01 '15 at 20:30
One more time I have to thank you @RickyDemer. I wasn't aware of this option for combining data and have updated the spec again (still without rendering Yehuda's answer obsolete by this change), I've also added that the "additional data introduced" (the 0 and 1 you mentioned) are subject to AD authentication. (I hope this strengthens rather than weakens :) ) — SEJPM, Aug 01 '15 at 20:59
Comments are not for extended discussion; this conversation has been moved to chat. — e-sushi, Aug 01 '15 at 22:53

score 2 · Answer 1 · answered Aug 01 '15 at 19:04

2

If I understand the question correctly, you are asking whether it's really needed to have the KEM be CCA secure, and maybe in the random oracle model it would suffice for it to just be an invertible one-way function.

This would not be CCA2-secure. Specifically, let $f$ be any invertible one-way function, and construct $f'$ so that $f'(x)=0f(x)$ for every $x$. In addition, the inversion function works by ignoring the first bit of $y$ and then inverting the rest using $f^{-1}$. Note that $f'^{-1}(0y) = f^{-1}(1y)$ for every $y$. Thus, it is possible to change a single bit in the ciphertext, and the encrypted value remains untouched. Since the CCA2 experiment allows the adversary to query a decryption oracle on anything but the challenge ciphertext, this would allow it to obtain a decryption (by changing the appropriate bit of the ciphertext). I know that this is a stupid counterexample, but this is exactly the point of these counterexample: it demonstrates that additional properties are needed.

If we ask the same question for a trapdoor permutation then this seems more promising. The same KEM proof when using RSA (see Section 11.5.5) may go through in this case. I'm not guaranteeing it, since sometimes proofs fall on strange things, but it's my intuition. I suggest writing out a proof to make sure.

answered Aug 01 '15 at 19:04

Yehuda Lindell

27,820
1
66
83

The OP fixed that issue well before you posted this answer. $;$ – Aug 01 '15 at 19:18
@RickyDemer, ahhh, now I understand why you asked me for the explicit check if the supplied $c$ is actually in the range of the $f_K$! :) Just another sample on how subtle details and implicit assumptions not written out can break otherwise secure systems :) – SEJPM Aug 01 '15 at 19:42
@SEJPM Note that in my silly counterexample, the range is defined to include 1y. You can of course come up with a more natural counterexample, but it won't make a difference. A range check won't help here, and neither will using authenticated encryption for the symmetric part. As I wrote, however, a permutation may suffice. – Yehuda Lindell Aug 02 '15 at 09:13
@YehudaLindell, just to be sure (because the term range has two meanings): You're referring to range of a function $f$ as the set of all possible outcomes of the function? (not neccessarily the co-domain) For example 1y is indeed part of the co-domain (${0,1}\times \mathcal C$) but not part of the range (as was meant in the question, I couldn't find the proper notation for this :( ) as $\forall x\in \mathcal M:f'(x)\neq 1f(x)$. Either way +1 for the permutation suggestion and the answer for the unrevised question. – SEJPM Aug 02 '15 at 12:24
I just mean anything that the inversion algorithm will accept. Of course, it's possible to define the inversion function so that it won't accept $1f(x)$. However, that's not the point. – Yehuda Lindell Aug 02 '15 at 21:12

Is constructing IND-CCA2 public key encryption schemes particularly easy with the KEM/DEM approach?

1 Answers1