If $Y$ is a nonnegative absolutely continuous random variable and $E[X|Y]=Y/2$, is $E[X|Y=-1]=-1/2$? Is $E[X|Y=2]=1$?

Question

One of the definitions I learned for $E[X|Y=y]$ is the following: $$ E[X|Y=y]=\int_{\mathbb{R}} x\,P_{X|Y=y}(dx), $$ where $P_{X|Y=y}$ a probability verifying $$ P(X\in A, Y\in B)=\int_B P_{X|Y=y}(A)\,P_Y(dy), \;\;(*)$$ for all $A,B\in\mathcal{B}(\mathbb{R})$.

This probability $P_{X|Y=y}$ is unique in the following sense: if $Q_{X|Y=y}$ is another probability satisfying $(*)$, then $P_{X|Y=y}(C)=Q_{X|Y=y}(C)$ for all $C\in\mathcal{B}(\mathbb{R})$ and $y\notin N$, with $P_Y(N)=0$.

Let $Y$ be a nonnegative absolutely continuous random variable with $E[X|Y]=Y/2$. Let $N=(-\infty,0)\cup \{2\}$. We have $P_Y(N)=0$, so in principle $P_{X|Y=y}$ may be any probability on $(\mathbb{R},\mathcal{B}(\mathbb{R}))$ for $y\in N$, right? Then $E[X|Y=y]$ could be any value for $y\in N$, right?

I am totally confused. Intuitively, $E[X|Y=2]$ should be $1$ and $E[X|Y=-1]$ should not be defined, since $Y$ is nonnegative.

As you can see in this question, in Lemma 5.22 of the book An Introduction to Computational Stochastic PDEs, it is stated that $E[W(t)|W(1)]=t\,W(1)$ implies that $E[W(t)|W(1)=0]=0$, where $W$ is a Brownian motion and $0\leq t\leq 1$. But $P(W(1)=0)$, so $E[W(t)|W(1)=0]$ is not uniqueley defined, it may be any value. I mean, $y\mapsto E[W(t)|W(1)=y]$ is defined $P_{W(1)}$-a.s., so at $y=0$ there is not a unique definition. It is like stating something about $f(0)$ when $f$ is a real function defined a.e.

Not sure if relevant, but if you are familiar with the idea of a regular conditional probability, then you can actually define conditional probability measures $E[\cdot |Y=y]$ for every $y$ (not just up to a set of $P_Y$-measure zero), without any problem. — shalop, Sep 16 '17 at 05:41
@Shalop I do not know about regular conditional probability. However, from the last example in WIkipedia, I understand that the regular conditional probability coincides in my case with $$f_{W(t),W(1)}(x,y)/f_{W(1)}(y)$$ for all $y\in\mathbb{R}$. Thus, from my point of view, the law $P_{W(t)|W(1)=y}$ is understood via the representative obtained from the density $$f_{W(t),W(1)}(x,y)/f_{W(1)}(y),$$ which is defined for all $y\in\mathbb{R}$. And also I think that the proof of Lemma 5.22 in the book I mentioned is not correct. — user39756, Sep 16 '17 at 12:03
Moreover, the definition of regular conditional probability is not unique, it is only $P_Y$-a.s. unique. Again, we have to choose representatives, so the same problem appears. — user39756, Sep 16 '17 at 14:14
Check this question: https://math.stackexchange.com/questions/62958/considering-brownian-bridge-as-conditioned-brownian-motion to see that, while not rigorous according to the usual definitions, the definition by conditioning can still be made rigorous through a limit of conditioning on smaller and smaller balls around 0 — Bananach, Sep 16 '17 at 15:58
@user39756 Ok, it seems like your problem is with the fact that the measures $P(\cdot|W(1)=x)$ are not uniquely defined for every $x \in \Bbb R$, only for almost every $x$. One way to fix this is that you can add a technical condition to make sure that uniqueness does hold, for instance requiring that the family of measures $x \mapsto P(\cdot |W(1)=x)$ is weakly continuous. — shalop, Sep 16 '17 at 22:20
Another way to remedy the situation is to adopt a completely different definition for Brownian bridge: $W^x(t):=W(t)-t(W(1)-x)$, and then proving that $W^x(t)$ is the distributional limit as $\epsilon \to 0$ of the law of $W$ conditioned on the event $|W(1)-x|<\epsilon$ (see the link by Bananach or the answer below). — shalop, Sep 16 '17 at 22:21

shalop · Answer 1 · 2017-09-16T09:39:28.563

I will state and prove a general theorem which formally justifies why a Brownian bridge can be thought of as the same object as a Brownian motion conditioned on its terminal value (maybe this answer will make a bit more sense if one is familiar with regular conditional probabilities).

So let $B=(B_t)_{t \in [0,1]}$ be a Brownian motion on $[0,1]$. For $x \in \Bbb R$ I will define the Brownian bridge ending at $x$ to be the process $\tilde B_t^x := B_t-t(B_1-x)$, and I will show that this naive definition coincides with your conditional probability definition.

Let $\mu$ denote the law of $B$, and let $\mu^x$ denote the law of $\tilde B^x$, considered as probability measures on $C[0,1]$ equipped with its Borel $\sigma$-algebra. In other words, $\mu(A):=P(B \in A)$ and $\mu^x(A) := P(\tilde B^x \in A)$, for Borel sets $A \subset C[0,1]$.

Theorem: All of the following are true:

The Law of $\tilde B^x$ is the law of $B$ conditional on $B_1=x$. Formally, this means that for any bounded measurable function $f:C[0,1] \to \Bbb R$, we have that $$\Bbb E[f(B) |B_1] = \int_{C[0,1]} f(\varphi) \; \mu^{B_1}(d\varphi), \;\;\;\;\; a.s.$$ where the RHS is meant to be interpreted as a random variable $\omega \mapsto \int f(\varphi) \;\mu^{B_1(\omega)}(d\varphi) $. Another way of putting this is that the map $(x,A) \mapsto \mu^x(A)$ is a regular conditional probability for $B$ with respect to $B_1$.

$B$ can be decomposed as the independent sum $B = \tilde B^0 + tB_1$. Equivalently, for any measurable subset $A \subset C[0,1]$, $$\mu(A) = \frac{1}{\sqrt{2\pi}} \int_{\Bbb R} \mu^x(A) e^{-x^2/2} dx$$

For any continuous function $f:C[0,1] \to \Bbb R$ and any $x \in \Bbb R$, $$\lim_{\epsilon \to 0} \Bbb E\big[f(B) \;\big| \; |B_1-x| < \epsilon \big] = \Bbb E[f( \tilde B^x) ] =\int_{C[0,1]} f(\varphi) \; \mu^x(d\varphi)$$ If $f$ is merely assumed to be bounded and measurable, the same conclusion still holds for almost every $x \in \Bbb R$.

In terms of density functions, we have that $$f_{\tilde B^x_{t_1},...,\tilde B_{t_n}^x}(y_1,...,y_n) = \frac{f_{B_1,B_{t_1},..., B_{t_n}}(x,y_1,...,y_n)}{f_{B_1}(x)}$$

Proof: We will first prove 2, then 1, then 3, then 4.

Proof of 2: We will use the following fact: If $(X_t,Y_t)_{t \in [0,1]}$ is a jointly Gaussian process such that $cov(X_t,Y_s)=0$, for all $s,t$, then $X$ and $Y$ are independent. Indeed, this would imply independence of the finite-dimensional distributions, which (via the $\pi$-$\lambda$ theorem) implies independence with respect to the product $\sigma$-algebra on $\Bbb R^{[0,1]}$, which coincides with the Borel $\sigma$-algebra when restricted to $C[0,1]$. Using this fact together with the fact that $cov(B_t-tB_1,B_1)=0$ for all $t\in [0,1]$, we see that $(B_t-tB_1)_{t\in[0,1]}= \tilde B^0$ is independent of $B_1$. This proves the first part of the claim 2, and for the second part, we use independence of $\tilde B^0$ and $B_1$ to say $$\mu(A) = P( B \in A) = P\big((\tilde B^0_t+tB_1)_{t\in[0,1]} \in A\big) = \int_{\Bbb R} P\big((\tilde B^0_t+tx)_{t\in[0,1]} \in A\big) \; \Bbb P^{B_1}(dx) $$$$=\int_{\Bbb R} P(\tilde B^x_t \in A) \; \Bbb P^{B_1}(dx)=\frac{1}{\sqrt{2\pi}}\int_{\Bbb R} \mu^x(A)e^{-x^2/2}dx$$which proves the formula in 2.

Proof of 1: As a corollary of the formula in 2, if $f: C[0,1] \to \Bbb R$ is any bounded measurable function, then $$\int_{C[0,1]} f(\phi) \mu(d\phi) = \frac{1}{\sqrt{2\pi}}\int_{\Bbb R} \bigg[\int_{C[0,1]} f(\phi) \mu^x(d\phi)\bigg] e^{-x^2/2}dx$$ Cosequently, if $A \subset \Bbb R$ is any Borel measurable set, then $$\Bbb E[f(B)1_{\{B_1 \in A\}}] = \int_{C[0,1]} f(\phi) 1_A(\phi(1)) \;\mu(d\phi)$$ $$=\frac{1}{\sqrt{2\pi}}\int_{\Bbb R} \bigg[\int_{C[0,1]} f(\phi)1_A(x) \mu^x(d\phi)\bigg] e^{-x^2/2}dx $$$$= \frac{1}{\sqrt{2\pi}}\int_{A} \bigg[\int_{C[0,1]} f(\phi) \mu^x(d\phi) \bigg]e^{-x^2/2}dx$$ $$=\Bbb E \bigg[ \bigg(\int_{C[0,1]} f(x) \; \mu^{B_1}(dx) \bigg) \cdot 1_{\{B_1 \in A\}} \bigg]$$ which proves the formula in 1.

Proof of 3: Let $f:C[0,1] \to \Bbb R$ be continuous, and fix some $x \in \Bbb R$. Mimicking the proof of 1, we find that $$ \Bbb E[f(B) \cdot 1_{\{|B_1-x|< \epsilon\}}] = \frac{1}{\sqrt{2\pi}} \int_{x-\epsilon}^{x+\epsilon} \bigg[ \int_{C[0,1]} f(\phi) \mu^u(d\phi)\bigg] e^{-u^2/2}du$$ Therefore, $$\Bbb E\big[f(B) \;\big|\;|B_1-x|<\epsilon\big] := \frac{\Bbb E[f(B) \cdot 1_{\{|B_1 -x|<\epsilon\}}]}{P(|B_1-x|<\epsilon)}$$$$= \frac{\int_{x-\epsilon}^{x+\epsilon} \bigg[ \int_{C[0,1]} f(\phi) \mu^u(d\phi)\bigg] e^{-u^2/2}du}{\int_{x-\epsilon}^{x+\epsilon} e^{-u^2/2}du}$$Now letting $\epsilon \to 0$ and using continuity of $f$ (together with weak continuity of $u \mapsto \mu^u$) we get that $$\lim_{\epsilon \to 0}\Bbb E\big[f(B) \;\big|\;|B_1-x|<\epsilon\big]=\int_{C[0,1]}f(\phi)\mu^x(d\phi) = \Bbb E[f(\tilde B^x)]$$ This proves the claim for continuous $f$. If $f$ is merely bounded and measurable, the claim follows similarly but we apply Lebesgue's differentiation theorem at the last step.

Proof of 4: We wish to show that $f_{B_1}(x)f_{\tilde B^x_{t_1},...,\tilde B_{t_n}^x}(y_1,...,y_n)=f_{B_1,B_{t_1},..., B_{t_n}}(x,y_1,...,y_n)$, i.e., that for bounded measurable $g:\Bbb R^n \to \Bbb R$ and Borel sets $A \subset \Bbb R$, $$\int_A\bigg[\int_{\Bbb R^{n}} f_{\tilde B^x_{t_1},...,\tilde B_{t_n}^x}(y_1,...,y_n)g(y_1,...,y_n)dy_1 \cdots dy_n\bigg]f_{B_1}(x)dx = \Bbb E[g(B_{t_1},...,B_{t_n})\cdot 1_{\{B_1 \in A\}}]$$ But the LHS may be rewritten as $$\frac{1}{\sqrt{2\pi}} \int_{A} \Bbb E \big[ g(\tilde B^x_{t_1},...,\tilde B^x_{t_n})\big] e^{-x^2/2}dx$$ $$=\frac{1}{\sqrt{2\pi}} \int_{A}\bigg[ \int_{C[0,1]} g(\phi(t_1), \dots, \phi(t_n)) \mu^x(d\phi) \bigg] e^{-x^2/2}dx$$ $$=\int_{C[0,1]} g(\phi(t_1),...,\phi(t_n))\cdot 1_A(\phi(1)) \mu(d\phi)$$ $$=\Bbb E[g(B_{t_1},...,B_{t_n})\cdot 1_{\{B_1 \in A\}}]$$ where we used the formula from 2 in the third equality. This proves the claim.

I'm not sure that this answer actually addresses the issue outlined in the problem, which is that the definition of conditional measures is only well-defined up to a set of measure zero. However, this answer is more for my own personal reference than to address that issue. — shalop, Sep 16 '17 at 06:47
Hi Shalop, could you help me answer this question: https://math.stackexchange.com/questions/3719515/brownian-bridge-for-w-w-t-t-in-t-0-1? Many thanks! — Hermi, Jun 15 '20 at 04:09

If $Y$ is a nonnegative absolutely continuous random variable and $E[X|Y]=Y/2$, is $E[X|Y=-1]=-1/2$? Is $E[X|Y=2]=1$?

1 Answers1