7

One of the definitions I learned for $E[X|Y=y]$ is the following: $$ E[X|Y=y]=\int_{\mathbb{R}} x\,P_{X|Y=y}(dx), $$ where $P_{X|Y=y}$ a probability verifying $$ P(X\in A, Y\in B)=\int_B P_{X|Y=y}(A)\,P_Y(dy), \;\;(*)$$ for all $A,B\in\mathcal{B}(\mathbb{R})$.

This probability $P_{X|Y=y}$ is unique in the following sense: if $Q_{X|Y=y}$ is another probability satisfying $(*)$, then $P_{X|Y=y}(C)=Q_{X|Y=y}(C)$ for all $C\in\mathcal{B}(\mathbb{R})$ and $y\notin N$, with $P_Y(N)=0$.

Let $Y$ be a nonnegative absolutely continuous random variable with $E[X|Y]=Y/2$. Let $N=(-\infty,0)\cup \{2\}$. We have $P_Y(N)=0$, so in principle $P_{X|Y=y}$ may be any probability on $(\mathbb{R},\mathcal{B}(\mathbb{R}))$ for $y\in N$, right? Then $E[X|Y=y]$ could be any value for $y\in N$, right?

I am totally confused. Intuitively, $E[X|Y=2]$ should be $1$ and $E[X|Y=-1]$ should not be defined, since $Y$ is nonnegative.

As you can see in this question, in Lemma 5.22 of the book An Introduction to Computational Stochastic PDEs, it is stated that $E[W(t)|W(1)]=t\,W(1)$ implies that $E[W(t)|W(1)=0]=0$, where $W$ is a Brownian motion and $0\leq t\leq 1$. But $P(W(1)=0)$, so $E[W(t)|W(1)=0]$ is not uniqueley defined, it may be any value. I mean, $y\mapsto E[W(t)|W(1)=y]$ is defined $P_{W(1)}$-a.s., so at $y=0$ there is not a unique definition. It is like stating something about $f(0)$ when $f$ is a real function defined a.e.

user39756
  • 1,549
  • 1
    Not sure if relevant, but if you are familiar with the idea of a regular conditional probability, then you can actually define conditional probability measures $E[\cdot |Y=y]$ for every $y$ (not just up to a set of $P_Y$-measure zero), without any problem. – shalop Sep 16 '17 at 05:41
  • @Shalop I do not know about regular conditional probability. However, from the last example in WIkipedia, I understand that the regular conditional probability coincides in my case with $$f_{W(t),W(1)}(x,y)/f_{W(1)}(y)$$ for all $y\in\mathbb{R}$. Thus, from my point of view, the law $P_{W(t)|W(1)=y}$ is understood via the representative obtained from the density $$f_{W(t),W(1)}(x,y)/f_{W(1)}(y),$$ which is defined for all $y\in\mathbb{R}$. And also I think that the proof of Lemma 5.22 in the book I mentioned is not correct. – user39756 Sep 16 '17 at 12:03
  • 1
    Moreover, the definition of regular conditional probability is not unique, it is only $P_Y$-a.s. unique. Again, we have to choose representatives, so the same problem appears. – user39756 Sep 16 '17 at 14:14
  • Check this question: https://math.stackexchange.com/questions/62958/considering-brownian-bridge-as-conditioned-brownian-motion to see that, while not rigorous according to the usual definitions, the definition by conditioning can still be made rigorous through a limit of conditioning on smaller and smaller balls around 0 – Bananach Sep 16 '17 at 15:58
  • @user39756 Ok, it seems like your problem is with the fact that the measures $P(\cdot|W(1)=x)$ are not uniquely defined for every $x \in \Bbb R$, only for almost every $x$. One way to fix this is that you can add a technical condition to make sure that uniqueness does hold, for instance requiring that the family of measures $x \mapsto P(\cdot |W(1)=x)$ is weakly continuous. – shalop Sep 16 '17 at 22:20
  • Another way to remedy the situation is to adopt a completely different definition for Brownian bridge: $W^x(t):=W(t)-t(W(1)-x)$, and then proving that $W^x(t)$ is the distributional limit as $\epsilon \to 0$ of the law of $W$ conditioned on the event $|W(1)-x|<\epsilon$ (see the link by Bananach or the answer below). – shalop Sep 16 '17 at 22:21

1 Answers1

-1

I will state and prove a general theorem which formally justifies why a Brownian bridge can be thought of as the same object as a Brownian motion conditioned on its terminal value (maybe this answer will make a bit more sense if one is familiar with regular conditional probabilities).

So let $B=(B_t)_{t \in [0,1]}$ be a Brownian motion on $[0,1]$. For $x \in \Bbb R$ I will define the Brownian bridge ending at $x$ to be the process $\tilde B_t^x := B_t-t(B_1-x)$, and I will show that this naive definition coincides with your conditional probability definition.

Let $\mu$ denote the law of $B$, and let $\mu^x$ denote the law of $\tilde B^x$, considered as probability measures on $C[0,1]$ equipped with its Borel $\sigma$-algebra. In other words, $\mu(A):=P(B \in A)$ and $\mu^x(A) := P(\tilde B^x \in A)$, for Borel sets $A \subset C[0,1]$.

Theorem: All of the following are true:

  1. The Law of $\tilde B^x$ is the law of $B$ conditional on $B_1=x$. Formally, this means that for any bounded measurable function $f:C[0,1] \to \Bbb R$, we have that $$\Bbb E[f(B) |B_1] = \int_{C[0,1]} f(\varphi) \; \mu^{B_1}(d\varphi), \;\;\;\;\; a.s.$$ where the RHS is meant to be interpreted as a random variable $\omega \mapsto \int f(\varphi) \;\mu^{B_1(\omega)}(d\varphi) $. Another way of putting this is that the map $(x,A) \mapsto \mu^x(A)$ is a regular conditional probability for $B$ with respect to $B_1$.
  2. $B$ can be decomposed as the independent sum $B = \tilde B^0 + tB_1$. Equivalently, for any measurable subset $A \subset C[0,1]$, $$\mu(A) = \frac{1}{\sqrt{2\pi}} \int_{\Bbb R} \mu^x(A) e^{-x^2/2} dx$$
  3. For any continuous function $f:C[0,1] \to \Bbb R$ and any $x \in \Bbb R$, $$\lim_{\epsilon \to 0} \Bbb E\big[f(B) \;\big| \; |B_1-x| < \epsilon \big] = \Bbb E[f( \tilde B^x) ] =\int_{C[0,1]} f(\varphi) \; \mu^x(d\varphi)$$ If $f$ is merely assumed to be bounded and measurable, the same conclusion still holds for almost every $x \in \Bbb R$.
  4. In terms of density functions, we have that $$f_{\tilde B^x_{t_1},...,\tilde B_{t_n}^x}(y_1,...,y_n) = \frac{f_{B_1,B_{t_1},..., B_{t_n}}(x,y_1,...,y_n)}{f_{B_1}(x)}$$

Proof: We will first prove 2, then 1, then 3, then 4.

Proof of 2: We will use the following fact: If $(X_t,Y_t)_{t \in [0,1]}$ is a jointly Gaussian process such that $cov(X_t,Y_s)=0$, for all $s,t$, then $X$ and $Y$ are independent. Indeed, this would imply independence of the finite-dimensional distributions, which (via the $\pi$-$\lambda$ theorem) implies independence with respect to the product $\sigma$-algebra on $\Bbb R^{[0,1]}$, which coincides with the Borel $\sigma$-algebra when restricted to $C[0,1]$. Using this fact together with the fact that $cov(B_t-tB_1,B_1)=0$ for all $t\in [0,1]$, we see that $(B_t-tB_1)_{t\in[0,1]}= \tilde B^0$ is independent of $B_1$. This proves the first part of the claim 2, and for the second part, we use independence of $\tilde B^0$ and $B_1$ to say $$\mu(A) = P( B \in A) = P\big((\tilde B^0_t+tB_1)_{t\in[0,1]} \in A\big) = \int_{\Bbb R} P\big((\tilde B^0_t+tx)_{t\in[0,1]} \in A\big) \; \Bbb P^{B_1}(dx) $$$$=\int_{\Bbb R} P(\tilde B^x_t \in A) \; \Bbb P^{B_1}(dx)=\frac{1}{\sqrt{2\pi}}\int_{\Bbb R} \mu^x(A)e^{-x^2/2}dx$$which proves the formula in 2.

Proof of 1: As a corollary of the formula in 2, if $f: C[0,1] \to \Bbb R$ is any bounded measurable function, then $$\int_{C[0,1]} f(\phi) \mu(d\phi) = \frac{1}{\sqrt{2\pi}}\int_{\Bbb R} \bigg[\int_{C[0,1]} f(\phi) \mu^x(d\phi)\bigg] e^{-x^2/2}dx$$ Cosequently, if $A \subset \Bbb R$ is any Borel measurable set, then $$\Bbb E[f(B)1_{\{B_1 \in A\}}] = \int_{C[0,1]} f(\phi) 1_A(\phi(1)) \;\mu(d\phi)$$ $$=\frac{1}{\sqrt{2\pi}}\int_{\Bbb R} \bigg[\int_{C[0,1]} f(\phi)1_A(x) \mu^x(d\phi)\bigg] e^{-x^2/2}dx $$$$= \frac{1}{\sqrt{2\pi}}\int_{A} \bigg[\int_{C[0,1]} f(\phi) \mu^x(d\phi) \bigg]e^{-x^2/2}dx$$ $$=\Bbb E \bigg[ \bigg(\int_{C[0,1]} f(x) \; \mu^{B_1}(dx) \bigg) \cdot 1_{\{B_1 \in A\}} \bigg]$$ which proves the formula in 1.

Proof of 3: Let $f:C[0,1] \to \Bbb R$ be continuous, and fix some $x \in \Bbb R$. Mimicking the proof of 1, we find that $$ \Bbb E[f(B) \cdot 1_{\{|B_1-x|< \epsilon\}}] = \frac{1}{\sqrt{2\pi}} \int_{x-\epsilon}^{x+\epsilon} \bigg[ \int_{C[0,1]} f(\phi) \mu^u(d\phi)\bigg] e^{-u^2/2}du$$ Therefore, $$\Bbb E\big[f(B) \;\big|\;|B_1-x|<\epsilon\big] := \frac{\Bbb E[f(B) \cdot 1_{\{|B_1 -x|<\epsilon\}}]}{P(|B_1-x|<\epsilon)}$$$$= \frac{\int_{x-\epsilon}^{x+\epsilon} \bigg[ \int_{C[0,1]} f(\phi) \mu^u(d\phi)\bigg] e^{-u^2/2}du}{\int_{x-\epsilon}^{x+\epsilon} e^{-u^2/2}du}$$Now letting $\epsilon \to 0$ and using continuity of $f$ (together with weak continuity of $u \mapsto \mu^u$) we get that $$\lim_{\epsilon \to 0}\Bbb E\big[f(B) \;\big|\;|B_1-x|<\epsilon\big]=\int_{C[0,1]}f(\phi)\mu^x(d\phi) = \Bbb E[f(\tilde B^x)]$$ This proves the claim for continuous $f$. If $f$ is merely bounded and measurable, the claim follows similarly but we apply Lebesgue's differentiation theorem at the last step.

Proof of 4: We wish to show that $f_{B_1}(x)f_{\tilde B^x_{t_1},...,\tilde B_{t_n}^x}(y_1,...,y_n)=f_{B_1,B_{t_1},..., B_{t_n}}(x,y_1,...,y_n)$, i.e., that for bounded measurable $g:\Bbb R^n \to \Bbb R$ and Borel sets $A \subset \Bbb R$, $$\int_A\bigg[\int_{\Bbb R^{n}} f_{\tilde B^x_{t_1},...,\tilde B_{t_n}^x}(y_1,...,y_n)g(y_1,...,y_n)dy_1 \cdots dy_n\bigg]f_{B_1}(x)dx = \Bbb E[g(B_{t_1},...,B_{t_n})\cdot 1_{\{B_1 \in A\}}]$$ But the LHS may be rewritten as $$\frac{1}{\sqrt{2\pi}} \int_{A} \Bbb E \big[ g(\tilde B^x_{t_1},...,\tilde B^x_{t_n})\big] e^{-x^2/2}dx$$ $$=\frac{1}{\sqrt{2\pi}} \int_{A}\bigg[ \int_{C[0,1]} g(\phi(t_1), \dots, \phi(t_n)) \mu^x(d\phi) \bigg] e^{-x^2/2}dx$$ $$=\int_{C[0,1]} g(\phi(t_1),...,\phi(t_n))\cdot 1_A(\phi(1)) \mu(d\phi)$$ $$=\Bbb E[g(B_{t_1},...,B_{t_n})\cdot 1_{\{B_1 \in A\}}]$$ where we used the formula from 2 in the third equality. This proves the claim.

shalop
  • 13,703
  • I'm not sure that this answer actually addresses the issue outlined in the problem, which is that the definition of conditional measures is only well-defined up to a set of measure zero. However, this answer is more for my own personal reference than to address that issue. – shalop Sep 16 '17 at 06:47
  • Hi Shalop, could you help me answer this question: https://math.stackexchange.com/questions/3719515/brownian-bridge-for-w-w-t-t-in-t-0-1? Many thanks! – Hermi Jun 15 '20 at 04:09