7

Someone knows of some definition or reference of how to define conditional expectation for a measure space with $\sigma$-finite measure.

I think it should be as follows:

Let $(X,\mathcal{B},\nu)$ be a measure space and let $\mathcal{F}\subset\mathcal{B}$ a sub$-\sigma-$algebra, such that $\nu$ is $\sigma-$finite in $\mathcal{F}$. Then for all $f\in L^1(X,\mathcal{B},\nu)$ there exists $g\in L^1(X,\mathcal{F},\nu|_{\mathcal{F}})$ such that $$\int_{E}fd\nu=\int_Egd\nu|_{\mathcal{F}},\qquad\forall E\in\mathcal{F};$$ then $g=:\mathbb{E}_{\nu}[f|\mathcal{F}]$ is called the conditional expectation of $f$ given $\mathcal{F}$.

Is this the correct way to define conditional expectation? There is another way to define it without requiring the hypothesis that $\nu$ be $\sigma$-finite in $\mathcal{F}$?

Iosif Pinelis
  • 116,648
Rusbert
  • 173
  • 1
    The hard part is making sense of what a conditional probability distribution means https://math.stackexchange.com/questions/496608/formal-definition-of-conditional-probability?noredirect=1&lq=1 – Christian Chapman Mar 10 '18 at 02:01

2 Answers2

6

One can define a reasonable notion of conditional expectation for arbitrary localizable measurable spaces, not necessarily σ-finite. This is explained in great detail in the answer to Is there an introduction to probability theory from a structuralist/categorical perspective?

The “pushforward for L_1-spaces” mentioned there is precisely the conditional expectation. Let me offer a few comments and expand this a little bit.

First, one doesn't really need a measure μ to talk about conditional expectations, only a measure class [μ], or equivalently, a σ-ideal N of negligible sets (alias sets of measure 0). Given a set X with a σ-algebra M of measurable subsets and a σ-ideal N of negligible subsets, one can define the set of (finite complex-valued) measures on (X,M,N) as the set of additive functions M→C that vanish on N. Infinite measures can be defined using [0,∞] instead of C. Furthermore, given a faithful finite measure μ on (X,M,N), one can identify the set of μ-integrable functions f with the set of finite measures ν via the isomorphism f↦fν supplied by the Radon-Nikodym theorem.

Given a morphism (X,M,N)→(X',M',N'), one can pushforward a finite measure on (X,M,N) and get a finite measure on (X',M',N'), by taking the preimage of a measurable subset of X' and computing its measure as a subset of X. This is the conditional expectation. In particular, in the notation of the original post, one uses the morphism (X,B,N)→(X,F,N), where N is the σ-ideal of sets with ν-measure zero. Although both the domain and codomain have the same underlying set X, it's best to think of them as different spaces, and the pushforward can then be thought of as the fiberwise integration. (For instance, take X=[0,1]×[0,1] with the Borel σ-algebra and the σ-algebra of “vertical” Borel sets. The resulting morphism will be isomorphic to the projection [0,1]×[0,1]→[0,1], and the pushforward will be the fiberwise integration map.)

Let's illustrate the above construction on the example by Iosif Pinelis. The product of the infinite measure ν and the function f is the finite measure fν. Its pushforward along the map (X,B,∅)→(X,F,∅) can be computed as follows. The codomain is isomorphic to the measurable space consisting of a single point (which is how one should think about it geometrically). By definition, measures on (X,F,∅) can be identified with complex numbers, and the pushforward of a finite measure on (X,B,∅) simply computes the measure of X. Thus, in this example the conditional expectation is the measure on (X,F,∅) that assigns the number 2 (the sum of 1/2^x) to X.

Of course, most analysts are more comfortable with functions rather than measures and prefer to use the Radon-Nikodym theorem with respect to the pushforward of ν to convert the pushforward of fν to an integrable function. However, this is only possible if the pushforward of ν is a faithful semifinite measure, whereas in this example it is a purely infinite, nonsemifinite measure.

However, the failure of the Radon-Nikodym theorem doesn't mean that the conditional expectation doesn't exist, but rather that it exists only as a finite measure that cannot be converted to an integrable function.

Dmitri Pavlov
  • 36,141
  • 1
    It appears that you suggest to define the conditional expectation just as the push-forward measure, with some additional care taken of the null sets. This way, you can of course avoid division by $\infty$ (and any other division!) and thus avoid the need for the sigma-finiteness condition. However, I don't think many people in probability will agree that you may call push-forward measures conditional expectations, especially because the people already have and use the separate notion of the push-forward measure, and that use is quite different from the actual use of conditional expectations. – Iosif Pinelis Mar 11 '18 at 04:02
  • My view is that the expectation is an average; accordingly, the conditional expectation is a conditional average -- so that some kind of division is needed. To me, saying that the conditional expectation is a push-forward measure sounds as the call to completely remove the useful and very broadly used notion of the actual conditional expectation. Even if we purchase non-sigma-finite measures this way, I don't think the price is right. – Iosif Pinelis Mar 11 '18 at 04:30
  • @IosifPinelis: This construction recovers the traditional definition of conditional expectation once we apply the Radon-Nikodym theorem (in the semifinite case). I don't see any problem switching between two equivalent languages for conditional expectations, especially if one of these languages removes the rather artificial restrictions imposed by the other. As for the price, can one prove something with the traditional definition that one can't prove with the new definition? I'd say that having multiple (equivalent) points of view on the same object is a blessing, not a curse. – Dmitri Pavlov Mar 11 '18 at 04:39
  • 1
    Of course, one can apply the Radon--Nikodym (RN) theorem, with null sets taken care of. In particular, one can apply the RN theorem to the restriction of the measure to a sub-sigma-algebra. So then, what is your suggestion? To expropriate the term "conditional expectation" from its current usage and give it to the push-forward measure? But why do that? We already have the established notion of a push-forward measure, and we know that one does not need sigma-finiteness to have such a measure. – Iosif Pinelis Mar 11 '18 at 04:56
  • Also, I of course agree that it is good to have multiple equivalent points of view of the same object, but I don't see what this has to do with what appears to be your suggestion -- to take an established term ("conditional expectation") and give it to another object that already has an established name ("push-forward measure"). – Iosif Pinelis Mar 11 '18 at 05:06
  • @IosifPinelis: You seem to incorrectly think that my answer has something to do with terminology. I am not proposing any new terminology in my answer. Rather, the OP asked for extension (with similar properties) of a certain construction from the class of finite measurable spaces to non-σ-finite measurable spaces and I provided such an extension. The choice of terminology is irrelevant for my answer. – Dmitri Pavlov Mar 11 '18 at 05:20
  • @IosifPinelis: In other words, there are two questions here. Question 1: Does the construction of a conditional expectation admit an extension to non-σ-finite spaces, with similar properties? (Answer: yes.) Question 2: What name should we use for this construction? Nothing in my answer or my comments ever purports to answer the second question. – Dmitri Pavlov Mar 11 '18 at 05:30
  • 1
    I cannot see why what you proposed, "pushforward a finite measure on (X,M,N) and get a finite measure on (X',M',N') [...]. This is the conditional expectation", is an extension of the accepted notion of the conditional expectation to non-finite-measures. Indeed, in the sigma-finite case, what you proposed is not at all the same as the accepted notion of the conditional expectation. As you wrote, your "conditional expectation" is a push-forward measure. So, how can I see this as anything but a suggestion about terminology, to assign the name of a known object to another known object? – Iosif Pinelis Mar 11 '18 at 05:52
  • @IosifPinelis: My construction defines a functor from the category of localizable measurable spaces to the category of sets. The restriction of this functor to the category of σ-finite measurable spaces is naturally isomorphic (via Radon-Nikodym) to the functor defined by the OP. Accordingly, my construction is an extension of the OP's construction. – Dmitri Pavlov Mar 11 '18 at 06:03
  • I think that for people in probability (at least) it is irrelevant that these two categories are isomorphic; this is not at all enough to claim that the corresponding notions -- of the push-forward measures and the Radon--Nikodym (RN) densities -- are the same. Here is an example: the set of all prime numbers and the set of all natural numbers are isomorphic to each other as, say, the lattices with the natural order on them. Will number theorists now accept that these two notions are the same, of prime numbers and natural numbers? – Iosif Pinelis Mar 11 '18 at 13:01
  • Previous comment, continued: Actually, I think that in your case the situation is only worse than in the above example: in your case, the only nontrivial part is the isomorphism already provided by Radon--Nikodym while the notion of push-forward measure is rather trivial, whereas in the above example the lattice isomorphism can be explained in elementary school, maybe. Of course, I realize that this discussion is rather meta-mathematical. – Iosif Pinelis Mar 11 '18 at 13:01
  • @IosifPinelis: It's not the categories that are isomorphic, it's the functors. The structure used, e.g., in the prime number theorem is not just the lattice structure, but the embedding of the lattice of prime numbers into integers. The embeddings of prime and natural numbers into integers are not isomorphic. With respect to this, if you could cite a specific theorem involving conditional expectations that becomes false for this generalization, this would be very helpful. Otherwise, it is exceptionally difficult to respond to such vague criticism. – Dmitri Pavlov Mar 11 '18 at 17:28
  • Sorry for misreading what you said about the isomorphism: it is, not one between categories, but one between functors, one functor defined by you (how?) and one functor you said defined by the OP (how?). I obviously know next to nothing in category theory; however, I suspect that your natural isomorphism, translated from the category theory language, boils down to the linear isomorphism between the densities $f$ and the corresponding measures $A\mapsto\int_A f,d\nu$ -- am I wrong here? – Iosif Pinelis Mar 11 '18 at 19:05
  • Previous comment, continued: If so, I can see no substantial difference between your suggestion and my example with prime and natural numbers. Of course, that example has no relevance in number theory, and I still see no relevance of your suggestion to probability theory (say). I still see the essence of your suggestion as hardly more than the suggestion to call push-forward measures conditional expectations: "pushforward a finite measure [...]. This is the conditional expectation", in your words. – Iosif Pinelis Mar 11 '18 at 19:05
  • Previous comment, continued: I have never said that accepting this suggestion would make any theorem false. I have just questioned if this suggestion is relevant and useful. So, can you present at least one theorem about conditional expectations (translated from the category theory language into what most people in probability use, please) that your suggestion would help discover or prove? Then I think people in probability may get interested in learning category theory. – Iosif Pinelis Mar 11 '18 at 19:06
  • @IosifPinelis: The first functor sends a finite measure space (X,B,ν) to the set of equivalence classes of ν-integrable functions and a morphism (i.e., equivalence class of measurable maps) of measure spaces to the induced conditional expectation map between integrable functions. The second functor sends a finite measure space (X,B,ν) to the set of finite measures on (X,B) that vanish on sets of ν-measure 0. – Dmitri Pavlov Mar 11 '18 at 23:59
  • @IosifPinelis: These two functors are isomorphic because for any finite measure space (X,B,ν) the Radon-Nikodym theorem supplies an isomorphism from integrable functions to finite measures and this isomorphism is natural, i.e., respects morphisms of measurable spaces. – Dmitri Pavlov Mar 12 '18 at 00:00
  • @IosifPinelis: Concerning the example with prime numbers: expressed in the language of category theory, the two objects in the slice category Lat/Z (i.e., lattices equipped with an order-preserving map to Z) the two objects given by prime numbers and natural numbers (both embedded into Z) are not isomorphic. Thus this example is not analogous to the generalization of conditional expectation discussed here. – Dmitri Pavlov Mar 12 '18 at 00:02
  • @IosifPinelis: As for examples of theorems about conditional expectations, the law of total expectation is still valid in the new generality. – Dmitri Pavlov Mar 12 '18 at 00:03
  • What do you mean by the law of total expectation (LTE), when no division is allowed? The usual LTE is that the expectation of the conditional expectation (with implicit division allowed) is the unconditional expectation. But your "conditional expectation" is a measure. So, do you have the expectation of a measure? – Iosif Pinelis Mar 12 '18 at 01:05
  • @IosifPinelis: The law of total expectation states that given a probability space (X,B,ν) and two σ-algebras F⊂G⊂B, then for any random variable V on (X,B,ν) we have E[E[V|G]|F] = E[V|F]. Exactly the same statement is true if E[−|−] is interpreted using the definition given above (i.e., using the Radon-Nikodym isomorphism). – Dmitri Pavlov Mar 12 '18 at 02:00
  • So, here you only seem to have one probability space, $(X,B,\nu)$. Then your $E(V|G)$ is just the restriction $\mu|_G$ of the measure $d\mu:=V d\nu$ to a sub-sigma-algebra $G$ of $B$, and then $E(E(V|G)|F)$ is the further restriction of $\mu|_G$ to a sub-sigma-algebra $F$ of the sub-sigma-algebra $G$ of $B$. Right? If so, then your "law of total expectation" states that, given $F\subset G\subset B$, the restriction to $F$ of the restriction to $G$ of a measure $\mu$ on $B$ is the same as the restriction of $\mu$ to $F$, that is, $(\mu|_G)|_F=\mu|_F$. – Iosif Pinelis Mar 12 '18 at 02:59
  • Previous comment continued: But this is trivially true even much more generally: when $F, G, B$ are not even sigma-algebras but any sets such that $F\subset G\subset B$ and $\mu$ is not a measure (of the special form $d\mu=V d\nu$ or at all) but any function on the set $B$. So, if my understanding here is correct, I am afraid people will not be impressed with this theorem, and I don't think this will motivate them to learn category theory. – Iosif Pinelis Mar 12 '18 at 02:59
  • @IosifPinelis: Yes, that's exactly what the law of total expectation says. This formulation, in particular, gives the usual law of total expectation, via the Radon-Nikodym isomorphism. As for being impressed with this theorem, well, it is exactly as impressive as the law of total expectation (which is to say, not that much). If you know a deeper result about conditional expectations that you would like to see generalized, then by all means, please mention it here. Finally, absolutely nothing in my answer or comments purports to motivate anybody to learn category theory. – Dmitri Pavlov Mar 12 '18 at 03:18
  • It is true that you never said directly: "Study category theory!" However, your comments about a natural isomorphism of functors and a reference in your answer to a previous answer made it look as though to achieve better understanding of conditional expectation one may want to learn at least basics of category theory. I doubted that throughout, but still had a spark of positive hope in that regard. So, to find out how useful and relevant your suggestion is, I asked you to "present at least one theorem about conditional expectations [...] that your suggestion would help discover or prove". – Iosif Pinelis Mar 12 '18 at 13:49
  • Previous comment, continued: You then presented what you call "the law of total expectation", which is actually a "law" of iterated restriction: $(\mu|_G)|_F=\mu_F$, which actually holds for any sets $F, G, B$ such that $F\subset G\subset B$ and for any function $\mu$ on $B$. Of course, this "law" is an utter triviality, devoid of any probability content or relevance. Is this the best you could come up with in this regard? – Iosif Pinelis Mar 12 '18 at 13:49
  • Previous comment, continued: I think what is deep and impressive about the accepted notion of conditional expectation is its existence, which is the main content of the Radon--Nikodym (RN) theorem. The actual law of total expectation, the way it is most commonly used, is that $EE(f|F)=Ef$ for any (say nonnegative) $B$-measurable function $f$ and any sub-sigma-algebra $F$ of $B$; of course, this follows immediately from the RN theorem. – Iosif Pinelis Mar 12 '18 at 13:50
  • Previous comment, continued: With your suggestion, you don't seem to have any analogue of $EE(f|F)=Ef$, with the unconditional expectation; indeed, your "conditional expectation" is actually not any kind of expectation, not any kind of average. – Iosif Pinelis Mar 12 '18 at 13:51
  • 1
    Previous comment, continued: I am wondering if your notion of "conditional expectation" (which is actually a restriction or, more generally, a push-forward of a measure) has ever been used (under this name, "conditional expectation") in any publication. If not, do you know anyone except for yourself who believes that a push-forward of a measure is a useful extension of the accepted notion of conditional expectation? – Iosif Pinelis Mar 12 '18 at 13:52
  • @IosifPinelis: As I pointed out previously, I am not interested in terminological or sociological discussions (e.g., do I know anybody who...), so there will be no response to this part of your comments. I will respond to the only mathematical claim in your comments, which seems to incorrectly claim that there is no analog of E(E(f|F))=Ef. As a matter of fact, the statement E(E(f|F))=Ef continues to be valid in the generalized setting: the unconditional expectation is defined as the pushforward to a single point, and measures on a single point are identified with numbers. – Dmitri Pavlov Mar 12 '18 at 17:06
  • @IosifPinelis: I reiterate my suggestion for you to provide a result about conditional expectations that you consider to be interesting. I already provided the law of total expectation, which apparently is not interesting enough. I cannot go through every single statement in probability theory (of which there are hundreds) simply to discover what statements you consider to be interesting. The burden is on you to provide an example. – Dmitri Pavlov Mar 12 '18 at 17:11
  • I did say: "I think what is deep and impressive [and hence interesting] about the accepted notion of conditional expectation is its existence, which is the main content of the Radon--Nikodym (RN) theorem." As for the character of this discussion, it has always been rather meta-mathematical. My expressed concern has been, from the beginning, about the relevance and usefulness of your suggestion. So, questions on whether your suggestion has actually been used/accepted seem appropriate here. – Iosif Pinelis Mar 12 '18 at 17:38
  • @IosifPinelis: This answer does not attempt to commence a revolution in probability theory, but rather to answer a question by the OP who inquired about the possibility of defining conditional expectations for non-σ-finite spaces. I proposed such a generalization and demonstrated that some results about conditional expectations are still valid in this setting, e.g., the law of total expectation. You seem to continuously insinuate (but not claim it directly) that this generalization will fail in other usages, but refuse to provide even a single example. – Dmitri Pavlov Mar 12 '18 at 17:53
  • I have never even thought of your suggestion in other usages (and I don't even know what you mean by that). What I have questioned is the relevance and usefulness of your suggestion regarding precisely what was the subject of the OP's question: the notion of conditional expectation. – Iosif Pinelis Mar 12 '18 at 18:35
  • @IosifPinelis: Other usages (i.e., other than the law of total expectation) of conditional expectations, of course. I do not think it is appropriate to question anything without providing explicit mathematical examples that would substantiate your claims. Since you refuse to provide such specific examples, I suggest to terminate this discussion. – Dmitri Pavlov Mar 12 '18 at 21:11
  • 2
    I don't know what you mean by my claims. I don't think I have made any claims; once again, I have only questioned the relevance and usefulness of your suggestion. I also don't know what you mean by "specific examples". You asked me for an example of what I consider impressive regarding conditional expectation. To that, I answered: "I think what is deep and impressive about the accepted notion of conditional expectation is its existence, which is the main content of the Radon--Nikodym (RN) theorem." I agree to terminate this discussion now, as I have no further questions to you on the matter. – Iosif Pinelis Mar 13 '18 at 02:28
3

$\newcommand{\N}{\mathbb N} \newcommand{\R}{\mathbb R} \newcommand{\B}{\mathcal B} \newcommand{\F}{\mathcal F} \newcommand{\la}{\lambda} \newcommand{\si}{\sigma} \newcommand{\Si}{\Sigma} \renewcommand{\c}{\circ} \newcommand{\tr}{\operatorname{tr}}$

The definition you quoted is correct.

However, there can be no reasonable notion of the conditional expectation without the sigma-finiteness condition, even in the discrete setting. E.g., let $X=\N$, $\B=2^\N$, and let $\F$ be any sigma-algebra over $\N$ containing an infinite atom $A\subseteq\N$; for instance, one may take $\F=\{\emptyset,\N\}$, with $A=\N$. Let $\nu$ be the counting measure on $\B=2^\N$, and let $f(x)=1/2^x$ for $x\in\N$. Then $E_\nu f=1\in\R$.

However, on the atom $A$ one cannot reasonably ascribe any value to the conditional expectation $E_\nu(f|\F)$, because such a value (say $v$) could reasonably be only the $\nu$-average of $f$ on $A$. Indeed, if you take $v=0$, this would imply $\int_A f\,d\nu=0$, which is false; if you take $v\ne0$, this would imply $|\int_A f\,d\nu|=|v|\nu(A)=\infty$, which is also false.

The problem here is that, while the measure $\nu$ is sigma-finite, its restriction $\nu|_\F$ to $\F$ is not.

Iosif Pinelis
  • 116,648