2

In the book Rationality: From AI to Zombies, in the chapter Absence of Evidence Is Evidence of Absence, Eliezer Yudkowsky (the author) wrote the following:

If $E$ is a binary event and $P(H|E) > P(H)$, i.e., seeing $E$ increases the probability of $H$, then $P(H|¬E) < P(H)$, i.e., failure to observe $E$ decreases the probability of $H$. The probability $P(H)$ is a weighted mix of $P(H|E)$ and $P(H|¬E)$, and necessarily lies between the two.

(By "$\neg E$" Eliezer means "$E^{\complement}$", and by "failure to observe $E$" he means "observing that $E$ is false". Similar terminology was discussed in another question (whose subject is very similar to the subject of this chapter))


According to the law of total probability, $P(H)=P(H|E)P(E)+P(H|\neg E)P(\neg E)$, so indeed as Eliezer (and wikipedia) noted, $P(H)$ is a weighted average of $P(H|E)$ and $P(H|¬E)$. Along with the assumption that $P(H|E) > P(H)$, we can deduce that $P(H|¬E)\le P(H)$.

So I see why $P(H|E)>P(H)$ implies $P(H|\neg E)\le P(H)$.
But does $P(H|E)>P(H)$ imply $P(H|\neg E)<P(H)$ ?

Is the following a valid counterexample, or am I missing something?

Let's say a real number in $[0,1]$ is chosen uniformly at random, let $H$ be the event that $1$ was chosen, and let $E$ be the event that an integer was chosen.
If I understand correctly:

  • $E$ is a binary event.
  • $P(H|E)=0.5$
    (here is a question in which a similar calculation of a conditional probability is discussed)
  • $P(H)=0=P(H|\neg E)$
    ($P(H)$ is indeed a weighted average of $P(H|E)$ and $P(H|¬E)$, as $P(E)=0$)
  • Thus, $P(H|E)>P(H)$ is true, but $P(H|\neg E)<P(H)$ is false.
Oren Milman
  • 451
  • 5
  • 11
  • 1
    I think when you write $P(H|E)$ you assume $P(E)>0$, since you use the definition $P(H \cap E) / P(E)$. – GEdgar Sep 19 '18 at 13:29
  • I guess you are right, and that this is also what Eliezer assumed (assuming that the extension of conditional probability that drhab explained about is not the default definition of conditional probability), so my "counterexample" is irrelevant, as the claim in the title assumes $P(E)>0$. – Oren Milman Sep 19 '18 at 15:54
  • By the way, I think that the definition that drhab explained is the second definition in wikipedia. The only thing I still wonder about is whether the claim in the title can be proved also for cases in which $P(E)=0$ or $P(E^C)=0$ (in case we are assuming this definition, and not the default one).. Or is my example actually is a counterexample in such a case? – Oren Milman Sep 19 '18 at 16:50

1 Answers1

1

Characteristic for conditional probability $P(A\mid B)$ is the equality:$$P(A\mid B)P(B)=P(A\cap B)$$


Applying that here (i.e. multiplying both sides with $P(E)$), statement $P(H\mid E)>P(H)$ can be converted into:$$P(H\cap E)>P(H)P(E)\tag1$$Similarly statement $P(H\mid E^{\complement})<P(H)$ can be converted into:$$P(H\cap E^{\complement})<P(H)P(E^{\complement})$$or equivalently: $$P(H)-P(H\cap E)<P(H)-P(H)P(E)\tag2$$

It is obvious now that $(1)$ and $(2)$ are equivalent statements.


Your counterexample makes use of an event $E$ that satisfies $P(E)=0$ making it tricky to speak of conditionals like $P(H\mid E)$.

Characteristic for $P(H\mid E)$ is actually the equation $$P(E)P(H\mid E)=P(H\cap E)$$ and nothing more than that.

But in your case $P(E)=P(H\cap E)=0$ so that for any value for $P(H\mid E)$ this will be satisfied.

So there is no ground for setting $P(H\mid E)=0.5$ as you did.

drhab
  • 151,093
  • I think I understand what you mean by "characteristic", but can you point me to a resource that explains about that? It sounds quite useful :) – Oren Milman Sep 19 '18 at 14:35
  • Also, because $P(E)=0$, it holds that $P(\neg E)=1$, and so for the law of total probability to hold, it must be that $P(H|\neg E)=0$ (as expected). Now, IIUC, you proved the title of the question, so for the claim in the title to hold, it must be that $P(H|E)=0$, right? – Oren Milman Sep 19 '18 at 14:57
  • If $P(E)=0$ then the law of total probability $P(H)=P(H\mid E)P(E)+P(H\mid E^{\complement})P(E^{\complement})$ holds if $P(H)=P(H\mid E)\cdot0+P(H\mid E^{\complement})\cdot1=P(H\mid E^{\complement})$. Why do you expect $P(H\mid E^{\complement})=0$ in that case? – drhab Sep 19 '18 at 15:10
  • Because $P(H)=0$.. Isn't it? – Oren Milman Sep 19 '18 at 15:12
  • The claim in the title (and also the converse of it) holds whenever $P(E)>0$. If you say "because $P(H)=0$.." are you then speaking about the special case of your counterexample? – drhab Sep 19 '18 at 15:25
  • Yes, sorry for the confusion. I interpreted your explanation (that uses characteristics) as meaning that the claim is always valid, and not only in case $P(E)>0$. So are you saying that I can't use $P(H|E)$ at all if $P(E)=0$? – Oren Milman Sep 19 '18 at 15:28
  • 1
    That goes too far. If we define $P(A\mid B)=P(A\cap B)/P(B)$ (as usual) then there is no definition of $P(A\mid B)$ if $P(B)=0$. But we could also “define”: $P(A\mid B)$ is a number in $[0,1]$ such that $P(A\mid B)P(B)=P(A\cap B)$. If $P(B)>0$ then both definitions are the same (so nothing is lost). Further it leaves $P\left(A\mid B\right)$ undetermined if $P\left(B\right)=0$. Then in certain cases we can make our own choice. – drhab Sep 19 '18 at 15:37
  • If e.g. $X,Y$ are independent random variables and $P\left(Y=y\right)=0$ then we can still go for $P\left(X\in A\mid Y=y\right)=P\left(X\in A\right)$. This choice agrees with our intuition (so something is gained). I have no source of this, but it is purely my own thinking. – drhab Sep 19 '18 at 15:37
  • $P(X\in A|Y=y)=P(X\in A)$ makes perfect sense to me if $X,Y$ are independent, but $P(H|E)=0.5$ also makes sense to me in my "counterexample", as also $P(A|B)=0.5$ makes sense to me in this answer (Or do you think this answer is wrong?) How are these 2 latter different from your example of independent variables? (I am sorry for my endless inquiries, but I am still quite confused..) – Oren Milman Sep 19 '18 at 19:41
  • 1
    To keep things as general as possible I think it's best to state that $P(H\mid E)>P(H)\wedge P(E)\neq0\implies P(H)<P(H\mid E^{\complement})$ together with $P(H\mid E)>P(H)\wedge P(E)=0\implies P(H)=P(H\mid E^{\complement})$. So actually I am saying that your doubts about the implication in the title of your question are well-founded. – drhab Sep 20 '18 at 09:39
  • IIUC, my doubts will be well-founded only if there exists a case such that $P(H∣E)>P(H)∧P(E)=0$, but in such case it must be that $P(E)=P(H\cap E)=0$, and you wrote in your answer: "But in your case $P(E)=P(H\cap E)=0$ so that for any value for $P(H\mid E)$ this will be satisfied", so how could there be a case such that $P(H∣E)>P(H)∧P(E)=0$? In each such case I will have no ground for setting $P(H∣E)$ to any value, won't I? – Oren Milman Sep 20 '18 at 13:25
  • 1
    What I mean is: if - in spite of $P(E)=0$ - we still allow ourselves to "define" $P(H\mid E)$ - (this on the way you referred to as the "second definition on wikipedia") then the implication in the title of your question is not true in general and your own counterexample works. Then we could "define" $P(H\mid E):=0.5$ on the (weak) base of intuition. – drhab Sep 20 '18 at 17:47