1

Let $V =$ {Person $A$ is affected by a specific virus} and $+ =$ {Person $A$ has tested positive for the virus}. Let also $T_1$ and $T_2$ be two distinct occurences of the same medical test (positive or negative). We are given that $P(V|+) = p,\ P(+|V) = q$ and $P(V) = v$.

Can we prove, using only the information specified above, that $T_1$ and $T_2$ are independent given $V$, namely

$$P(T_1 \cap T_2|V) = P(T_1|V)P(T_2|V)$$

For instance if $T_1$ and $T_2$ are both positive we would like to prove that $P(T_1 \cap T_2|V) = q^2$.

If not, under what conditions it can be reasonably assumed?

Edit: To give some more context, my question arises from an exercise that was asking to find the probability that a person gets a positive first test and a negative second, given that he has a specific virus. The author solved the exercise assuming conditional independence of the two tests, so I wondered if there is a proof for that.

Nick
  • 332
  • 2
  • 8
  • 1
    Suppose that $T_1$ and $T_2$ are the same test, so they are correct on the same patients and make a mistake on the same patients. Then obviously they are not independent, conditional or otherwise. But I'd say it is clear that nothing you wrote in the first paragraph would in any way provide any contradiction with the scenario of $T_1$ and $T_2$ being identical. After all, you are saying that they come with the same values of $p$ and $q$. That is VERY compatible with them being the same. So conversely, the info you wrote is not enough to say that they are NOT the same, let alone independent – Vincent Jan 23 '23 at 14:21
  • $T_1$ and $T_2$ are the same test, performed two distinct times. How do I prove that they are independent given V? – Nick Jan 23 '23 at 14:38

2 Answers2

2

You really need to know something about the biology of the disease and the inner workings of the test.

Following the comments on the other answer we work with a sensitivity (i.e. $P(+|V)$) of 90%.

Here are two, somewhat unrealistic scenarios:

  1. Suppose that 90% of people when infected with the virus produces some special anti-gen that the test picks up with 100% sensitivity, but 10% of the people have a genetic disorder that makes that they don't produce the antigen. Then from the manufacturers perspective the sensitivity of the test is 90%: giving the test to a random person from the set of all virus carriers yields 90% chance of a positive test result. But the probability of getting 2 positive tests when testing the same infected person twice is still 90% because we are only testing if this person belongs to the 90% of the population that produces the anti-gen when infected.
  1. Suppose that 90% of tests are 100% sensitive to the virus, and 10% of tests are defective and are 0% sensitive. Now from a consumer perspective the sensitivity is 90%: if I have the virus and do a random test I have 90% chance of testing positive. But the probability of testing positive twice when doing two tests of this type is suddenly only 81% because here we can assume conditional independence: buying a test is like picking a marble from a vase in one of those math problems.

Most realities will be somewhere in between these extremes. There is also dependence on how far you stick the test into your nose and a lot of other factors. The more you know, the better you can model the part that you don't know as the result of a random process. But what is the right model and how much independence it contains depends on what you do know about the underlying mechanisms.

Vincent
  • 10,614
  • There is a separate issue (in the comments to the other answer) that when testing twice the probability of at least one positive test is a more useful quantity than the probability of exactly two positive tests, but I leave that aside here as the question seems to be about if and when we compute that latter quantity. – Vincent Jan 23 '23 at 16:50
  • You seem to overanalyze. The question is not about medical testing, but rather medical testing is used to formulate an example about probabilities. – Nick Jan 23 '23 at 17:00
  • Yes, but the point is that the givens are not enough to know if there is conditional independence. There are multiple scenarios compatible with the givens in which I you can deduce independence or perfect correlation because of the extra information in the scenario and it follows that there are also scenarios where you don't have enough information and can only assume that likely there is 'some' correlation but we don't know how many – Vincent Jan 23 '23 at 17:03
  • If you want to give this problem to someone and ask them to compute under assumption of conditional independence then you need to either state explicitly that the tests are conditional independent or give enough details on the background for the person to be able to deduce it from there. If that is the setting (e.g. if you are a math teacher) I think explicitly stating the assumptions is the safer option. – Vincent Jan 23 '23 at 17:06
  • But for your question 'under what conditions it can be reasonably assumed' it depends on the context: if your example about probabilities tries to model some real life phenomenon then the extra conditions can come from there; if it is pure mathematics then the conditional independence is itself the clearest and strongest formulation of this assumption. – Vincent Jan 23 '23 at 17:10
  • 1
    Alright. My verdict is that it can only be reasonably assumed and not proven. My question arose from an exercise that was asking to find the probability that a person gets a positive first test and a negative second, given that he has a specific virus. The author solved the exercise assuming conditional independence of the two tests, so I wondered if there is a proof for that. – Nick Jan 23 '23 at 18:01
  • 1
    Ah no, it should be stated, or something else should. But you will see this often. Independence (conditional independence is a special, more rare case) is an assumption that people make tacitly and perhaps unknowingly all the time because they are used to excercises where it holds or because it makes computation easier. It is good that you are aware that there was an unspoken assumption here. – Vincent Jan 23 '23 at 21:02
  • 1
    My answer conditional independence of diagnostic tests is pertinent to your question, @Nick. $\quad$ In particular, its points #3 & 4 shows that the comment about this one is incorrect to suggest that conditional independence is a special case of (i.e., implies) independence. – ryang Feb 28 '23 at 20:40
0

"For instance if $T_1$ and $T_2$ are both positive we would like to prove that $P(T_1 \cap T_2|V) = q^2$"

In general,we certainly can't agree to assertions of this type.

I would say we could only make that assertion if both the sensitivity and specificity of the two tests are identical or near identical.

[ Sensitivity $= \frac{True\;positive}{True\;positive + False\; negative}$,
Specificity $ = \frac{True\; negative}{True\;negative + False\; positive}\quad\quad$]

Such situations are quite unlikely, that is why so called "gold standards" exist for different tests


P.S.

  1. Please don't change your question after answer(s) have been received as it can invalidate posted answer(s)

  2. $P(T_1\cap T_2 |V)$ is definitely not equal to $q^2$ as it implies that if the repeat test returns +, you are less sure that you have the disease.

  3. It is not clear to me what you are driving at.

PPS:

It is best to take a simpler model where the test has an "accuracy", of , say, $90\,$% i.e. fraction of diseased people testing positive is $0.9$ and fraction of healthy people testing negative is also $0.9$.
Then, if two tests return positive,
it doesn't mean that P(diseased|positive twice) $= 0.9\cdot0.9$,
rather the probability is $1 -(1-0.9)^2$

In all the cross-correspondence, I had lost track of your original query, viz if two (same) tests give values of $p_1$ and $p_2$ for P(+|diseased), would the value be $p_1\cdot p_2$ ? No, it would be the simple average $\large\frac{p_1+p_2}{2}$

  • I clarified my question. $T_1$ and $T_2$ are two distinct occurences of the same test. – Nick Jan 23 '23 at 14:41
  • Is it safe to say $$\mathbb{P}(T_2|V\cap T_1)=\mathbb{P}(T_2|V) $$ since any additional information about the outcome of test #1 is irrelevant when it's known that the patient is infected with the specific virus? – Matthew H. Jan 23 '23 at 15:14
  • Matthew H. are you asking me? That's basically what I asked. But I wanted a proof. – Nick Jan 23 '23 at 15:23
  • I'm asking @trueblueanvil. If you can make this assumption (which seems reasonable) then from chain rule $$\begin{eqnarray}\mathbb{P}(T_1\cap T_2|V)&=&\mathbb{P}(T_2|T_1\cap V)\mathbb{P}(T_1|V)\&=&\mathbb{P}(T_2|V)\mathbb{P}(T_1|V)\end{eqnarray}$$ – Matthew H. Jan 23 '23 at 15:31
  • Matthew H. conditional independence of two events $A$ and $B$ given $C$ is defined in two different ways. One is $P(A \cap B|C) = P(A|C)P(B|C)$ and the other is $P(A|B \cap C) = P(A|C)$. So my question is the same as yours. – Nick Jan 23 '23 at 15:38
  • @MatthewH: I am adding a PPS shortly. – true blue anil Jan 23 '23 at 15:42
  • 1
    true blue anil $P(V|+_1 \cap +_2 \cap \cdots)$ should be increasing as it says the more positive tests you get, the more likely it is to have the virus but $P(+_1 \cap +_2 \cap \cdots | V)$ should be decreasing as it says, if you have the virus, the more positive tests you get, the more likely it is for one of them to err. Am I wrong? – Nick Jan 23 '23 at 15:52
  • @Nick: Please see the PPS – true blue anil Jan 23 '23 at 15:57
  • true blue anil: I'm not asking whether $P(\text{diseased}|\text{positive twice}) = 0.9\cdot0.9$. I'm asking whether $P(\text{positive twice}|\text{diseased}) = 0.9\cdot0.9$. – Nick Jan 23 '23 at 16:18
  • @Nick: Commonsense would dictate a simple average, thus $\large\frac{0.9+0.9}{2} = 0.9$ – true blue anil Jan 23 '23 at 16:22