It is well known that the set of density operators $\{\rho\}$ for a quantum theory form a convex set. As I have seen them defined, we simply say that a state corresponds to some linear operator $\rho$ which is self-adjoint, positive semidefinite, and of unit trace.
Now it is often taken in books as obvious that, for a situation (I'm struggling to make this precise which is part of the question I suppose) in which there is some "classical" uncertainty about the state of some quantum system which may be, with probability $p_n$, in some pure state $|\psi_n\rangle \langle \psi_n|$, we must represent the state of the system via a weighted sum of these pure states. More precisely, we take $$\rho = \sum_n p_n |\psi_n\rangle \langle \psi_n|, \sum p_n = 1, p_n \in [0,1] \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ (A)$$
Is there any reason other than "experiment confirms this choice" that we might take this? Obviously this also depends on exactly which physical contexts imply this choice for $\rho$. For instance, if I enter a laboratory and am unsure if a spin has been prepared up or down but I know it is in one or the other, is it correct for me to assign $\rho = \frac{1}{2}|+\rangle \langle +|+\frac{1}{2}|-\rangle \langle -|$? It seems absurd since this cannot reproduce the statistics of measurements of the actual pure state.
Thus to rephrase and try to be more precise, my questions are:
(1) When is (A) the correct choice for $\rho$?
(2) Can we prove that it is the correct choice without appealing to experiment? NB that I understand that this $\rho$ is a valid state since $\{\rho\}$ forms a convex set; thus my question here is not "why is it a valid state" but rather "why is it the valid state for this physical situation"?