13

I posted this question on Math, but there has been silence there since. So, I wonder if anyone here can get any closer to the answer to my question using Mathematica. Here is the question:

Suppose I draw $N$ random variables from independent but identical uniform distributions, where $N$ is an even integer. I now sort the drawn values and find the two middlemost of these. Finally, I calculate a simple average of these two middlemost values.

Is there a closed-form description of the progression of distributions that arise as $N$ increases from $N=2$ to $N=∞$ ? The first distribution is easily found to be Triangular, but what about the rest? Plots from simulations in MATLAB, with a uniform distribution on the range 0 to 1, provide the following illustrations:

enter image description here

JimB
  • 41,653
  • 3
  • 48
  • 106
user120911
  • 2,655
  • 9
  • 18

3 Answers3

21

Mathematica does make this pretty easy. The statistic of interest is the typical estimator of the median when the sample size is even. When the sample size is odd the sample median has a beta distribution:

OrderDistribution[{UniformDistribution[{0, 1}], n}, (n + 1)/2]
(* BetaDistribution[(1 + n)/2, 1 + 1/2 (-1 - n) + n] *)

Now for the case when $n$ is even. First find the joint distribution of the middle two order statistics. Then find the distribution of the mean of those two statistics.

n = 6;
od = OrderDistribution[{UniformDistribution[{0, 1}], n}, {n/2, n/2 + 1}];

md = TransformedDistribution[(x1 + x2)/2, {x1, x2} \[Distributed] od];

PDF[md, x]

PDF of distribution

Plot[Evaluate[PDF[md, x]], {x, 0, 1}]

Density function

To obtain the distribution for general $n$ when $n$ is even we have to use some other than TransformedDistribution. We need to integrate the joint density function and treat $0<x<1/2$, $x=1/2$, and $1/2<x<1$ separately.

fltOneHalf = 2 Integrate[(x1^(-1 + n/2) (1 - x2)^(-1 + n/2) n!)/((-1 + n/2)!)^2 /. 
  x2 -> 2 x - x1, {x1, 0, x}, Assumptions -> n > 1 && 0 < x < 1/2]
(* -((4 ((1 - 2 x) x)^(n/2) Gamma[n]*
  Hypergeometric2F1[1 - n/2, n/2, (2 + n)/2, x/(-1 + 2 x)])/((-1 + 2 x)*
  Gamma[n/2]^2)) *) 

fOneHalf = 2 Integrate[(x1^(-1 + n/2) (1 - x2)^(-1 + n/2) n!)/((-1 + n/2)!)^2 /.
  x2 -> 1 - x1, {x1, 0, 1/2}, Assumptions -> n > 1]
(* (2^(2 - n) n!)/((-1 + n) ((-1 + n/2)!)^2) *)

(* Because the density is symmetric, we'll take advantage of that *)
fgtOneHalf = FullSimplify[fltOneHalf /. x -> y /. y -> 1 - x]
(* (4 (-1 + (3 - 2 x) x)^(n/2) Gamma[n]*
  Hypergeometric2F1[1 - n/2, n/2, (2 + n)/2, (-1 + x)/(-1 + 2 x)])/((-1 + 2 x) Gamma[n/2]^2) *)

Putting this together in a single function:

pdf[n_, x_] := 
 Piecewise[{{-((4 ((1 - 2 x) x)^(n/2)*Gamma[n] Hypergeometric2F1[1 - n/2, n/2, (2 + n)/2, 
       x/(-1 + 2 x)])/((-1 + 2 x) Gamma[n/2]^2)), 0 < x < 1/2},
   {(2^(2 - n) n!)/((-1 + n) ((-1 + n/2)!)^2), x == 1/2},
   {(4 (-1 + (3 - 2 x) x)^(n/2) * Gamma[n]*
 Hypergeometric2F1[1 - n/2, n/2, (2 + n)/2, (-1 + x)/(-1 + 2 x)])/((-1 + 2 x) Gamma[n/2]^2), 
    1/2 < x < 1}}, 0]
JimB
  • 41,653
  • 3
  • 48
  • 106
  • Wow that is pretty slick. One interesting thing is that Mathematica only seems to like to compute the ones with even n, perhaps because OrderDistribution doesn't like fractional values or perhaps because those are the only ones where the polynomials work out. – b3m2a1 Jun 27 '19 at 03:57
  • @b3m2a1 I'm in the process of adding in the code for odd n and the formulas for general n. Should be able to get to that in about an hour. – JimB Jun 27 '19 at 04:20
  • I'm interested in what the general form looks like... I took a bit of time to try to find a pattern for these even n but still no dice. – b3m2a1 Jun 27 '19 at 04:44
  • If you plot this for n = 1...50 you get quite the distribution: https://i.stack.imgur.com/6t4HQ.png Looks as if the thing will never level out, which suggests that in the limit of n->Infinity this'd go to a delta distribution centered at x=1/2 – b3m2a1 Jun 27 '19 at 07:31
  • That would make quite some T-shirt! – user120911 Jun 27 '19 at 07:51
  • 1
    For general n, you can still get the result with TransformedDistribution: PDF[TransformedDistribution[(x1 + x2)/2, {x1, x2} \[Distributed] OrderDistribution[{UniformDistribution[{0, 1}], 2 n}, {n, n + 1}]], x]. Then just replace n -> n/2. You'll have to wait a bit for the result, though. – Sjoerd Smit Jun 27 '19 at 08:58
  • @SjoerdSmit Thanks. That took 2 minutes on my laptop. (I certainly wasn't patient enough when trying that before. How things have changed since the punch card days!) However, it doesn't deal with x = 1/2 as it gives an Indeterminate in that case. But I'll add that into the answer. Thanks, again! – JimB Jun 27 '19 at 14:32
  • Oh well, a single point on the x axis doesn't matter in a continuous PDF anyway (as long as there's no delta spike there) :P. You can always ask for the CDF instead, which does cover the point x = 1/2. – Sjoerd Smit Jun 27 '19 at 15:15
  • @SjoerdSmit Just saw one more issue with the use of TransformedDistribution for the general case. For 1/2<x<1 there is the term $B_{2-\frac{1}{x}}(-2 n,n+1)$. One obtains ComplexInfinity for this term and therefore ComplexInfinity when 1/2<x<1. So only the density between 0 and 1/2 gets the correct density. I'll have to look into that more closely. – JimB Jun 27 '19 at 16:25
  • @SjoerdSmit For whatever it's worth, I've submitted a question to Wolfram Support about the symbolic approach resulting in the wrong answer for $1/2 < x < 1$. – JimB Jun 28 '19 at 04:44
2

Here's an interesting counter-example for a discrete uniform distribution does not tend to your shape as $N$ grows.

Let your r.v. $x$ be distributed as per a coin toss, taking value ${0,1}$ with equal probability. Then you have three possible outcomes after $N$ coin tosses; $0, 1/2$ , or $1$

The middle outcome's probability is the probability that with $N$ coin tosses you get exactly $N/2$ zeros or ones. This probability is

$$ 2^{-n} \binom{n}{\frac{n}{2}} $$

Which decreases in $N$, resulting in the plot below, with the distribution of probabilities of values with $N$

enter image description here

EDIT

You can see the same effect with the discrete distribution for $x$ expanded to take values $x=\{1, 2, ... , 50\}$ with equal probability. The non-integer values are much less likely, since the odds of the two middle points hitting the boundary is low. The integers values do tend to a Guassian.

middleMean[n_, range_] := Module[{res}, 
     res = Sort@Table[RandomChoice[Range[range]], {n}];
     Mean[{res[[n/2]], res[[n/2 + 1]]}]
  ]

Histogram[Table[middleMean[500, 50], {1000}], 50]

enter image description here

MikeY
  • 7,153
  • 18
  • 27
-2

According to "Median." Wikipedia, The Free Encyclopedia. 11 Jun. 2019. Web. 3 Jul. 2019, the median of a uniformly distributed variable is (b-a)/2. In general, for the Uniform, Normal, and Exponential distributions, Mathematica reports the same medians as the Wikipedia article. I did not try the other distributions for which the Wikipedia article gave medians.

CElliott
  • 560
  • 2
  • 7