27

What's the right way to generate a random probability vector $p={p_1,\ldots,p_n} \in {(0,1)}^n$ where $\sum_i p_i=1$, uniformly distributed over the $(n-1)$-dimensional simplex?

What I have is

Intervals = Table[{0, 1}, {i, n}]
RandomPoint := Block[{a},
    a = RandomVariate[UniformDistribution[Intervals]];
    a/Total[a]];

But I am unsure that this is correct. In particular, I'm unsure that it's any different from:

RandomPoint := Block[{a},
    a = Table[Random[], {i, n}];
    a/Total[a]];

And the latter clearly will not distribute vectors uniformly. Is the first code the right one?

Michael E2
  • 235,386
  • 17
  • 334
  • 747
Schiphol
  • 375
  • 3
  • 8

3 Answers3

26

#/Total[#,{2}]&@Log@RandomReal[{0,1},{m,n}] will give you a sample of m points from a uniform distribution over an n-1-dimensional regular simplex. (An equilateral triangle is a 2-dimensional regular simplex.) Here's what m = 2000, n = 3 should look like, where {x,y} = {p[[2]]-p[[1]], Sqrt@3*p[[3]]} are the barycentric coordinates of the 3-element probability vector p:

enter image description here

Here's what you get if you omit the Log@ and normalize Uniform(0,1) variables, which is what both of the OP's examples do:

enter image description here

Ray Koopman
  • 3,306
  • 14
  • 13
  • Thanks a lot. Could you please explain in what respects does this behave differently from RandomVariate[UniformDistribution[]]? – Schiphol Oct 09 '13 at 08:10
  • See for yourself. Try it with n = 2 and make a histogram of p[[1]]. Or use n = 3 and ListPlot the barycentric coordinates: {x,y} = {p[[2]]-p[[1]],Sqrt@3*p[[3]]}. – Ray Koopman Oct 09 '13 at 18:41
  • Yes, the difference is clear -- see my answer below. Actually, I meant for you to explain the difference in algorithmic terms, or perhaps provide pointers to a textbook explanation of why your method is doing what it's doing. – Schiphol Oct 11 '13 at 10:39
  • I generate a Dirichlet distribution in which all the concentration parameters are 1. See the link that Jacob provided, then scroll down to this section and remember that the log of a Uniform(0,1) variable is proportional to a Gamma variable with shape parameter 1. – Ray Koopman Oct 11 '13 at 13:55
  • 5
    You can also use Mathematica's built-in DirichletDistribution: points = RandomVariate[DirichletDistribution[{1, 1, 1}], 2000] /. v_?VectorQ :> {v[[2]] - v[[1]], Sqrt[3] (1 - Total[v])}; and then ListPlot[points]. – chuy Oct 11 '13 at 18:28
  • Amazing! Does anyone have a prove that this distribution is uniform on the simplex? It passes every stat sampling test I tried, but I still can't prove it's right. I posted this question on the Math page too. https://math.stackexchange.com/questions/4235192/prove-this-method-generates-uniform-points-on-the-simplex – Jerry Guern Aug 28 '21 at 17:22
11

Starting in M10.2, you can just use RandomPoint:

pts=RandomPoint[Simplex[{{0,0,1},{0,1,0},{1,0,0}}], 1000];
Graphics3D[Point[pts]]

enter image description here

Carl Woll
  • 130,679
  • 6
  • 243
  • 355
10

Old question, but I didn't see this method. Generates $n$ points uniformly randomly distributed on a simplex embedded in $d$ dimensions.

    genSimplex[n_, d_] := 
       Table[Differences[Sort[Flatten[{0, RandomReal[1, d – 1], 1}]]], {n}];

The algorithm generates points that are randomly distributed on an outer face of a simplex. The way to generate them is, for a d-dimensional problem…

  1. Generate d-1 uniformly distributed random values in the range [0,1]
  2. Add a 0 and a 1 to the list
  3. Sort the list
  4. Extract a list of the differences between the elements

You now have a list of random values that sum to 1 (so they are on a plane that is defined by points that sum to one) and that are otherwise independent of each other, so their dispersion is uniform.

Updating the answer with a picture of example data with 1,000 points.

enter image description here

This topic is well-covered here... https://stackoverflow.com/questions/3010837/sample-uniformly-at-random-from-an-n-dimensional-unit-simplex

MikeY
  • 7,153
  • 18
  • 27
  • 1
    This is more or less what I came up with on my own, but then I found this Question page while looking for a more efficient way! – Jerry Guern Aug 14 '21 at 06:13