Pattern matching with PDF, Distributed and multivariate distributions

Question

I have defined a function to compute the Shannon entropy of a probability distribution:

ent[p_] := Expectation[-Log[PDF[p, x]], x \[Distributed] p]

This works fine for univarite distributions:

nd = NormalDistribution[]
ent[nd]

Out[9]= 1/2 (1 + Log[2 \[Pi]])

I would have expected ent to work with multivariate distributions as well, since x \[Distributed] p in ent should cause x to be bound to vectors of the appropriate length to make PDF[p, x] work, but I get

nd2 = ProductDistribution[nd, nd]
ent[nd2] // FullSimplify

Out[10]= Expectation[-Log[
   PDF[ProductDistribution[NormalDistribution[0, 1], 
     NormalDistribution[0, 1]], x]], 
 x \[Distributed] 
  ProductDistribution[NormalDistribution[0, 1], 
   NormalDistribution[0, 1]]]

I can write a specific function for each arity:

ent2[p_] := Expectation[-Log[PDF[p, {x, y}]], {x, y} \[Distributed] p]
ent2[nd2]

Out[11]= 1 + Log[2 \[Pi]]

Is there a way to write ent to be generic wrt to the dimention of the distribution? I would also like to better understand why ent does not work for multivariate distributions as is. The definition seems mathematically correct for any dimention.

You'll need to parse out the distribution, arriving at the number of vars needed for say the PDF, then generate these (say using With and Unique), and inject these. Your expectation this will happen "automagically" with x~p is incorrect. — ciao, Apr 17 '14 at 08:55

score 2 · Accepted Answer · answered Apr 17 '14 at 08:58

2

Per my comment (this is not a fleshed out answer, just an example ):

entX[p_] := With[{vars = Unique[] & /@ Range@Length@p},
  Expectation[-Log[PDF[p, vars]], vars \[Distributed] p]]

entX[nd2]

(* 1+Log[2 π] *)

Note, you'll want to use more sophistacated means to detrmine needed number, Length works here for your example, and is probably OK for some generic combination distributions, but will fail in many cases (e.g., if you feed it just nd from your example, it sees the default values for NormalDistribution and poops two variables instead of one.)

answered Apr 17 '14 at 08:58

ciao

25,774
2
58
139

Combined with the pointer to to StatisticsLibraryDistributionDimensionality give in the answer to http://mathematica.stackexchange.com/questions/46309/determining-the-dimension-of-a-probability-distribution I think this provides a complete solution. I would still like to better understand why the original ent does not work though – Daniel Mahler Apr 17 '14 at 20:49
@DanielMahler: Ooh! Thanks for that link! Even though undocumented, that's going to be useful. – ciao Apr 17 '14 at 21:14

Pattern matching with PDF, Distributed and multivariate distributions

1 Answers1