10

I am trying to specify a bivariate probability density function in Mathematica. As a check, I would like to confirm that it integrates to one. Here is the function:

f[x1_, x2_, u1_, u2_, v11_, v22_, v12_] := Det[2*Pi*{{v11, v12}, {v12, v22}}]^(-0.5)*(x1*x2*(1 - x1 - x2) )^(-1)*Exp[-0.5*(Log[{x1, x2}/(1 - x1 - x2)] - {u1, u2}).Inverse[{{v11, v12}, {v12, v22}}].(Log[{x1, x2}/(1 - x1 - x2)] - {u1, u2})]

Here, $X_1$ and $X_2$ are the random variables, where $X_1 > 0$, $X_2 > 0$ and $X_1 + X_2 < 1$, and $U_1 \in \mathbb{R}$, $U_2 \in \mathbb{R}$, $V_{11}>0$, $V_{22}>0$ and $V_{12}\in \mathbb{R}$ are parameters.

Using NIntegrate to integrate over the $(X_1,X_2)$ space with non-zero density, with $U_1=U_2=V_{12}=0$ and $V_{11}=V_{22}=1$, I get:

NIntegrate[f[x1, x2, 0, 0, 1, 1, 0], {x1, 0, 1}, {x2, 0, 1 - x1}]
(* 0.364031 *)

The answer should be 1 (not 0.364 as above) for any choice of the parameters. I have specified the same function in the R language with only syntax changes. The function in R gives exactly the same values as in Mathematica when supplied with the same arguments (confirming that it is not a programming error). However, I get an integral of 1 in R using adaptive quadrature (i.e. a different answer to Mathematica!).

Any suggestions?

Miguel

Miguel
  • 201
  • 1
  • 5
  • If all you want is to specify a PDF, consider ProbabilityDistribution[] and its Method -> "Normalize" setting. – J. M.'s missing motivation Jul 06 '16 at 13:40
  • 1
    @J.M. That's not what I'm interested in. I want to know why NIntegrate is giving me an answer that is incorrect, both in theory and when checked against another package. – Miguel Jul 06 '16 at 13:44
  • 2
    This seems to be a failure of the default integration method, then. Try adding Method -> "UnitCubeRescaling". Also, it is advisable to use 1/2 instead of 0.5 in your PDF. – J. M.'s missing motivation Jul 06 '16 at 13:58
  • Hiding the symbolic form using _?NumericQ causes it to return 1. – Szabolcs Jul 06 '16 at 14:03
  • 1
    @Miguel You could report this problem to Wolfram Support. – Szabolcs Jul 06 '16 at 14:04
  • @J.M. Wouldn't MultidimensionalRule provide a more flexible solution? (both work) – Feyre Jul 06 '16 at 14:15
  • @Feyre, it could, but that's a "rule" and not a "strategy"; that is, you're not actually doing adaptive cubature when using a rule, so it is possible to miss features in the integrand. But you can certainly use an integration rule if you know what you're doing, as that evaluates much faster. – J. M.'s missing motivation Jul 06 '16 at 14:17
  • Thanks for your suggestions! It seems to be a problem with the symbolic pre-processing that is fixed with either _?NumericQ or Method -> "UnitCubeRescaling". – Miguel Jul 06 '16 at 14:23
  • What I couldn't figure out though is how to explicitly turn off symbolic preprocessing, e.g. Method -> {Automatic, "SymbolicProcessing" -> False} causes it to fail in other ways. – Szabolcs Jul 06 '16 at 14:34
  • @Szabolcs, apparently the default method evaluates at the corners of the triangle when "SymbolicProcessing" -> 0; Method -> {"DoubleExponential", "SymbolicProcessing" -> 0} works. – J. M.'s missing motivation Jul 06 '16 at 14:42
  • NIntegrate seems to be using "LevinRule" but I don't know why. And I don't know why it fails so badly. Pretty much all the suggestions that work are just one way or another to prevent "LevinRule" from being applied. – Michael E2 Jul 07 '16 at 04:05
  • Interesting: N@ Integrate[f[x1, x2, 0, 0, 1, 1, 0], {x1, 0, 1}, {x2, 0, 1 - x1}] yields 0.5000000002523569`. I also get 0.5 if I Rationalize[] the integrand and use N[integral, prec]. Integrate[] factors out the -1/(2 Pi), and for whatever reason, that changes the value returned by NIntegrate[] when N is applied. Seems the integral is a pathological case for the Levin Rule (without the setting MinRecursion higher as Anton showed). – Michael E2 Jul 07 '16 at 15:02
  • @Michael, David Levin designed his method specifically for oscillatories, so no surprise that it does poorly on something manifestly nonoscillatory. From your analysis, I guess the internal method-choosing code could do much better, since it has chosen poorly here. – J. M.'s missing motivation Jul 08 '16 at 04:09
  • @J.M. Yes, that is my thinking, too. I don't understand the internals of the Levin rule very well, nor why the rule fails by default but succeeds with MinRecursion -> 1. After all, it can work on nonoscillatory integrals, although, as you say, it's a poor choice and shouldn't be expected to work well. (To a certain extent I can see what it did wrong. Something tricks it into computing small integrals & errors on subregions.) – Michael E2 Jul 08 '16 at 18:33

2 Answers2

14

The sampling points are insufficient in the first rule applications. Increasing them, say, with MinRecursion produces the expected result:

NIntegrate[f[x1, x2, 0, 0, 1, 1, 0], {x1, 0, 1}, {x2, 0, 1 - x1}, 
 MinRecursion -> 1]
(* 1. *)

An alternative is to use Cartesian rules:

NIntegrate[
 f[x1, x2, 0, 0, 1, 1, 0], {x1, 0, 1}, {x2, 0, 1 - x1}, 
 Method -> {"GlobalAdaptive", Method -> "ClenshawCurtisRule"}]
(* 1. *)

Changing the integration strategy to the simpler "Trapezoidal" or "MonteCarlo" also gives results close to 1.

Here are some sampling points plots for comparison:

Grid[Partition[
  Append[NIntegrateSamplingPoints[
      NIntegrate[
       f[x1, x2, 0, 0, 1, 1, 0], {x1, 0, 1}, {x2, 0, 1 - x1}, 
       Evaluate[Sequence @@ #]]], {PlotLabel -> #[[1]], 
      ImageSize -> 400}] & /@ {{Method -> Automatic}, {MinRecursion ->
       1}, {MinRecursion -> 
      2}, {Method -> {"GlobalAdaptive", 
       Method -> "ClenshawCurtisRule"}}, {Method -> "Trapezoidal", 
     MaxRecursion -> 20}, {Method -> "MonteCarlo"}}, 3], 
 Dividers -> All, Alignment -> {Right, Left}]

enter image description here

Compare the plots above with the following plots of the integrand:

Grid[{{Plot3D[f[x1, x2, 0, 0, 1, 1, 0], {x1, 0, 1}, {x2, 0, 1}, 
    MaxRecursion -> 4(*,Mesh\[Rule]All*), ImageSize -> 350], 
   DensityPlot[f[x1, x2, 0, 0, 1, 1, 0], {x1, 0, 1}, {x2, 0, 1}, 
    MaxRecursion -> 4, Mesh -> All, ImageSize -> 350]}}]

enter image description here

Anton Antonov
  • 37,787
  • 3
  • 100
  • 178
7

Aside from getting around the apparent weakness in the "LevinRule"* as others have suggested, here is another way to verify the total probability is 1, namely, by changing variables.

{transformation} = 
 Solve[{u1, u2} == {Log[x1/(1 - x1 - x2)], Log[x2/(1 - x1 - x2)]}, {x1, x2}, Reals]
(*
  {{x1 -> E^u1/(1 + E^u1 + E^u2), 
    x2 -> -E^-u1 (-E^u1 + E^u1/(1 + E^u1 + E^u2) + E^(2 u1)/(1 + E^u1 + E^u2))}}
*)

jacobian = Det@D[{x1, x2} /. transformation, {{u1, u2}}] // Simplify
(*  E^(u1 + u2)/(1 + E^u1 + E^u2)^3  *)

(* new limits of integration *)
Reduce[{u1, u2} == {Log[x1/(1 - x1 - x2)], Log[x2/(1 - x1 - x2)]} &&
  0 < x1 < 1 && 0 < x2 < 1 - x1, {u1, u2}, {x1, x2}]
(*  (u1 | u2) ∈ Reals  *)

Integrate[
 f[x1, x2, 0, 0, 1, 1, 0] * jacobian /. transformation,
 {u1, -∞, ∞}, {u2, -∞, ∞}]    (* implied by (u1 | u2) ∈ Reals *)
(*  1  *)

*The use of "LevinRule" and "MultidimensionalRule" in the OP's integral as the automatically chosen method can be seen by using the approach presented in Determining which rule NIntegrate selects automatically, or by inspecting calls to the integration rules collected via

{calls} = Last@Reap@NIntegrate[f[x1, x2, 0, 0, 1, 1, 0], {x1, 0, 1}, {x2, 0, 1 - x1},
 IntegrationMonitor :> (Sow[#1] &)]

The output may be viewed in this image. I hesitate to call it a bug, if it were an edge-case that is hard to detect; numerical routines often have limitations. There are two issues, why choose "LevinRule", and why "LevinRule" gives wrong approximations 0.36 or 0.5 (if 1/(2 Pi) is factored out of the integral sign as noted here).

Further remarks on the choice of "LevinRule" and a strange workaround. NIntegrate seems to conclude the integral is oscillatory, because of the following. The domain of integration is mapped to a unit square, which transforms the integrand to

Exp[1/2 * 
  (-Log[x1/(1 - x1 - (1 - x1) x2)]^2 - Log[((1 - x1) x2)/(1 - x1 - (1 - x1) x2)]^2) /
    (2 π x1 x2 (1 - x1 - (1 - x1) x2))
 ]

Whether the integrand is oscillatory is determined by whether the argument to Exp is real. That depends on whether the arguments to the logarithms

x1/(1 - x1 - (1 - x1) x2), ((1 - x1) x2)/(1 - x1 - (1 - x1) x2)

are negative over the domain 0 <= x1 <= 1, 0 <= x2 <= 1. This is determined by plugging the intervals in the form Interval[{0, 1.}] for x1, x2 each. But Interval[] is only guaranteed to compute an interval that contains the exact result. But note that the computed intervals contain negative numbers:

{x1/(1 - x1 - (1 - x1) x2), ((1 - x1) x2)/(1 - x1 - (1 - x1) x2)} /. 
 Thread[{x1, x2} -> Interval[{0, 1.}]]
(*  {Interval[{-∞, ∞}], Interval[{-∞, ∞}]}  *)

However, the arguments are nonnegative:

Minimize[{#, 0 <= x1 <= 1 && 0 <= x2 <= 1}, {x1, x2}] & /@
 {x1/(1 - x1 - (1 - x1) x2), ((1 - x1) x2)/(1 - x1 - (1 - x1) x2)}
(*  {{0, {x1 -> 0, x2 -> 1/2}}, {0, {x2 -> 0, x1 -> 0}}}  *)

Before we conclude it is a bug, consider what's going on in the denominator of the arguments:

{1 - x1 - (1 - x1) x2, 1 - x1 - (1 - x1) x2 // Factor} /. 
 Thread[{x1, x2} -> Interval[{0, 1}]]
(*  {Interval[{-1, 1}], Interval[{0, 1}]}  *)

The first evaluates to 1 + Interval[{-1, 0}] + Interval[{-1, 0}], which is exactly right. But the two intervals are treated as independent quantities, which originally they aren't, and so the interval sum comes out to be Interval[{-1, 1}]. The factored expression yields a more precise result.

It gets more complicated when we consider approximate reals, such as 1.. Interval adds or subtracts an epsilon from the end points when calculating with them, so that 1 - Interval[{0, 1.}] equals Interval[{-4.44089*10^-16, 1}]. This has been discussed a little here and remarked on here.

I'll leave off the analysis here, except to note that it suggested the following as a potential fix:

f2 = 1/Expand[1/f[x1, x2, 0, 0, 1, 1, 0]];

f2 == f[x1, x2, 0, 0, 1, 1, 0] // Simplify  (* check algebraic equivalence *)
(*  True  *)

NIntegrate[f2, {x1, 0, 1}, {x2, 0, 1 - x1}]
(*  1.  *)
Michael E2
  • 235,386
  • 17
  • 334
  • 747
  • Note that the OP's f[x1, x2, y1, y2, v11,v22,v12] is equivalent to the PDF[] of xfdist = TransformedDistribution[{x1, x2} /. transformation, {u1, u2} \[Distributed] MultinormalDistribution[{y1, y2}, {{v11, v12}, {v12, v22}}]] – Michael E2 Jul 07 '16 at 22:28
  • 1
    Very good analysis! – Anton Antonov Jul 08 '16 at 00:27