Calculate the mean value from point inside a 2D contour

Question

I saw a question about creating contour lines over a smoothdensityhistogram plot. It really help me to create this kind of graphs.

I want to ask if is possible to extract only the points inside a certain contour to obtain, for example, mean values or desviation.

kind regards

What do you mean by "mean values" of the points inside a certain contour. Mean location? Mean height of the density curve? Or are that additional attributes associated with each data point such as percent foliage cover, elevation, tree density, etc., for which you want a mean or standard deviation? — JimB, Oct 10 '23 at 17:10
Yes mean location, or just extract these points in a array to do other calculations. — Renato QE, Oct 12 '23 at 02:30
@ydd 's splitUp list created by using Table and Select shows how you can extract the points either between contours or inside or outside a particular contour. — JimB, Oct 12 '23 at 02:45
Also note that a "mean location" could still be calculated from the bivariate density even if there were no data points within a particular contour. This would be analogous to the mean of a truncated distribution. I would think that after the trouble estimating a bivariate distribution that summary calculations would be done based on the estimated bivariate density. The data points have done their job and it would be relying on the entity that you've estimated for future calculations. — JimB, Oct 12 '23 at 02:51

ydd · Answer 1 · 2023-10-10T22:31:12.693

Using @Ian 's example from your linked question:

Ian's part (except I added explicit SeedRandom)

SeedRandom[123];
two dimensional data) data = 
   RandomVariate[BinormalDistribution[.75], 200];
(nonparametric density estimate)
d = SmoothKernelDistribution[data];
(logged density estimate at data)
p = PDF[d, data];
(quantile for countour height)
q = 1 - {0.68, 0.95, 0.99};
c = Quantile[p, q];

Then create your quantile bounds by padding a 1 and a 0 to either end of q

cPad = PadLeft[c, Length@c + 1, 1];
cPad = PadRight[cPad, Length@cPad + 1, 0];
quantileBounds = Partition[cPad, 2, 1]
(*{{{1, 0.0759093}, {0.0759093, 0.0200241}, {0.0200241, 
  0.00849196}, {0.00849196, 0}}*)

And split up data based on which quantiles their PDF values fall between:

splitUp = 
Table[Select[data, Last[i] < PDF[d, #] <= First[i] &], {i, 
   quantileBounds}];

(If you don't want to calculate the PDF again of all the data points, you could use Position instead but I feel this looks uglier than the Select implementation):

splitUp = 
  Table[Part[data, 
    Flatten@Position[p, n_ /; Last[i] < n <= First[i]]], {i, 
    quantileBounds}];

Comparing mean and variances. As expected, we see the variance grow as you go out to more extreme quantiles:

Mean /@ splitUp
({{0.0116993, -0.147756}, {0.214602, 
  0.222228}, {-0.361543, -0.202958}, {-0.741956, -1.677}})
Variance /@ splitUp
({{0.382458, 0.346718}, {1.4757, 1.59431}, {2.1283, 3.82937}, {8.91892,
   2.63842}})

And plotting different point colors depending on quantiles:

nonEmptyContours = Select[splitUp, Length[#] > 0 &];
lpAll = ListPlot[nonEmptyContours, 
   PlotStyle -> {Black, Green, Orange, Red}];
Show[DensityPlot[PDF[d, {x, y}], ##], 
   ContourPlot[PDF[d, {x, y}], ##, Contours -> c, 
    ContourShading -> None], lpAll] &[{x, -4, 4}, {y, -4, 4}, 
 PlotRange -> All]

Mathematica graphics

Edit: I think you were asking actually how to do this using JimB's answer. The methodology is basically the same:

JimB's part (except adding explicit SeedRandom and set c to contours):

SeedRandom[123];
data = RandomVariate[BinormalDistribution[.75], 100];
(*Calculate a nonparametric density estimate*)
d = SmoothKernelDistribution[data];
(Evaluate the estimated density function over a grid of points and 

sort by the density values from high to low)
pdf = Reverse[
   Sort[Flatten[
     Table[PDF[d, {x, y}], {x, -3, 3, 0.05}, {y, -3, 3, 0.05}]]]];
(Create a table of cumulative pdf values that correspond to the 

volumes of interest)
cdf = Accumulate[pdf]/Total[pdf];
contours = 
  pdf[[Flatten[
     Table[FirstPosition[cdf, 
       p_ /; p >= alpha], {alpha, {0.68, 0.95, 0.99}}]]]];
c = contours;

Then do everything the same as before. We can color the points based on their contours:

cPad = PadLeft[c, Length@c + 1, 1];
cPad = PadRight[cPad, Length@cPad + 1, 0];
quantileBounds = Partition[cPad, 2, 1];
splitUp = 
  Table[Select[data, Last[i] < PDF[d, #] <= First[i] &], {i, 
    quantileBounds}];
sdh = SmoothDensityHistogram[data, PlotRange -> All];
cp = ContourPlot[PDF[d, {x, y}], {x, -4, 4}, {y, -4, 4}, 
   Contours -> contours, ContourShading -> None, PlotRange -> All];
nonEmptyContours = Select[splitUp, Length[#] > 0 &];
coloredByContourPoints = 
  ListPlot[nonEmptyContours, PlotStyle -> {Black, Green, Orange, Red}];
Show[{sdh, cp, coloredByContourPoints}, PlotRange -> All]

Mathematica graphics

This a question for the Graphics folks…is there a way to decide colors based on how “extreme” they seem? The green really seems to stand out here and so I’d want it to be the outermost color. Orange should really be the second color because it doesn’t stand out so much — ydd, Oct 12 '23 at 04:06

Calculate the mean value from point inside a 2D contour

1 Answers1