1

I have two vectors, one containing sampled energies and the other one for each of these energies a weight: e(i) and w(i).

I bin the energies e(i). Now I want to compute the sum of weights w in each bin separately.

I can bin the energies using BinLists and make a plot with Histogram. How do I perform the sum over weights in each bin? I would obviously need access to the indices i in each bin.

Any elegant solution to this problem would be welcome.

e = {0.129454, 0.160294, 0.140456, 0.152359, 0.174006, 0.186969};
w = {2.12373*10^-6, 0.00029488, 5.68648*10^-6, 0.000100244, 2.28126*10^-6, 6.74131*10^-6}; 
Kuba
  • 136,707
  • 13
  • 279
  • 740
user3584513
  • 199
  • 9
  • e = {{0.129454, 0.160294, 0.140456, 0.152359, 0.174006, 0.186969} w = {2.1237310^-6, 0.00029488, 5.6864810^-6, 0.000100244, 2.2812610^-6, 6.7413110^-6} – user3584513 Jun 26 '14 at 20:51

3 Answers3

3

Very similar to @Sjoerd's, but relaxing the uniqueness assumption:

SeedRandom[42];
n = 20;
bins = {1, 30, 50, 80, 100};
e = RandomInteger[{1, 100}, n];
w = RandomInteger[{1, 3}, n];
k = {#[[1, 1]], Tr[#[[All, 2]]]} & /@    GatherBy[Transpose[{e, w}], First];
wTotal = Tr /@ (Union /@ BinLists[e, {bins}] /. Rule @@@ k)
(*
  {9, 12, 5, 10}
*)

Show[Histogram[e, {bins}], 
     ListLinePlot[Transpose[{MovingAverage[bins, 2], wTotal}]]]

Mathematica graphics

Dr. belisarius
  • 115,881
  • 13
  • 203
  • 453
3

A combination of WeightedData and HistogramDistribution:

@belisarius's example:

 SeedRandom[42];
 n = 20;
 e = RandomInteger[{1, 100}, n];
 w = RandomInteger[{1, 3}, n];
 wd = WeightedData[e, w];
 binlimits = {1, 30, 50, 80, 100};
 dist = HistogramDistribution[wd, {binlimits}];
 Total[w] Normalize[dist["PDFValues"], Total]
 (* {9,12,5,10} *)

OP and Sjoerd's example:

 e2 = {0.129454, 0.160294, 0.140456, 0.152359, 0.174006, 0.186969};
 w2 = {2.12373*10^-6, 0.00029488, 5.68648*10^-6, 0.000100244, 2.28126*10^-6, 6.74131*10^-6};
 wd2 = WeightedData[e2, w2];
 dist2 = HistogramDistribution[wd2, {0.12, .22, 0.03}];
 Total[w2] Normalize[dist2["PDFValues"], Total]
 (* {7.81021*10^-6, 0.000397405, 6.74131*10^-6} *)

Update: a function to get the total weights of bins:

 bwF = Total[#2] Normalize[HistogramDistribution[WeightedData[#1, #2], #3]["PDFValues"], 
            Total] &

 bwF[e, w, {binlimits}]
 (* {9, 12, 5, 10} *)
 bwF[e2, w2,  {0.12, .22, 0.03}]
 (*  {7.81021*10^-6, 0.000397405, 6.74131*10^-6} *)
kglr
  • 394,356
  • 18
  • 477
  • 896
  • Nice! How could I have forgotten WeightedData? – Sjoerd C. de Vries Jun 27 '14 at 08:03
  • Same here. +1, of course – Dr. belisarius Jun 27 '14 at 11:48
  • @Sjoerd & belisarius thank you both. – kglr Jun 27 '14 at 11:57
  • What does 'PDFValues' in Normalize[dist2["PDFValues"], Total] do? Cannot find anything on this in the documentation on Normalize. Where can I find anything on this PDFValues? – user3584513 Jun 28 '14 at 11:07
  • @user3584513, dist["Properties"] gives the available properties for the object dist. and dist["PDFValues"] gives the density for each of the bins. For the first example, dist["PDFValues"] returns {9/851,12/851,5/851,10/851}, that is the density of the bin [1,30] is 9/851 etc. This value is the probability of the interval [1,30] divided by the bin length (i.e. Probability[1<=x<=30, x\[Distributed] dist]/(30-1)) (similarly for the other bins). Normalize[xx, Total] is xx/Total[xx] which transforms the PDFValues to proportions that add up to 1. ... – kglr Jun 28 '14 at 11:30
  • ... Then we multiply these proportions with the the total weight of all observations (Total[w]) to get the weight for each bin. Hope this answers your question(s). – kglr Jun 28 '14 at 11:30
  • @kugler: many thanks for your explanations. I love that function bwF. It surely works, even though I do not fully understand yet why. – user3584513 Jun 28 '14 at 15:41
1
e = {0.129454, 0.160294, 0.140456, 0.152359, 0.174006, 0.186969};
w = {2.12373*10^-6, 0.00029488, 5.68648*10^-6, 0.000100244, 2.28126*10^-6, 6.74131*10^-6};

Total /@ (BinLists[e, {0.12, .22, 0.03}] /. Thread[e -> w])
(* {7.81021*10^-6, 0.00039740526, 6.74131*10^-6} *)

So, what happens here? BinList divides e in bins. I then replace every e value with the corresponding weight. Thread takes care that the rule -> is taken element-wise. One assumption that is used here: The e values should be unique, or more precise: the weights of identical elements in e should be identical.

Sjoerd C. de Vries
  • 65,815
  • 14
  • 188
  • 323
  • Thanks to all of you! This really helped. I only wished that Mathematica had a special command for this operation. It does not seem to be so rare! – user3584513 Jun 27 '14 at 12:42
  • What does 'PDFValues' in

    Normalize[dist2["PDFValues"], Total]

    do? Cannot find anything on this in the documentation on Normalize. Where can I find anything on this PDFValues?

    – user3584513 Jun 28 '14 at 11:06