8

My situation is as follows. I have a whole bunch of data that I need to fit, which tends to look like this picture: enter image description here

As can be seen, there's three dips (resonances) and a whole bunch of noise around amplitude 1. Now, in order to fit this, it will be much easier if I only fit the resonances themselves; this means a whole lot less points will have to be evaluated. So what I'm trying to do is cut away the parts that I don't care about.

The first thing to do of course is find the resonances. I think this should be doable with FindPeaks (although it does seem as if this literally only works for peaks, so I suppose I'd have to use 1 - the data points). The problem is here that I'm not sure how to go from there. What I do is the following:

I use FindPeaks for 1-Abs[s2]] (Abs[s2]] is the data), with gaussian blurring 0, minimum sharpness 0, and minimum peak height equal to X times the standard deviation of the dataset. (Or, well it's better to just take a piece from the beginning of the data and calculate the standard deviation here, but that's a detail). In the example here, if I use, say, X = 10, then it does find all three resonances.

However, it doesn't just find a single point, it finds a whole bunch of them around each actual resonance (they are all peaks to the function). So here's my first question; how do I adjust this to just pick out the actual lowest point for each of these resonances, instead of a whole bunch of points around each resonance? The function returns a list of x and y coordinates, so maybe I should somehow find some local minima's here again, but I don't see how really.

The next part of my question brings us closer to the title of the post, and that is cutting away the other data. So say that I've now found my three resonances; what I want to do is cut away all data around each one of them, leaving only N points to the left and right. What this N is I already know, but this depends on the specifics of the experiment from which I took the data, so that's not too important. Perhaps one could use the Table command? Say the first resonance is at x = 7.6, which is the 100th element of the vector. Then I could say that the new list is now Table[{f1[[n]],s2[[n]]},{n,100-N,100+N}]. This would find it for the first resonance, and then maybe I can append this table with the lists from the other two resonances? But perhaps there is a much smarter way of doing this, in which case I'd be very happy to hear so.

user129412
  • 1,339
  • 8
  • 19

2 Answers2

10

Define dataset that looks like yours:

fr = 15;
interval = {7.4, 8.2};
nPoints = 10000;
deltaX = (interval[[2]] - interval[[1]])/nPoints;
blur = 0.01;
p1 = {0.2, 7.8, 0.002};
p2 = {0.1, 7.65, 0.001};
p4 = {0.11, 7.653, 0.001};
p3 = {0.15, 8.0, 0.002};
f[t_, p_List] := p[[1]] Exp[-(t - p[[2]])^2/(2 p[[3]]^2 )];

s[t_] := 1 - (0.002 Sin[2 Pi fr t] + f[t, p1] + f[t, p2] + f[t, p3] + 
     f[t, p4]) + RandomReal[0.01];
xx = Range[interval[[1]], interval[[2]], deltaX];
yy = s /@ xx;

p2 and p4 give a really close peaks. Depending on your blur parameter they can be counted as one or two.

fData = GaussianFilter[yy, blur/deltaX];
peaks = FindPeaks[-fData, blur/deltaX, 0, -0.97];
points = Point@{xx[[#[[1]]]], yy[[#[[1]]]]} & /@ peaks;

Edit: It should be peaks = FindPeaks[-yy, blur/deltaX, 0, -0.97]; so we don't blur twice.

for blur = 0.01 we will have following:

ListPlot[Transpose[{xx, #}] & /@ {yy, fData}, 
 PlotRange -> {Full, Full}, Joined -> {False, True}, 
 Epilog -> {Red, PointSize[Large], points}]
ListPlot[Transpose[{xx, #}] & /@ {yy, fData}, 
 PlotRange -> {{7.6, 7.7}, Full}, Joined -> {False, True}, 
 Epilog -> {Red, PointSize[Large], points}]

enter image description here

Zoom in on coupled peak.

enter image description here

And for blur=0.0005; we will get additional peak. So you control your resolution. enter image description here enter image description here

And for completeness, since you asked for data in vicinity of peak here is code that gives you 101 points around your peak:

 peakSeries = (Transpose[{xx, #}] &@
               yy)[[# - 50 ;; # + 50]] & /@ (#[[1]] & /@ peaks);
    ListPlot[{Transpose[{xx, yy}]}~Join~peakSeries, PlotRange -> Full]

enter image description here

BlacKow
  • 6,428
  • 18
  • 32
  • Awesome! The splitting of peaks might turn out to be very useful in datasets that I don't have yet, as I expect two degenerate resonances to split a little. I figured I'd tackle that when I actually saw it, but I guess I don't have to. One question; why apply a Gaussian filter if this is already what FindPeaks does? Isn't it doing the same thing twice? – user129412 Jun 12 '15 at 19:06
  • Added a piece of code to show vicinity of peaks in different colors – BlacKow Jun 12 '15 at 19:10
  • @user129412 I applied Gaussian filter so I can draw the plot of smoothed curve. It's not used in peak detection routine. – BlacKow Jun 12 '15 at 19:12
  • Well, you do actually use it in peaks. But substituting yy there works well. – user129412 Jun 12 '15 at 19:22
  • @user129412 You are correct! I'm using it in peaks and let blurring to be zero in FindPeaks – BlacKow Jun 12 '15 at 19:23
  • Hm, I might just misunderstand the documentation, but you write 'peaks = FindPeaks[-fData, blur/deltaX, 0, -0.97];'. The way I see that is that it again uses blur/deltaX for blurring (which might do nothing? I'm not sure, it seems to me like it applies the filter a second time), then requires a minimum sharpness of 0, and a minimum height of -0.97. How are you letting the blurring be zero here? – user129412 Jun 12 '15 at 19:27
  • 1
    @user129412 My bad again.. the sharpness is 0, blur is the same. So you really need to use it one place, should be FindPeaks[-yy, blur/deltaX, 0, -0.97]; – BlacKow Jun 12 '15 at 19:30
  • I'll probably end up opening a new question for this if some thinking doesn't lead me to the answer, but it turns out I'll also need to determine the full width half maximum of the peaks. I'm pretty sure this code will still be a good starting point though. – user129412 Jun 14 '15 at 21:18
6

A completely manual method (requires peak separation).

Generate a signal:

interval = {7, 8};
sigma = 1/100;    
peaks = RandomReal[interval, 3];

baseSignal[t_] := 1/20 Sin[100 t] + 1
noisyS[t_, sigma_] := baseSignal[t] + RandomVariate[NormalDistribution[0, sigma]]
peakFun[interval_, peaks_, x_] := -sigma/Sqrt@ Pi 
          Tr[PDF[NormalDistribution[#, -Subtract @@ interval/300], x] & /@ peaks]
fullFun[interval_, peaks_, sigma_, t_] := noisyS[t, sigma] + peakFun[interval, peaks, t]
signal = Table[{t, fullFun[interval, peaks, sigma, t]}, 
         Evaluate[{t, Sequence @@ #, -Subtract @@ #/10000} &@interval]];
signalValues = signal[[All, 2]];
ListPlot[signal, AxesOrigin -> {7, 0}, MaxPlotPoints -> 5000]

Mathematica graphics

Now the mins:

pps = Select[signal, Abs[#[[2]]-Mean@signalValues] > 3 StandardDeviation@signalValues &];
clus = pps[[#]] & /@ FindClusters[pps[[All, 1]] -> Range@Length@pps, 
                                  Method -> {"Agglomerate"}];
mins = First@SortBy[#, Last] & /@ clus

 ListPlot[signal, AxesOrigin -> {7, 0}, MaxPlotPoints -> 5000, 
          Epilog -> {Red, PointSize[Large], Point@mins}]

Mathematica graphics

Dr. belisarius
  • 115,881
  • 13
  • 203
  • 453
  • Interesting… So you rely on FindClusters for peak separation. How do you control peak resolution? OP uses GaussianFilter (at least I think FindPeaks uses it) to set blur parameter to specify resolution. – BlacKow Jun 12 '15 at 18:00
  • I'm still trying to understand what each part does, after which I'll get back to you. But it looks promising, as the peaks will in general be quite separated. Specifically, I'm having some trouble understanding how the definition of mins works, and why we need -> Range@Length@pps in clus. – user129412 Jun 12 '15 at 18:03
  • Oh, and I was also wondering, in the current way it finds the actual coordinates of the peaks. Is there an easy way to figure out which element of the original vector this is? Like, number 100, or 172, etcetera. – user129412 Jun 12 '15 at 18:22
  • @user129412 Position[signal,#]&/@mins – Dr. belisarius Jun 12 '15 at 18:30
  • The pps[[All, 1]] ->Range@Length@pps form is the way to force FindClusters to return the indices in the original list instead of the values. I need that because I'm clustering by the first coordinate only and in the following step I need the ordered pairs for the mins – Dr. belisarius Jun 12 '15 at 18:34
  • @belisarius I see, that makes sense. As for the Position thing, that does indeed work! I continuously run into new types of functions and stuff doing these things, which is nice. One day I won't have to ask these simple things. Interesting is that the documentation on Position lists that the output is different than the one used in Part; it however doesn't say how one converts it to that form.. Kind of annoying; they'd be more useful as actual numbers instead of this double angular bracket type. – user129412 Jun 12 '15 at 18:43
  • @user129412 That is because Position[ ] works with Extract[ ] – Dr. belisarius Jun 12 '15 at 18:47
  • @belisarius Anyway… +1 for showing FindClusters – BlacKow Jun 12 '15 at 18:49
  • 1
    @BlacKow I haven't seen your first comment. Sorry. Peak resolution is not the strongest point of this method. But the OP confirmed that it isn't a problem in his current setup. I don't have v10, so I can't test FindPeaks[ ] – Dr. belisarius Jun 12 '15 at 19:26