27

How can I compute and plot the spectrogram of a signal/time series/data in Mathematica? I have a WAV file, sampled at 44100 samples/second and I want to generate a spectrogram of that data. Something like this:

http://matplotlib.sourceforge.net/_images/specgram_demo.png

rm -rf
  • 88,781
  • 21
  • 293
  • 472
Eiyrioü von Kauyf
  • 1,735
  • 1
  • 16
  • 20

4 Answers4

26

Get a sample sound:

snd = ExampleData[{"Sound", "SopranoSaxophone"}];

This gives us a Sound data structure with a SampledSoundList as first element. Extracting the data from it:

sndData = snd[[1, 1]];
sndSampleRate = snd[[1, 2]];

Plotting the data:

ListPlot[sndData, DataRange -> {0, Length[sndData]/sndSampleRate }, 
 ImageSize -> 600, Frame -> True, 
 FrameLabel -> {"Time (s)", "Amplitude", "", ""}, 
 BaseStyle -> {FontFamily -> "Arial", FontWeight -> Bold, 14}]

Mathematica graphics

Find the lowest amplitude level (used as reference for dB calculations):

min = Min[Abs[Fourier[sndData]]];

A spectrogram is made by making a DFT of partitions of the sample. The partitions usually have some overlap.

partSize = 2500;
offset = 250;
spectroGramData = 
  Take[20*Log10[Abs[Fourier[#]]/min], {2, partSize/2 // Floor}] & /@ 
   Partition[sndData, partSize, offset];

Note that I skip the first element of the DFT. This is the mean level. I also show only half of the frequency data. Because of the finite sampling only half of the returned coefficient list contains useful frequency information (up to the Nyquist frequency).

MatrixPlot[
  Reverse[spectroGramData\[Transpose]], 
  ColorFunction -> "Rainbow", 
  DataRange -> 
    Round[
     {{0, Length[sndData]/sndSampleRate }, 
     {sndSampleRate/partSize, sndSampleRate/2 }}, 
     0.1
    ], 
  AspectRatio -> 1/2,  ImageSize -> 800, 
  Frame -> True, FrameLabel -> {"Frequency (Hz)", "Time (s)", "", ""}, 
  BaseStyle -> {FontFamily -> "Arial", FontWeight -> Bold, 12}
]

Mathematica graphics

A 3D spectrogram (note the different offset value):

partSize = 2500;
offset = 2500;
spectroGramData = 
  Take[20*Log10[Abs[Fourier[#]]/min], {2, partSize/2 // Floor}] & /@ 
   Partition[sndData, partSize, offset];

ListPlot3D[spectroGramData\[Transpose], ColorFunction -> "Rainbow", 
 DataRange -> 
  Round[{{0, Length[sndData]/sndSampleRate }, {sndSampleRate/partSize,
      sndSampleRate/2}}, 0.1], ImageSize -> 800, 
 BaseStyle -> {FontFamily -> "Arial", FontWeight -> Bold, 12}]

Mathematica graphics

Anjan Kumar
  • 4,979
  • 1
  • 15
  • 28
Sjoerd C. de Vries
  • 65,815
  • 14
  • 188
  • 323
  • Brilliant! Thank you. – Eiyrioü von Kauyf Apr 07 '12 at 21:27
  • how about a 3d surface spectrogram? http://en.wikipedia.org/wiki/File:Spectrogram.png additionally, what is [Transpose] – Eiyrioü von Kauyf Apr 07 '12 at 21:29
  • \[Transpose] is a FullForm version of the transpose symbol that does the same as the function Transpose. You'll find all about Transpose in Mathematica's electronic manual. – Sjoerd C. de Vries Apr 07 '12 at 21:40
  • Ok. Thanks, I haven't seen a function in brackets before so I was uncertain where to look, as things such as @ are hard to find. Addtionally, what is ~Join~ ? I saw it on an answer somewhere. – Eiyrioü von Kauyf Apr 07 '12 at 21:42
  • @EiyrioüvonKauyf It's really not that hard. Just select the /@ and hit F1, it'll take you to the Map page, for which /@ is the infix form shorthand. ~ can be used to convert a function taking two arguments to an infix function: Func[arg1,arg2] ==> arg1~Func~arg2. More in many places in the manual, e.g., here. By the way, it's better to ask these kinds of things in a comment directly below the answer/question that provoked your question. – Sjoerd C. de Vries Apr 07 '12 at 22:16
  • What does the offset mean? Does it mean the step size of the moving window? For example, if I have a list of 8 numbers and the bin size is 4, offset=2 means every time I advance the 4-number wide window, 2 numbers ahead. Is my understanding correct? Thanks :) – xslittlegrass Aug 12 '14 at 14:21
  • @xslittlegrass Right. BTW version 9 has more tools for making these kinds of plots (see Spectrogram and Periodigram). – Sjoerd C. de Vries Aug 12 '14 at 15:08
22

The following is a more detailed version of a Spectrogram implementation in Mathematica and is part of a larger package that I'm writing in my spare time. I've tried to keep the functionality and output structure similar to MATLAB's (but not entirely). The code is provided at the end of this answer, and I'll jump to the examples now:

First load the package and some example data:

<<SignalProcessing`
{data, fs} = {#[[1, 1, 1]], #[[1, 2]]} &@ ExampleData[{"Sound", "Apollo11SmallStep"}];

Spectrogram takes several options such as:

  • WindowFunction — Hamming, Hanning and rectangular (default) windows are pre-defined and you can extend it to other windows or supply one of your own.
  • WindowLength — Samples in each segment (default is Floor[Length[data]])
  • OverlapLength — Samples overlapping with the previous segment (default 50%)
  • FFTLength (default: Max[256, 2^pow] where pow is the next power of 2 larger than your segment length) and SamplingFrequency are self explanatory

A sample call to Spectrogram with the above data would be:

{s, f, t} = Spectrogram[data, WindowFunction -> Hamming, WindowLength -> 200, 
   SamplingFrequency -> fs];

You can then plot this like in Sjoerd's example (or you can implement a default plot style within Spectrogram) to get something like:

enter image description here

A couple of things that are on my to-do list are to implement the Goertzel algorithm so that you can easily supply a list of frequencies at which the spectrogram needs to be calculated and different plotting schemes for 2D (something like the plot above as default) and waterfall and similar for 3D plots. I'll update this post when I get around to doing that.


Spectrogram code

BeginPackage["SignalProcessing`"];
Unprotect[Spectrogram,Hamming,Hanning,Rectangular];

Begin["Private"]; (* Input checks *) DoubleQ[data_List] := And @@ Function[x, MachineNumberQ[x], Listable][data];

CheckInput::sparse = "The input data must not be sparse!"; CheckInput::nodbl = "The input data must be machine precision!"; CheckInput[x_] := Which[ Head[x] === SparseArray, Message[CheckInput::sparse], ! DoubleQ[x], Message[CheckInput::nodbl], True, True ]

(* Window functions *) Rectangular = UnitStep[Range[#]] &; Hanning = 0.5 (1 - Cos[2 Pi Range[0, #-1]/(#-1)]) &; Hamming = 0.54 - 0.46 Cos[2 Pi Range[0, #-1]/(#-1)] &;

(* Spectrogram code ) Spectrogram[data_List, opts : OptionsPattern[{ WindowFunction -> Rectangular, WindowLength :> Floor[Length[data]/8], OverlapLength -> {}, FFTLength -> {}, SamplingFrequency -> 1}]] := Module[{ x = Developer`ToPackedArray[data], len = Length[data], win, seg, olap, nfft, fs, spectra, time, freq}, ( parameters ) seg = OptionValue[WindowLength]; olap = If[# === {}, Floor[seg/2], #] &@OptionValue[OverlapLength]; nfft = If[# === {}, Max[2^Ceiling[Log2[seg]], 256], #] &@ OptionValue[FFTLength]; win = OptionValue[WindowFunction]; fs = OptionValue[SamplingFrequency]; ( output ) time = Range[0, len]/fs; freq = fs Range[0, nfft/2]/nfft ; spectra = Take[Abs[Fourier[PadRight[#win[seg], nfft], FourierParameters -> {1, -1}]] & /@ Partition[x, seg, seg - olap, {1, -1}] // Transpose, Floor[nfft/2] + 1];

    {spectra, freq, time}
] /; CheckInput[data]

End[];

Protect[Spectrogram,Hamming,Hanning,Rectangular]; EndPackage[];

rm -rf
  • 88,781
  • 21
  • 293
  • 472
  • 2
    You have the frequency start at 0 Hz, but that can't be true. The lowest frequency is determined by the length of your window. – Sjoerd C. de Vries Apr 08 '12 at 06:52
  • @SjoerdC.deVries The first DFT bin (and the frequencies it spans) contributes to the "mean". You chose to discard it, I didn't. AFAIK it's not customary to throw it out when plotting. – rm -rf Apr 08 '12 at 15:01
  • This is not about the first bin per se. It's about making your data range fit correctly. If your window length is short the lowest assignable frequency is not negligible and the frequency values for all the bins except the last one experience a shift wrt the values you assign them. The second bin the most of course. – Sjoerd C. de Vries Apr 08 '12 at 15:51
  • BTW there's no need to provide Fourier with a data list whose length is a power of 2. DFT implementations are usually faster when this is the case (I haven't tested this on MMA), but speed doesn't seem to be an issue here. – Sjoerd C. de Vries Apr 08 '12 at 16:04
  • @Sjoerd That's a labeling convention — depends on whether you choose to place the tick mark at the bin center or at the ends. Bin center is also very common and in that case, it would start at half the bin width. I'm aware that Fourier doesn't need a list that's a power of 2, but you'd be hard pressed to find a signal processing application/paper that doesn't use a power of 2 — probably relics from the past, but is convention. Perhaps I could add another option that uses the actual length instead of padding. – rm -rf Apr 08 '12 at 16:10
  • 2
    I'm not talking about bin centres. If you're examining your sound data in pieces of 1000 samples at a time and with a sample rate of 20 kHz the lowest representable frequency will be 20 Hz. That's what should be the lowest part of the plot (or the second lowest in your case). If you let it start at 0 every step in your spectrogram will be slightly mislabeled. – Sjoerd C. de Vries Apr 08 '12 at 16:26
  • 1
    @Sjoerd Aha! I now get what you're saying. The frequencies in my output f are correct (2nd bin is about 21.5ish), but since I set the range from Min[f] to Max[f], it stretches the labeling from the left end of the first bin (if you consider it a rectangle) to the right end of the last bin instead of the left end of the last bin. So while I don't think that starting from 20 (or whatever it is) will fix this (it'll have the same subtle misalignment), I agree that it needs to be fixed. I hadn't gotten around to implementing a custom plot of my own and just quickly used yours :) – rm -rf Apr 08 '12 at 17:41
  • How do you know things like "sndSampleRate = snd[[1, 2]];"? WAV have 10 different possible codecs, all with different conventions. How do you figure out imaginary components? – Tyler Durden Mar 11 '16 at 17:17
  • 1
    @TylerDurden This is not a library to support every codec on earth. I was using example data which has a specific format. – rm -rf Mar 11 '16 at 17:37
14

Version 9 has introduced several signal processing functions and Spectrogram is now a built-in function. You can simply do:

{data, fs} = {#[[1, 1, 1]], #[[1, 2]]} &@ExampleData[{"Sound", "Apollo11SmallStep"}];
Spectrogram[data, SampleRate -> fs, ColorFunction -> "Rainbow", 
    FrameLabel -> {"Frequency (Hz)", "Time (s)"}]

enter image description here

rm -rf
  • 88,781
  • 21
  • 293
  • 472
  • 1
    Thank you for this answer. I think you inverted "Frequency" and "Time" in the FrameLabel (the capture is correct). – anderstood Feb 27 '16 at 18:29
5

Mathematica can plot spectrograms using the Spectrogram function or the SpectrogramArray function, or the Fourier function -- it is very general and powerful. Here are some examples of audio spectrograms from the Wolfram demonstration site and here is a discussion on Mathematica stackexchange of the process of taking spectrograms of audio. To use mp3s you will need to convert them to .wav files, which can be done using a program like Audacity or Quicktime (or many others). If your final goal is to plot spectrograms, Audacity (or other free software) might be more appropriate. What you would gain in Mathematica is the ability to control the analysis in complete detail -- but this would not be automatic or trivial, it would take some knowledge of how the spectral functions work.

bill s
  • 68,936
  • 4
  • 101
  • 191