5

I'm processing audio data for voice input from a mic. The data arrives in 32 bit floats [-1 ~ +1].

My first filter is to remove DC:

// x = new input value, y = filtered output value
m_x += ( 0.01 * ( x - m_x ) );
y = ( x - m_x );

When I feed it audio that is close to clipping (but not actually hitting -1 or +1), I'll get values back that actually do go above the [-1 ~ +1] limit - sometimes way above. I find this behavior curious.

Can anyone explain why this happens?

Also, what's the best way to "fix" this? Do a simple clamp for the returned y value? Pre-scale the input down via (x * 0.7071) first?

Thanks!

  • Can you plot the waveform before and after the filter? – endolith May 14 '13 at 19:49
  • Have a look at the behavior of m_x. More smoothing could help, i.e. decrease the current value of 0.01. What is the actual mean of the input signal? – Matt L. May 14 '13 at 19:54
  • The waveform before does show some near-clipping (say, 0.989, but not 1.0). The waveform after has some point above 1.0. I can see certain sections creeping back to 0, so the filter is actually working like it should. – SoftwareSamurai May 14 '13 at 20:00
  • The mean of the input signal appears to hover very near zero. – SoftwareSamurai May 14 '13 at 20:11
  • OK, if the actual mean is close to zero, then $y$ should be very close to $x$. Check to see what m_x is doing. – Matt L. May 14 '13 at 20:18
  • m_x stays within the range [-0.6 ~ +0.6]. – SoftwareSamurai May 14 '13 at 20:26
  • If the mean of $x$ is very close to zero, as you said, how come that m_x has maxima/minima of $\pm 0.6$, which is 60% of the input range? Try more smoothing on m_x, i.e. try a smoothing constant of e.g. 0.001. I guess you initialize m_x with zero, right? – Matt L. May 14 '13 at 20:36
  • The more I reduce the smoothing constant, the more the output matches the input. If I increase it to, say, 0.1, that kills off more of the lower frequencies, but it doesn't stop the occasional spikes outside of [-1 ~ +1]. (Yes, m_x is initialized to 0.) – SoftwareSamurai May 14 '13 at 20:51
  • That's what I meant, apply more smoothing means reducing the smoothing constant, because then the influence of the current input value is decreased (= more smoothing). So I guess the spikes are gone if the smoothing constant is appropriately reduced? – Matt L. May 14 '13 at 21:13

1 Answers1

3

Your DC blocker transfer function is

$H(z) = (1-a)\frac{1-z^{-1}}{1-(1-a)z^{-1}}$

and an alternative (equivalent) difference equation is

$y_n = (1-a)(x_n - x_{n-1}+y_{n-1})$

Although this filter provides no gain it is possible to find inputs that exceed +1. For instance this input [1 -1 -1 -1 1].

Because you seem to work with a floating point realization you should not need to scale down before filtering.

Edit: The output of your filter is bounded this way: $|y_n|<2(1-a)$. So an input gain of 0.5 guarantees that the output of the filter can't exceed +1 if the input is bounded between -1 and +1. Can you come up with an input that produces an output value that exceeds +1.5 or an input that produces an output values that is less than -1.5?

Edit2: Filters can provide amplification as well as attenuation, otherwise they wouldn't be that interesting. Your filter does not provide amplification in the sense that a sinusoid passing through your filter will not have an increased amplitude. You can see that by plotting the amplitude/frequency response of your filter. However, for more complex inputs the non-linear phase response of your filter can cause 'overshoots' in the output. An allpass filter for instance, even though it has a completely flat amplitude response can also provide 'overshoots' for fullscale inputs.

How to deal with the spikes depends very much on what the other modules in your signal path are doing. Maybe some of the modules create headroom maybe some of them consume headroom. I don't know. Considering your DC blocker in isolation then you can apply a headroom gain of say 0.7 or 0.8 and then saturate your output. Although this setting is likely to perform some saturations my guess is that they will be completely inaudible. You will have to confirm this by experiment.

niaren
  • 1,928
  • 16
  • 14
  • I agree that technically I shouldn't need to scale down before filtering, but in practice, it does have spikes that go over the [-1 ~ +1] limits, so I've got to do something about it. – SoftwareSamurai May 15 '13 at 00:47
  • 2
    Yes, and scaling down on the input is one way of handling it. There is no best way to handle it. It really depends. Is scaling down on the input not a viable solution in your case? – niaren May 15 '13 at 05:13
  • Yes. Apparently a full rail-to-rail input may cause the returned value to peak above +1. e.g. x ≈ 0.98, and m_x ≈ -0.07 when y peaks above +1. I'm guessing this happens due to very high frequencies. (The sample rate is 48kHz.)

    As for scaling down, I did a test where I multiplied the incoming x by 0.7071, ran it through the filter, and then divided the result by 0.7071. So far I haven't seen any peaks outside the [-1 ~ +1] range. And since I'm working with 32 bit floats, this method doesn't appear to negatively affect the results AFAICT. I'll continue testing with this method.

    – SoftwareSamurai May 15 '13 at 13:46
  • The output can certainly peak above +1. Can you list the shortest input sequence that will provide an output value above +1.5 (given that $y_{-1} = 0$)? Scaling down by a factor on the input and scaling up by the same factor on the output will give you the same output as no scaling at all I think, because its floating point. – niaren May 15 '13 at 13:51
  • You're right, down-scaling then up-scaling doesn't change anything. (Had my mic input levels too low during my last test.) I've created an example .wav file. Left channel is the input to the filter. Right channel is the output from the filter. In the filter, if the returned value went outside the [-1 ~ +1] range, I set it to zero. Look at ≈ the 0.009s mark and you'll see where the filter's output exceeded the limit and was clamped to zero. Then you can see what the input data looked like at that moment. – SoftwareSamurai May 15 '13 at 15:22
  • To answer your first question, This .wav file is similar to the other one I posted, except that the input was scaled by 0.5 before filtering. At ≈ 0.233s you can see the right channel (output) dipped below -0.8, which scales to just below -1.6. So it appears that at least this specific filter can indeed produce in excess of ± 1.5 given a ± 1.0 range input. (Albeit just slightly in excess - could be due to floating point rounding perhaps?) – SoftwareSamurai May 15 '13 at 18:30
  • Your filter can create output values arbitrarily close to $\pm$1.98, it has nothing to do with rounding. Can you elaborate on what exactly your problem is? What variable is not allowed to exceed +1 and why is it a problem? Why can't you just use a fixed headroom gain on the input of your filter? – niaren May 15 '13 at 18:56
  • I thought that IIR filters (or any filter I guess) would never cause amplification of the input signal, and I am surprised to discover that isn't true.

    I'm trying to apply a series of filters to improve voice input from a mic. I do not want the filters to amplify the signal, nor do I want to have to arbitrarily reduce the signal just to maintain the ± 1.0 range through each filter.

    I could scale the mic input by 0.5 to help prevent the filters from spiking, but if each filter I apply has the possibility of spiking like this, the overall quality could easily degrade after a few filters.

    – SoftwareSamurai May 15 '13 at 19:09
  • Okay, I accept your "Edit2" explanation as a good answer to my initial question.

    So technically, yes, this filter can (and does) create "overshoots" given certain complex input waveforms, and exactly how to handle it will greatly depend on each specific usage of this filter. (I suspect this will also be true for any IIR, FIR, or BiQuad filter even when their response curves are always below 0 dB.)

    Thanks!

    – SoftwareSamurai May 16 '13 at 13:56