Why Zero Padding in the Center of the DFT Interpolates / Upsamples the Signal (Sinc Interpolation / DFT Interpolation / Periodic Interpolation)

Question

I'm experimenting with the Inverse Discrete Fourier Transform. Starting from the two-cycles continuous $x(t)$ signal below:

I have the discrete signal $x(n) = \{ 1, 0, -1, 0, 1, 0, -1, 0 \}$ leading to the 8 points DFT $X_0(n) = \{ 0, 0, 4, 0, 0, 0, 4, 0 \}$

Now, if I use the IDFT on $X_0$, I obtain $x_0$ looking like that (the blue curve is the real part of the IDFT $Re[x_0(t)]$):

Here $x_0(n) = x(n)$ for integer values ($n = 0, 1, 2, \dots 7$). I understand I do not get back the continuous $x(t)$ function because of aliasing. I would explain that by saying on $x_0(t)$, there is a second signal, above the Nyquist frequency, and "riding" the "carrier"¹

I read about zero padding, so I tried to add $k$ extra zeros in the middle of $X_0(t)$ and now, if I perform the IDFT, I obtain the following results:

1. With 8 extra values

I have $X_{k=8}(t) = \{ 0, 0, 4, 0, \underbrace{\mathbf{0, 0, 0, 0, 0, 0, 0, 0}}_\text{8 extra bins}, 0, 0, 4, 0 \}$, leading to $x_{k=8}(t)$:

2. With 24 extra values

I have $X_{k=24}(t) = \{ 0, 0, 4, 0, \underbrace{\mathbf{0, 0, 0, 0, \dots, 0, 0, 0}}_\text{24 extra bins}, 0, 0, 4, 0 \}$, leading to $x_{k=24}(t)$:

3. The problem

Adding more bins proportionally decreased the amplitude of the rebuild signal and increase the frequency of the signal riding the carrier. I think I understand both phenomenons.

However, I can't find an intuitive way of explaining why, at integer positions (the orange dots), $x_k(n)$ reproduces more and more accurately the original shape of the $x(t)$ signal.

_{¹Do not hesitate to edit the question if I don't use the correct vocabulary here.}

Hi. What is that blue curve in your 2nd plot? How did you compute that curve's values so that you could plot it? Your original x(n) sequence is two cycles of a cosine sequence whose frequency is Fs/2 Hz. Why do you think your original x(n) sequence contained multiple sinusoidal signals. — Richard Lyons, Dec 17 '19 at 03:16
Thanks for the comment @Richard. I failed to mention it in the question but this is strongly inspired from exercise 3.21 in your book "Understanding Digital Signal Processing". The blue curve is the real part of the IDFT. Here is how I understand it: (1) the original sequence $x(n)$ is two cycles of a cosine. (2) Since this is a real signal, that leads to a DFT where $X(0, \dots , N/2-1) = X(N/2, \dots , N-1)$ So each bin is replicated at $n+N/2$. (3) When I build a new $x_0$ signal from that, it's made not of one, but two cosines. The original one and its replica. Or am I wrong? — Sylvain Leroux, Dec 17 '19 at 11:09
Hi Sylvain. Send me a private e-mail to: R.Lyons@ieee.org and I will send you the Solution to my Homework Problem 3.21. After reading my Solution 3.21, if you have any additional questions you can send them to me. — Richard Lyons, Dec 17 '19 at 11:48
@SylvainLeroux, I think a full derivation is given at https://dsp.stackexchange.com/questions/72433. — Mark, Nov 29 '21 at 09:11

Fat32 · Answer 1 · 2021-12-22T12:25:59.250

What you are experiencing is technically called interpolation by DFT; i.e., interpolating a time-domain sequence $x[n]$ by properly zero filling the middle portion of it's DFT $X[k]$ (and taking the inverse DFT to get the time domain interpolated sequence).

Typically, interpolation is described and performed in the time-domain, but equivalently possible in the frequency-domain, as a consequence of Fourier theorems.

Interpolation described in the time-domain:

$$ x[n] \rightarrow ({\uparrow L}) \rightarrow w[n] \rightarrow \boxed{LPF} \rightarrow y[n]$$

Input sequence $x[n]$ is of length $N$, expanded sequence $w[n]$ is of length $M = N \times L$, and the LPF has a Gain = L and discrete-time cutoff frequency of $\omega_c = \pi/L$ radians per sample.

The relation in the frequency-domain is such that if $X[k]$ is the N-point DFT of $x[n]$, then $W[k] = X[k]$ is the $M = L \times N$ point DFT of the sequence $w[n]$. Inded, $W[k]$ is an L-fold copy of $X[k]$. Let the DFT of the LPF be $H[k]$, which is also $M = L \times N$ points.

$$ H[k] = \begin{cases}{ L ~~~,~~~-N/2 \leq k \leq N/2 \\ 0 ~~~~, ~~~~ \text{otherwise} }\end{cases} $$

Then the DFT of the interpolation output $y[n]$ is $$Y[k] = H[k] W[k] $$

After the multiplication $Y[k]$ becomes : $$ Y[k] = \begin{cases}{ L \cdot X[k] ~~~,~~~-N/2 \leq k \leq N/2 \\ ~~~~ ~0 ~~~~~~~~~~, ~~~~ \text{otherwise} }\end{cases} $$

This is what you are implementing in your zero filled DFT of $X[k]$: You try to obtain $Y[k]$ by filling the middle portion of $X[k]$ to make it length $M$. And you can also see why your magnitude is missing by $1/L$; as you did not multiply it by $L$.

The lowpass filter is implicitly implemented while zero filling $X[k]$ into length $M$ to obtain $Y[k]$.

The reduction in the output magnitude, can be explained either due to the increase in the length of the interpolated sequence (hence inverse DFT scaling), or due to the (missing) gain of the implicit interpolation lowpass filter.

To correct the amplitude mismatch, simply multiply the interpolated sequence by the interpolation factor L.

You are exactly correct in your "interpolation" explanation. As for the amplitude loss in Sylvain's 3rd plot, there's no lowpass filter involved. The original analog signal's peak amplitude is P = 1 and Sylvain captured an integer number of cycles with N = 8 samples. The largest computed DFT magnitudes, M, will be M = PN/2 = 18/2 = 4. Thus P = 2M/N. His zero stuffing doubled N, and left M = 4 unchanged, so the IFFT of the N = 16 spectral sequence will have P = 24/16 = 0.5 as a peak amplitude value. — Richard Lyons, Dec 17 '19 at 12:18
@Richard " The largest computed DFT magnitudes, M, will be M = PN/2 = 18/2 = 4. Thus P = 2M/N. His zero stuffing doubled N, and left M = 4 unchanged, so the IFFT of the N = 16 spectral sequence will have P = 24/16 = 0.5 as a peak amplitude value." That's how I understood it. I would be currious to learn more about the(emph. mine) "implicit interpolation lowpass filter" mentioned by Fat32 though. — Sylvain Leroux, Dec 17 '19 at 16:30
@Sylvain. The time-domain interpolation in my Problem 3.21 works well because the original x(n) signal was an integer number of cycles. But for real-world information carrying-signals the freq-domain "zero stuffing and IFFT" method of time-domain interpolation doesn't work very well. In practice, time-domain interpolation is most often performed as described in Chapter 10 of my "Understanding DSP" book. In that Chapter 10 time-domain interpolation processing a lowpass filter is required. — Richard Lyons, Dec 17 '19 at 20:12

Matt L. · Answer 2 · 2019-12-18T08:32:04.403

2

One easy way to understand interpolation in the time domain by zero-padding in the frequency domain is to realize that all interpolated sequences can be derived from sampling a single periodic continuous-time function, defined by the DFT coefficients $X[k]$, which are interpreted as (scaled) Fourier coefficients of that periodic continuous-time function $x_c(t)$. For odd $N$ we have

$$x_c(t)=\frac{1}{N}\sum_{k=-(N-1)/2}^{(N-1)/2}X[k]e^{j2\pi kt/N}\tag{1}$$

and for even $N$ (your example) you get

$$x_c(t)=\frac{1}{N}\sum_{k=-N/2}^{N/2}\tilde{X}[k]e^{j2\pi kt/N}\tag{2}$$

where $\tilde{X}[k]$ is obtained from $X[k]$ by splitting the bin at Nyquist (index $N/2$):

$$\tilde{X}[k]=\big[X[0],\ldots,X[N/2-1],0.5X[N/2],\\0.5X[N/2],X[N/2+1],\ldots,X[N-1]\big]$$

where we assume periodicity with period $N+1$ (due to splitting of the Nyquist bin): $\tilde{X}[k]=X[k+N+1]$, so $\tilde{X}[-N/2]=\tilde{X}[N/2+1]=0.5X[N/2]$.

Note that for real-valued $x[n]$, $x_c(t)$ defined by $(1)$ or $(2)$ is real-valued. Also note that regardless of the interpolation factor, all interpolated discrete-time sequences are samples of $x_c(t)$. So the blue curves in your question do not make much sense, or they at least don't help with understanding what's going on.

For a given length $M$ of the desired interpolated sequence ($M>N$), the interpolated sequence obtained by IDFT from zero-padding in the frequency domain can be written in terms of a sampled version of $x_c(t)$:

$$\hat{x}[m]=x_c\left(\frac{mN}{M}\right)=\frac{M}{N}\textrm{IDFT}_M\{X_{ZP}[k]\}\tag{3}$$

where $X_{ZP}[k]$ is a zero-padded version of $X[k]$ ($N$ odd) or $\tilde{X}[k]$ ($N$ even), respectively. The amplitude scaling in your plots is due to the factor $M/N$ in $(3)$ that you probably forgot to include in your computations.

edited Dec 18 '19 at 08:32

answered Dec 17 '19 at 13:30

Matt L.

89,963
9
79
179

1

I think the interesting part is why to divide the Nyquist element by 2. I don't see the reason in the answer. – Royi Dec 30 '20 at 10:51
@Royi , first you have the inherent periodicity of the DFT. $X[k+N]=X[k] \quad \forall k \in \mathbb{Z}$, $x[n+N]=x[n] \quad \forall n \in \mathbb{Z}$ and for a real signal $x[n]$ (that is $\Im{x[n]} = 0 \quad \forall n \in \mathbb{Z}$ that means that complex conjugate symmetry $X[-k] = X[k]^* \quad \forall k$. Now what are you gonna do with $X[\pm\tfrac{N}2]$? – robert bristow-johnson Dec 30 '20 at 19:24
@Royi , this answer is related to this issue, but it is not a derivation. – robert bristow-johnson Dec 30 '20 at 19:30
@Royi , here is a more complete answer that deals with what to do with Nyquist element. – robert bristow-johnson Dec 30 '20 at 19:38
My remark was about the answer not being full. We all know we need to factor it by $ 0.5 $. We can also show "it works". But the reasoning is missing. By the way, my answer to that is composed of 2 reasons: 1. Solving estimation problem with 2 constraints (Parseval and Conjugate Symmetry) will lead to this. 2. DFT is basically DFS. What's the value of the ideal LPF on the jumping point in DFS analysis? Half the jump, there you go :-). – Royi Dec 30 '20 at 21:05
@robertbristow-johnson, See my comment above. I will lurk for another time someone asks it to answer my reasoning. But all I meant to say is, I have mine. I wonder what's Matt's reasoning. You links don't derive the $ 0.5 $ factor. – Royi Dec 30 '20 at 21:06
1

@Royi: For even $N$, $x_c(t)$ would become complex-valued. One option to solve this is to split the bin at $N/2$ such that we sum over an odd number of elements. There are other options as well. One other possibility would be to just zero pad either to the right or to the left of $X[N/2]$, and then take the real part of the IDFT. – Matt L. Dec 30 '20 at 21:26
@MattL., I am looking for Mathematical reasoning. Derivation. Not what works. Read my comment above, I have my answers (Either the solution of constrained optimization or DFS point of view). I'd be happy for more :-). – Royi Dec 30 '20 at 21:30
1

@Royi: My argument wasn't that "it just works", it was that we require $x_c(t)$ to be real-valued, and one way to achieve that is to split the bin at $N/2$. – Matt L. Dec 30 '20 at 21:34
Another way would be factor it by $ 0.25 $. It will also generate real value signal. The reasoning is both real value (Symmetry) and preserve energy (Parseval). Integrating both as constraints will yield the result. Anyhow, Once we'll have new question about this I will write the full answer which derives it. – Royi Dec 30 '20 at 22:58
1

@Royi: Well, you need the same factors to keep it real-valued, and obviously you don't want to change the original signal before interpolation, otherwise interpolation doesn't really make sense. So you can't just throw away half of the bin, that's why clearly the factors need to add up to 1. You wouldn't deliberately change the other DFT coefficients by scaling them, would you? The factors $1/2$ make $x_c(t)$ real-valued AND don't change the original DFT. These are the two (very simple) reasons why we use them. I agree that I haven't made this explicit in my answer. – Matt L. Dec 31 '20 at 09:39
Of course you scale the other coefficients when doing interpolation. You scale them to match Parseval of higher sampled signal. That's what I'm saying the real math reasoning. Parseval + Symmetry yield the $ 0.5 $ factor. Any of them on itself doesn't. People forget to mention this and only say you need to split. – Royi Dec 31 '20 at 10:05
@Royi: Yes, after zero-padding you scale all coefficients by $M/N$, but you don't scale individual coefficients by some factor, which wouldn't make any sense. And you'd do exactly that if you used $0.25$ or some other factor different from $0.5$ when splitting the bin at $N/2$. – Matt L. Dec 31 '20 at 11:44
@MattL., It seems you're trying to tell me things I know. Mathematically, your reasoning for Symmetry isn't enough. As there are many options to keep symmetry besides splitting in half. I am telling you the right derivation comes from doing constrained optimization with 2 constraints: 1. Keep Conjugate Symmetry (For this element it is just symmetry). 2. Keep Parseval in place. I am adding information and correct derivation to your answer. No need to tell me why. I just gave a feedback it is missing in your answer :-). – Royi Dec 31 '20 at 12:26
1

@Royi: There are two things at play here. I agree that I probably didn't sufficiently motivate in my answer the way the bin at $N/2$ is split. The other thing is that you keep repeating that there's only one (your) way of understanding this, and that all others blindly use some kind of recipe. I tried to show that the problem is actually simple, and that splitting the bin at $N/2$ in any other way than scaled by $1/2$ either destroys real-valuedness, or changes the actual value of that DFT coefficient, which is non-sensical. I don't need constrained optimization to see or show this. – Matt L. Dec 31 '20 at 12:42
What do you mean change? Of course we change it. We also change it by scaling it by $ 0.5 $. The point I made the remark is I know the Math to derive the splitting into half. I was asking you to either give your reasoning or, as you say, give other options (If you say there is more than 1). – Royi Dec 31 '20 at 12:45
@Royi: $0.5+0.5=1$, i.e., no change. I give up. – Matt L. Dec 31 '20 at 12:47
1

@MattL., We're talking about the output values. They are different from the input. All of them are different. So why would 2 of them need to be added? Why not split into 4 and say $ 0.25 + 0.25 + 0.25 + 0.25 = 1 $? This is just not good enough reasoning. The process changes the values of the DFT bins. The answer should derive, formally, why the factor $ 0.5 $ should be used. Derive, not say why it makes sense to have it split (Or at least define how this derivation looks like). – Royi Dec 31 '20 at 13:57
@MattL., I posted a full derivation for the reasoning behind the $ 0.5 $ factor. Have a look at https://dsp.stackexchange.com/questions/72433. – Royi Jan 04 '21 at 15:00
@Royi: Ok, I'll have a look ... – Matt L. Jan 04 '21 at 15:31
@Royi I think what MatL. says is sufficient in his comments. The problem of symmetry (for real output) occurs only at the index N/2 (for N even). And is solved by splitting it into two and multiplying by 0.5. By the way, I may prefer to skip the symmetry and put that sample at $N/2$ at one side, and I have observerd that it can produce better interpolation than symetric case. Furthermore, and more interestingly, in my answer, and Matt's answer, and in your new explanative answer, there's a hidden (silent) problem that's probably unnoticed. Can you see what it's? – Fat32 Jan 04 '21 at 17:06
1

@Fat32, That's the problem. It's not a symmetry to solve. It is the Filter applied which split the values. This is where it comes from, not the other way around. Regarding your question, there are many implicit assumptions: Samples are sampled in a uniform grid, the data is band limited, etc... – Royi Jan 04 '21 at 18:53
@Royi yes you have a point about the filter related symmetry condition, and indeed my question is also related with that filter. In particular, the DFT based interpolation can be considered to be a frequency domain implementation of the time-domain expander-LPfilter operation isn't it? Yet there's problem with the implementation length, and why that problem does not corrupt the ouput result is interesting ;-) – Fat32 Jan 04 '21 at 19:45
1

If you mean expander as in adding zeros between samples, the derivation I wrote doesn't require that. You can interpolate to any grid you'd like. Regarding the length of the filter, could you clarify what you mean? What's the issue with the filter? – Royi Jan 04 '21 at 21:12

Why Zero Padding in the Center of the DFT Interpolates / Upsamples the Signal (Sinc Interpolation / DFT Interpolation / Periodic Interpolation)

1. With 8 extra values

2. With 24 extra values

3. The problem

2 Answers2