Discrepancies in 2D cross correlation results using FFT

Question

I'm trying to implement 2D cross-correlation to acquire the displacement of these two small images (interrogation windows). The issue is that the result from Cross-Correlation (CC) is different compared to the FFT approach; therefore, the displacement/fitting would be wrong. The ffwtools package in R uses the C library: Fastest Fourier Transform in the West (FFTW) to do the conversions. The R code and the issue:

# Assume Window1 at t1 is:
Window1 <- structure(c(
  0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0,
  0, 1, 1, 1, 0, 0, 0, 0, 0, 1, 1, 1, 0, 0, 0, 0, 0, 1, 1, 1, 0,
  0, 0, 0, 0, 1, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0,
  0, 0, 0, 0, 0
), dim = c(8L, 8L))
Assume Window2 at t2 is:
Window2 <- structure(c(
  0, 0, 0, 0, 0, 0, 0, 0, 0, 1, 1, 1, 0, 0, 0, 0, 0,
  1, 1, 1, 0, 0, 0, 0, 0, 1, 1, 1, 0, 0, 0, 0, 0, 1, 0, 0, 0, 0,
  0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 1, 1, 0, 0, 0,
  0, 0, 0, 1, 1
), dim = c(8L, 8L))
Getting the cross-correlation using gsignal package
CC_gsignal <- gsignal::xcorr2(Window1, Window2)

Now using FFT approach (i.e. fftwtools in R), I have generally followed the example/function here.

# dimensions of the CC_fft matrix
cc_row <- nrow(Window1) + nrow(Window2) - 1
cc_col <- ncol(Window1) + ncol(Window2) - 1
padding arrays
padded_Window1 <- matrix(0, nrow = cc_row, ncol = cc_col)
padded_Window1[1:nrow(Window1), 1:ncol(Window1)] <- Window1
For padding Window2, I have tried multiple approaches;
option 3 produces the correct values but not the locations
padded_Window2 <- matrix(0, nrow = cc_row, ncol = cc_col)
# Option 1
padded_Window2[1:nrow(Window2),
1:ncol(Window2)] <- Window2
# Option 2
padded_Window2[1:nrow(Window2),
1:ncol(Window2)] <- Window2 [nrow(Window2):1,
ncol(Window2):1]
option 3
padded_Window2[nrow(Window2):cc_row,
               ncol(Window2):cc_col] <- Window2
# Option 4
padded_Window2[nrow(Window2):cc_row,
ncol(Window2):cc_col] <- Window2 [nrow(Window2):1,
ncol(Window2):1]
fft
fftWindow1 <- matrix(fftwtools::fftw2d(padded_Window1), nrow(padded_Window1))
fftWindow2 <- matrix(fftwtools::fftw2d(padded_Window2), nrow(padded_Window2))
CC_fftwtools
CC_fftwtools <- round(
  matrix(
    Re(fftwtools::fftw2d(Conj(fftWindow1) * fftWindow2, inverse = 1)),
    nrow(padded_Window1)
  ) / length(padded_Window1),
  digits = 3
)
plotting
library(plot.matrix)
par(mfrow = c(2, 2))
plot(Window1, col = topo.colors, main = "Window1")
plot(Window2, col = topo.colors, main = "Window2")
plot(CC_gsignal, col = topo.colors, main = "CC_gsignal")
plot(CC_fftwtools, col = topo.colors, main = "CC_fftwtools")

From the two bottom images, you can see the difference between CC from gsginal and fftwtools. I have tried to solve the issue by changing how Window2 is padded (see code above). I have also tried to shift the CC_fftwtools array using gsignal::fftshift without success. One dirty solution would be to flip the CC_fftwtools result vertically & horizontaly. However, I believe I'm doing something wrong, and there is a better explanation for my error. Any hints or tips are much appreciated!

New Info: When exchanging Window1 with Window2, the results of CC exchange are as well. Meaning the left image CC_gsignal will look like the current cc_fftwtools and vice versa.

@OverLordGoldDragon, thanks for the links. I have tried the approaches here and here. My first guess of the issue is the flipping/placements of Window2 in padded_Window2 array. Yes, I'm checking each function's documentation. — ahmathelte, May 03 '23 at 15:14
If you open a mirror question in Python, vs. e.g. scipy.signal.correlate2d, I may take a look, then port it to R if scipy's result is identical, or you could take it from there. — OverLordGoldDragon, May 04 '23 at 17:08
@OverLordGoldDragon, I will add a Python version to the question. After a small discussion with a colleague, it looks like both results are acceptable in the PIV context; the only difference is how you define the displacement indices, k & l, e.g., from -7 to 7 or the opposite. See here — ahmathelte, May 06 '23 at 07:58
That makes the question unfocused and close worthy. It should be asked separately. I don't know what your context is but CC and convolutions have standard output modes for good reasons, so if you're after something else, that should be in the question. You could base it off of this question while using your example. — OverLordGoldDragon, May 06 '23 at 08:40
@OverLordGoldDragon, thanks for your reply. I'm not sure now. Should I modify the question to make it focused, or should it be closed and open a new one? I appreciate any help you can provide :) — ahmathelte, May 06 '23 at 11:13
The original question was fine. Python should be asked separately, so remove it from this one. Once that's answered, I or yourself can answer the one on R. — OverLordGoldDragon, May 06 '23 at 11:28

score 2 · Answer 1 · answered May 08 '23 at 10:34

My solution revises the approach as follows:

Getting the dimension of the Cross-correlation (CC) from the input matrices Window1 and Window2:

#dimensions of the CC matrix   
cc_row <- nrow(Window1) + nrow(Window2) - 1    
cc_col <- ncol(Window1) + ncol(Window2) - 1

Padding zeros and placing Window1 & Window2 in the padded matrices. According to here, no need to flip the second matrix in CC application, only in convolution.

# padding Window1
padded_Window1 <- matrix(0, nrow = cc_row, ncol = cc_col)
padded_Window1[1:nrow(Window1), 1:ncol(Window1)] <- Window1
padding Window2
padded_Window2 <- matrix(0, nrow = cc_row, ncol = cc_col)
padded_Window2[1:nrow(Window2), 1:ncol(Window2)] <- Window2

From Fig. 5. in Willert and Gharib (1991), CC should be calculated using the conjugate of FFT of Window2:

# fft
fftWindow1 <- matrix(fftwtools::fftw2d(padded_Window1), nrow(padded_Window1))
fftWindow2 <- matrix(fftwtools::fftw2d(padded_Window2), nrow(padded_Window2))
CC_notfftshifted
CC_notfftshifted <- round(
matrix(
    Re(fftwtools::fftw2d(fftWindow1 * Conj(fftWindow2), inverse = 1)),
    nrow(padded_Window1)
  ) / length(padded_Window1),
  digits = 3
)

Dividing by the count of matrix elements because the inverse FFT applied in the FFTW library returns a matrix scaled by the array size. See here

FFT shift on rows and columns. See here

# CC_fftshifted<- gsignal::fftshift(CC_notfftshifted, c(1,2))

The plotting below shows the steps explained above.

Good that you figured it out. When padding is involved we no longer need fftshift, just correct padding and unpadding is more performant. — OverLordGoldDragon, May 08 '23 at 10:45

score 1 · Answer 2 · answered May 08 '23 at 10:44

2D cross-correlation, replicating scipy's correlate2d, is covered in this answer. It appears xcorr2 does the same, at least for the given use case.

You have the right idea, but when using FFT methods, any minute detail can make night-day difference. Your approach is fixed as follows:

Remove conj(fft(x)); the Fourier property this is trying to apply works out incorrectly here since x isn't centered - refer to this post
Flip Window2, no conjugation needed since it's real-valued
Option 1 padding is correct with 1 and 2 done

# Assume Window1 at t1 is:
Window1 <- structure(c(
  0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0,
  0, 1, 1, 1, 0, 0, 0, 0, 0, 1, 1, 1, 0, 0, 0, 0, 0, 1, 1, 1, 0,
  0, 0, 0, 0, 1, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0,
  0, 0, 0, 0, 0
), dim = c(8L, 8L))
Assume Window2 at t2 is:
Window2 <- structure(c(
  0, 0, 0, 0, 0, 0, 0, 0, 0, 1, 1, 1, 0, 0, 0, 0, 0,
  1, 1, 1, 0, 0, 0, 0, 0, 1, 1, 1, 0, 0, 0, 0, 0, 1, 0, 0, 0, 0,
  0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 1, 1, 0, 0, 0,
  0, 0, 0, 1, 1
), dim = c(8L, 8L))
Getting the cross-correlation using gsignal package
CC_gsignal <- gsignal::xcorr2(Window1, Window2)
dimensions of the CC_fft matrix
cc_row <- nrow(Window1) + nrow(Window2) - 1
cc_col <- ncol(Window1) + ncol(Window2) - 1
padding arrays
padded_Window1 <- matrix(0, nrow = cc_row, ncol = cc_col)
padded_Window1[1:nrow(Window1), 1:ncol(Window1)] <- Window1
padded_Window2 <- matrix(0, nrow = cc_row, ncol = cc_col)
padded_Window2[1:nrow(Window2), 
               1:ncol(Window2)] <- Window2[nrow(Window2):1, ncol(Window2):1]
fft
fftWindow1 <- matrix(fftwtools::fftw2d(padded_Window1), nrow(padded_Window1))
fftWindow2 <- matrix(fftwtools::fftw2d(padded_Window2), nrow(padded_Window2))
CC_fftwtools
CC_fftwtools <- round(
  matrix(
    Re(fftwtools::fftw2d(fftWindow1 * fftWindow2, inverse = 1)),
    nrow(padded_Window1)
  ) / length(padded_Window1),
  digits = 3
)
plotting
library(plot.matrix)
par(mfrow = c(2, 2))
plot(Window1, col = topo.colors, main = "Window1")
plot(Window2, col = topo.colors, main = "Window2")
plot(CC_gsignal, col = topo.colors, main = "CC_gsignal")
plot(CC_fftwtools, col = topo.colors, main = "CC_fftwtools")

Thanks for your answer! I think both solutions are right. Because the convolution theorem requires flipping the kernel, while in the CC theorem, the complex conjugate is required. Refer to this link — ahmathelte, May 08 '23 at 12:46
Both are correct yes, I'm just noting speed. The fftshift variant is also harder to generalize to 'same' and 'valid' and can't benefit from caching. Anyway +1, good effort. — OverLordGoldDragon, May 08 '23 at 15:07
Thanks, this is a very valid point I did not consider because my application requires the full CC plane. I will try to keep it in mind :-) — ahmathelte, May 08 '23 at 15:18
Conjugation is also much more expensive than flipping and unnecessary for real-valued inputs, but it's needed with your approach of conj(fft). Remember to upvote what you found helpful! — OverLordGoldDragon, May 08 '23 at 16:37
The production will be in Fortran using FFTW; so far I have performed ~22k CC operations in 25 seconds. Would be interesting to see what is the effect of shifting the FFT :-) — ahmathelte, May 08 '23 at 16:49
Conjugation is much slower than I thought, and it's not even due to conjugation per se. I'll edit the other answer to operate in-place. I think flipping has no overhead, just changes array metadata - a (5000, 5000) complex128 array flips in 300 nanoseconds in NumPy. As for Fortran or production-intensive environments, my full answer offers significant optimizations. — OverLordGoldDragon, May 08 '23 at 19:55

Discrepancies in 2D cross correlation results using FFT

Assume Window2 at t2 is:

Getting the cross-correlation using gsignal package

padding arrays

For padding Window2, I have tried multiple approaches;

option 3 produces the correct values but not the locations

# Option 1

padded_Window2[1:nrow(Window2),

1:ncol(Window2)] <- Window2

# Option 2

padded_Window2[1:nrow(Window2),

1:ncol(Window2)] <- Window2 [nrow(Window2):1,

ncol(Window2):1]

option 3

# Option 4

padded_Window2[nrow(Window2):cc_row,

ncol(Window2):cc_col] <- Window2 [nrow(Window2):1,

ncol(Window2):1]

fft

CC_fftwtools

plotting

2 Answers2

padding Window2

CC_notfftshifted

CC_notfftshifted <- round(

Assume Window2 at t2 is:

Getting the cross-correlation using gsignal package

dimensions of the CC_fft matrix

padding arrays

fft

CC_fftwtools

plotting