2

I'm trying to implement 2D cross-correlation to acquire the displacement of these two small images (interrogation windows). The issue is that the result from Cross-Correlation (CC) is different compared to the FFT approach; therefore, the displacement/fitting would be wrong. The ffwtools package in R uses the C library: Fastest Fourier Transform in the West (FFTW) to do the conversions. The R code and the issue:

# Assume Window1 at t1 is:
Window1 <- structure(c(
  0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0,
  0, 1, 1, 1, 0, 0, 0, 0, 0, 1, 1, 1, 0, 0, 0, 0, 0, 1, 1, 1, 0,
  0, 0, 0, 0, 1, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0,
  0, 0, 0, 0, 0
), dim = c(8L, 8L))

Assume Window2 at t2 is:

Window2 <- structure(c( 0, 0, 0, 0, 0, 0, 0, 0, 0, 1, 1, 1, 0, 0, 0, 0, 0, 1, 1, 1, 0, 0, 0, 0, 0, 1, 1, 1, 0, 0, 0, 0, 0, 1, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 1, 1, 0, 0, 0, 0, 0, 0, 1, 1 ), dim = c(8L, 8L))

Getting the cross-correlation using gsignal package

CC_gsignal <- gsignal::xcorr2(Window1, Window2)

Now using FFT approach (i.e. fftwtools in R), I have generally followed the example/function here.

# dimensions of the CC_fft matrix
cc_row <- nrow(Window1) + nrow(Window2) - 1
cc_col <- ncol(Window1) + ncol(Window2) - 1

padding arrays

padded_Window1 <- matrix(0, nrow = cc_row, ncol = cc_col) padded_Window1[1:nrow(Window1), 1:ncol(Window1)] <- Window1

For padding Window2, I have tried multiple approaches;

option 3 produces the correct values but not the locations

padded_Window2 <- matrix(0, nrow = cc_row, ncol = cc_col)

# Option 1

padded_Window2[1:nrow(Window2),

1:ncol(Window2)] <- Window2

# Option 2

padded_Window2[1:nrow(Window2),

1:ncol(Window2)] <- Window2 [nrow(Window2):1,

ncol(Window2):1]

option 3

padded_Window2[nrow(Window2):cc_row, ncol(Window2):cc_col] <- Window2

# Option 4

padded_Window2[nrow(Window2):cc_row,

ncol(Window2):cc_col] <- Window2 [nrow(Window2):1,

ncol(Window2):1]

fft

fftWindow1 <- matrix(fftwtools::fftw2d(padded_Window1), nrow(padded_Window1)) fftWindow2 <- matrix(fftwtools::fftw2d(padded_Window2), nrow(padded_Window2))

CC_fftwtools

CC_fftwtools <- round( matrix( Re(fftwtools::fftw2d(Conj(fftWindow1) * fftWindow2, inverse = 1)), nrow(padded_Window1) ) / length(padded_Window1), digits = 3 )

plotting

library(plot.matrix) par(mfrow = c(2, 2)) plot(Window1, col = topo.colors, main = "Window1") plot(Window2, col = topo.colors, main = "Window2") plot(CC_gsignal, col = topo.colors, main = "CC_gsignal") plot(CC_fftwtools, col = topo.colors, main = "CC_fftwtools")

enter image description here

From the two bottom images, you can see the difference between CC from gsginal and fftwtools. I have tried to solve the issue by changing how Window2 is padded (see code above). I have also tried to shift the CC_fftwtools array using gsignal::fftshift without success. One dirty solution would be to flip the CC_fftwtools result vertically & horizontaly. However, I believe I'm doing something wrong, and there is a better explanation for my error. Any hints or tips are much appreciated!

New Info: When exchanging Window1 with Window2, the results of CC exchange are as well. Meaning the left image CC_gsignal will look like the current cc_fftwtools and vice versa.

ahmathelte
  • 141
  • 4
  • Combine this with this. Also mind library conventions. – OverLordGoldDragon May 03 '23 at 14:42
  • @OverLordGoldDragon, thanks for the links. I have tried the approaches here and here. My first guess of the issue is the flipping/placements of Window2 in padded_Window2 array. Yes, I'm checking each function's documentation. – ahmathelte May 03 '23 at 15:14
  • If you open a mirror question in Python, vs. e.g. scipy.signal.correlate2d, I may take a look, then port it to R if scipy's result is identical, or you could take it from there. – OverLordGoldDragon May 04 '23 at 17:08
  • @OverLordGoldDragon, I will add a Python version to the question. After a small discussion with a colleague, it looks like both results are acceptable in the PIV context; the only difference is how you define the displacement indices, k & l, e.g., from -7 to 7 or the opposite. See here – ahmathelte May 06 '23 at 07:58
  • That makes the question unfocused and close worthy. It should be asked separately. I don't know what your context is but CC and convolutions have standard output modes for good reasons, so if you're after something else, that should be in the question. You could base it off of this question while using your example. – OverLordGoldDragon May 06 '23 at 08:40
  • @OverLordGoldDragon, thanks for your reply. I'm not sure now. Should I modify the question to make it focused, or should it be closed and open a new one? I appreciate any help you can provide :) – ahmathelte May 06 '23 at 11:13
  • 1
    The original question was fine. Python should be asked separately, so remove it from this one. Once that's answered, I or yourself can answer the one on R. – OverLordGoldDragon May 06 '23 at 11:28

2 Answers2

2

My solution revises the approach as follows:

  1. Getting the dimension of the Cross-correlation (CC) from the input matrices Window1 and Window2:
#dimensions of the CC matrix   
cc_row <- nrow(Window1) + nrow(Window2) - 1    
cc_col <- ncol(Window1) + ncol(Window2) - 1   
  1. Padding zeros and placing Window1 & Window2 in the padded matrices. According to here, no need to flip the second matrix in CC application, only in convolution.
# padding Window1
padded_Window1 <- matrix(0, nrow = cc_row, ncol = cc_col)
padded_Window1[1:nrow(Window1), 1:ncol(Window1)] <- Window1

padding Window2

padded_Window2 <- matrix(0, nrow = cc_row, ncol = cc_col) padded_Window2[1:nrow(Window2), 1:ncol(Window2)] <- Window2

  1. From Fig. 5. in Willert and Gharib (1991), CC should be calculated using the conjugate of FFT of Window2:
# fft
fftWindow1 <- matrix(fftwtools::fftw2d(padded_Window1), nrow(padded_Window1))
fftWindow2 <- matrix(fftwtools::fftw2d(padded_Window2), nrow(padded_Window2))

CC_notfftshifted

CC_notfftshifted <- round(

matrix( Re(fftwtools::fftw2d(fftWindow1 * Conj(fftWindow2), inverse = 1)), nrow(padded_Window1) ) / length(padded_Window1), digits = 3 )

Dividing by the count of matrix elements because the inverse FFT applied in the FFTW library returns a matrix scaled by the array size. See here

  1. FFT shift on rows and columns. See here
# CC_fftshifted<- gsignal::fftshift(CC_notfftshifted, c(1,2))

The plotting below shows the steps explained above. Results

ahmathelte
  • 141
  • 4
  • 1
    Good that you figured it out. When padding is involved we no longer need fftshift, just correct padding and unpadding is more performant. – OverLordGoldDragon May 08 '23 at 10:45
1

2D cross-correlation, replicating scipy's correlate2d, is covered in this answer. It appears xcorr2 does the same, at least for the given use case.

You have the right idea, but when using FFT methods, any minute detail can make night-day difference. Your approach is fixed as follows:

  1. Remove conj(fft(x)); the Fourier property this is trying to apply works out incorrectly here since x isn't centered - refer to this post
  2. Flip Window2, no conjugation needed since it's real-valued
  3. Option 1 padding is correct with 1 and 2 done

enter image description here

# Assume Window1 at t1 is:
Window1 <- structure(c(
  0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0,
  0, 1, 1, 1, 0, 0, 0, 0, 0, 1, 1, 1, 0, 0, 0, 0, 0, 1, 1, 1, 0,
  0, 0, 0, 0, 1, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0,
  0, 0, 0, 0, 0
), dim = c(8L, 8L))

Assume Window2 at t2 is:

Window2 <- structure(c( 0, 0, 0, 0, 0, 0, 0, 0, 0, 1, 1, 1, 0, 0, 0, 0, 0, 1, 1, 1, 0, 0, 0, 0, 0, 1, 1, 1, 0, 0, 0, 0, 0, 1, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 1, 1, 0, 0, 0, 0, 0, 0, 1, 1 ), dim = c(8L, 8L))

Getting the cross-correlation using gsignal package

CC_gsignal <- gsignal::xcorr2(Window1, Window2)

dimensions of the CC_fft matrix

cc_row <- nrow(Window1) + nrow(Window2) - 1 cc_col <- ncol(Window1) + ncol(Window2) - 1

padding arrays

padded_Window1 <- matrix(0, nrow = cc_row, ncol = cc_col) padded_Window1[1:nrow(Window1), 1:ncol(Window1)] <- Window1 padded_Window2 <- matrix(0, nrow = cc_row, ncol = cc_col) padded_Window2[1:nrow(Window2), 1:ncol(Window2)] <- Window2[nrow(Window2):1, ncol(Window2):1]

fft

fftWindow1 <- matrix(fftwtools::fftw2d(padded_Window1), nrow(padded_Window1)) fftWindow2 <- matrix(fftwtools::fftw2d(padded_Window2), nrow(padded_Window2))

CC_fftwtools

CC_fftwtools <- round( matrix( Re(fftwtools::fftw2d(fftWindow1 * fftWindow2, inverse = 1)), nrow(padded_Window1) ) / length(padded_Window1), digits = 3 )

plotting

library(plot.matrix) par(mfrow = c(2, 2)) plot(Window1, col = topo.colors, main = "Window1") plot(Window2, col = topo.colors, main = "Window2") plot(CC_gsignal, col = topo.colors, main = "CC_gsignal") plot(CC_fftwtools, col = topo.colors, main = "CC_fftwtools")

OverLordGoldDragon
  • 8,912
  • 5
  • 23
  • 74
  • Thanks for your answer! I think both solutions are right. Because the convolution theorem requires flipping the kernel, while in the CC theorem, the complex conjugate is required. Refer to this link – ahmathelte May 08 '23 at 12:46
  • Both are correct yes, I'm just noting speed. The fftshift variant is also harder to generalize to 'same' and 'valid' and can't benefit from caching. Anyway +1, good effort. – OverLordGoldDragon May 08 '23 at 15:07
  • Thanks, this is a very valid point I did not consider because my application requires the full CC plane. I will try to keep it in mind :-) – ahmathelte May 08 '23 at 15:18
  • Conjugation is also much more expensive than flipping and unnecessary for real-valued inputs, but it's needed with your approach of conj(fft). Remember to upvote what you found helpful! – OverLordGoldDragon May 08 '23 at 16:37
  • The production will be in Fortran using FFTW; so far I have performed ~22k CC operations in 25 seconds. Would be interesting to see what is the effect of shifting the FFT :-) – ahmathelte May 08 '23 at 16:49
  • 1
    Conjugation is much slower than I thought, and it's not even due to conjugation per se. I'll edit the other answer to operate in-place. I think flipping has no overhead, just changes array metadata - a (5000, 5000) complex128 array flips in 300 nanoseconds in NumPy. As for Fortran or production-intensive environments, my full answer offers significant optimizations. – OverLordGoldDragon May 08 '23 at 19:55