5

I hate this kind of questions, but I'm really stuck. I'm trying to implement Lucas-Kanade algorithm as described in this paper (see pages 4 and 5). Unlike most explanations I've seen they don't assume optical flow, but instead use affine transformation for most examples and in general refer to warp function just as $W(x, p)$, where $x$ is the point and $p = (p_1, p_2, p_3, p_4, p_5, p_6)^T$ is parameter vector of affine transformation.

At the beginning of page 4 authors outline their version of Lucas-Kanade algorithm. I'm stuck at steps (4) and (5), namely, evaluating the Jacobian $\frac{\partial W}{\partial p}$ and calculating the steepest descent images $\nabla I\frac{\partial W}{\partial p}$.

(As far as I can understand, $\nabla I$ is just a pair of 2 matrices $(\frac{\partial I}{\partial x}, \frac{\partial I}{\partial y})$ in this context. Please, correct me if I'm wrong).

Authors carefully describe affine warp and even provide formula for its Jacobian (Equation 8):

$$\frac{\partial W}{\partial p} = \pmatrix{x & 0 & y & 0 & 1 & 0 \\ 0 & x & 0 & y & 0 & 1}$$

However, Jacobian of the warp is (not surprisingly) defined only for a single pixel and not the entire image. But in step (5) we calculate multiplication of image gradient and this Jacobian - $\nabla I \frac{\partial W}{\partial p}$, and as far as I can see from the context (see, for example, Figure 2 on page 5), it is done for the whole image and not per pixel.

My question is, how should I interpret this multiplication and what are the real sizes/formats of $\nabla I$ and $\frac{\partial W}{\partial p}$?

I understand that this question may require reading a lot from that paper, so I will be glad to explain any point that is not clear enough. Also feel free to refine question title or contents if you have an idea of a better wording.

Glorfindel
  • 418
  • 1
  • 5
  • 10
ffriend
  • 443
  • 3
  • 7

1 Answers1

4

You're right that all the quantities are computed for a single pixel, so in the product

$$\nabla I\cdot \frac{\partial W}{\partial p}\cdot\Delta p\tag{1}$$

the sizes of the vectors and matrices are

$$\nabla I\quad\;\;\; 1\times 2\\ \frac{\partial W}{\partial p}\quad 2\times n\\ \Delta p\quad\;\;\;\; n\times 1$$

where $n$ is the number of parameters. So the expression in (1) is a scalar. The total error measure given by Equation (6) in the paper is the sum of this scalar expression over all pixels.

Matt L.
  • 89,963
  • 9
  • 79
  • 179
  • Thanks! In addition, I found their source code for that paper, where they implement it in a vectorized form, but basic operations over each pixel seem to be the same. – ffriend Jun 15 '13 at 15:10
  • I joined this community to upvote this question and this answer. However, the source code link of https://www.ri.cmu.edu/research_project_detail.html?project_id=515&menu_id=261 is broken – Prasad Raghavendra Mar 03 '20 at 17:25
  • 1
    @PrasadRaghavendra: I think this is the source code referred to in the other comment. – Matt L. Mar 03 '20 at 17:54