7

In the Compressed Sensing context, assume there is a signal $ x \in {\mathbb{R}}^{n} $ which is $ k $ sparse. Namely its Pseudo $ {\ell}_{0} $ Norm is $ {\left\| x \right\|}_{0} = k $ (The signal has only $k $ non vanishing elements) where $ k << n $.

Given a Model Matrix $ A \in {\mathbb{R}}^{m \times n} $ the measurements are given by (This is the Model):

$$ y = A x $$

The recovery problem is given by:

$$ \arg \min_{x} {\left\| A x - y \right\|}_{2}^{2} \; \text{s. t.} \; {\left\| x \right\|}_{0} = k $$

Since the exact solution to the problem above is hard to find the recovery (Estimation) of the signal $ x $ from the measurements $ y $ is usually done using Orthogonal Matching Pursuit (OMP) Algorithm.

Basically the OMP finds iteratively the elements with highest correlation to the model.

The question is, since the measure of the quality to select indices is based on Correlation, why can't one just find the support using the following:

[vCorrVal vCorrIdx] = sort(A' * Y);    
vSignalSupport = vCorrIdx(1:k);

The result is almost the same as that of normal OMP.

Royi
  • 19,608
  • 4
  • 197
  • 238
Digi1
  • 161
  • 8
  • I can't fully follow you; I'm not sure which definition of "iterative decoding" you're using. "select sequence with highest correlation" does sound like "decoding by finding the symbol with highest correlation", which would make iterative decoding the same as matching pursuit if what is symbols in decoding is atoms in MP. – Marcus Müller Feb 07 '17 at 12:28
  • Thanks for your reply. I am asking about the OMP algorithm. In which we find the support by selecting the column of the sensing matrix (A) which has maximum correlation with the received signal (Y). E.g Res=Y; [val ind]= max(<A' * Res>). We select the support as ind. Support=[support ind]; Then we update the residual(Res) and run the algorithm until K (sparsity) times or some other stopping criteria. My question is why we don't select the maximum K indices. Like [val ind]= sort(<A' * Y>,2); and then find the Support=ind(1:K); ... – Digi1 Feb 07 '17 at 13:10
  • Edit your question to include that info. Use proper Latex/Mathjax to express your formulas! – Marcus Müller Feb 07 '17 at 16:31
  • So, what do you think "use the $K$ maximum correlation columns" is, rather than multiple iterations of the "find the maximum correlation" algorithms?! – Marcus Müller Feb 07 '17 at 17:06
  • I mean as in normal OMP, we have to first find the maximum correlated column of $A$ , save the index of that column and then update the Residual for next iteration. All these steps need extra computation like pseudoinverse of $A$ at each iteration. Why its not done in one step. I have updated the question too. – Digi1 Feb 08 '17 at 08:12
  • 1
    I see no update in the last 16h; no, in OMP you don't have to update the residual. That's the point about O in OMP. – Marcus Müller Feb 08 '17 at 09:20
  • @MarcusMüller, Please reopen the question. It is really a nice question and I rephrased it to be clear. – Royi Feb 08 '17 at 15:05
  • @Royi I agree, the way you changed it, it's now a really nice question. I'm voting to reopen it; still would love Awais to comment on whether the question you are asking is the same question Awais wanted to ask. – Marcus Müller Feb 08 '17 at 15:31
  • I am really sorry. I couldn't make it so clear. Yeah, that was my question. Thanks @Royi . Now its looking good. – Digi1 Feb 08 '17 at 15:38
  • @MarcusMüller, The question how to make Peter K to see this. – Royi Feb 08 '17 at 17:41
  • Anyone could please reopen this question? Thank You. For others, just click on reopen at the bottom of the question. – Royi Feb 09 '17 at 12:32
  • I believe he's asking why you can't grab the highest correlated vectors all at once, rather - why do you have to iterate. Consider the case when you have 2 basis vectors that are exactly the same (it is a pathological case). Assume they also have a high correlation as per the algorithm. Taking both vectors offers no new information - you only need one of them. That's why you remove the contribution of the selected vector. – David Feb 09 '17 at 13:29
  • @David, Could you clock on reopen above? It is really crazy what's going on here with putting questions on hold. – Royi Feb 10 '17 at 12:41
  • I would, but I can't find that option anywhere. I guess I don't have enough points. – David Feb 10 '17 at 14:42
  • Could you correct the mention to the pseudo-norm $\ell_0$? A count measure, a sparsity index would be fine. IMHO, although used a lot, it is not a *-norm of any kind – Laurent Duval Feb 11 '17 at 21:27

2 Answers2

5

The main advantage of OMP is that the residual is orthogonal to the current solution.

Let's say you select all $k$ columns from $A$ (also called atoms) at once and let us also presume that $A$ is an overcomplete basis (this is more or less the standard in OMP literature).

Now, with your method, if the atom that correlates the most with your measurements $y$ is linear-dependent with $p < k$ other atoms in $A$ you will end-up with an $k-p$-sparse signal, because $p$ entries will be more or less redundant. The same argument can of course be extended to less correlated atoms. You might also be lucky and never see the phenomenon.

Let's take the same example but with OMP this time. During the first iteration you would select the atom that correlates the most with measurements $y$. After that you compute the coefficient in $x$ such that the new residual is orthogonal to the current measurements approximation. In other words you got the most of the information provided by the currently selected atom so during the next iteration you are very likely to pick an atom that contains fresh information (ask yourself what would happen with linear dependent atoms in this case).

Here is a list of atom selection look-ahead strategies based on OMP and OLS that you might find interesting to read: POMP, LAOLS and POLS.

Paul Irofti
  • 219
  • 1
  • 6
4

What you propose is actually being used in other algorithms. Your proposal corresponds to the first step of iterative hard thresholding. After the first step, the residual is updated, correlation repeated, and the correlation result added to the signal estimate before thresholding again. This is repeated until convergence is reached in some sense (Iterative Hard Thresholding for Compressed Sensing). One can also update the estimate in a way more similar to OMP; then it corresponds to hard thresholding pursuit (Hard Thresholding Pursuit: An Algorithm For Compressive Sensing).

The principle (of identifying many support index candidates in each iteration) is also seen in the two-step thresholding algorithms such as CoSaMP (CoSaMP: Iterative Signal Recovery From Incomplete and Inaccurate Samples) and subspace pursuit (Subspace Pursuit for Compressive Sensing Signal Reconstruction).

All of these algorithms have in common that they all run several iterations of "correlating" the signal with the "model" and updating the residual accordingly, because it improves the estimate.

Thomas Arildsen
  • 1,322
  • 7
  • 17