2

when we wish to use one list to select items from another using something like Pick, writing the following code, for example, causes us to lose the 1-to-1 correspondence between items in the two lists:

dataArray = {"A","B","C","D","E","F","G"};
testArray = {0.223,0.3,1.2,0.44,4,0.24449,1.01};
dataArray = Pick[dataArray, #>= 1 &/@ testArray];

output = {"C", "E", "G"}

Without having to make a copy of anything, how do we safely prune items from, here testArray, to restore the previous 1-to-1 correspondence between elements in testArray and elements in dataArray? For example, if B in dataArray corresponds to 0.3 in testArray (based on its index), it should again do so after the Pick pruning step.

Mr.Wizard
  • 271,378
  • 34
  • 587
  • 1,371
CA30
  • 151
  • 7
  • 1
    I'm not sure what is you goal at the end but maybe you can just work on pairs? Select[Transpose[{dataArray, testArray}], #[[2]] > 1 &] – Kuba Apr 04 '14 at 20:06

1 Answers1

3

There are surely many ways to approach this problem. Which is best likely (again) depends on your data. I will illustrate three variants.

Paired data (a la decorate-and-sort)

We can do as Kuba did and merge the two lists into one to keep the elements together:

Select[{dataArray, testArray}\[Transpose], #[[2]] >= 1 &]
{{"C", 1.2}, {"E", 4}, {"G", 1.01}}

You can finish with a second Transpose to separate the data into two lists.

Reused mask

A typically faster method is to simply construct the mask once and then reuse it in Pick as needed:

mask = UnitStep[testArray - 1];

Pick[#, mask, 1] & /@ {dataArray, testArray}
{{"C", "E", "G"}, {1.2, 4, 1.01}}

Note that I converted your test to a vectorized numeric form for better performance.

Index-based filtering

Perhaps the top performing method for filter reapplication (especially in version 7 before Pick was better optimized) is to create a list of positions you wish to keep, then extract them using Part or Extract. Faster than Position, when applicable, is SparseArray, using the undocumented Properties method:

fastpos = SparseArray[#]["AdjacencyLists"] &;

idx = fastpos @ UnitStep[testArray - 1]
{3, 5, 7}
#[[idx]] & /@ {dataArray, testArray}
{{"C", "E", "G"}, {1.2, 4, 1.01}}

You can also process multiple lists at once with the help of All, like this:

{dataArray, testArray}[[All, idx]]
{{"C", "E", "G"}, {1.2, 4, 1.01}}
Mr.Wizard
  • 271,378
  • 34
  • 587
  • 1,371