4

I'd like to select (offset) indexes of a number of array points, that match a selection criteria. I can do this by creating a Table of all the coordinates, and then selecting all the values:

data = Array[ RandomInteger[1] &, {4, 4}]
{wx, wy} = data // Dimensions ;
t = Flatten[
  Table[ {x - wx/2, y - wy/2, data[[x]][[y]]}, {x, 1, wx}, {y, 1,
    wy}], 1]
Select[ t, #[[3]] == 1 &]

This resulted in the respective values:

{{1, 1, 1, 1}, {1, 0, 0, 1}, {1, 0, 1, 0}, {1, 0, 0, 1}}
{{-1, -1, 1}, {-1, 0, 1}, {-1, 1, 1}, {-1, 2, 1}, {0, -1, 1}, {0, 0,
  0}, {0, 1, 0}, {0, 2, 1}, {1, -1, 1}, {1, 0, 0}, {1, 1, 1}, {1, 2,
  0}, {2, -1, 1}, {2, 0, 0}, {2, 1, 0}, {2, 2, 1}}
{{-1, -1, 1}, {-1, 0, 1}, {-1, 1, 1}, {-1, 2, 1}, {0, -1, 1}, {0, 2,
  1}, {1, -1, 1}, {1, 1, 1}, {2, -1, 1}, {2, 2, 1}}

However, my real data is big (800x800) and sparse. This method creates a big data set and then throws most of it away. What is a more efficient way to do this?

Peeter Joot
  • 6,398
  • 4
  • 36
  • 55

4 Answers4

3

Here's some sparse sample data of size 800 by 800:

am = AdjacencyMatrix@RandomGraph[{800, 1000}];
data = am.am;
data -= DiagonalMatrix@Diagonal[data];

It contains 1s and 2s (other than zeros).

Let's find the positions of 2s:

Cases[Most@ArrayRules[data], HoldPattern[position_ -> 2] :> position]

These positions can be easily offset after they're found.

You're probably looking something more like this variation:

Cases[Most@ArrayRules[data], HoldPattern[position_ -> value_ /; value == 2] :> Append[position, value]]

This approach is best when you have sparse arrays to work with.

Szabolcs
  • 234,956
  • 30
  • 623
  • 1,263
2

Please consider:

{wx, wy} = {4, 4};

SeedRandom[1];
data = RandomInteger[1, {wx, wy}]

{#1 - wx/2, #2 - wy/2, data[[##]]} & @@@ Position[data, 1]
{{-1, -1, 1}, {-1, 0, 1}, {-1, 2, 1}, {0, 2, 1}, {1, 0, 1}}

For better performance you can use SparseArray to directly compute the list of positions, then use the "NonzeroPositions" Property to extract them. For example:

SeedRandom[1]
m = RandomInteger[5, {10, 10}];

Optimized code to find the positions of all elements equal to four:

sa = SparseArray[Unitize @ Subtract[m, 4], Automatic, 1];

pos = sa["NonzeroPositions"]
{{1, 1}, {1, 3}, {2, 6}, {2, 7}, {3, 1}, {3, 3}, {3, 7},
 {3, 9}, {5, 5}, {6, 8}, {7, 4}, {8, 5}, {9, 6}, {10, 3}}

(The long form of Subtract is used because peculiarly it is faster.)

You could then operate on these positions as I did above. Better still, set all non-selection elements in the array to the same value, and use that as the background of the SparseArray, then also use the "NonzeroValues" Property as well. For example let's select all elements in the interval $[2, 4]$:

{wx, wy} = {14, 20};

SeedRandom[1]
m = RandomInteger[99, {wx, wy}];

sa = SparseArray[Clip[m, {2, 4}, {0, 0}]];

pos = sa["NonzeroPositions"]
val = sa["NonzeroValues"]
{{1, 5}, {1, 13}, {3, 5}, {3, 9}, {4, 14}, {9, 5}, {9, 20}, {11, 10}, {14, 16}}

{3, 4, 4, 2, 3, 2, 2, 2, 2}

We can then transform positions using fast vector operations:

pos2 = (pos\[Transpose] - {wx/2, wy/2})\[Transpose]  (* this looks much nicer in a Notebook *)
{{-6, -5}, {-6, 3}, {-4, -5}, {-4, -1}, {-3, 4}, {2, -5}, {2, 10}, {4, 0}, {7, 6}}

And combine with the values using Join, which does not unpack:

Join[pos2, val ~Partition~ 1, 2]

% // Developer`PackedArrayQ
{{-6, -5, 3}, {-6, 3, 4}, {-4, -5, 4}, {-4, -1, 2}, {-3, 4, 3},
 {2, -5, 2}, {2, 10, 2}, {4, 0, 2}, {7, 6, 2}}

True

Note that using SparseArray properties with packed/packable data is much faster than using ArrayRules:

{wx, wy} = {14000, 20000};

SeedRandom[1]
m = RandomInteger[99, {wx, wy}];

sa = SparseArray[Clip[m, {2, 4}, {0, 0}]];

Timing[
 pos = sa["NonzeroPositions"];
 val = sa["NonzeroValues"];
]

Timing[
 rules = Most @ ArrayRules[sa];
]
{0.0088, Null}

{4.649, Null}

Mr.Wizard
  • 271,378
  • 34
  • 587
  • 1,371
1

It seems that, depending on how dense the input matrix is, Position may perform better than the Cases/ ArrayRules combination:

am = AdjacencyMatrix@RandomGraph[{800, 1000}];
data = am.am;
data -= DiagonalMatrix@Diagonal[data];

ClearSystemCache[];
out1 = Cases[Most@ArrayRules[data], HoldPattern[position_ -> 2] :> position]; // Timing
(* {0.015625,Null} *)
ClearSystemCache[];
out2 = Position[Normal@data, 2, {2}]; // Timing
(* {0.03125,Null} *)
Length@out1/(800  800) // N
(* 0.0000375 *)
Sort@out1 == Sort@out2
(* True *)

For a denser matrix

am = AdjacencyMatrix@RandomGraph[{800, 10000}];
data = am.am;
data -= DiagonalMatrix@Diagonal[data];
ClearSystemCache[];
out1 = Cases[Most@ArrayRules[data], HoldPattern[position_ -> 2] :> position]; // Timing
(* {0.3125,Null} *)
ClearSystemCache[];
out2 = Position[Normal@data, 2, {2}]; // Timing
(* {0.046875,Null} *)
Length@out1/(800  800) // N
(* 0.140584375` *)
Sort@out1 == Sort@out2 
(* True *)
kglr
  • 394,356
  • 18
  • 477
  • 896
0

Use functions Sow and Reap

t = Reap[
  Do[
   If[data[[x]][[y]] == 1, Sow[{x - wx/2, y - wy/2, data[[x]][[y]]}]],
   {x, 1, wx}, {y, 1, wy}
   ]
  ]
t = t[[2,1]]
molekyla777
  • 2,888
  • 1
  • 14
  • 28