Multi-dimensional PositionIndex

Question

I've noticed that PositionIndex does not take a level spec like most related functions. So while it's very useful on 1D lists, I can't really use it to get a index of a matrix for example. Now I'm wondering what an efficient replacement is.

I'm aware of this post which contains two handwritten alternatives to PositionIndex that are faster in the 1D case.

However, since I know that the values in my matrix are small consecutive integers (in fact it's a label matrix obtained from MorphologicalComponents), I don't really need the result to be an association, and I've found a simple Array over Positions to be faster:

matrix = RandomInteger[{1, 10}, {1000, 1000}];
Array[Position[matrix, #] &, Max@matrix]; // AbsoluteTiming
GroupBy[Tuples[Range@Length@matrix, 2], matrix[[## & @@ #]] &]; // AbsoluteTiming

{0.910396, Null}
{2.73593, Null}

The Array approach seems wasteful though because it needs to go over the matrix several times. Then again, I can speed this one up a bit more in my own case, because I don't care about the positions of the 0s in the label matrix.

Is there a better way to generalise the GroupBy approach to multiple dimensions? Or is there some third approach that would be a useful replacement for PositionIndex when I'm dealing with data that has more structure than a 1D list?

@AlexeyGolyshev Wouldn't that only give me the second index of each index pair? I guess I could do that both on matrix and its transpose and then zip the two results together. I might try that, but I'm not sure it will be particularly efficient (although then I can swap in the faster replacement for PositionIndex again). — Martin Ender, Mar 30 '17 at 16:42
Merge[Table[Thread[{i,#}]&/@PositionIndex[matrix[[i]]],{i,Length@matrix}],Join] — Alexey Golyshev, Mar 30 '17 at 16:49
@AlexeyGolyshev It needs to be Join@@#&, but that's already faster, thanks. — Martin Ender, Mar 30 '17 at 16:52
Interestingly, using cleanPosIdx or myPosIdx from the other question is slower than PositionIndex in this case. — Martin Ender, Mar 30 '17 at 16:54

score 4 · Answer 1 · answered Dec 30 '19 at 03:22

Here is a function based on the ResourceFunction GroupByList:

pIndex2[m_] := ResourceFunction["GroupByList"][
    Tuples @ Range @ Dimensions @ m,
    Flatten @ m
]

Comparison:

r1 = posInd[matrix]; //RepeatedTiming
r2 = pIndex2[matrix]; //RepeatedTiming

r1 === Values @ KeySort @ r2

{0.046, Null}

{0.0394, Null}

True

The GroupByList scales better with the number of keys:

matrix = RandomInteger[{1, 10^3}, {1000, 1000}];
r1 = posInd[matrix]; //RepeatedTiming
r2 = pIndex2[matrix]; //RepeatedTiming

r1 === Values @ KeySort @ r2

{0.68, Null}

{0.050, Null}

True

score 3 · Answer 2 · answered Mar 30 '17 at 20:39

This is the fastest I can think of right now:

posInd[matrix_] := Block[{inds, flatmat},
  inds = Flatten[Outer[List, #, #] &@Range[Length@matrix], 1];
  flatmat = Flatten[matrix, 1];
  Pick[inds, flatmat, #] & /@ Range[Max@matrix]
  ]

On my MacBook Pro from 2013 with Mathematica 10.0.2.0:

matrix = RandomInteger[{1, 10}, {1000, 1000}];
m1 = Array[Position[matrix, #] &, Max@matrix]; // AbsoluteTiming
m2 = GroupBy[Tuples[Range@Length@matrix, 2], matrix[[## & @@ #]] &]; // AbsoluteTiming
m3 = Merge[Table[
    Thread[{i, #}] & /@ PositionIndex[matrix[[i]]], {i, 
     Length@matrix}], Join @@ # &]; // AbsoluteTiming
m4 = posInd[matrix]; // AbsoluteTiming

{0.866475, Null}

{2.824167, Null}

{0.297523, Null}

{0.088725, Null}

m1 === m4

True

Nice. This can be sped up a little more by constructing the indices with Tuples[Range@Length@matrix, 2]. — Martin Ender, Mar 30 '17 at 20:52

score 2 · Answer 3 · answered Mar 30 '17 at 17:04

2

Merge[Table[Thread[{i,#}]&/@PositionIndex[matrix[[i]]],{i,Length@matrix}],Join@@#&]

answered Mar 30 '17 at 17:04

Alexey Golyshev

9,526
2
27
57

Multi-dimensional PositionIndex

3 Answers3