10

Here is how to Generate data

the function is below with two images to generate data

img1

enter image description here

img2 enter image description here

segmentImage[binarizedMask_?ImageQ] := 
Module[{seg, areas, indexMaxarea, maxArea},
seg = MorphologicalComponents@*ColorNegate@Dilation[binarizedMask, 1];
areas = ComponentMeasurements[seg, "Area"];
{indexMaxarea, maxArea} = First@MaximalBy[areas, Last] /. Rule -> List;
If[maxArea > 20000, ArrayComponents[seg, Length@areas, indexMaxarea -> 0], seg] ~
Dilation~1(* 20000 is just a threshold *)
];

generating data

{seg1, seg2} = segmentImage /@ {img1, img2};

Now the question

I have two 2-D arrays with dimensions {651,823}. Each array has labelled components i.e. at each position the value is either zero or an integer label. I have 51 objects in the first seg1 and 50 objects in the second array seg2.

I wish to calculate the number of positions where a given object from the first array intersects the objects in the second array.

I have concocted this strategy but it seems terribly slow:

(pos = Table[
 Length@Intersection[Position[seg1, i], Position[seg2, j]], {i, 
  51}, {j, 50}];) // AbsoluteTiming
(* {143.607, Null} *)

Since these arrays need to be modified and consequently the intersections need to recalculated in an iteration later, I find that this is not the best way to do it.

Any ideas of speeding the process up will be welcomed.

the output i have looks something like this (each row represents an integer label in seg1 and each column represents a label in seg2 and the values represent the number of positions the labelled objects in both arrays intersect):

enter image description here

Note: In the image attached the values 267 and 564 in the first line delineates that the object (label 1) in seg1 intersects an object labeled 5 and 7 in seg2 at 267 and 564 positions respectively

Ali Hashmi
  • 8,950
  • 4
  • 22
  • 42
  • In the image attached the value 267 in the first line delineates that the object (label 1) in seg1 intersects an object labeled 5 in seg2 at 267 positions – Ali Hashmi May 21 '17 at 22:36
  • So are the two 2d arrays seg1 and seg2? In your pos printout, if each column reflects a single integer label and each row also reflects a unique integer label, how can you have two nonzero values in the first column, 204 and 118? – MikeY May 21 '17 at 22:37
  • @MikeY sorry i read your comment late. The thing is the labeled data in seg1 is 1,2,3....51 and seg2 is labeled with 1,2,3....50 so the entries being non-zero means that the 4th and 6th labels in seg1 intersect with the label 1 in seg2 – Ali Hashmi May 21 '17 at 22:56
  • maybe SparseArray[SparseArray@Normal[Rest@Counts[Flatten[(1 - Unitize[seg1 - #]) seg2]]]&/@Range[51]]// Normal? – kglr May 21 '17 at 23:24
  • @kglr let me add the raw images and the function from which the arrays are created. because currently your suggestion is giving a few errors. – Ali Hashmi May 21 '17 at 23:28
  • @kglr please see the question with the input images and the only function needed to generate the two arrays – Ali Hashmi May 21 '17 at 23:35
  • @Ali, could you try Normal[SparseArray[SparseArray[#, {50}]&@DeleteCases[Normal[Counts[Flatten[(1 - Unitize[seg1 - #]) seg2]]],0->_]&/@Range[51]]]? – kglr May 21 '17 at 23:39
  • So by intersect, you mean the corresponding positions in seg1 and seg2 are both nonzero, but they need not be the same? – MikeY May 21 '17 at 23:47
  • @kglr can you post this as an answer. It works and is blazingly fast :) – Ali Hashmi May 21 '17 at 23:48
  • @kglr 0.52 seconds !! – Ali Hashmi May 21 '17 at 23:49
  • @MikeY the labels can be any number in both the arrays. If the same position in both arrays is occupied then that means an intersection. We are computing the number of positions where objects i = 1,2,...51 in seg1 intersect with objects j = 1,2,3...50 in seg2 – Ali Hashmi May 21 '17 at 23:57

7 Answers7

9
AbsoluteTiming[
 counts = Normal[SparseArray[SparseArray[#, {50}] &@ DeleteCases[
         Normal[Counts[Flatten[(1 - Unitize[seg1 - #]) seg2]]], 0 -> _] & /@ Range[51]]];]

{0.30279, Null}

(pos = Table[Length@Intersection[Position[seg1, i], Position[seg2, j]], {i, 
        51}, {j, 50}];) // AbsoluteTiming

{122.747, Null}

counts == pos

True

Ali Hashmi
  • 8,950
  • 4
  • 22
  • 42
kglr
  • 394,356
  • 18
  • 477
  • 896
7

Something like the following would be pretty fast:

intersections[r1_, r2_] := Module[{t},
  t = DeleteCases[Tally@Flatten[Unitize[r1 r2] (100 r1 + r2)], {0, _}];
  t[[All, 1]] = IntegerDigits[t[[All, 1]], 100];
  Normal[SparseArray[Rule @@@ t]]
]

This assumes that the labels are all less than 100. For the OP example:

res=intersections[seg1, seg2]; //AbsoluteTiming
res[[;;5, ;;5]]

{0.006349, Null}

{{0, 0, 0, 0, 267}, {0, 0, 6888, 0, 0}, {0, 6511, 0, 0, 0}, {204, 0, 0, 7249, 70}, {0, 0, 0, 110, 3971}}

Carl Woll
  • 130,679
  • 6
  • 243
  • 355
  • @AliHashmi It works fine for me. Perhaps you need to restart the kernel or clear some variables? – Carl Woll May 22 '17 at 14:35
  • your method works like a charm !! plus one'd it already :) – Ali Hashmi May 22 '17 at 16:27
  • @AliHashmi IMHO you really should change the Accept to this post. I somehow overlooked it completely before starting on an answer myself, and it seems to be far and away the best method provided. – Mr.Wizard May 22 '17 at 16:32
  • @CarlWoll there is a slight problem with your approach. Your method here gives a dimension of {50,50} whereas it should have been {51,50} – Ali Hashmi Jan 22 '18 at 11:11
  • 1
    @AliHasmi The dimensions are {50, 50} because there is no intersection of the 51 from seg1 with any of the integers from seg2. This could be fixed by adding a dimension specification to the SparseArray code. That is, using SparseArray[Rule @@@ t, {Max[r1], Max[r2]}] instead of SparseArray[Rule @@@ t]. – Carl Woll Jan 22 '18 at 19:19
  • @CarlWoll thanks checked your comment late. Btw can your approach be modified to work with gaps in labels. lets say the first mask had labels such as {1,2,3 ... 35, 39,40,42} (36, 37, 38 and 41 are not there). This is only an example. The gaps can exist anywhere. The second mask has regular labeling scheme with no gaps {1,2,3.. n} where n is the maximum value. – Ali Hashmi Apr 05 '18 at 16:15
  • @CarlWoll i modified your function as: overlapMatrix[segPrev_, segCurr_] := Module[{m,p1,p2,val,rules1,rules2}, {p1,p2} = Values@ComponentMeasurements[segPrev,"Label"]&/@{segPrev,segCurr}; val = Max[Max@p1,Max@p2]+1; m = DeleteCases[Tally@Flatten[Unitize[segPrev segCurr] (val segPrev + segCurr)], {0, _}]; m[[All, 1]] = IntegerDigits[m[[All, 1]], val]; m =SortBy[m,First]; {rules1,rules2}=Thread[#-> Range@Length@#]&/@{p1,p2}; m = MapAt[ind\[Function]ind/.rules1,m,{All,1,1}]; m = MapAt[ind\[Function]ind/.rules2,m,{All,1,2}]; Normal@SparseArray[Rule@@@m,{Length@p1,Length@p2}] ]; – Ali Hashmi Apr 05 '18 at 16:16
  • @CarlWoll this passes in some cases (when i use different masks) and fails in other cases. I am not sure where i am going wrong to compute the overlaps. – Ali Hashmi Apr 05 '18 at 16:17
5

A simpler solution using PositionIndex

a1 = Flatten[seg1] // PositionIndex;
a2 = Flatten[seg2] // PositionIndex;
a3 = Table[Length@Intersection[a1[i], a2[j]], {i, 51}, {j, 50}]; // AbsoluteTiming

{0.930523, Null}

counts == pos == a3

True

Can't beat kglr's solution, though. It takes 0.41 s.

Anjan Kumar
  • 4,979
  • 1
  • 15
  • 28
5

Pick is optimized to work on packed arrays. Let's try it. As a self-contained function:

fn[x_, y_] :=
 Module[{a, b, m, n, toRow},
   {m, n} = Max /@ {x, y};
   {a, b} = Flatten /@ {x, y};
   toRow = SparseArray[Normal @ KeyDrop[0] @ Counts @ #, {n}] &;
   Array[toRow @ Pick[b, a, #] &, m]
 ]

Output matches kglr's:

img1 = Import["https://i.stack.imgur.com/oKpri.png"];
img2 = Import["https://i.stack.imgur.com/yzvfv.png"];

{seg1, seg2} = segmentImage /@ {img1, img2};

fn[seg1, seg2] == counts  (* True *)

Performance is superior by more than an order of magnitude:

fn[seg1, seg2] // RepeatedTiming // First

Normal[SparseArray[
    SparseArray[#, {50}] &@
       DeleteCases[Normal[Counts[Flatten[(1 - Unitize[seg1 - #]) seg2]]], 
        0 -> _] & /@ Range[51]]] // RepeatedTiming // First
0.026

0.488


The code above still scans the data fifty times. That's usually a bad idea. With more segments this might become an issue. As an attempt to avoid the problem I'll use Szabolcs's function from How to efficiently find positions of duplicates? (as pos) as a substitute for Pick:

fn2[x_, y_] :=
  Module[{a, b, n, pos, toRow},
    n = Max[y];
    {a, b} = Flatten /@ {x, y};
    pos[lst_] := GatherBy[Range @ Length @ lst, lst[[#]] &];
    toRow = SparseArray[Normal @ KeyDrop[0] @ Counts @ #, {n}] &;
    toRow @ b[[#]] & /@ Rest @ pos @ a
  ]

Test:

fn[seg1, seg2] == fn2[seg1, seg2]

fn2[seg1, seg2] // RepeatedTiming // First
True

0.024

Only a slight edge over fn here but I expect it to pull ahead as the number of segments rises.

Mr.Wizard
  • 271,378
  • 34
  • 587
  • 1,371
  • This is still about 3-4 times slower than my method. Also, I would pull the definition of pos outside of fn2, as there is no reason to define pos every time you run fn2. – Carl Woll May 22 '17 at 16:21
  • @Mr.Wizard this is great ! +1 – Ali Hashmi May 22 '17 at 16:24
  • @CarlWoll That's weird, I completely overlooked your post. Sorry! I agree about externalizing pos but I wanted self-contained functions for the sake of this post. – Mr.Wizard May 22 '17 at 16:29
4

OK, as a learning exercise in associations I worked through this, and ended up with a faster result than the @kglr one. Count me surprised!

RepeatedTiming[
  cr = {seg1 // Flatten, seg2 // Flatten} // Transpose // Counts; 
  sa = SparseArray[ KeySelect[cr, FreeQ[0]] // Normal] // Normal;
]

(* out[11]={0.153, Null} *)

@kglr's

AbsoluteTiming[counts = Normal[SparseArray[SparseArray[#, {50}] &@DeleteCases[
   Normal[Counts[Flatten[(1 - Unitize[seg1 - #]) seg2]]], 0 -> _] & /@ Range[51]]];]

(* out[11]={0.478285, Null} *)

The matrices match (minus the fact that I did not pad the sa matrix... the last row is all zeros)

sa == Counts // Most
(* true *)
Mr.Wizard
  • 271,378
  • 34
  • 587
  • 1,371
MikeY
  • 7,153
  • 18
  • 27
1

This approach works for the posted question as well as for cases when there are gaps in labels. What i mean by gaps is that the first or the second segmented mask may have a structure as follows: {1,2,3 ... 10,13,14,20} (11,12 as well as 16-19 are the missing labels).

overlapMatrix[seg1_,seg2_]:=Block[{keys1,keys2,mask,map,rules1,rules2},
keys1 = Keys@ComponentMeasurements[seg1, "Label"];
keys2 = Keys@ComponentMeasurements[seg2, "Label"];
mask= Unitize[seg1*seg2];
map = Normal@Counts@Thread[{SparseArray[mask*seg1]["NonzeroValues"],SparseArray[mask*seg2]["NonzeroValues"]}];
{rules1,rules2}= Dispatch@Thread[# -> Range@Length@#]&/@{keys1,keys2};
map[[All,1,1]]=map[[All,1,1]]/.rules1;
map[[All,1,2]]=map[[All,1,2]]/.rules2;
Normal@SparseArray[map,{Length@keys1,Length@keys2}]
];

Applying it over the above images

overlapMatrix[seg1,seg2]

(* {0.167852, Null} *)

Ali Hashmi
  • 8,950
  • 4
  • 22
  • 42
0

Transpose accepts a second parameter, which you can use to shuffle the values around a bit. I have used it to merge the value of the two arrays together in a list.

Normal[SparseArray[
   DeleteCases[Normal[
       Counts[Flatten[Transpose[{seg1, seg2}, {3, 2, 1}], {1, 2}]]],
       Rule[{0,0}|{0,_}|{_,0}, _]
   ], 
   {Max[seg1], Max[seg2]} 
]]