6

I need to compute medoids (based on SquaredEuclideanDistance for large lists of k-vectors (k = 14). My implementations so far are impractically slow.

Here is a simple implementation, where the aim is the clarity of the code rather than its performance1. The function medoidSimple takes one argument, cluster (a matrix of reals), and returns a pair consisting of cluster's medoid, and its index in cluster:

medoidSimple[cluster_List/;MatrixQ[cluster, (NumberQ[#] && Im[#] == 0)&]] :=
Module[
  {
     totalDistance
   , distances
   , indexOfMin
  }
  , totalDistance =
        Function[element,
                 Total[SquaredEuclideanDistance[element, #]& /@ cluster]]
  ; distances = totalDistance /@ cluster  (* BOTTLENECK *)
  ; indexOfMin = First[Ordering[distances, 1]]
  ; {cluster[[indexOfMin]], indexOfMin}
];

The bottleneck is the $O(N^2)$ computation of all pairwise distances.

I tried to optimize the code above by aborting the computation of the pairwise sums for any element once the cumulative total for that element exceeded the best minimum value found so far. To my surprise, this "optimization" made my code much slower. I give the code for this version (medoid) below, but here are the timing results:

SeedRandom[0];
cluster = RandomReal[{0, 1}, {5000, 14}];

Timing[medoidSimple[cluster];]
(* {34.428, Null} *)

Timing[medoid[cluster];]
(* {69.628, Null} *)

Next I tried compilation, but all my attempts failed for one reason or another. For example:

compiledTotalDistance = Compile[
    {{element, _Real, 1}, {cluster, _Real, 2}}
  , Total[SquaredEuclideanDistance[element, #]& /@ cluster]
];

compiledTotalDistance[First[cluster], cluster];

Mathematica graphics

I've run out of ideas. Any suggestions would be appreciated.


FWIW, here's the code for my failed optimization.

The main function (medoid) uses the helper function totalDistances. The latter takes as input a point (base), a list of points (peers), and an upper bound (limit). The function returns Infinity if the sum of all the pairwise SquaredEuclideanDistance's between base and the points in peers exceeds limit; otherwise it returns this sum. (The computation of the sum is aborted as soon as soon as its value exceeds limit.)

totalDistance[base_, peers_, limit_] := Module[{accumulator}
  , accumulator[partialSum_, element_] := Module[{nextSum}
      , nextSum = partialSum + SquaredEuclideanDistance[base, element]
      ; If[nextSum > limit, Throw[Infinity], nextSum]
    ]
  ; Catch[Fold[accumulator, 0, peers]]
];

medoid[cluster_] := Module[
  {
     accumulator
   , best
   , medoid
   , indexOfMedoid
  }
  , accumulator[bestSoFar_, index_] := Module[
      {
         limit = First[bestSoFar]
       , element = cluster[[index]]
       , candidate
      }
      , candidate = totalDistance[element, cluster, limit]
      ; If[candidate < limit, {candidate, index}, bestSoFar]
    ]
  ; best = Fold[accumulator, {Infinity, Null}, Range[Length[cluster]]]
  ; indexOfMedoid = best[[2]]
  ; medoid = cluster[[indexOfMedoid]]
  ; {medoid, indexOfMedoid}
];

UPDATE

Summary of (absolute) timings:

medoidSimple->27.8218
medoidMeyer1->3.66165
medoidMeyer2->1.99197
medoidCiao->0.67927
medoidBlochwave1->0.697672
medoidBlochwave2->3.94929
medoidBlochwave3->0.609076

The definitions corresponding to the names above follow.

medoidSimple[
   cluster_List /; MatrixQ[cluster, (NumberQ[#] && Im[#] == 0) &]] := 
  Module[{totalDistance, distances, indexOfMedoid}, 
   totalDistance = 
    Function[element, 
     Total[SquaredEuclideanDistance[element, #] & /@ cluster]]; 
   distances = totalDistance /@ cluster; 
   indexOfMedoid = 
    First[Ordering[distances, 1]]; {cluster[[indexOfMedoid]], 
    indexOfMedoid}];

medoidMeyer1[cluster_] := 
  Block[{distances, indexOfMin}, 
   distances = Total@distMatCompiled[cluster];
   indexOfMin = First[Ordering[distances, 1]];
   {cluster[[indexOfMin]], indexOfMin}];

medoidMeyer2 = Compile[{{cluster, _Real, 2}},
  Block[{len = Length[cluster], bestInd = 1, bestVal, partsum, j},
   bestVal = 
    Total[Table[
      Function[x, x.x][cluster[[1]] - cluster[[i]]], {i, len}]];
   Do[
    partsum = 0.;
    j = 1;
    While[j <= len && partsum < bestVal,
     partsum += Function[x, x.x][cluster[[i]] - cluster[[j]]];
     j++
     ];
    If[j == len + 1 && partsum < bestVal, bestVal = partsum; 
     bestInd = i]
    , {i, 2, len}
    ];
   bestInd
   ]
  , CompilationTarget -> "C"
  ];

medoidCiao = 
  With[{m = #, 
     d = Tr /@ 
       DistanceMatrix[#, 
        DistanceFunction -> SquaredEuclideanDistance]}, {First@m[[#]],
        First@#} &@Pick[Range@Length@d, d, Min@d]] &;

medoidBlochwave1[cluster_] := 
  Module[{distances, indexOfMin}, 
   distances = 
    Tr /@ DistanceMatrix[cluster, 
      DistanceFunction -> SquaredEuclideanDistance];
   indexOfMin = First[Ordering[distances, 1]];
   {cluster[[indexOfMin]], indexOfMin}];

medoidBlochwave2[cluster_] := 
  Module[{distances, indexOfMin}, 
   distances = 
    Total@With[{tr = Transpose[cluster]}, 
      Function[point, Total@Abs[(point - tr)^2]] /@ cluster];
   indexOfMin = First[Ordering[distances, 1]];
   {cluster[[indexOfMin]], indexOfMin}];


myDistMatrix = Compile[{{point, _Real, 1}, {tr, _Real, 2}}, 
                 Total[(point - tr)^2], 
                 RuntimeOptions -> "Speed", 
                 RuntimeAttributes -> {Listable}, 
                 Parallelization -> True];
medoidBlochwave3[cluster_] := 
  Module[{distances, indexOfMin}, 
   distances = Tr /@ myDistMatrix[cluster, Transpose[cluster]];
   indexOfMin = First[Ordering[distances, 1]];
   {cluster[[indexOfMin]], indexOfMin}];

TableForm[(# -> 
     First[AbsoluteTiming[ToExpression[#][cluster];]]) & /@ {"medoidSimple", 
   "medoidMeyer1", "medoidMeyer2", "medoidCiao", "medoidBlochwave1", 
   "medoidBlochwave2", "medoidBlochwave3"}]

1 Well, not exactly. The implementation of medoidSimple does do one thing for the sake of performance: it uses the sum of all pairwise distance, rather than their mean value, as the statistic to optimize. In this case, both strategies are equivalent, since all sets of pairwise distances have the same number of elements.

kjo
  • 11,717
  • 1
  • 30
  • 89

3 Answers3

5

If you're on 10.3+, this should be faster (it's two orders of magnitude faster than your first example on my loungebook):

medioid=With[{m = #, d = Tr /@ DistanceMatrix[#, DistanceFunction ->SquaredEuclideanDistance]},
             {First@m[[#]], First@#} &@Pick[Range@Length@d, d, Min@d]]&;
ciao
  • 25,774
  • 2
  • 58
  • 139
5

Update: You can go even faster with Compile, and exploit the Listable and Parallelization attributes to great effect, if you have a multi-core machine:

SeedRandom[0];
cluster = RandomReal[{0, 1}, {5000, 14}];

myDistMatrix = Compile[{{point, _Real, 1}, {tr, _Real, 2}}, 
                 Total[(point - tr)^2], 
                 RuntimeOptions -> "Speed", 
                 RuntimeAttributes -> {Listable}, 
                 Parallelization -> True,
                 CompilationTarget -> "C"];
medioidCompile[cluster_] := 
  Module[{distances, indexOfMin}, 
   distances = Tr /@ myDistMatrix[cluster, Transpose[cluster]];
   indexOfMin = First[Ordering[distances]];
   {cluster[[indexOfMin]], indexOfMin}];

AbsoluteTiming[medioidCompile[cluster];]
(* 0.35 seconds *)
(* Original: 29.05 seconds *)

If you have Mathematica 10.3, you can use the experimental function DistanceMatrix for a speed-up of 50x:

medioidFast[cluster_] := 
  Module[{distances, indexOfMin}, 
   distances = Tr/@DistanceMatrix[cluster, DistanceFunction -> SquaredEuclideanDistance];
   indexOfMin = First[Ordering[distances]];
   {cluster[[indexOfMin]], indexOfMin}];

AbsoluteTiming[medioidFast[cluster];]
(* 0.521 seconds *)
(* Original: 29.05 seconds *)

(* Also, for fun: *)
medioidFast[cluster]; // RepeatedTiming (* 0.522 seconds *)
medioidCiao[cluster]; // RepeatedTiming (* 0.522 seconds *)

For versions prior to 10.3, or if you don't want to use an experimental function, then the following solution isn't bad. Taking inspiration from Leonid's answer here, you can achieve a speed-up of 12x:

medioid[cluster_] := Module[{distances, indexOfMin}, 
         distances = Tr/@With[{tr = Transpose[cluster]}, 
               Function[point, Total[(point - tr)^2]] /@ cluster];
         indexOfMin = First[Ordering[distances, 1]]; 
         {cluster[[indexOfMin]], indexOfMin}];

AbsoluteTiming[medioid[cluster];]
(* 2.44 seconds *)
(* Original: 29.05 seconds *)
dr.blochwave
  • 8,768
  • 3
  • 42
  • 76
4

This is much faster:

distMatCompiled = Compile[{{cluster, _Real, 2}},
  Outer[Function[diff, diff.diff][#1 - #2] &, cluster, cluster, 1, 1]
  , CompilationTarget -> "C"
  ]

medoidCompiled[cluster_] := Block[{distances, indexOfMin},
  distances = Total@distMatCompiled[cluster];
  indexOfMin = First[Ordering[distances, 1]];
  {cluster[[indexOfMin]], indexOfMin}
  ]

On my machine:

SeedRandom[0];
cluster = RandomReal[{0, 1}, {5000, 14}];

AbsoluteTiming[m1 = medoidSimple[cluster]]
(* 48.116032 *)

AbsoluteTiming[m2 = medoidCompiled[cluster]]
(* 6.779268 *)

m1 == m2
(* True *)

The reason you couldn't compile efficiently in your example is that SquaredEuclideanDistance is not in the list of compilable functions.

UPDATE:

Just for fun I wanted to implement your optimization idea:

medoidCompiled2 = Compile[{{cluster, _Real, 2}},
  Block[{len = Length[cluster], bestInd = 1, bestVal, partsum, j},
   bestVal = 
    Total[Table[
      Function[x, x.x][cluster[[1]] - cluster[[i]]], {i, len}]];
   Do[
    partsum = 0.;
    j = 1;
    While[j <= len && partsum < bestVal,
     partsum += Function[x, x.x][cluster[[i]] - cluster[[j]]];
     j++
     ];
    If[j == len + 1 && partsum < bestVal, bestVal = partsum; 
     bestInd = i]
    , {i, 2, len}
    ];
   bestInd
   ]
  , CompilationTarget -> "C"
  ]

It performs a bit better than my first attempt:

AbsoluteTiming[m3 = medoidCompiled2[cluster]]
(* 4.286030 *)

Last[m1] == m3
(* True *)

Of course this cannot beat the insane timings ciao and blochwave have come up with ;)

Marius Ladegård Meyer
  • 6,805
  • 1
  • 17
  • 26