Bug introduced in 8.0 or earlier and fixed in 11.2.0
CorrelationDistance[{-53.0, 4.3}, {-23.0, 5.4}] is a negative number. It is close to zero, to be sure, but negative. Shouldn't all distance functions guarantee a non-negative result?
Bug introduced in 8.0 or earlier and fixed in 11.2.0
CorrelationDistance[{-53.0, 4.3}, {-23.0, 5.4}] is a negative number. It is close to zero, to be sure, but negative. Shouldn't all distance functions guarantee a non-negative result?
This is again a situation where one needs to use a stable formula for computations.
Let's look at the result of CorrelationDistance[] again for reference:
v1 = {-53.0, 4.3}; v2 = {-23.0, 5.4};
CorrelationDistance[v1, v2] // InputForm
-2.220446049250313*^-16
Note that there is a simple relationship between CorrelationDistance[] and CosineDistance[]:
CosineDistance[v1 - Mean[v1], v2 - Mean[v2]] // InputForm
-2.220446049250313*^-16
It's the same result. Let's look at the result of using explicit formulae:
c1 = v1 - Mean[v1]; c2 = v2 - Mean[v2];
1 - c1.c2/(Norm[c1] Norm[c2])
-2.220446049250313*^-16
1 - Normalize[v1 - Mean[v1]].Normalize[v2 - Mean[v2]]
0.
The second explicit formula gives the correct result.
Still, the need to subtract two quantities that are very nearly equal should give one much reluctance. Thus, here is a stable algorithm, derived from work by Velvel Kahan:
cosDistance[v1_?VectorQ, v2_?VectorQ] :=
Module[{n1 = Normalize[v1], n2 = Normalize[v2], y},
y = Norm[n1 - n2]^2; 2 y/(Norm[n1 + n2]^2 + y)]
correlationDistance[v1_?VectorQ, v2_?VectorQ] :=
cosDistance[v1 - Mean[v1], v2 - Mean[v2]]
and thus
correlationDistance[v1, v2]
0.
The expected result is exactly zero.
Simplify[CorrelationDistance[{x, y} , {p, q}],
Assumptions -> {Element[{x, y, p, q}, Reals], x < y, p < q}]
0
The result , -2.22045*10^-16 is zero to within machine precision. Use Chop if you like.
FindClusters) that expect a strictly non-negative value though. For this reason I would consider it a bug, or at least worth reporting.
– Szabolcs
Nov 07 '17 at 16:23
Complex result with complex input. The imaginary part is 0., but the head is Complex, which may again cause trouble.
– Szabolcs
Nov 07 '17 at 16:26
0.0) would likely take less time than a C-language function call (!), not to mention just initiating a Mathematica evaluation.
– Szabolcs
Nov 07 '17 at 16:58
CorrelationDistance[{-53.0, 4.3},{-23.0, 5.4}]Does that help? – Scott Guthery Nov 08 '17 at 00:05CorrelationDistance @@ Rationalize[{{-53.0, 4.3}, {-23.0, 5.4}}, 0]– Bob Hanlon Nov 08 '17 at 13:06