5

I am trying to understand NNMF (Non-Negative Matrix Factorization). This is not a built-in function in Mathematica, but there is a package that implements it, which is refered to in this post. The package is loaded by:

Import["https://raw.githubusercontent.com/antononcube/MathematicaForPrediction/master/NonNegativeMatrixFactorization.m"]

The problem that NNMF tries to solve is this: given a matrix $X$, factor it as $W.H$ where $W$ and $H$ both have all positive entries.

But when I try to apply this using the package, I cannot figure out what is happening. First, construct a matrix $x$ -- I build it random, but of low rank (rank 5):

xKer = RandomInteger[{0, 10}, {5, 5}];
xL = RandomInteger[{0, 10}, {50, 5}];
xR = RandomInteger[{0, 10}, {5, 100}];
x = xL.xKer.xR;
Dimensions[x]
MatrixRank[x]

So you can see $x$ is 50 by 100, but is of rank only 5. Applying the NNMF command from the package:

{w, h} = GDCLS[x, 5, "MaxSteps" -> 1000];
Dimensions[w]
Dimensions[h]

So we can see that $w.h$ has the same dimensions as $x$. But

Norm[w.h - x]

is very large, so $w.h$ is not a good approximation to $x$.

Thus my questions: why doesn't NNMF seem to work? Am I expecting the wrong thing?

bill s
  • 68,936
  • 4
  • 101
  • 191
  • 4
    Maybe x simply cannot be factored this way? Moreover, it is more realistic to condsider a relative error measure. E.g., Norm[w.h - x, "Frobenius"]/Norm[x, "Frobenius"] returns 0.00326206 which is not that bad... With MaxSteps -> 10000, one can get down to 0.00075928 or so. – Henrik Schumacher Mar 06 '19 at 20:41
  • If you create x = xL.xR then it for sure can be expressed as w.h, and there is still significant error in the Norm. But maybe you are right, the error is small compared to the size of x. – bill s Mar 06 '19 at 20:48
  • 1
    @HenrikSchumacher beat me to it! (BTW, the automatic precision goal is 4.) – Anton Antonov Mar 06 '19 at 20:58
  • "This is not a built-in function in Mathematica, but there is a package that implements it [...]" -- see the implementation and documentation "NonNegativeMatrixFactorization" published 12 days ago at Wolfram Function Repository. – Anton Antonov Jan 01 '20 at 16:20

1 Answers1

7

Thank you for using that package!

The stopping criteria is based on relative precision. Find the lines:

 ....
 normV = Norm[V, "Frobenius"]; diffNorm = 10 normV;
 If[ pgoal === Automatic, pgoal = 4 ];      
 While[nSteps < maxSteps && TrueQ[! NumberQ[pgoal] || NumberQ[pgoal] && (normV > 0) && diffNorm/normV > 10^(-pgoal)],
   nSteps++;
   ...

in the implementation code. Note the condition diffNorm/normV > 10^(-pgoal).

Here is an example based on question’s code:

SeedRandom[2343]
xKer = RandomInteger[{0, 10}, {5, 5}];
xL = RandomInteger[{0, 10}, {50, 5}];
xR = RandomInteger[{0, 10}, {5, 100}];
x = xL.xKer.xR;
Dimensions[x]
MatrixRank[x]

(* {50, 100} *)

(* 5 *)

Options[GDCLS]

(* {"MaxSteps" -> 200, "NonNegative" -> True, 
 "Epsilon" -> 1.*10^-9, "RegularizationParameter" -> 0.01, 
 PrecisionGoal -> Automatic, "PrintProfilingInfo" -> False} *)

AbsoluteTiming[
 {w, h} = GDCLS[x, 5, PrecisionGoal -> 3, "MaxSteps" -> 100000];
 {Dimensions[w], Dimensions[h]}
]

(* {19.759, {{50, 5}, {5, 100}}} *)

Norm[w.h - x]/Norm[x]

(* 0.000939317 *)
Anton Antonov
  • 37,787
  • 3
  • 100
  • 178