21

The document centre seems only explain how to use these functions, but just in an very brief way. I know Mathmatica is not open source, so we can not expect to see the code of the function, but are there any materials that describe the internal algorithm in more detailed informations? Like how does Mathematica decide to use a logistic model or a Markov model to do the classify or prediction.

J. M.'s missing motivation
  • 124,525
  • 11
  • 401
  • 574
m00nlight
  • 1,582
  • 1
  • 12
  • 22

2 Answers2

24

If you want to have a description of the method used by a given ClassifierFunction you can do:

ClassifierInformation[myclassifier, "MethodDescription"]

Also, the methods used are quite classic, so you can easily find documentation on the web.

If you want to know why Classify uses a given model there is a simple answer: Classify tries to find the model that has the highest likelihood on unseen data (that is on test sets). In a nutshell, Classify first selects possible candidates (from heuristics, depending on the characteristics of data). Then the models compete against each other using cross validation techniques, and the best model is selected. There are subtleties in the automation though (not every model get all the data for speed reason etc.), and we intend to make it smarter in the future, which is the reason we did not give a precise description in the documentation.

rcollyer
  • 33,976
  • 7
  • 92
  • 191
Etienne Bernard
  • 256
  • 2
  • 2
-2

Simply run DownValues[Classify]

EDIT: it's strange, DownValues does not work. Anyway, I have a dump of Classify (version 10.4) made using this tool. You can find it here.

RE-EDIT: the link was broken, here's a working one

JavierG
  • 541
  • 4
  • 10
  • 1
    Hmm, I get {} -- Means nothing to me. Mean anything to you?? – Michael E2 Sep 06 '16 at 02:33
  • 2
    I don't think this is what he was looking for... – ktm Sep 06 '16 at 03:53
  • 1
    @MichaelE2 I just edited my post – JavierG Sep 06 '16 at 09:55
  • @Piruzzolo Spelunk[Classify] still returns output, although the algorithms must be in sub-functions, not just the down-values of Classify. Maybe saying more about how you generated the dump would be helpful. My lame attempt to download your file yielded an error. -- BTW, the DV is not mine. I usually only downvote non-spam answers if they're dangerously wrong/misleading and the author doesn't respond to feedback. I thought maybe you'd forgotten something like Unprotect@Classify; ClearAttributes[Classify, ReadProtected]; Protect@Classify;. This still doesn't get the internal algorithms. – Michael E2 Sep 06 '16 at 11:38
  • @MichaelE2 fixed link – JavierG Sep 06 '16 at 12:16
  • @Piruzzolo I should have been clearer -- I get a file (from both links), but StuffItExpander (on Mac OSX) complains it's damaged. (It's a reason link-only answers tend to be discouraged, what should be perfectly portable isn't always.) -- I tried updating StuffIt but now it itself seems to be broken...Sorry – Michael E2 Sep 06 '16 at 12:54
  • You can try 7Zip on deskop – JavierG Sep 06 '16 at 13:09
  • @MichaelE2 I use The Unarchiver. The free version can only extract, and it won't show the contents of the archive before extractions (an annoying misfeature of many OS X archive extractors IMO). But it has proven dependable so far which I cannot say for the alternatives I tried. – Szabolcs Sep 07 '16 at 09:58
  • @Szabolcs Thanks. It worked. – Michael E2 Sep 07 '16 at 10:27
  • Anyway, I remember I used Spelunk... – JavierG Sep 07 '16 at 10:53