37

In version 10.2 there is a new experimental function: FindFormula[].

I suspect that a genetic programming algorithm (symbolic regression) is behind this new feature, but I can't find any references.

Question

  • What is behind this new function?
xyz
  • 605
  • 4
  • 38
  • 117
vonjd
  • 1,595
  • 15
  • 22
  • 1
    It was inevitable that someone would come along and ask this… – J. M.'s missing motivation Jul 17 '15 at 17:04
  • 6
    FWIW I used this until it became paid – Dr. belisarius Jul 17 '15 at 17:07
  • @Guesswhoitis.: The thing is that I have been working in this area for quite some time and I was quite delighted when I saw this new feature. – vonjd Jul 17 '15 at 17:07
  • @belisarius: Yeah, this is quite a good program! – vonjd Jul 17 '15 at 17:08
  • Another question is whether this can be used in WolframAlpha (Pro)? I didn't find it there. – vonjd Jul 17 '15 at 17:18
  • 2
    Quite unlikely; I haven't ever seen them add functions to Alpha at the same time as in a new version of Mathematica. – J. M.'s missing motivation Jul 17 '15 at 17:28
  • 12
    I think it builds Ill-formed questions from a cursory scan, posts them here, scrapes any answers, and returns the result.... :-) – ciao Jul 17 '15 at 19:12
  • @belisarius Let me guess... now Eureqa costs an arm and a leg in individual use? It's never a great sign when you have trouble finding pricing from a seemingly simple web site. – kirma Jul 18 '15 at 05:39
  • 1
    @kirma It was free for many years while they were developing the product. It is (was) an incredible piece of software. I threw in clouds of (almost ) nonsensical points and received back unbelievable correlations. Now they aim high an sell a mixture of product and cloud (sounds familiar?) to fortune 500 and big labs. A pity. No "individual" license the last time I checked – Dr. belisarius Jul 18 '15 at 05:56
  • 1
    @belisarius: I even helped them in the start-up phase and received an academic license. This has expired now and I emailed them to renew it - no answer so far. I am really disappointed. – vonjd Jul 18 '15 at 06:40
  • 2
    @vonjd Hope you'll get it! There shouldn't be that difficult to be generous – Dr. belisarius Jul 18 '15 at 06:43
  • This related product does a search over 3665 buil-in equations and returns the best – Gustavo Delfino Sep 11 '15 at 19:39

3 Answers3

19

The Experimental function FindFormula[] at the moment is using a combination of different methods: it combines non linear regression with Markov chain Monte Carlo methods (e.g. Metropolis–Hastings algorithm). In the future (possibly in V$10.3$) there will be an option allowing the user to choose which method to use.

xyz
  • 605
  • 4
  • 38
  • 117
Giorgia
  • 206
  • 2
  • 3
  • 1
    Welcome to Mathematica.SE! – Michael E2 Jul 22 '15 at 18:25
  • 3
    Thank you. How do you know that? Are you in the development team? – vonjd Jul 22 '15 at 18:36
  • 1
    Just found this post. Interesting, Monte Carlo seems to be involved. I have a set of data (not very large, 121 data points) which delivers distinct results every time FindFormula is invoked. I first thought of a bug, but now I see it´s a feature ;-) – mgamer Apr 27 '17 at 16:29
9

I doubt that this is very robust. Consider a simple change in the DE example in the Documentation:

sol = y /. NDSolve[{y'[x] == y[x] Cos[x], y[0] == 2}, y, {x, -5, 300}][[1]];
times = N[Range[-5, 600]/9];
data = Transpose[{times, sol[times] + RandomReal[0.05, Length[times]]}];
lp = ListPlot[data, PlotRange -> All]

Now

FindFormula[data, x, 1, TargetFunctions -> {Exp, Sin, Cos}]

thinks the best solution is 2.27414 Sin[x] + 2.5479. Whereas a much better solution, obviously compatible with the selected TargetFunctions, is 2 Exp[Sin[x]].

xyz
  • 605
  • 4
  • 38
  • 117
TheDoctor
  • 2,832
  • 1
  • 14
  • 17
  • On the other hand, is FindFormula[] even smart enough to consider composing TargetFunctions? – J. M.'s missing motivation Jul 27 '15 at 13:47
  • 5
    You don't need to change the DE to get this. The behaviour is the same with the original. Every time I run it I get a different result, and that result is sometimes $e^{\sin{x}}$. The same with the modified one: put it in a Table[..., {20}] and very likely at least one of the results will be $2 e^{\sin x}$. (But requesting 20 functions within the FindFormula doesn't work nearly as well.) – Szabolcs Jul 28 '15 at 07:12
8

The following reveals definitions

<< GeneralUtilities`
PrintDefinitions@FindFormula

As usual one can click the symbols to find definitions of functions "further down". It should also be noted that FindFormula is listed in the Machine Learning guide, which corresponds to symbol names like SymbolicMachineLearning`PackageScope`ImputArgumentsTestFindFormula shown further down by PrintDefinitions.

Jacob Akkerboom
  • 12,215
  • 45
  • 79