Does anyone have knowledge of the O(n) time & space complexities for various model types supported by Classify[] and Predict[] during training and evaluation?
Here are a list of supported model types:
models = {"LogisticRegression", "Markov", "RandomForest",
"SupportVectorMachine", "NearestNeighbors", "NeuralNetwork",
"NaiveBayes"};
I tried using BenchmarkPlot[] but it is currently broken:
trainingData = ExampleData[{"MachineLearning", "MNIST"}, "TrainingData"];
Needs["GeneralUtilities`"]; Clear[g, f];
g[n_] := Classify[RandomSample[trainingData, n], Method -> "RandomForest"]
f[g_] := ClassifierInformation[g, "TrainingTime"]
BenchmarkPlot[f, g, {10, 100, 200, 400, 800, 1600, 3200, 6400}, "IncludeFits" -> True]
Update:
Even though BenchmarkPlot doesn't always work, it can on some inputs:
<<GeneralUtilities`
trainingData = ExampleData[{"MachineLearning", "MNIST"}, "TrainingData"];
plotComplexity[m_, trainingData_] := Module[{n,c,time,space,tdata,sdata},
n=Table[100*n,{n,1,50,2}];
c=Classify[RandomSample[trainingData,#],Method->m]&/@n;
time=QuantityMagnitude[ClassifierInformation[#,"TrainingTime"]]&/@c;
space=ByteCount/@c;
tdata=Thread[{n,time}]; sdata=Thread[{n,space}];
Return @ TextGrid @ {{BenchmarkPlot[sdata,"IncludeFits"->True,PlotLabel->m<>" Training Space"],
BenchmarkPlot[tdata,"IncludeFits"->True,PlotLabel->m<>" Training Time"]}}
]
Here's what the above code gives for numeric features:
trainingData = ExampleData[{"MachineLearning", "UCILetter"}, "TrainingData"];
models = {"LogisticRegression", "Markov", "RandomForest",
"SupportVectorMachine", "NearestNeighbors", "NeuralNetwork", "NaiveBayes"};
Column[plotComplexity[#, trainingData] & /@ models];


BenchmarkPlotin theGeneralUtilitiespackage may be helpful here. – Sjoerd C. de Vries Feb 01 '16 at 18:50BenchmarkPlotis broken. – Karsten7 Feb 02 '16 at 06:13