I have been working with a dataset of predictor functions -- mainly so I can have more insight in to what the functions are doing and how they behave.
Where I am struggling is using PredictorMeasurements -- ideally as another new column in the dataset. So the end result I am looking for is :
<|pFunction -> somePfunction, pMeasure -> PredictorMeasurements[somepFunction, someTestdata]|>
This is where I have got to so far - what am I missing?
header = {"Input", "Output"};
trainingset = {{1, 2}, {3, 4.5}, {5, 6}, {7, 8.5}};
testset = {{1.5, 2}, {4, 5}, {6, 5.5}};
methods = {"NearestNeighbors", "LinearRegression", "NeuralNetwork", "RandomForest"};
samps = {2, 4};
headers = {"sampsize", "method", "pFunction"};
(* Create Datasets of test and training sets *)
trainingsetDS =
Dataset[Flatten[AssociationThread[header -> #] & /@ trainingset]];
testsetDS =
Dataset[Flatten[AssociationThread[header -> #] & /@ testset]];
(* Create an association of inputs of form <|samplesize, method|> *)
predictorInputs =
Flatten[AssociationThread[headers[[1 ;; 2]] -> #] & /@ Tuples[{samps, methods}]];
(* Create a list of predictor functions *)
p =
Predict[
RandomSample[trainingsetDS, #"sampsize"] -> "Output",
Method -> #"method"] & /@ predictorInputs;
(* Create a dataset of predictor functions with some useful keys for querying*)
pDS = Dataset[
Flatten[AssociationThread[{"samplesize", "method", "pFunction"} -> #] & /@
Partition[Riffle[Flatten[Tuples[{samps, methods}]], p, 3], 3]]];
(* I can select results based on the predictor function information rather
than using the keys! *)
result =
pDS[
Select[
PredictorInformation[#pFunction, Method] == "NearestNeighbors" ||
PredictorInformation[#pFunction, "ExampleNumber"] == 2 &],
{"samplesize", "pFunction"}]
(* I've figured out how to get predictor measurement for 1 result using DS key "Output" *)
PredictorMeasurements[result[[1, 2]], testsetDS -> "Output", "StandardDeviation"]
Ideally I would like to get to a stage where I can filter the dataset of functions based on one or more properties of PredictorMeasurements just like I can the PredictorInformation.