11

I'm trying to reproduce the result giving by a Predict function with a Neural Network as the chosen method.

So, I'm training from this set:

trainingset = {1 -> 2, 2 -> 3, 3 -> 6, 4 -> 8};

and train a simple NN with just one HL with 1 linear neuron:

p = Predict[trainingset, Method -> {"NeuralNetwork", "HiddenLayers" -> {{1, "Linear"}}}]

By using Options I can have access to the resulting weights from:

Normal[Options[p][[1]]][[7]]

"Models" -> {<|"Method" -> "NeuralNetwork", "NeuralNetwork" -> 
<|"NeuronTypes" -> {"Linear", "Linear"}, "CostFunction" -> "SquaredCost", 
 "NumberOfNodes" -> {1, 1, 1}, "L1Regularization" -> 0, "L2Regularization" -> 0.1, 
 "DropOut" -> False, "Weights" -> {{{-0.9752752696739045}}, 
   {{-0.9752752696739043}}}, "Biases" -> {{-5.738324013344273*^-17}, 
   {-1.5516692498423558*^-17}}, "TrainingCostHistory" -> 
  {1.8283207510222368, 0.4788249850114622, 0.04910194009988826, 
   0.042738345120368906, 0.041599875787272395, 0.0410427527972165, 
   0.03861566085385098, 0.03761310711018569, 0.03599253089639499, 
   0.035775402536093924, 0.03573690274038513, 0.03573478075597261, 
   0.035734174870756356, 0.03573417449687247, 0.035734174496067674, 
   0.035734174496065876, 0.03573417449606588, 0.03573417449606589, 
   0.03573417449606589, 0.03573417449606589, 0.03573417449606589, 
   0.03573417449606589}, "TestCostHistory" -> {}|>, "EarlyStopping" -> False, 
"MaxIterations" -> 2250, "FeatureIndices" -> All, "DistributionData" -> 
{NormalDistribution, 0.1785506703441404}, "FeaturePreprocessor" -> 
MachineLearning`PackageScope`Preprocessor["Standardize", {{2.5}, {Sqrt[5/3]}}], 
"ExtractedFeatureNumber" -> 1|>}

They are both -0.9752. The predict function p gives me as a result:

 p[2]=3.73555

and now I'm trying to obtain this same result by calculation it by hand. So, since it is a linear NN, and because the biases are nearly zero, we should have.

output = (x * w1) * w2 = (2 * -0.9752) * -0.9752 = 1.90203

So, what am I doing wrong?

Carl Lange
  • 13,065
  • 1
  • 36
  • 70
Miguel
  • 981
  • 5
  • 13
  • 1
    Are you sure you can just shove 2 in there directly? From your code it looks like features are standardized first. Practically all these options are undocumented too, I'd be interested in the solution as well. – Histograms May 28 '15 at 01:12
  • Not an answer, but another related question: Where can I find the documentation about predict using ANN? That is, what options are available? How do I specify the number and size of the hidden layers? How to specify the activation functions for the different layers?, etc. Thanks in advance. – Juan Flores Jan 11 '18 at 14:28
  • In version 12, this code does not seem to work anymore. Is there a way to obtain the weights from a NN in version 12? – Whelp Apr 29 '19 at 08:43

1 Answers1

5

As @Histograms mentioned in the comment, the features are standardized. Actually, both the input and output are standardized.

In the example, the mean and standard deviation of the input is:

input = {1, 2, 3, 4};
μ1 = Mean[input]
σ1 = StandardDeviation[input]
(* 5/2 *)
(* Sqrt[5/3] *)

so the linear function that standardized the input is

f1 = (# - μ1)/σ1 &

while the function that inverts this process is

g1 = μ1 + σ1 # &

The same for output:

output = {2, 3, 6, 8};
μ2 = Mean[output];
σ2 = StandardDeviation[output];  
f2 = (# - μ2)/σ2 &;
g2 = μ2 + σ2 # &;

So the prediction can be calculated as:

g2[(-0.975275)*(-0.975275)*f1[2]]
(* 3.73555 *)

The same as the value from Predict. More about processors can be found here.

xslittlegrass
  • 27,549
  • 9
  • 97
  • 186