11

Is it possible to improve the current ClassifierFunction with more training data without running the previous training again?

In this way i will be able to deal with very large datasets in chunks.

user13892
  • 9,375
  • 1
  • 13
  • 41

1 Answers1

6

Here is an example of reclassify using neural network, modified from the documentation example of MNIST dataset.

First define the neural net

lenet = NetChain[{
   ConvolutionLayer[20, {5, 5}],
   ElementwiseLayer[Ramp],
   PoolingLayer[{2, 2}, {2, 2}],
   ConvolutionLayer[50, {5, 5}],
   ElementwiseLayer[Ramp],
   PoolingLayer[{2, 2}, {2, 2}],
   FlattenLayer[],
   DotPlusLayer[500],
   ElementwiseLayer[Ramp],
   DotPlusLayer[10],
   SoftmaxLayer[]},
  "Output" -> NetDecoder[{"Class", Range[0, 9]}],
  "Input" -> NetEncoder[{"Image", {28, 28}, "Grayscale"}]
  ]

We take 10000 training examples and separated into two training sets

resource = ResourceObject["MNIST"];
{trainingData1, trainingData2} = 
  Partition[
   RandomSample[ResourceData[resource, "TrainingData"], 10000], 5000];
testData = RandomSample[ResourceData[resource, "TestData"], 1000];

Train on the first group

trained = NetTrain[lenet, trainingData1, MaxTrainingRounds -> 3];

Measure the accuracy

cm = ClassifierMeasurements[trained, testData];
cm["Accuracy"]
(* 0.964 *)

Now export the trained net into a wlnet file and clear it from Mathematica

Export["~/Downloads/trained.wlnet", trained]
(* "~/Downloads/trained.wlnet" *)
Clear[trained]

Load the trained net from the file

trained = Import["~/Downloads/trained.wlnet"]

and continue the training on the second training set

trained2 = NetTrain[trained, trainingData2, MaxTrainingRounds -> 3];

We now see an improved accurarcy

cm2 = ClassifierMeasurements[trained2, testData];
cm2["Accuracy"]
(* 0.978 *)
xslittlegrass
  • 27,549
  • 9
  • 97
  • 186