Is it possible to improve the current ClassifierFunction with more training data without running the previous training again?
In this way i will be able to deal with very large datasets in chunks.
Is it possible to improve the current ClassifierFunction with more training data without running the previous training again?
In this way i will be able to deal with very large datasets in chunks.
Here is an example of reclassify using neural network, modified from the documentation example of MNIST dataset.
First define the neural net
lenet = NetChain[{
ConvolutionLayer[20, {5, 5}],
ElementwiseLayer[Ramp],
PoolingLayer[{2, 2}, {2, 2}],
ConvolutionLayer[50, {5, 5}],
ElementwiseLayer[Ramp],
PoolingLayer[{2, 2}, {2, 2}],
FlattenLayer[],
DotPlusLayer[500],
ElementwiseLayer[Ramp],
DotPlusLayer[10],
SoftmaxLayer[]},
"Output" -> NetDecoder[{"Class", Range[0, 9]}],
"Input" -> NetEncoder[{"Image", {28, 28}, "Grayscale"}]
]
We take 10000 training examples and separated into two training sets
resource = ResourceObject["MNIST"];
{trainingData1, trainingData2} =
Partition[
RandomSample[ResourceData[resource, "TrainingData"], 10000], 5000];
testData = RandomSample[ResourceData[resource, "TestData"], 1000];
Train on the first group
trained = NetTrain[lenet, trainingData1, MaxTrainingRounds -> 3];
Measure the accuracy
cm = ClassifierMeasurements[trained, testData];
cm["Accuracy"]
(* 0.964 *)
Now export the trained net into a wlnet file and clear it from Mathematica
Export["~/Downloads/trained.wlnet", trained]
(* "~/Downloads/trained.wlnet" *)
Clear[trained]
Load the trained net from the file
trained = Import["~/Downloads/trained.wlnet"]
and continue the training on the second training set
trained2 = NetTrain[trained, trainingData2, MaxTrainingRounds -> 3];
We now see an improved accurarcy
cm2 = ClassifierMeasurements[trained2, testData];
cm2["Accuracy"]
(* 0.978 *)