28

I was hoping there was some way to generate the .params and .json file needed to define an MXNet model, from a network trained using NetTrain[] in Mathematica. I was hopeful because I found these functions in the NeuralNetworks` package:

NeuralNetworks`ToMXNetSymbol
NeuralNetworks`ToMXNetJSON

You can use them on a net link this:

<< NeuralNetworks`;
net = NetInitialize@NetChain[{
     ConvolutionLayer[20, {5, 5}],
     ElementwiseLayer[Ramp],
     PoolingLayer[{2, 2}, {2, 2}],
     FlattenLayer[],
     DotPlusLayer[500],
     ElementwiseLayer[Ramp]
     }, "Input" -> NetEncoder[{"Image", {32, 32}}]];
ImportString[Normal@NeuralNetworks`ToMXNetJSON[net][[1]], "RawJSON"]
NeuralNetworks`ToMXNetSymbol[net]

Now, ToMXNetJSON[] returns a tuple, and it looks like the first element is the JSON for the symbol file. But I don't know what the second element is, and I don't have a clue as to what ToMXNetSymbol[] is returning.

Motivation: A solution to this would enable one to take any net from MMA and run it in C or Python with GPU inference!

Mr.Wizard
  • 271,378
  • 34
  • 587
  • 1,371
user5601
  • 3,573
  • 2
  • 24
  • 56

1 Answers1

27

It seems the model file in MXNet (checkpoint) is defined by two files: a ".json" file and a ".params" file. The json file contains the definition of the network, and the params file contains the actual weight and bias of each neuron. The params file is in the binary format of MXNet's NDArray representation.

Thus, to export a network in Mathematica to MXNet, we need to generate these two files. The json file can be generated easily with the NeuralNetworks``ToMXNetJSON. The param file can be generated using the MXNetLink``NDArrayExport. Here is an example of this process using MNIST example in the documentation.

We first load the packages

<< MXNetLink`;
<< NeuralNetworks`;
<< GeneralUtilities`;

and define the network

net = NetChain[{
   ConvolutionLayer[20, {5, 5}],
   ElementwiseLayer[Ramp],
   PoolingLayer[{2, 2}, {2, 2}],
   ConvolutionLayer[50, {5, 5}],
   ElementwiseLayer[Ramp],
   PoolingLayer[{2, 2}, {2, 2}],
   FlattenLayer[],
   DotPlusLayer[500],
   ElementwiseLayer[Ramp],
   DotPlusLayer[10],
   SoftmaxLayer[]},
  "Output" -> NetDecoder[{"Class", Range[0, 9]}],
  "Input" -> NetEncoder[{"Image", {28, 28}, "Grayscale"}]
  ]

enter image description here

Then we train the network

resource = ResourceObject["MNIST"];
trainingData = ResourceData[resource, "TrainingData"];
testData = ResourceData[resource, "TestData"];
trained = 
 NetTrain[net, trainingData, ValidationSet -> testData, 
  MaxTrainingRounds -> 3]

We will now export the trained network into the MXNet's model files.

The json file can be exported using ToMXNetJSON

jsonPath = "~/Downloads/MNIST-symbol.json";
Export[jsonPath, ToMXNetJSON[trained][[1]], "String"]
(* "~/Downloads/MNIST-symbol.json" *)

The second part of ToMXNetJSON[trained] contains the weights of our network. However, the weights are in Mathematica's RawArray format, so we need to convert those into MXNet's NDArray format. Also we will drop the encoder layer, and change the names of the layers to comply with the convention

paraPath = "~/Downloads/MNIST-0000.params";
ass = KeyDrop[ToMXNetJSON[trained][[2]], ".Inputs.Input"];

f[str_] := 
 If[StringFreeQ[str, "Arrays"], str, 
  StringReplace[
   StringSplit[str, ".Arrays."] /. {a_, b_} :> 
     StringJoin[{"arg:", a, "_", b}], {"Weights" -> "weight", 
    "Biases" -> "bias"}]]

NDArrayExport[paraPath, NDArrayCreate /@ KeyMap[f, ass]]
(* "~/Downloads/MNIST-0000.params" *)

Now the two files "MNIST-symbol.json" and "MNIST-0000.params" can be used to load the network in MXNet.

To verify that the files are correct, we can use ImportMXNetModel to import MXNet model files we just generated.

trained2 = 
 NetGraph[{ImportMXNetModel[jsonPath, paraPath]}, {}, 
  "Input" -> NetEncoder[{"Image", {28, 28}, "Grayscale"}], 
  "Output" -> NetDecoder[{"Class", Range[0, 9]}]]

enter image description here

And we see that the network from the MXNet's model files produces the same predictions as our original network:

testsample = RandomSample[testData, 100];

(trained[Keys[#], "Probabilities"] & /@ 
   testsample) == (trained2[Keys[#], "Probabilities"] & /@ testsample)
(* True *)

Edit for version 11.1

It seems that the structure of the implementation of neural network has been updated in 11.1. The trained weight is no longer in the NeuralNetworks``ToMXNetJSON. The trained weight can be accessed using NeuralNetworks``ToNetPlan.

plan = ToNetPlan[trained]

enter image description here

So to export the weight, we can do

NDArrayExport[paraPath, NDArrayCreate /@ KeyMap[f, plan["WeightArrays"]]]
xslittlegrass
  • 27,549
  • 9
  • 97
  • 186
  • This no longer works in 11.1 would you be able to update your answer, please? I believe ToMXJSON replaces ToMXNetJSON and ToMXNetJSON[trained][[2]] should be ToMXJSON[trained][[3]], but ImportMXNetModel[] is still failing for me... – user5601 Mar 16 '17 at 20:52
  • The import call gives the warning: ''Inferred inconsistent shapes for array "Weights" (a length-4 vector versus a rank-4 tensor).'' – user5601 Mar 16 '17 at 20:57
  • 1
    Thanks for the update! Would you be able to give a working example for either NetModel["SqueezeNet V1.1 Trained on ImageNet Competition Data", "EvaluationNet"] and NetModel["GloVe 300-Dimensional Word Vectors Trained on Wikipedia and Gigaword-5 Data", "EvaluationNet"]? – user5601 Mar 17 '17 at 01:17
  • For some nets (liek GloVe) we need to encode the SpecialArrays and FixedArrays into the mxnet files, how might that be done? – user5601 Mar 17 '17 at 01:21
  • @user5601 I think you can just use the pretrained network for MXNet at https://github.com/dmlc/mxnet-model-gallery. I think this pretrained network is also what Mathematica uses in the NetModel too. For GloVe, the trained network is an embedding layer, and the weights can be extracted directly using NetExtract[net, "Weights"] – xslittlegrass Mar 17 '17 at 01:55
  • Your original answer was great, but I'd really appreciate another few written out examples for those nets with more complicated layer types and special and fixed arrays. The 500 points was a lot for me to offer! Thanks – user5601 Mar 17 '17 at 14:30
  • I'll go ahead and accept your solution but I still would greatly appreciate an extra example or two! – user5601 Mar 17 '17 at 17:29
  • @user5601 Thanks! I will try to add more examples once I get a better understanding of the internal implementation of neural network framework in Mathematica. – xslittlegrass Mar 17 '17 at 17:47
  • I sent the bounty your way, thanks for following up on this – user5601 Mar 18 '17 at 06:10
  • I take it you weren't able to figure this out yet – user5601 May 19 '17 at 21:11
  • But you haven't test the net in MXNet.Actually,it leads problem,ref https://mathematica.stackexchange.com/questions/155069/example-of-netchain-running-in-mxnet-cause-error – partida Sep 10 '17 at 02:44
  • @partida I suggest waiting until 11.2 to test. The underline implementation has been changed quite a lot, and it's not worth spending time now and later find out the code breaks in the new release. – xslittlegrass Sep 10 '17 at 03:13
  • @xslittlegrass Thank you for your suggestion,I do something else and wait the new release... – partida Sep 10 '17 at 03:31
  • @xslittlegrass The 11.2 is release,problem remains – partida Sep 17 '17 at 02:02
  • @partida I think it should be regarded as a bug then. Maybe you should report it to Wolfram. – xslittlegrass Sep 18 '17 at 16:25
  • Supposedly Export now supports this... however, I have an encoder on my net and even though the doc says it will automatically drop this, it doesnt and the export fails – SumNeuron Sep 22 '17 at 18:19
  • Also the "update" no longer works... the json file is just NetChain[<len_net_chain>] – SumNeuron Sep 22 '17 at 18:26
  • Hello,ToMxPlan is renamed as ToNetPlan and plan["ArgumentArrays"] is renamed by plan["WeightArrays"] – partida Apr 13 '18 at 03:17
  • @partida I updated the post. Thanks! – xslittlegrass Apr 15 '18 at 00:49