Before we get into the weeds, here's an example of doing exactly this, just to see how it works:
$testNB =
URLDownload[
"https://raw.githubusercontent.com/b3m2a1/mathematica-tools/master/\
NotebookToPackage.nb",
FileNameJoin@{
$TemporaryDirectory,
"NotebookToPackage.nb"
}
];
Import["https://raw.githubusercontent.com/b3m2a1/mathematica-tools/\
master/NotebookToPackage.m"
];
NotebookToPackage[NotebookOpen[$testNB]] // CreateDocument;
So happily we can fill in most of the package boiler plate with just the code in the notebook, under a few assumptions:
- All code in the
"Input" and "Code" cells should be exported to the package
- Every symbol defined in the
"Input" and "Code" cells that starts with a capital letter should be exported to the package at top-level
- The
Context for the new package should be based on the file-name of the notebook
Of course, we could always add Options and customizations to get around any of these restrictions, but maybe that's for another day.
The hardest part here will actually be to pull the usage statements. In our code we may have things like:
`Pkg`mFunc[Verbatim[HoldPattern][f_], a_+b_]:=
Module[{blahblah},
...
]
And how do we make a good usage message out of that?
I'd argue we'd want something like
`Pkg`mFunc::usage="mFunc[f] ..."`
So how do we get there. First off, we'll need a way to extract definitions from an expression. Since we don't want to capture things like:
Module[{tempFunc},
tempFunc[]:=...
]
We'll need to make sure to do it only at the first level. But first we'll need to remove any first-level CompoundExpression calls, as they'd prevent us from accessing the definitions. A second issue is that you can define things like
HoldPattern[f[x_]]:=x
So we have to strip things like HoldPattern too
Altogether we get a function, exprFindDefinitions, which looks something like this:
definitionSanitize[expr_] :=
HoldComplete[expr] //. {
Verbatim[Verbatim][e_] :> e,
Verbatim[HoldPattern][e_] :> e
} // Apply[HoldPattern];
definitionSanitize~SetAttributes~HoldAllComplete
exprFindDefinitions[expr_] :=
With[{eheld =
Replace[HoldComplete[expr],
HoldComplete[CompoundExpression[e___]] :>
HoldComplete[e]
]},
Flatten@{
Cases[eheld,
(SetDelayed | Set)[lhs_, rhs_] :>
(definitionSanitize[lhs] :> rhs)
],
Cases[eheld,
(TagSetDelayed | TagSet)[tag_, lhs_, rhs_] :>
({definitionSanitize[tag], definitionSanitize[lhs]} :> rhs)
],
Cases[eheld,
(UpSetDelayed | UpSet)[lhs : _[e__], rhs_] :>
Replace[
{(subPatternPullHead /@ HoldComplete[e] // ReleaseHold)},
HoldComplete[s_Symbol] :>
({definitionSanitize[s], definitionSanitize[lhs]} :> rhs),
1
]
]
}
];
exprFindDefinitions~SetAttributes~HoldAllComplete
This will pull out definitions from an expression and return them in a held format.
We'll then want a function that will figure out what the symbol in the definition is, depending on whether it was defined in an OwnValues, DownValues, SubValues, or UpValues like pattern:
Clear[ownPatternToUsage, downPatternToUsage, subPatternToUsage,
upPatternToUsage];
ownPatternToUsage[sym_Symbol :> u_] :=
makeUsageRule[sym, sym];
ownPatternToUsage~SetAttributes~HoldAllComplete;
downPatternToUsage[(sym_Symbol)[args___] :> u_] :=
makeUsageRule[sym, sym[args]];
downPatternToUsage~SetAttributes~HoldAllComplete;
subPatternPullHead[pat_] :=
NestWhile[
Extract[#, {1, 0}, HoldComplete] &,
HoldComplete[pat],
! MatchQ[#, HoldComplete[_Symbol]] &
];
subPatternPullHead~SetAttributes~HoldAllComplete;
subPatternToUsage[subPat_ :> u_] :=
With[{sym = Extract[subPatternPullHead[subPat], 1, Unevaluated]},
makeUsageRule[sym, subPat]
];
subPatternToUsage~SetAttributes~HoldAllComplete;
upPatternToUsage[{sym_, p_} :> u_] :=
makeUsageRule[sym, p];
upPatternToUsage~SetAttributes~HoldAllComplete;
Clear[exprDefinitionToUsage];
exprDefinitionToUsage[
({Verbatim[HoldPattern][sym_], Verbatim[HoldPattern][p_]} :> v_)
] :=
upPatternToUsage[{sym, p} :> v];
exprDefinitionToUsage[
(Verbatim[HoldPattern][own_Symbol] :> v_)
] :=
ownPatternToUsage[own :> v];
exprDefinitionToUsage[
(Verbatim[HoldPattern][down : (_Symbol)[___]] :> v_)
] :=
downPatternToUsage[down :> v];
exprDefinitionToUsage[
(Verbatim[HoldPattern][sub_] :> v_)
] :=
subPatternToUsage[sub :> v];
This is then calling a function makeUsageRule which formats a symbol and usage pair into a Rule that will be interpreted as a usage message later. The function itself looks like this:
Clear[makeUsageRule, patternSanitize];
makeUsageRule[sym_Symbol, usage_] :=
StringReplace[
ToString[Unevaluated[sym], InputForm],
$hiddenPackageExportContext -> ""
] ->
Replace[patternSanitize[usage],
HoldComplete[s_] :>
StringReplace[
ToString[Unevaluated[s], InputForm],
$hiddenPackageExportContext -> ""
]
];
makeUsageRule~SetAttributes~HoldAllComplete;
patternSanitize[usage_] :=
ReplaceAll[
ReplaceRepeated[
HoldComplete[usage], {
(Verbatim[Pattern] |
Verbatim[Optional] |
Verbatim[PatternTest] |
Verbatim[Condition]
)[p_, _] :> p
}],
(s_Symbol?(Function[Null, Quiet[Context[#]] =!= "System`",
HoldAllComplete]) :> RuleCondition[
ToExpression[$hiddenPackageExportContext <>
SymbolName[Unevaluated@s]],
True
])
];
patternSanitize~SetAttributes~HoldAllComplete
Here's an example of what it's doing:
makeUsageRule[asd,
asd[b_, {c_, g : _ : 1}, d_] /; $true
]
"asd" -> "asd[b, {c, g}, d]"
The right-hand side of that is a good start for a usage message.
So now that we can turn the notebook expressions into usages, we'll want to turn the notebook cells into usages, collect these, and format a new package notebook. We'll start with the cell conversion:
cellsBuildPackageCore[c : {___Cell}] :=
(
Begin[$hiddenPackageExportContext];
(End[]; #) &@Replace[c,
{
cell : Cell[BoxData[b_], $codeExportStyles, ___] :>
(
Sow@
ToExpression[b, StandardForm, exprFindDefinitions];
Cell[
BoxData@
FrontEndExecute@FrontEnd`ReparseBoxStructurePacket[
First@FrontEndExecute@
FrontEnd`ExportPacket[cell, "InputText"]
],
"Code"
]
),
Cell[CellGroupData[cells_, state1___], state2___] :>
Cell[CellGroupData[cellsBuildPackageCore[cells], state1],
state2]
},
1
]
);
This goes through a list of Cell expressions and if they match our expectation for the form of a code cell ($codeExportStyles is and alias for "Input"|"Code") we convert them into an expression, pull the definitions from that (in held form, of course), and sow these. Then we reformat the boxes in the cell so they'll appear with the appropriate indentation in the package.
Then we write our core function, NotebookToPackage to pull this all together (note that we have a function defsUsagesCell to format our usages into their cell and a function notebookExtractPackageName that pulls the appropriate package name from a NotebookObject):
Clear[makeUsageBoxes];
makeUsageBoxes[name_, strings : {__}] :=
RowBox[{
RowBox[{
RowBox[{name, "::", "usage"}],
"=",
"\"" <> StringRiffle[strings, "\n"] <> "\""
}],
";"
}];
defsUsagesCell[defList_] :=
With[{chunks =
GroupBy[
exprDefinitionToUsage /@ defList,
First -> Last
]
},
If[Length[#] > 0,
Cell[
BoxData@
RowBox@
Riffle[
Prepend[
RowBox@{"(*", RowBox@{"Package", " ", "Declarations"},
"*)"}]@
KeyValueMap[makeUsageBoxes, #],
"\n"
],
"Code"
],
{}
] & /@ {
KeySelect[chunks,
And @@ Through[{Not@*LowerCaseQ, LetterQ}@StringTake[#, 1]] &],
KeySelect[chunks,
Or @@ Through[{LowerCaseQ, Not@*LetterQ}@StringTake[#, 1]] &]
}
]
notebookExtractPackageName[nb_NotebookObject] :=
StringReplace[
Replace[
Quiet[NotebookFileName[nb]],
{
s_String :>
FileBaseName[s],
$Failed :>
AbsoluteCurrentValue[nb, WindowTitle]
}
],
Except[WordCharacter] -> ""
];
NotebookToPackage[nb_NotebookObject] :=
With[{cells =
Replace[NotebookRead[nb], {
c_Cell :> {c},
{} :> First@NotebookGet[nb]
}]},
With[{data = Reap[cellsBuildPackageCore@cells]},
With[{usagecells = defsUsagesCell[Flatten@Last@data]},
Notebook[
Flatten@{
Cell[notebookExtractPackageName[nb], "Section"],
Cell[
BoxData@
RowBox[{
RowBox[{"BeginPackage", "[",
"\"" <> notebookExtractPackageName[nb] <> "`\"",
"]"}],
";"}],
"Code"
],
If[Length@usagecells[[1]] > 0,
usagecells[[1]],
{}
],
Cell[
BoxData@RowBox[{RowBox[{"Begin", "[", "\"`Private`\"", "]"}],
";"}],
"Code"
],
If[Length@usagecells[[2]] > 0,
Cell[
CellGroupData[
Flatten@{
Cell["Private Declarations", "Subsubsection"],
usagecells[[2]]
},
Closed
]
],
Nothing
],
Cell[
CellGroupData[
Flatten@{
Cell["Implementation", "Subsection"],
Cell[BoxData@
RowBox@{"(*",
RowBox@{"Package", " ", "Implementation"},
"*)"},
"Code"
],
First@data
}
]
],
Cell[
CellGroupData[{
Cell["End", "Subsection"],
Cell[BoxData@RowBox[{RowBox[{"End", "[", "]"}], ";"}],
"Code"
],
Cell[BoxData@
RowBox[{RowBox[{"EndPackage", "[", "]"}], ";"}],
"Code"
]
}]
]
},
StyleDefinitions -> "Package.nb"
]
]
]
];
And this'll spit out a package Notebook which open with CreateDocument we can tweak (mostly just extend the usage messages) and save naturally as a .m file (although for unknown reasons the opened notebook does not take on its new file name and that .m file needs to be opened separately). Alternatively we can open this new notebook with Visible->False and use the "SaveRename" token to immediately go to a package file.
It's also worth noting that this whole process could be pretty easily extended to turn a notebook or collection of notebooks directly into a full application with documentation, resources, PacletInfo.m, etc. (since code for autogenerating docs and the PacletInfo.m already exist).