7

The PageWidth option controls the way the Cell's content is displayed, not the manner it's written in a .nb file. I wish to get a "one Cell - one line" representation (i.e., with no line breaking, a sort of PageWidth -> Infinity for the file). Is it possible?

This question arises from the necessity of ransacking in a large stack of notebooks by means of a strings search tool.

Alexey Popkov
  • 61,809
  • 7
  • 149
  • 368
mitochondrial
  • 1,843
  • 10
  • 16
  • 1
    Interesting question. I wish I had an answer. – Mr.Wizard May 09 '16 at 08:48
  • Hello ! An alternative could be to import the .nb file as plain text, replace "\\\n" -> "" and then perform the search. Do you think that it's trustworthy ? A direct search by means of regular expressions looks much more frightening ... – mitochondrial May 09 '16 at 08:58
  • Frankly I am not in the habit of reading Notebook code directly so I cannot recall if there are traps for using that simple replacement. – Mr.Wizard May 09 '16 at 09:01
  • I don't think it is possible to force the FrontEnd to write the .NB files with PageWidth->Infinity. Even if you would be able to do this, your old Notebooks still won't be searchable without re-saving them with the new option. So the correct approach to your problem is to solve it on the level of your string search tool without changing the default behavior of the FrontEnd. – Alexey Popkov May 09 '16 at 11:37
  • Theoretically the Wolfram Notebook Indexer should help with this, but it's only theoretically... – Alexey Popkov May 09 '16 at 11:39
  • An alternative is to use Mathematica for searching instead of your strings search tool. It should be much easier and much more fail-safe approach than writing your own parser of .NB files using regular expressions. – Alexey Popkov May 09 '16 at 11:49
  • I suggest you to start from Import["ExampleData/document.nb", {"Cells", All}] if you wish to follow the Mathematica route for searching. It returns the list of the Cells not in the order in which they are present in the file (what is incorrect from my point of view), but at least it gives them as WL expressions which are convenient for searching. – Alexey Popkov May 09 '16 at 12:09
  • 1
    Thanks for the replies ! The use of Import[fileIn,"Text"] looks good indeed, but when it comes to regular expressions I still don't go much further than basics. Please, can you point to a source for a reg exp or an algorithm to detect matching square brackets delimitng a Cell ? The command Import["ExampleData/document.nb", {"Cells", All}] gives me an error Import::noelem: "The Import element \"\!\(\"Cell\"\)\" is not present when importing as "NB; may be it's because I'm using M. 8.0.1 ? – mitochondrial May 09 '16 at 12:24
  • With Mathematica version 8 the command Import["ExampleData/document.nb", {"Cells", All}] doesn't work and you should instead use Import["ExampleData/document.nb", "Notebook"] what may be even more convenient (depending on your actual goals). – Alexey Popkov May 09 '16 at 19:49

1 Answers1

9

For producing a NB file without the line breaks you can Get it as a Notebook expression, then Export it with PageWidth -> Infinity as "Package" (other possible options does not work correctly: Export ignores PageWidth -> Infinity when exporting as "NB" and corrupts the code when exporting as "Text"):

Export["document.nb", Get["ExampleData/document.nb"], "Package", 
 PageWidth -> Infinity, "Comments" -> None]

This method is safe. Mathematica directly opens the generated file as a valid Notebook identical to the original Notebook. The only significant difference is that non-printable ASCII characters (excepting \b\t\n\r\f) are written "as is" when exporting as "Package", but in FullForm when exporting as "NB" (this difference doesn't affect the actual contents of the file when the latter is opened by the FrontEnd).

If you really need the "one Cell - one line" formatting you can simply add a line break before each Cell[ in the obtained string:

Export["document.nb", 
 StringReplace[
  ExportString[Get["ExampleData/document.nb"], "Package", PageWidth -> Infinity, 
   "Comments" -> None], c : "Cell[" :> "\n" <> c], "Text"]

A special note: this will affect all Cells including the Inline Cells (what may be undesirable but is safe). But the above code isn't safe in the sense that if your file contains textual strings with verbatim Cell[, they will also be modified. You can avoid this corruption by replacing the Head Cell in the Notebook expression with your own uniquely named Head before exporting:

Export["document.nb", 
 StringReplace[
  ExportString[Get["ExampleData/document.nb"] /. Cell -> $$$MyUniqueCellHead$$$, 
   "Package", PageWidth -> Infinity, "Comments" -> None], 
  "$$$MyUniqueCellHead$$$" -> "\nCell"], "Text"]

You can ensure that your Head is really unique by appending a random number to it:

With[{uniqName = "$$$MyUniqueCellHead$$$" <> ToString[RandomInteger[{10^8, 10^10}]]},
 Export["document.nb", 
  StringReplace[
   ExportString[Get["ExampleData/document.nb"] /. Cell -> Symbol[uniqName], "Package", 
    PageWidth -> Infinity, "Comments" -> None], uniqName -> "\nCell"], "Text"]]

This method is based on the well-documented and widely used functionality and hence should work reliably.

Here is how the obtained file looks in Notepad with word wrapping turned off:

screenshot


04/19/2022 IMPORTANT UPDATE

In the recent versions of Mathematica (I tested 12.3.1 and 13.0.1) the Export[..., "Package"]/Put compatibility is partially broken for the "NB" format. The problem is that Export as "Package" and Put write Pattern[sym,obj] as sym:obj instead of Pattern[sym,obj] as it was in earlier versions. The syntax sym:obj is recognized by the FrontEnd as a syntax error when it opens a file as Notebook. I consider this as a bug. As a workaround, we can "protect" the Pattern head like we do above for Cell:

With[{uniqCellHeadName = 
   "$$$MyUniqueCellHead$$$" <> ToString[RandomInteger[{10^8, 10^10}]], 
  uniqPatternHeadName = 
   "$$$MyUniquePatternHead$$$" <> ToString[RandomInteger[{10^8, 10^10}]]}, 
 Export["document.nb", 
  StringReplace[
   ExportString[
    Get["ExampleData/document.nb"] /. {Cell -> Symbol[uniqCellHeadName], 
      Pattern -> Symbol[uniqPatternHeadName]}, "Package", PageWidth -> Infinity, 
    "Comments" -> None], {uniqCellHeadName -> "\nCell", 
    uniqPatternHeadName -> "Pattern"}], "Text"]]
Alexey Popkov
  • 61,809
  • 7
  • 149
  • 368