6

Bug fixed in 12.3 or earlier


This question is about parsing files that contain nulls (character code zero) as field separators.

Intending to attempt to answer this I began by trying to enter a string containing a null, e.g. by typing:

str = "123\.00456"

in the way that they are displayed in the linked question. This is converted to "123456" as it is typed but evaluating it gives the following error [using Mathematica 10.3.1 on Mac OS X 10.11]:

Syntax::tsntxi: "str=123\.00456" is incomplete; more input is needed.

So the null is being interpreted as a string termination character. (But note that the message above is the result of copy & paste -- in the Front End it is displayed as "str = 123456".)

Using the Unicode form \:0000 gives the same error and I tried using \[Null] in my string but this has a character code different from zero (62368) so is not quite what I'm after.

I can generate the string using e.g.

(str = FromCharacterCode[{49, 50, 51, 0, 52, 53, 54}]) // FullForm

"123\.00456"

But is there any way of typing a string containing nulls directly, i.e. in a quoted string?

Also should it be considered a bug that it is not possible to use the FullForm of a string containing a null as its InputForm?

Alexey Popkov
  • 61,809
  • 7
  • 149
  • 368
MikeLimaOscar
  • 3,456
  • 17
  • 29

1 Answers1

7

Turns out the answer is yes, simply use "PrintableASCII" character encoding rather than the FullForm/InputForm representation:

str = "123\000456"
ToCharacterCode[str]

{49, 50, 51, 0, 52, 53, 54}

I discovered this using:

ToString[FromCharacterCode[{49, 50, 51, 0, 52, 53, 54}], InputForm, 
 CharacterEncoding -> "PrintableASCII"]

Which is basically from the Special Characters and Strings tutorial in the Documentation.

So I am inclined to agree (with myself) that not being able to use the \.00 or \:0000 forms in a string is a bug.

MikeLimaOscar
  • 3,456
  • 17
  • 29