1

I have a text file (ANSI encoding) with two columns of real numbers separated, unfortunately, by a "Null Character" ("\.00").

How can I import this data easily?

The problem:

Take[Import["17Dec15450K00.txt", "Table"], 3]
{{"29.89128\.000.0119872"}, {"30.37339\.000.0120442"}, {"30.85551\.000.0123593"}}

But

ToExpression[Take[Import["17Dec15450K00.txt", "Table"], 3]]
{{29.8913}, {30.3734}, {30.8555}}

Other functions also fail with if the string contains the Null Character

ToCharacterCode["999\.00999"]

StringReplace["999\.00999", "\.00" -> "X"]

Gives no output. Is that a bug?

File available via Dropbox here

UUE ZIP via Pastebin here

Alexey Popkov
  • 61,809
  • 7
  • 149
  • 368
rhermans
  • 36,518
  • 4
  • 57
  • 149

4 Answers4

3

I would not consider this an EASY solution, but it does the job.

file = OpenRead["17Dec15450K00.txt"];
data=Reap[
 While[
  Sow[
    Read[file, {Number, Byte, Number}]] =!= EndOfFile
  ]]
[[2, 1, ;; -2, {1, 3}]]
Close[file]
rhermans
  • 36,518
  • 4
  • 57
  • 149
2

I suspect to be robust you need to read in binary mode. Actually I would probably convert the nulls in one pass something like this:

in = OpenRead[file_with_nulls, BinaryFormat -> True] 
out = OpenWrite["tmp.out", BinaryFormat -> True] 
BinaryWrite[out, BinaryReadList[ in, "Byte"] /. 0 -> 32] (*32 -> space*)
Close /@ {in, out}

Then you should be able to use Import.

You could also use a command line tool like linux tr to repair the file.

This is totally untested as I don't have an example file to work with.

george2079
  • 38,913
  • 1
  • 43
  • 110
1

Another workaround:

ReadList["17Dec15450K00.txt", {Number, Byte, Number}][[All, {1, 3}]]
Alexey Popkov
  • 61,809
  • 7
  • 149
  • 368
-1

Why wouldn't you use this?

Import["17Dec15450K00.txt", "Data"]; // AbsoluteTiming
{0.187201, Null}

Import the text file as "Data" and no need to convert anything. I'm still using MMA 9

Some how it works for me

Bob Brooks
  • 466
  • 2
  • 13
  • 1
    That actually does not work, you get a list of strings. and the NullCharacter makes it hard to parse. – rhermans Dec 18 '15 at 17:27