21

When importing a data file what are the comment symbols for Mathematica? That is, given a file like this

blabla
bulbul

1 2 6 54 7 ...
..

what symbol do I have to put in front of header lines so Mathematica skips them and starts reading at the line 1 2 6 54 7 .... I tried #, which works in gnuplot, but that did not work.

I know that I could just tell Mathematica to skip the lines, but as I can control the file output, it would be nicer to use some kind of a tag.

m_goldberg
  • 107,779
  • 16
  • 103
  • 257
MaxJ
  • 1,535
  • 1
  • 10
  • 16

4 Answers4

19

I like to just import it and then filter it afterwards.

data = Cases[Import["file", "Table"], {_?NumberQ, ___}];

which will contain only those lines that start with a number.

  • I find this far more readable than george2079, upvote it! +1. Can someone explain why the other answer has got so many upvotes and this only 2? – hhh Nov 11 '14 at 03:58
18

Here is an approach that handles interspersed comments in addition to "headers"

 FilePrint["test.txt"]
#comment
#comment
#comment
1 2 3
#c2
4 5 6
7 8 9
 ImportString[
     StringReplace[Import["test.txt", "Text"], 
         StartOfLine ~~ "#" ~~ Shortest[___] ~~ EndOfLine ~~ "\n" -> ""], "Table"]

{{1, 2, 3}, {4, 5, 6}, {7, 8, 9}}

of course you can invent whatever convention you want or even a mix, eg..

 `StartOfLine ~~ {"#", "!", "%"} ~~ ...`

another variant:

 ImportString[StringJoin@Riffle[
      Select[StringSplit[Import["test.txt", "Text"], "\n"],
            StringTake[#, 1] != "#" &], "\n"], "Table"]

Even handle end-of-line comments:

#comment
1 2 3
#c2
4 5 6  #note 1
7 8 9  
 ImportString[StringReplace[Import["test.txt", "Text"], {
      StartOfLine ~~ "#" ~~ Shortest[___] ~~ EndOfLine ~~ "\n" -> "",
                     "#" ~~ Shortest[___] ~~ EndOfLine -> ""
                                       }], "Table"]

{{1, 2, 3}, {4, 5, 6}, {7, 8, 9}}

Tested on Windows by the way -- this might need some tweaking to handle different line endings on other systems

george2079
  • 38,913
  • 1
  • 43
  • 110
  • Strictly speaking your last example doesn't handle in-line comments, but only handles comments that are terminated by a line ending. In-line comments need a beginning and an end delimiter. – m_goldberg Jun 13 '14 at 15:48
  • ok.. not sure what to call that but I'll fix.. – george2079 Jun 13 '14 at 16:45
  • @george2079, does your apprach require ImportString or, in any case, can it be adapted to import CSV files, skipping blocks of rows (of varying length) consisting entirely of NullWords? – alancalvitti Nov 11 '14 at 02:31
  • Are you able to answer this question here? I am not sure whether it is related: I have header with four lines where unique identifier is specified by two lines. – hhh Nov 11 '14 at 03:16
17

The Import command supports an option to ignore header lines. In many cases this is the easiest solution. For example:

dataStats = Import["C:/data/stats.csv", "CSV", HeaderLines -> 4];
Tyler Durden
  • 4,090
  • 14
  • 43
5

Here's one possibility. I made a CSV file that imports looking like this:

data = Import["temp.csv"]
(* {{"header 1", ""}, {"header 2", ""}, {"#", ""}, {1, 1}, {2, 4}, {3, 9}, {4, 16}} *)

Search for your flag (I used #) using Position and then select all rows following that search

data[[Position[data, "#"][[1, 1]] + 1 ;;]]
(* {{1, 1}, {2, 4}, {3, 9}, {4, 16}} *)
bobthechemist
  • 19,693
  • 4
  • 52
  • 138