7

Suppose I have a string str that contains only whitespace-delimited real numbers:

str = " 0\t1.46604\t1.44829\t12.0546\t1.57075\t1.64044\t12.0489\t1.58142";

I would like to convert str to a list of real numbers. One way to accomplish this is to split the string using StringSplit and then Map the function ToExpression across the resulting list:

Map[ToExpression, StringSplit[str]]

which gives the correct output:

{0, 1.46604, 1.44829, 12.0546, 1.57075, 1.64044, 12.0489, 1.58142}

But, is there a cleaner or more efficient way to do this? I will be converting about 100,000 strings str -- each of which are approximately the same length as the example str above -- which will be read as records from a text file (i.e., using OpenRead and Read), so it would be nice if the method is relatively efficient. Thank you for your time!

Andrew
  • 10,569
  • 5
  • 51
  • 104
  • Although, I'm not sure why, ToExpression["{" <> StringReplace[str, "\t" -> ", "] <> "}"] seems to be a tad faster than both StringSplit and StringToDouble methods. – kale Sep 02 '12 at 23:06
  • @kale probably because you are only searching for /t, and not for the other possible string separators Whitespace is taken to include spaces, tabs and newlines. – Dr. belisarius Sep 02 '12 at 23:09
  • @kale, not clearly faster than StringToDouble in my test. Also probably because there's only one string to expression conversion there – Rojo Sep 02 '12 at 23:10
  • @Rojo, Yeah, we're not talking much, but consistently a little faster on my machine. – kale Sep 02 '12 at 23:11

5 Answers5

12
str = " 0\t1.46604\t1.44829\t12.0546\t1.57075\t1.64044\t12.0489\t1.58142";
ToExpression[StringSplit[str, Whitespace]]

(* {0, 1.46604, 1.44829, 12.0546, 1.57075, 1.64044, 12.0489, 1.58142} *)
rm -rf
  • 88,781
  • 21
  • 293
  • 472
Fred Daniel Kline
  • 2,360
  • 2
  • 20
  • 41
10

You could also try

Internal`StringToDouble /@ StringSplit[str]

But I don't think your speed limitation will be in the conversion but in the repeated reading of the records. Just guessing.

If you had some long string perhaps you could also try Flatten@ImportString[str, "Table"]

As of 12.3, the internal function name is Internal`StringToMReal

John
  • 2,429
  • 3
  • 17
  • 16
Rojo
  • 42,601
  • 7
  • 96
  • 188
  • 1
    Careful with this Internal\StringToDouble. I would always useStringTrimbefore it. I just found thatInternal`StringToDouble["-1"]returns -1.0 whileInternal`StringToDouble[" -1"]`, with an extra leading space, returns positive 1.0. – Gustavo Delfino Jun 08 '20 at 17:17
5

Alternatively, since your data is separated by tabs:

data = "0\t1.46604\t1.44829\t12.0546\t1.57075\t1.64044\t12.0489\t1.58142";
ImportString[data, "TSV"] // First
   {0, 1.46604, 1.44829, 12.0546, 1.57075, 1.64044, 12.0489, 1.58142}
J. M.'s missing motivation
  • 124,525
  • 11
  • 401
  • 574
2

Using ReadList:

str = " 0\t1.46604\t1.44829\t12.0546\t1.57075\t1.64044\t12.0489\t1.58142";
ReadList[StringToStream[str], Number]

({0, 1.46604, 1.44829, 12.0546, 1.57075, 1.64044, 12.0489, 1.58142})

vindobona
  • 3,241
  • 1
  • 11
  • 19
1

Using StringCases:

str = " 0\t1.46604\t1.44829\t12.0546\t1.57075\t1.64044\t12.0489\t\
1.58142";
StringCases[str, NumberString] // ToExpression

{0, 1.46604, 1.44829, 12.0546, 1.57075, 1.64044, 12.0489, 1.58142}

Syed
  • 52,495
  • 4
  • 30
  • 85