3

Working with big files, I have to convert some numbers that are in string format into real MMA numbers. I know that I can use ToExpression but it's slow when compared to another forms.

For instance, for Integer case we can compare:

dataInt = ToString /@ RandomInteger[1000, {10^5}];
d1 = FromDigits /@ dataInt; // AbsoluteTiming
d2 = ToExpression /@ dataInt; // AbsoluteTiming

{0.050298, Null}
{0.475669, Null}

It's almost 10 times faster to use FromDigits. The question is, how can I make a equivalent for Real cases? Like:

dataReal = ToString /@ RandomReal[1000, {10^5}];
d1 = someFunction/@dataReal;//AbsoluteTiming
d2 = ToExpression/@dataReal;//AbsoluteTiming

I haven't found some Mathematica function to someFunction. I miss one way to force ToExpression to interpret the string in some specific way.

Murta
  • 26,275
  • 6
  • 76
  • 166

2 Answers2

9

I found in this answer the function Internal`StringToDouble that does exactly what I was looking for:

dataReal = ToString /@ RandomReal[1000, {10^5}];
d1 = Internal`StringToDouble /@ dataReal; // AbsoluteTiming
d2 = ToExpression /@ dataReal; // AbsoluteTiming
d1 == d2
{0.045391, Null}
{0.544008, Null}
True

As you can see, 10x faster then ToExpression.

As of 12.3, the Internal function is Internal`StringToMReal

John
  • 2,429
  • 3
  • 17
  • 16
Murta
  • 26,275
  • 6
  • 76
  • 166
1

Well yours is better and faster but I'll post this anyways. This is the best I can do for the moment. For 10^6 values its about 15% faster then ToExpression:

First@ImportString[
    StringReplace[StringDrop[StringDrop[ToString[dataReal], 1], -1], 
     "," -> ""], "Table"]; // AbsoluteTiming
s0rce
  • 9,632
  • 4
  • 45
  • 78