1

The following code generates a dataset from a table within a webpage:

ClearAll;
resultsByTurbineTypeRaw = Import["https://www.vestas.com/en/products/track_record#!results-by-turbine-type", "Data"];
position = FirstPosition[resultsByTurbineTypeRaw, "Wind Turbine"]
resultsByTurbineTypeRaw[[2, 6, 1, 1]]
resultsByTurbineType1 = resultsByTurbineTypeRaw[[2, 6, Range[20]]];
first = First[resultsByTurbineType1]
rest1 = Rest[resultsByTurbineType1];
listB = Join[{first}, rest1];
vestasOrdersByTurbineType = Dataset[AssociationThread[First@listB, #] & /@ Rest@listB]

It can be noted upon code execution that the values for the columns "Quantity" and "Total MW" can contain commas but do not always do so.

How can I create the proper syntax for a StringReplace function that deletes the commas so that the substituted values are presented as Numbers rather than Strings, thereby creating a new version of the dataset without commas, while not failing on Strings that do not contain commas?

Stuart Poss
  • 1,883
  • 9
  • 17

2 Answers2

3

Based on the clarification that only commas need to be removed, here is one way

vestasOrdersByTurbineType[All,
 <|#,
   "Quantity" -> Interpreter["Number"][#"Quantity"],
   "Total MW" -> Interpreter["Number"][#"Total MW"]|> &]

InputForm@Normal@First@% (* <|"Wind Turbine" -> "Other", "Quantity" -> 34652, "Total MW" -> 23716|> *)

Rohit Namjoshi
  • 10,212
  • 6
  • 16
  • 67
  • 1
    you can also use vestasOrdersByTurbineType[All, {"Quantity" -> Interpreter["Number"], "Total MW" -> Interpreter["Number"]}] (+1) – kglr Aug 09 '21 at 07:46
  • @kglr Thanks. I did not know this nice shortcut. – Rohit Namjoshi Aug 09 '21 at 14:32
  • Works like a charm. Thanks for providing a new way to approach dataset manipulation. Can kglr explain the (+1) syntax? I haven't seen this construction before. Where is it documented? – Stuart Poss Aug 09 '21 at 17:04
  • The (+1) is just the way @klgr indicated that he upvoted the answer. Used frequently on MSE. – Rohit Namjoshi Aug 09 '21 at 17:09
2
f[x_] := StringJoin[Select[Characters[x], Or[DigitQ[#], # == "."] &]]

ToExpression[f /@ {"1,234.56", "£5,432"}]

{1234.56, 5432}

Chris Degnen
  • 30,927
  • 2
  • 54
  • 108
  • Works nicely when the {row, column} element of the dataset is a digit, but converts to Null when the element contains no digits and returns a 0. when the element is a string partially containing a number within it (eg name of turbine type is " V90-3.0) . Not sure how to structure conditionals. – Stuart Poss Aug 07 '21 at 16:58
  • Perhaps I didn't make my question clear. I am dealing with columns of a dataset, so the function(s) need to operate over a list. I assume I can Flatten and then put all entries in a single list, apply the function and then ArrayReshape to reconstruct the new dataset. The function f[x_] := StringJoin[Select[Characters[x], Or[DigitQ[#], # == "."] &]] when operating on the 4 types of data here ToExpression[f /@ {"Other", "1,574", "732", "V80-1.8/2.0 MW®"}] returns {Null, 1574, 732, 0.} What I am looking for is a function that will return {"Other",1574,732,"V80-1.8/2.0 MW®"} – Stuart Poss Aug 08 '21 at 01:17