7

This is a follow-up question to Mathematica lexer: Symbols and UTF-8 and this it is about numbers. In Mathematica, we have a large number of ways to input numbers. They can contain a notation for the base, precision, accuracy, a scientific form *^12, or a combination of them.

I crafted a simple StringExpression that should catch most cases

number = {DigitCharacter .., "." ~~ DigitCharacter .., 
   DigitCharacter .. ~~ "." ~~ DigitCharacter ...};
baseNumber = {HexadecimalCharacter .., "." ~~ HexadecimalCharacter ..,
    HexadecimalCharacter .. ~~ "." ~~ HexadecimalCharacter ..};
base = DigitCharacter .. ~~ "^^";
precicion = "`" ~~ RepeatedNull[RepeatedNull["`", 1] ~~ number, 1];
scientific = "*^" ~~ RepeatedNull["+" | "-", 1] ~~ DigitCharacter ..;
final = {number, base ~~ baseNumber} ~~ RepeatedNull[precicion, 1] ~~ 
   RepeatedNull[scientific, 1];

testMe[str_String] := StringMatchQ[str, final]

We can test this with most common forms

testMe /@ {"123", ".123", "123.123", "16^^aa", "16^^.aa", 
  "16^^.aa``30*^+10", "16^^0.*^3"}
(* {True, True, True, True, True, True, True} *)

Question 1: Do you find valid Mathematica numbers that return False? There are some restrictions:

  • A leading minus sign is not allowed as this will be caught as operator later and is of no concern now.
  • I implemented only up to base 16, so funny examples like

    32^^ListLinePlot
    (* 777888725646235421 *)
    

    are unfortunately not allowed.

Question 2: Do you find invalid number forms that return True? Restriction:

  • While 2^^abc is an invalid number, there is no way for the lexer to know this because when matching the abc, it has no knowledge about the context that you used only base 2.
halirutan
  • 112,764
  • 7
  • 263
  • 474

0 Answers0