7

Bug introduced in 5 or earlier and persisting through 12.0


Anyone knows why

StringMatchQ["x", Except[{"1", "2"}]]

StringMatchQ["x", Except[{"*"}]]

StringMatchQ["x", {"1", "*"}]   (* without Except *) 

all work as expected, but

StringMatchQ["x", Except[{"1", "*"}]]

throws the following error:

StringExpression::invld: Element Except[{1,*}] is not a valid string 
   or pattern element in StringExpression[Except[{1,*}]]. >>
Alexey Popkov
  • 61,809
  • 7
  • 149
  • 368
Sjoerd C. de Vries
  • 65,815
  • 14
  • 188
  • 323

1 Answers1

5

I don't think that StringMatchQ["x", Except[{"*"}]] works as expected as well as StringMatchQ["x", Except["*"]]:

StringMatchQ["x", Except[{"*"}]]
StringMatchQ["x", Except["*"]]
True

True

The string pattern "*" is an abbreviated string pattern consisted from the only metacharacter * which corresponds to zero or more characters according to the first point under the "Details and Options" section of StringMatchQ. So by the definition the pattern Except["*"] shouldn't give a match at all because it means negation of literally everything. Also for matching verbatim "*" one must escape this metacharacter with double backslash "\\*" (the reference is the same). So current behavior is a bug.

Note also that escaping with single backslash is meant for Mathematica's internal metacharacters like < and > (and obviously for more common n, t and r etc. but with another counting by StringLength) as described in this answer by John Fultz:

The modern Mathematica notebook format (introduced in 1996) was always made to be interpreted properly as a Mathematica expression should you call Get[] on it from the kernel. So this syntax was standardized, and is still used today. Now, the kernel simply ignores the \< \> delimiters as you can see below:

In[1]:== StringLength["\<x\>"]
Out[1]== 1

The following demonstrates a possible bug (for the string "\*" length must be 2 because * isn't an internal metacharacter of Mathematica and it needs not be escaped inside of an ordinary string):

StringLength["*"]
StringLength["\*"]
1

1

StringLength["\x"]

Syntax::stresc: Unknown string escape \x.

2

The following is a set of examples of the expected behavior (observed with version 10.4.1):

StringMatchQ["", "*"]
StringMatchQ["", Except["*"]]
True

False

StringMatchQ["xy", {"*"}]
StringMatchQ["xy", "*"]
True

True

StringMatchQ["*", Except[{"*"}]]
StringMatchQ["*", Except["*"]]
StringMatchQ["xy", Except["*"]]
False

False

False

StringMatchQ["x", "@"]
True
StringMatchQ["x", {"\\*"}]
StringMatchQ["x", "\\*"]    
StringMatchQ["*", "\\*"]
False

False

True

The following is a set of examples of wrong behavior of Except with metacharacters:

StringMatchQ["x", Except[{"\\*"}]] (* metacharacter is correctly escaped *)
StringMatchQ["x", Except["\\*"]] (* metacharacter is correctly escaped *)

StringExpression::invld: Element Except[{\*}] is not a valid string or pattern element in Except[{\*}]. >>

StringMatchQ["x", Except[{"\\*"}]]

StringExpression::invld: Element Except[\*] is not a valid string or pattern element in Except[\*]. >>

StringMatchQ["x", Except["\\*"]]
StringMatchQ["x", Except["\*"]] (* metacharacter is wrongly escaped *)
StringMatchQ["*", Except["\*"]] (* metacharacter is wrongly escaped *)
True

True

StringMatchQ["x", Except["@"]]

StringExpression::invld: Element Except[@] is not a valid string or pattern element in Except[@]. >>

StringMatchQ["x", Except["@"]]

And here is an example of wrong (but consistent) behavior of StringMatchQ both with Except and without it:

StringMatchQ["\*", Except["\*"]] (* metacharacter is wrongly escaped *)
StringMatchQ["\*", "\*"] (* metacharacter is wrongly escaped *)
False

True

In addition:

StringMatchQ["*", Except["\*"]]
True

Update

I've found an old MathGroup discussion on this topic from where I learned that StringCases and StringPosition don't support abbreviated string patterns, and that Verbatim forces StringMatchQ to match two strings literally: StringMatchQ["\\*", Verbatim["\\*"]] returns True.

Alexey Popkov
  • 61,809
  • 7
  • 149
  • 368
  • It's a bit long ago, so I don't recall what made me write "work as expected". Perhaps I was just looking at the generation of an error message. Anyway, it indeed looks like there is a bug in Mathematica's treatment of meta-characters. Nice treatment of the issue (+1). – Sjoerd C. de Vries Jun 15 '16 at 21:11