I need a pattern for a string that does not contain a specified string, while at the same time respecting the mandate to match only shortest sequences.
For example strings bracketed by [] but not containing o:
string = "[the][quick][brown][fox][jumps][over][the][lazy][dog]";
StringCases[string, x : Shortest["[" ~~ __ ~~ "]"] /; StringFreeQ[x, "o"]]
{"[the]", "[quick]", "[jumps]", "[the]", "[lazy]"}
However this is a poor solution because the pattern matcher keeps lengthening the test string rather than moving on to the next possible match, causing a tremendous slow-down on longer strings. Illustrated:
Needs["GeneralUtilities`"]
StringCases[string, x : Shortest["[" ~~ __ ~~ "]"] /; StringFreeQ[Echo@x, "o"]]
[the]
[quick]
[brown]
[brown][fox]
[brown][fox][jumps]
[brown][fox][jumps][over]
[brown][fox][jumps][over][the]
[brown][fox][jumps][over][the][lazy]
. . .
What I actually want can be accomplished with a second filtering operation:
StringCases[string, "[" ~~ Shortest[__] ~~ "]"];
Pick[%, StringFreeQ[%, "o"]]
{"[the]", "[quick]", "[jumps]", "[the]", "[lazy]"}
I would prefer to do this in a single pass.
A solution in the case of a single-character string to prohibit is Except:
StringCases[string, "[" ~~ Shortest[Except["o"] ..] ~~ "]"]
{"[the]", "[quick]", "[jumps]", "[the]", "[lazy]"}
At least in Mathematica 10.1 Except does not work with more than one character:
StringCases[string, "[" ~~ Shortest[Except["ox"] ..] ~~ "]"]
StringExpression::invld: Element Shortest[Except[ox]..] is not a valid string or pattern element in [~~Shortest[Except[ox]..]~~]. >>
In actuality my delimiters are multicharacter as well so I cannot get around this with something like Except["]"] .. either.
Is there another approach that is fast and doesn't require a second filtering pass in one form or another?