1

I'm trying to understand Mathematica's pattern matching in strings, but I have some doubts.

99% of the time I need to a) check if a match occurs and b) retrieve the matched patterns.

I come from perl, where it is easy to do both in two lines:

"big bad wolf =~ / (.*) /;
print"$1\n"
"bad"

In Mathematica is it possible to do it with just one function call? For now I'm able to do

a = "big bad wolf"
StringMatchQ[a, RegularExpression["big .* wolf"]]
True
StringCases[a, RegularExpression[" (.*) "] -> "$1"]
{"bad"}

As a 2nd question -- how can I retrieve more than one pattern, e.g. from this call:

StringCases[a, RegularExpression["(.*) .* (.*)"]
m_goldberg
  • 107,779
  • 16
  • 103
  • 257
alessandro
  • 1,085
  • 7
  • 14
  • 1
    Quite likely I am misunderstanding your first question, but since you know RegularExpression[] and StringCases[] already, why do they not suit your needs? If nothing is matched, you end up with an empty list. – J. M.'s missing motivation Jun 04 '15 at 08:33
  • sorry I wasnt clear... I just wanted to know if in M there is a single function able to return False on no-match, and {$1,$2,...} in case of a match ... – alessandro Jun 04 '15 at 08:38
  • So, something like With[{s = StringCases[(* stuff *)]}, If[s =!= {}, s, False]]? – J. M.'s missing motivation Jun 04 '15 at 08:44
  • Mathematica's regular expression matching capabilities are not up to perl's, but it does handle match groups. You need to make a careful reading of the full documentation article on RegularExpression. Frankly, for heavy duty reggae work, I prefer Ruby to Mathematica. – m_goldberg Jun 04 '15 at 13:35
  • @m_goldberg reggae work? – Mr.Wizard Jun 04 '15 at 13:59
  • 1
    @Mr. Wizard -- that's when m_goldberg is programming for Bob Marley. – bill s Jun 04 '15 at 14:33
  • 1
    @Mr.Wizard, et al. Hate to spoil the fun, but -- alas -- I intended to write 'regex work' ()you probably already guessed). At least my conscious mind so intended. The real control center evidently had other things in mind. Never programmed for Marley -- probably missed a good gig :-) – m_goldberg Jun 05 '15 at 02:32
  • @m_goldberg My actual guess was industry jargon of some sort. As for "real control center" I find that amusing; at one point I noticed that nearly all my typos were actual words. Fingers seem to have a mind of their own at times. – Mr.Wizard Jun 05 '15 at 04:18

2 Answers2

5

This is a minor variation on the last example in the Examples > Generalizations & Extensions section of RegularExpression,

StringCases["big bad wolf", RegularExpression["(.*) .* (.*)"] -> {"$1", "$2"}]
{{"big", "wolf"}}

Does this work for you?

m_goldberg
  • 107,779
  • 16
  • 103
  • 257
  • perfectly, thanks! I'm just surprised it doesnt return {"big", "wolf"}: why is it one level deeper? – alessandro Jun 05 '15 at 08:37
  • @alessandro That is because of the right hand side of the Rule; if you use RegularExpression["(.*) .* (.*)"] -> foo["$1", "$2"] you will see foo instead. If you want a flat list you could use RegularExpression["(.*) .* (.*)"] -> Sequence["$1", "$2"] or Flatten @ StringCases[ . . . ]. – Mr.Wizard Jun 05 '15 at 13:50
3

I think you might be looking for a pattern structure like this:

str = "big bad wolf";
StringCases[str, "big " ~~ x__ ~~ " wolf" -> x]

which returns bad or whatever happens to lie between big and wolf. In the event that there is no match, you get a null {}.

bill s
  • 68,936
  • 4
  • 101
  • 191
  • Yes, I simply use RegularExpression because, frankly, Wolfram's syntax for RE's is so ugly! (personal opinion :-) – alessandro Jun 04 '15 at 13:23
  • @alessandro The Mathematica string pattern syntax is an extension of the native pattern syntax; its strength is that is familiar and very flexible (e.g. extensible with Condition) but like most aspects of Mathematica it is a bit clumsy with Strings. To the extent possible Mathematica actually converts string patterns into regular expressions so if you are comfortable writing them directly you might as well do that, with one caveat: Why is StringExpression faster than RegularExpression? – Mr.Wizard Jun 04 '15 at 15:49