1

For example,

string = "ads, 32, dv, \"sdf\"
  sd, 213, as, \"asd
  asd\"
  asd, 123 sd, \"asd\"";

There are 3 line returns \n. I want to find the second one since it is inside the quotes ("").

StringCases[string,RegularExpression["put regular expression here"]];

I have tried the following regex which should work if only Mathematica would allow variable length lookbehind assertions:

RegularExpression["(?<=,\"[^\"]*)\\n(?=[^\"]*\",)"]
C. E.
  • 70,533
  • 6
  • 140
  • 264
user13892
  • 9,375
  • 1
  • 13
  • 41

2 Answers2

3

You are able to reference the parenthesised sections of your RegularExpression on the right-hand side with RuleDelayed.

With

string = "ads, 32, dv, \"sdf\"
  sd, 213, as, \"asd
  asd\"
  asd, 123 sd, \"asd\"";

Then

StringCases[
 string,
 RegularExpression["\"([^\"]+?\n[^\"]+?)\""] :> "$1"
 ]
{"asd
asd"}

Hope this helps.

Edmund
  • 42,267
  • 3
  • 51
  • 143
1

Here is a solution which won't include the outer quotes in the matched string and hence will count every quote twice:

StringCases["a\"b\nc\"d\ne\"", RegularExpression["(?<=\")[^\"]+?\n[^\"]+?(?=\")"]]
{"b\nc", "d\ne"} 

Compare to Edmund's solution:

StringCases["a\"b\nc\"d\ne\"", 
 RegularExpression["\"([^\"]+?\n[^\"]+?)\""] :> "$1"]
{"b\nc"}

Note that both solutions use + instead of * since you wish to find only the second \n what contradicts the title of the question: "regex to find \n within quotes and not ones outside quotes". Based on the title only, the solution would use * in order to allow matching lonely newline character inside of the quotes:

StringCases[string, RegularExpression["(?<=\")[^\"]*?\n[^\"]*?(?=\")"]]
{"\n  sd, 213, as, ", "asd\n  asd", "\n  asd, 123 sd, "} 

It is worth to add also that both mine and Edmund's solution completely ignore the usual rules for matching quotes as WReach correctly notes in the comment. If quote matching is required, one should apply approaches shown in the following threads:

Alexey Popkov
  • 61,809
  • 7
  • 149
  • 368