7

Let's say I have a piece of code:

Hold[{code1,
      "asdad " <> ToString[testa] <> " adsd " <> ToString[testb],
      code2}] (*MWE ofc*)

which I want to convert. Each StringJoin[...] should be replaced so I will get:

Hold[{code1, StringForm["asdad `` adsd ``", testa, testb], code2}]

I have an answer but maybe one may show shorter approach:

Hold[{code1, "asdad " <> ToString[testa] <> " adsd " <> ToString[testb], code2}
    ] /. HoldPattern[StringJoin[x__]
                    ] :> RuleCondition@(
          StringForm[StringJoin @@ (Hold[x] /. _ToString :> "``"), 
                     ##] & @@ Cases[Hold[x], HoldPattern[ToString[z_]] :> z]
          ) // InputForm
 Hold[{code1, StringForm["asdad `` adsd ``", testa, testb], code2}]

Edit: This solution is not perfect. It evauates testa and testb, does not matter in my case but for generality let's assume they may not be evaluated. Also, referring to first of Mr.Wizard's suggestions: StringJoin expressions may appear or different levels too.

Kuba
  • 136,707
  • 13
  • 279
  • 740
  • 1
    Should all StringJoin objects be replaced, or only those at level 2, or only certain ones by position? – Mr.Wizard Jan 17 '14 at 13:54
  • @Mr.Wizard Let's say we don't know on which level they may appear. But if you have neat solution for the simpler case, I'm lookng forward seeing it too. – Kuba Jan 17 '14 at 13:57
  • I was trying to solve this on my own, without reading your solution, but I struggled to get evaluation correct so I looked at your method to see how you had solved it. I discovered that your code does not work properly: testa and testb get evaluated. Is that acceptable? – Mr.Wizard Jan 17 '14 at 14:22
  • I think I'm remembering something. Please tell me, what are the Attributes of StringForm on your system? – Mr.Wizard Jan 17 '14 at 14:34
  • 1
    @Mr.Wizard only Protected. Hmm, it seems they are, I've missed that because it does not make a difference for my purposes. – Kuba Jan 17 '14 at 14:36
  • Okay. Would you make clear in the question whether it does or does not matter? I'll adjust my code to match. – Mr.Wizard Jan 17 '14 at 14:37
  • @Mr.Wizard ok, done. let me eat the dinner and I will check answers :) – Kuba Jan 17 '14 at 16:05

3 Answers3

11

I think that in general, for tasks like this one, tricks like Trott-Strzebonski technique are not the best way, and one really needs expression parsers, which are may be not shorter, but more readable and more extensible. Here is a possible one for your problem:

ClearAll[convert];
SetAttributes[convert, {HoldAll}];
convert[x_List] := Map[convert, Unevaluated[x]];
convert[Hold[{pieces___}]] :=
   (Hold[#] &[convert[{pieces}]]) /. convert[x_] :> x;
convert[s_StringJoin] := convertSJ[s];

where the specific converter for StringJoin is:

ClearAll[convertSJ];
SetAttributes[convertSJ, HoldAll];
convertSJ[s_StringJoin] := convertSJ[s, {}];
convertSJ[StringJoin[prev__String, ToString[x_], rest___], {accum___}] :=
   convertSJ[StringJoin[prev, " `` ", rest], {accum, x}];
convertSJ[s_StringJoin, {accum___}] := 
   With[{st = s}, convert[StringForm[st, accum]]];

So that

convert[
  Hold[{code1, "asdad " <> ToString[testa] <> " adsd " <> ToString[testb], code2}]
]

(* Hold[{code1, StringForm["asdad  ``  adsd  `` ", testa, testb], code2}] *)
Leonid Shifrin
  • 114,335
  • 15
  • 329
  • 420
  • I was about to ask "why" but then I remembered we've been over this before. Frankly I still don't agree, as that convert with all its cases seems more confusing to me than a single replacement rule that caries out a specific action. "Agree to disagree" I guess. – Mr.Wizard Jan 17 '14 at 15:53
  • @Mr.Wizard The big advantage of convert is that its rules are composable, which means that, for more complex expressions, it will automatically dispatch to a right rule for a given part, and then combine them all together again. The difference in power between plain rule-based approach and the one based on parsers is pretty much similar to a difference between a parser and a regexp, for string parsing (with a caveat that patterns in M are more powerful than regexps). The more complex the original expression and the transformation rules, the more this difference will show. – Leonid Shifrin Jan 17 '14 at 16:02
  • @Mr.Wizard Another reason why I prefer this approach is that in every single rule in convert, I can focus on only what this particular rule should do. In a single rule a-la your solution or others, all these transformations are coupled together by the evaluation control constructs, while in my approach evaluation control does not couple the transformation together. If I want to change that rule later, for example, I don't have to care what else such a change might break. – Leonid Shifrin Jan 17 '14 at 16:05
  • The first point I think I understand, but I'll think about it (again). If one knows from the start that this operation will not transmogrify into something more complicated do you still feel that a parser is the best approach? How do you decide when it is appropriate to leave behind simple rules and start using a parser? Do you have to write out your parser code by hand every time, or do you have some meta-code for this? – Mr.Wizard Jan 17 '14 at 16:12
  • Would you please amplify "In a single rule a-la your solution or others, all these transformations are coupled together by the evaluation control constructs, while in my approach evaluation control does not couple the transformation together." Again, I think I understand what you're saying, and I think I start to see the superiority of the parser, but I need a lot of help I guess. :^) – Mr.Wizard Jan 17 '14 at 16:12
  • 1
    @Mr.Wizard Yes, I still think that the parser approach is still superior even in that case, because I find it easier to understand. In most (but not all) cases, I started to view TZ trick as a hack, and when you need something more complicated that what TZ gives out of the box (in terms of evaluation), you need more hacks to make it all work without evaluation leaks. The parser approach is straight-forward: you descend down the expression, transform, and then collect the expression back, with evaluation control taken care of explicitly in each case. – Leonid Shifrin Jan 17 '14 at 16:16
  • @Mr.Wizard And if there is an evaluation leak, it is localized to a single rule, which you can fix. And again, it allows you to focus on a single rule at a time, and then the problem of evaluation control does not get worse when you deal with more complex rules / expressions. – Leonid Shifrin Jan 17 '14 at 16:18
  • 1
    You have almost convinced me but I still have trouble finding this easy to read. Would you consider posting a self Q&A with a graphical illustration of a simple parser? I think I lack a mental image that allows me to "see" this operation. I hope you understand what I am trying to say. – Mr.Wizard Jan 17 '14 at 16:27
  • Also, any reason for (Hold[#] &[convert[{pieces}]]) rather than the shorter Hold @@ {convert[{pieces}]}? (Or even Hold @@ convert[{pieces}]?) Is the single argument (use of Slot) significant? – Mr.Wizard Jan 17 '14 at 16:28
  • @Mr.Wizard Re: self-answer - would like to, but right now my time here is over, need to finish something. Hope to remember to do this later. Re: Hold: your first version is a possible alternative, I am just used to Hold[#]& idiom, and I like it because it makes it explicit that we want to leak evaluation here. Your last option (Hold @@ ...) will leak evaluation. – Leonid Shifrin Jan 17 '14 at 16:37
  • @Mr.Wizard Ok, one last thing I did for now was to factor out the conversion for StringJoin pieces, since it is in fact not really coupled with the rest of convert. This seems to make the code a bit easier to read. – Leonid Shifrin Jan 17 '14 at 16:51
  • Thanks Leonid. I do hope you have time for the Q&A later. By the way, if Hold @@ convert[{pieces}] leaks doesn't the following rule /. convert[x_] :> x cause the same problem? – Mr.Wizard Jan 17 '14 at 16:52
  • @Mr.Wizard I will really try to remember posting this, since this seems to be an important topic, and I feel that expression parsers are a powerful technique. Re: convert - the rule convert[x_]:>x does not leak because by the time it is applied, expression is dressed back in Hold. It is implicitly assumed here that Hold is the top-level wrapper of entire expression, and the starting (end ending) point of the application of convert. – Leonid Shifrin Jan 17 '14 at 16:56
  • btw, congrats on 50k! Half way there to a 100k :) – rm -rf Jan 23 '14 at 05:49
  • @rm-rf Thanks, man :). Now I joined the 50K club :) – Leonid Shifrin Jan 23 '14 at 10:47
  • @LeonidShifrin May I ask additional questions? Let's say there is "1/2" in the row, but in 2D form. Is there any way to avoid conversion to "(!(...", while preserving input form of possible Style["1/2", Bold]? I can ask a separate question later if you want. – Kuba Apr 30 '14 at 12:25
  • @Kuba Yes, please ask a separate one. I am currently out of context for this problem, and the cost of context switching is too high for me at the moment. – Leonid Shifrin Apr 30 '14 at 13:51
  • @LeonidShifrin Of course :) I will do this in near future. Good luck with your current duties :) – Kuba Apr 30 '14 at 14:07
1

An embedded Trott-Strzebonski method-approach. rep[expr, held, from -> to, f] works by:

  • holding held symbol (like ToString);
  • replacing symbol from with to (like StringJoin -> StringForm), partially evaluating arguments that are not held...
  • ...applying function f to arguments not held in from.

It leaves code... and test... parts unevaluated, es expected.

rep[expr_, held_, from_ -> to_, f_] := Block[{z}, 
   expr //. {held -> HoldForm, from -> z} /. (z[y__] :> Block[{},
     to @@ MapThread[#1 @@ (#2 /@ #3) &,
       {{from, Sequence}, {f, #&}, GatherBy[{y}, Head]}] /; True]) //. HoldForm->held];

{code1, code2} = {11, 22};
testa := (Print@"A"; 2);
testb := (Print@"B"; 4);

rep[expr, ToString, StringJoin -> StringForm, #<>"``" &] // InputForm
Hold[{code1, StringForm["asdad `` adsd ``",
     ToString[testa], ToString[testb]], code2}]

It is general enough to deal with other types of replacements:

expr = Hold[{code1, 1 + N@testa + 3 + N@testb, code2}];
rep[expr, N, Plus -> g, f] // InputForm
Hold[{code1, g[f[1] + f[3], N[testa], N[testb]], code2}]
István Zachar
  • 47,032
  • 20
  • 143
  • 291
  • @Mr.Wizard, Kuba Please see new version. – István Zachar Jan 17 '14 at 15:11
  • The new version does not prevent evaluation of testa and testb, but neither does Kuba's original. My rule code seems a good deal simpler in this case. Also your code has ToString[testa] rather than bare testa, which does not behave the same; consider for example if testa is a "2D" expression. This could be an advantage or a disadvantage depending on what Kuba wants. – Mr.Wizard Jan 17 '14 at 15:25
  • @Mr.Wizard Yeah, I've realized that too. For the moment, I don't have time to figure out something clever. If I cannot do better tomorrow, I'll delete it. – István Zachar Jan 17 '14 at 16:19
  • @Mr.Wizard Figured out something. – István Zachar Jan 17 '14 at 16:24
1

This is what I came up with. Better? I don't know.

convert =
 Block[{StringForm},
   SetAttributes[StringForm, HoldRest];
   # /. sj_StringJoin :> RuleCondition[
      Reap[Unevaluated[sj] /. (t : ToString)[x_] :> (Sow[Hold@x]; " `` "), _, 
        Join @@ #2 &] /. {s_, {_[ex__]}} :> StringForm[s, ex]]
 ] &;

Test:

start = Hold[{code1, "asdad " <> ToString[testa] <> " adsd " <> ToString[testb], code2}];
{code1, code2} = {0, 0};
testa := 2 + 2
testb := Print["!"]

convert @ start // InputForm
Hold[{code1, StringForm["asdad  ``  adsd  `` ", testa, testb], code2}]

If we don't need to prevent evaluation of testa and testb, which Kuba's code does not do, we can simplify this considerably:

rule = sj_StringJoin :>
  RuleCondition[
   StringForm[#, Sequence @@ #2[[1]]] & @@
    Reap[Unevaluated[sj] /. ToString :> ((Sow[#]; " `` ") &)]
  ];

start /. rule // InputForm

During evaluation of In[]:= !

Hold[{code1, StringForm["asdad  ``  adsd  `` ", 4, Null], code2}]
Mr.Wizard
  • 271,378
  • 34
  • 587
  • 1,371
  • This attribute is quite nice, I'm struggling with injection of unevaluated sequence in different way but I got stuck now. :p p.s. to many whitespaces: " " -> "". Thanks again. – Kuba Jan 18 '14 at 00:24