5

I'm trying to get a list of all the variables in a symbolic expression. Some are named with Symbols, some are named with Subscript-ed symbols.

I can get a list of the Subscript-ed symbols with:

DeleteDuplicates@Cases[expr, Subscript[_Symbol, _Symbol], Infinity]

I can get a list of Symbols with:

DeleteDuplicates@Cases[expr, _Symbol, Infinity]

The problem with that second Cases statement is that it returns subscripted symbols in two parts-- the "base" and the "subscript". That is, Subscript[a,b] will return two symbols: {a,b}.

How do I construct a pattern such that I get back a list of all the symbolic variables of an expression, whether or not they're represented as subscripted? For example:

DeleteDuplicates@Cases[a+Subscript[b,c]*Subscript[d,e]^f,pattern,Infinity]

returns {a,Subscript[b,c],Subscript[d,e],f}?

Omegaman
  • 279
  • 1
  • 4
  • What should it return on (for example) Subscript[f[t + t1], e + k]? – Dr. belisarius Aug 10 '15 at 04:45
  • @belisarius, I'm not so concerned with forms like that as I don't expect them in the input. The only place I have Subscript heads is when it's a subscripted variable such as Subscript[r,bat] where r and bat are unassigned symbols. – Omegaman Aug 10 '15 at 04:51

3 Answers3

3

This is the very definition of a kluged-together solution (although I like the idea of replacement rules with side-effects), and it's not well-tested, but here's something that works for your example.

expr = a + Subscript[b, c]*Subscript[d, e]^f;

DeleteDuplicates@Flatten@Reap@Cases[ expr /. Subscript[a_Symbol, b_Symbol] :> (Sow[Subscript[a, b]]; temp[]) , _Symbol , Infinity ] (* {a, f, Subscript[b, c], Subscript[d, e]} *)

The first replacement replaces the subscripts with the temp[] (which will not match a Symbol) while Sowing the Subscripted symbol. It then runs Cases to find the symbols, Reaps to get the Subscripts, Flattens so that all the symbols are together in one list, and finally runs DeleteDuplicates (Union works as well as used by belisarius).

Just as belisarius said about his answer, this is probably not the best solution, but matching a Symbol while not matching the Symbol when it has a particular Head is something I don't know how to do easily.

Update

We could also use the mysterious vanishing function:

DeleteDuplicates@Flatten@Reap@Cases[
  expr /. Subscript[a_Symbol, b_Symbol] :> (Sow[Subscript[a, b]]; ##&[])
  , _Symbol
  , Infinity
 ]
(* {a, f, Subscript[b, c], Subscript[d, e]} *)

The interesting bit here is

expr /. Subscript[a_Symbol, b_Symbol] :> (Sow[Subscript[a, b]]; ##&[])
(* a + f *)

The mysterious ##&[] removes all mention of the Subscripts, and f takes the place of that entire term.

march
  • 23,399
  • 2
  • 44
  • 100
  • 1
    I like this best as it appears to only make one pass through the expression-- I don't know if that's actually more efficient, but it feels it would be. I borrowed from @belisarius to replace your use of temp[] with Unique[][] because it feels less likely to step on a common function name. The vanishing function gives me something to study up on... – Omegaman Aug 11 '15 at 06:41
  • @Omegaman. That's why I like the vanishing function idea, as instead of replacing with some arbitrary expression, it removes the symbols completely (and actually it's not all that mysterious). I also have a standard Protected head temporaryPlaceHolder that I use for things like this so as to avoid clashing with already-defined names. Thanks for the accept! – march Aug 11 '15 at 14:14
  • …and in the latest version, the "vanishing function" is now called Nothing. – J. M.'s missing motivation Aug 14 '15 at 02:30
  • Both solutions may lead to incorrect answers depending on the value of expr. For example, the one that uses temp[] will fail for expr = (Subscript[a, c] - Subscript[b, c])*f while the one that uses the vanishing function will fail for expr = (1 - Subscript[b, c])*f. Using Omegaman's comment and replacing temp[] by Unique[][] succeeds in both cases by preventing (Subscript[a, c] - Subscript[b, c]) from being replaced by zero and -Subscript[b, c] from being replaced by 1. – Mauricio de Oliveira Oct 04 '17 at 19:47
2

Surely not the best way:

expr = a+Subscript[b,c]*Subscript[d,e]^f
pat = Subscript[_Symbol, _Symbol];
Union @@ {Cases[expr, pat, Infinity], 
          Cases[expr /. pat :> #, Except[#, _Symbol], Infinity] &@Unique[]}

(* {a, f, Subscript[b, c], Subscript[d, e]} *)
Dr. belisarius
  • 115,881
  • 13
  • 203
  • 453
2

Depending on the complexity of the actual expression, this very simple solution may work better than Cases:

expr = a + Subscript[b, c]*Subscript[d, e]^f;

Variables[expr /. x : Except[_Plus | _Times | _Subscript] :> Times @@ x]

(* ==> {a, f, Subscript[b, c], Subscript[d, e]} *)

It uses the built-in function Variables which is designed to identify variables in polynomials. It can identify subscripted variables, too (as a side effect of it's ability to return anything that isn't simple enough to be decomposed as part of the variable list). To apply it here, all you have to do is convert the expression such that it contains only addition and multiplication operations (in addition to the atomic and subscripted parts). This is what the replacement rule does. It turns anything more complicated into Times, which in this case allows the exponent in Power to be counted as a variable, too. Then the variables are identified correctly.

Jens
  • 97,245
  • 7
  • 213
  • 499
  • +1 for a practical solution that works on my example expression, but the actual expressions I'm processing are not polynomial and seem to choke Variables. – Omegaman Aug 11 '15 at 06:36