Simplification depends on the names of variables

Question

I am sure this issue has been discussed here, but it has never been asked whether this behavior is a bug. Therefore my direct question to the community: do you think this behavior is a bug?

poly = a1 b1 e1 + a2 b1 e1 + a3 b1 e1 + a1 b2 e1 + a2 b2 e1 + a3 b2 e1 + b1 c1 x + c1 d1 x + c1 x z;
FullSimplify[poly]
Out[1]= a3 b1 e1 + a3 b2 e1 + a1 (b1 + b2) e1 + a2 (b1 + b2) e1 + b1 c1 x + c1 d1 x + c1 x z
FullSimplify[poly /. x -> a]
Out[2]= (a1 + a2 + a3) (b1 + b2) e1 + a c1 (b1 + d1 + z)

The LeafCount is 33 and 17, respectively. How one can achieve a consistent simplification independent on the names of variables?

Notably, it doesn't simplify if you do x->t or x->s or x -> F, but does simplify if you choose x -> ω, x -> ℵ, x -> C[1], for example. — flinty, Jul 28 '20 at 13:38
I do not consider it a bug. Expressions must be put in canonical forms for manipulation. To remove all dependencies due to variable name ordering would require at least some calculations always be done with all name orderings. This would be a huge performance cost with little overall benefit. On a case-by-case basis, each user can determine if they wish to pay this cost. If so, they can do variable name substitutions to find the optimum ordering, then reverse substitute for the final result. — Bob Hanlon, Jul 28 '20 at 14:27
If I had to hazard a guess, there's probably an internal hash table and it's sort of psuedo-randomly influencing the order that Mathematica chooses which reductions to make. ASCII symbols like the usual alphanumeric symbols and others like extended characters such as î ô û ø seem to produce more variation, while unicode characters above this range have almost no variation at all as far as I can tell. — flinty, Jul 28 '20 at 14:30
@BobHanlon This goes well beyond the ordering issue. Notice the LeafCount. — yarchik, Jul 28 '20 at 14:31
@BobHanlon Due to a huge difference in the LeafCount I would even claim that we are talking about simplification vs. no simplification at all depending on the variables' name. Lack of consistency is a bug, isn't it? — yarchik, Jul 28 '20 at 14:37
@yarchik the final LeafCount or more generally the ComplexityFunction is the criterion by which the optimal ordering of the variables is determined. More clearly, the actual variables are substituted with a set of ordered replacements which effectively reorders the original variables. Each set of substitutions is used to measure the ComplexityFunction of the corresponding result. The substitution (ordering of the original variables) that produces the optimum result is used. That result then has the reverse substitution applied to express the result in the original variables. — Bob Hanlon, Jul 28 '20 at 14:53
@yarchik - results differ based on variable names. Difference is due to ordering of the variable names which influences the canonical form of expressions which influences recognition of forms in the rule-based substitutions that drive calculations. To get the optimal result, optimal ordering of the names resulting in optimal canonical forms of expressions is required. Reordering of the variables can only be accomplished by temporary substitution until the optimum result is found. The ComplexityFunction of final result before and after the return to the original variables will not change. — Bob Hanlon, Jul 28 '20 at 15:12
It would be nice if Mathematica picked the ordering based on the order the symbols first appear in the expression rather than the lexicographic ordering. That way it would be consistent under a replacement as above. Maybe you could emulate this in a function OrderedFullSimplify that replaces all symbols with s[1],s[2],s[3],...,s[n], performs the FullSimplify, then puts back the original symbols. — flinty, Jul 28 '20 at 16:21

Simplification depends on the names of variables

0 Answers0

Linked