5

One of the many attractions of Mathematica is that integers are generally treated as exact symbolic entities, and not just as numbers. I was just playing with Sqrt, and was puzzled by what appeared to me to be an inconsistency that must have been there since day 1, and yet which I have somehow not noticed before.

This works as expected:

Sqrt[x^2]
Sqrt[x^2]

so that this evaluates correctly:

Solve[y == Sqrt[x^2], x]
{{x -> -y}, {x -> y}}

But, if I enter exact integers, only the positive branch solution is returned:

Sqrt[9]
3

and so:

Sqrt[(-3)^2]
3

In Summary:

The various comments below make it clear that the Sqrt function:

  • only ever returns the positive branch, by design.

  • Sqrt[x^2] is left unevaluated, so that the positive branch can be selected with appropriate assumptions, as per: Simplify[ Sqrt[x^2], x<0] -> -x

This leaves the prevailing issue: Is there perhaps a variant of Sqrt (or option) that gives both solutions that I have missed?

Why does there not seem to be a function in Mathematica that returns the mathematically rigorous solution set {negative branch, positive branch} ... rather than just returning the positive branch that seems to operate more like a 20thC pocket calculator?

For example:

Sqrt[  Interval[{-2, 0}]^2 ]
-> Interval[{0, 2}]

... is correct given how the Sqrt function is defined (to always return the positive branch), but most confusing given that we actually seek the set of possible solutions.

Many thanks for the suggestions below! It would be nice to see something like this added to the Sqrt functionality in mma.

wolfies
  • 8,722
  • 1
  • 25
  • 54
  • 3
    Actually it is the definition of the absolute value: $\sqrt{x^2}=\lvert x\rvert$. – Spawn1701D Apr 21 '13 at 14:17
  • 1
    But Solve[9==x^2,x] returns both solutions, so I don't think there's an inconsistency. – Cassini Apr 21 '13 at 14:49
  • FullForm[Sqrt[z]] gives Power[z, Rational[1, 2]], and in general according to the documentation for Power, the value of x^y is the principal value of Exp[y Log[x]]. Hence for a specific number z, the value of Sqrt[z] is always a single number, i.e., the function Sqrt is single-valued. – murray Apr 21 '13 at 14:59
  • 3
    Sqrt is a (single-valued) function. It returns the principal square root, which is a (complex) number with a nonnegative real part. Note that Sqrt[x^2] remains unevaluated. But, for instance, Simplify[Sqrt[x^2], Re[x] < 0] returns -x (only). – Michael E2 Apr 21 '13 at 15:00
  • 3
    "Is there perhaps a variant of Sqrt (or option) that gives both solutions that I have missed?" - no, because Sqrt[] was always intended to be a function. One input, one output. Mathematica conveniently happens to use the principal branch. – J. M.'s missing motivation Apr 21 '13 at 15:04
  • What would you expect Sqrt[x^2] to return to be consistent? It can't be Abs[x], because that is incorrect for Complex numbers. E.g. Sqrt[(-1+2I)^2] == 1 - 2I. (As I said before, Sqrt[x^2] does not technically return anything; it is left unevaluated.) – Michael E2 Apr 21 '13 at 15:27
  • 2
    "...why does Sqrt[x^2] not return Abs[x]?" - wild guess, but maybe, just maybe, because of things like Sqrt[(3 + 4 I)^2] not being equal to Abs[3 + 4 I]? – J. M.'s missing motivation Apr 21 '13 at 15:30
  • @MichaelE2 I think the default setting should be as is, but there should be an option Sqrt[blah, Reals] or some such thing that produces both solutions. – wolfies Apr 21 '13 at 15:32
  • "...there should be an option... that produces both solutions." - then it isn't really a function anymore, no? – J. M.'s missing motivation Apr 21 '13 at 15:56
  • 1
    @J.M. Does the absence of one-to-one solutions trouble you when using Solve? It seems to me that the way that Sqrt is implemented in modern computing packages has more to do with the way the old sqrt button was historically implemented on 20thC pocket calculators (which only presented the positive branch) than with the underlying mathematics which presents multiple solutions. It also causes a lot of confusion when people use the term in its general meaning, and get back the positive root, for instance: Sqrt[ Interval[{-2, 0}]^2 ] returning Interval[{0, 2}] – wolfies Apr 21 '13 at 16:20
  • Well, I know I'm doing different things when I'm solving a polynomial equation, and when I'm using a square root as part of a formula, for starters. The principal square root function is just one of the two possible solutions, conveniently chosen, of a certain quadratic equation. Would you also insist that, say, arcsine should return the infinitely many possible values it can be? – J. M.'s missing motivation Apr 21 '13 at 16:23
  • BTW, there has been a lot of discussion on this matter in math.SE. See e.g. this and this. – J. M.'s missing motivation Apr 21 '13 at 16:29
  • @J.M. Well, mathematical rigour is certainly the path that mma appears to be taking as it evolves. Compare, for instance, the solution to: Solve[y == Cos[x], x], which returns ..... under v8: {{x -> -ArcCos[y]}, {x -> ArcCos[y]}} ......... to what you now get under v9 ........ {{x -> ConditionalExpression[-ArcCos[y] + 2 [Pi] C[1], C[1] [Element] Integers]}, {x -> ConditionalExpression[ArcCos[y] + 2 [Pi] C[1], C[1] [Element] Integers]}} – wolfies Apr 21 '13 at 16:30
  • 5
    There was a "debate" about this sort of thing in the late 1800s and early 1900s. Prior to that, the square root sign did represent all roots. Since, the convention has been that functions should be single-valued and in particular that the square root should represent a single number. One problem: If $\sqrt{x^2} + \sqrt{y^2}$ can take on four forms, $\pm x \pm y$, then $\sqrt{x^2} + \sqrt{x^2}$ equals $\pm x$ or $0$, but $2 \sqrt{x^2}$ equals only $\pm x$. Thus $\sqrt{x^2} + \sqrt{x^2}$ would not be equivalent to $2 \sqrt{x^2}$! – Michael E2 Apr 21 '13 at 16:42
  • Your edited question asks why Sqrt[x] doesn't return Abs[x]. The simple reason is that this isn't true unless x is real. But the default assumption for symbols without numerical value is that they are complex. – Jens Apr 21 '13 at 17:04

3 Answers3

3

While Mathematica follows the conventional philosophy about inverses of functions that are not one-to-one, it does provide tools for dealing with other approaches. If Solve is too cumbersome, one can write one's own versions:

root[n_Integer, expr_] := expr^(1/n) ((-1)^(2/n))^Range[0, n - 1];
root[n_Integer, Power[expr_, m_Integer]] /; Divisible[m, n] := 
  expr^(m/n) ((-1)^(2/n))^Range[0, n - 1];
sqrt[expr_] := root[2, expr];    

sqrt[9]
  (* {3, -3}  *)

sqrt[3 + 4 I]
  (* {2 + I, -2 - I} *)

sqrt[x^2]
  (* {-x, x} *)

root[3, 27]
   (* {3, 3 (-1)^(2/3), -3 (-1)^(1/3)} *)

Solve[z^3 == 27, z]
  (* {{z -> 3}, {z -> -3 (-1)^(1/3)}, {z -> 3 (-1)^(2/3)}} *)

root[3, Sin[x]^6]
  (* {Sin[x]^2, (-1)^(2/3) Sin[x]^2, -(-1)^(1/3) Sin[x]^2} *)

However, there are inconsistent behaviors that would have to be dealt with to implement fully working algebraic system.

sqrt[(x)^2] + sqrt[(-x)^2]
  (* {2 x, -2 x} *)

sqrt[(x - 1)^2] + sqrt[(-(x - 1))^2]
  (* {0, 0} *)

The upshot for me is that the basis is and should be made of functions.

Michael E2
  • 235,386
  • 17
  • 334
  • 747
3

For completeness, I'll just add the textbook definition of the $n$-th root of a complex variable:

root[x_, n_, branch_: 1] := 
 Simplify[Power[Abs[x], 1/n] Exp[I (Arg[x]/n + 2 Pi (branch - 1)/n)]]

root[1, 2, 2]

(* ==> -1 *)

Here, x is an arbitrary complex number, n is the power of the the equation $z^n = x$ we're trying to solve, and branch is the number of the branch (with 1 being the default, first branch). So the normal square root would be

root[1, 2, 1]

(* ==> 1 *)

And a cube root would look like this:

root[1, 3, 2]

(* ==> (-1)^(2/3) *)

Everything is consistent with the other built-in functions, (e.g., you can confirm Arg[%] yields the correct result.

Jens
  • 97,245
  • 7
  • 213
  • 499
1

The observations I made I got from messing around with Unevaluated, Trace, Hold and FullForm.

My intuition was that we could make get the same behavior of Sqrt for numbers as for Symbols using Unevaluated. However, there is a a rule for Sqrt[anything] that must look like this

HoldPattern[Sqrt[anything_]]:> Power[anything, Rational[1,2]]

So an Unevaluated on an argument of Sqrt will just get cleared by this rule. So we have to look at Power instead.

Now, this observation seems crucial. It appears there is a rule that looks like this

Clear[power]
power[power[x_, y_] /; 
   And[IntegerQ[x], IntegerQ[y]], d_] := 
 power["power"[x, y], d]

Where I use the string "power" to avoid that the results of examples satisfy this rule again. If you are used to using Unevaluated, you will know that Unevaluated only gets stripped if a rule is applied. Now we can test if there is really such a rule (you can also verify this using Trace and asking the FullForm's of intermediate expressions you surround with Hold, but it is easier to use Unevaluated). This shows the rule in action:

Power[Unevaluated[Power[3, 4]], d]

-> 81^d

whereas

Power[Unevaluated[Power[3, a]], d]

-> Power[Unevaluated[Power[3, a]], d]

So in the first case, a rule was applied, whereas in the second case there was no rule applied. To see that the test really involves IntegerQ and not NumberQ (or something), note that

Power[Unevaluated[Power[3., 4]], d]

-> Power[Unevaluated[Power[3., 4]], d]

We can see that my symbol power behaves in the same way for the three examples

power[Unevaluated[power[3, 4]], d]

-> power[power[3, 4], d]

(where we have to imagine that "power"[3,4] just evaluates to 81).

power[Unevaluated[power[3., 4]], d]

-> power[Unevaluated[power[3., 4]], d]

and

power[power[3, a]], d]

-> power[power[3, a]], d]

Now, lets get back to your problem. In your example, as Sqrt does not have Hold attributes, -3^2 immediately gets evaluated to 9 and there is no fun. However, using all this knowledge about Unevaluated, we can now do the following

Unprotect[Power]
Power[Power[a_, 2] /; a < 0, Rational[1, 2]] := a

So that

Sqrt[Unevaluated[Unevaluated[(-3)^2]]]

-> -3

Which I think is very nice :).

With respect to consistency... Sqrt could have been given the attribute HoldAll. It could have been made so that it would pass its argument to Power with a wrapper Unevaluated. And a rule could have been implemented like the one for Power I defined above. But I guess it would just be confusing, as none of the other "basic functions" (Times Plus Divide etc) have such attributes and they don't pass ever pass the head Unevaluated.

Remarks: Related Mathematica rules

I also found a rule in MMA which may lead to more insight into how MMA handles this kind of expression. We have

Trace[Power[Unevaluated[Power[Times[-1, q], 4]], d], 
   TraceOriginal -> True][[4]] // FullForm

-> HoldForm[Power[Power[Times[-1, q], 4], d]]

But

Power[Unevaluated[Power[Times[-1, a], 3]], d]

-> Power[Unevaluated[Power[Times[-1, a], 3]], d]

This appears to be the result of a rule like this

power[power[times[x_, a_], y_] /; And[x == -1, EvenQ[y]], d_] := 
 power[power[times[x, a], d]]

I also found some other related rules, but I cannot say more about those yet.

Actually, I now find the subject extremely interesting, as it seems that MMA is using rules of a form that I haven't seen yet. MMA seems to have some very abstract rule for handling cases like the following

Power[Unevaluated[
  Power[Power[Power[Power[Power[Times[1, a1], a2], a3], a4], a5], 
   a6]], a7]

-> Power[Power[Power[Power[Power[Power[a1,a2],a3],a4],a5],a6],a7]

It would seem that here that there is a rule at work that can deal with very deep nesting. If you are interested, please look at my question here

Jacob Akkerboom
  • 12,215
  • 45
  • 79
  • To get the negative branch correctly, you'd have to make it work for complex numbers, too. Without trying it (I don't like to modify the built-in function Power), it seems that your method with /; a < 0 can't do this. – Jens Apr 21 '13 at 17:08
  • @Jens Yeah... but I suppose you would want to have it for all even powers too... But if you want to be able to really do things like these in a general, the only good way I can see is to give all the basic functions Hold attributes. The attribute Flat is also something to consider. Consider FullSimplify[Times[Unevaluated[a], Unevaluated[b]]], where Unevaluated gets stripped because of the rule Times[a_]:=a and the attribute Flat. But I guess you can make your own functions to start off with, that refer to the usual functions if there is nothing more to do. Or something. – Jacob Akkerboom Apr 21 '13 at 17:16
  • @Jens Basically my conclusion is that you can do very little for numbers without hold attributes. But for symbols, anything is possible as these do not evaluate. But now I should edit the answer a bit – Jacob Akkerboom Apr 21 '13 at 17:27