6

It is conventional wisdom that programmers should use std::hypot whenever one implements an expression of the form $r = \sqrt{x^2 + y^2}$

In my example, my expression is $r = \sqrt{1 + c^2}$, where $0 \le c \le 1, c \in \Re$. My first instinct is to reach for hypot() to implement - I am interested in both double and float variants. In my problem, for most instances $c \ll 1$.

However, a quick read of Wikipedia suggests I might be wasting my (CPU) time. Since the suggested implementation where $x > y$ is:

$$ r = \sqrt { x^2 + y^2 } \\ = \sqrt { x^2 ( 1 + (y/x)^2) } \\ = |x| \sqrt {1 + (y/x)^2 } $$

However, in my instance $x = 1$ and therefore it is algebraically the same as the naive implementation $r = \sqrt{1 + y^2}$.

What would be the impact of not using hypot() in this problem, and are there any other numerical pitfalls to look for?

J. M.
  • 3,155
  • 28
  • 37
Damien
  • 802
  • 4
  • 10

1 Answers1

10

Summary: it is cheapest and most accurate to use sqrt(fma(c, c, 1)) if you have FMA, and sqrt(1+c*c) otherwise. In my testing, though, the difference is extremely marginal: of the 1065353216 32-bit floating point numbers $0\leq c\leq 1$, the first formula is better 532509 times (0.05%), and worse 382159 times (0.035%), and neither formula has error worse than 1 ulp, so the obvious sqrt(1+c*c) is good enough.

It is a little bit of a fine point, and Wikipedia can sometimes be a little unreliable on this. One good way to settle these types of questions is to go to a mature library that already implements hypot, such as openlibm (https://github.com/JuliaLang/openlibm).

Quoting the source code comment that explains it (https://github.com/JuliaLang/openlibm/blob/master/src/e_hypot.c):

/* __ieee754_hypot(x,y)
 *
 * Method :                  
 *  If (assume round-to-nearest) z=x*x+y*y 
 *  has error less than sqrt(2)/2 ulp, than 
 *  sqrt(z) has error less than 1 ulp (exercise).
 *
 *  So, compute sqrt(x*x+y*y) with some care as 
 *  follows to get the error below 1 ulp:
 *
 *  Assume x>y>0;
 *  (if possible, set rounding to round-to-nearest)
 *  1. if x > 2y  use
 *      x1*x1+(y*y+(x2*(x+x1))) for x*x+y*y
 *  where x1 = x with lower 32 bits cleared, x2 = x-x1; else
 *  2. if x <= 2y use
 *      t1*y1+((x-y)*(x-y)+(t1*y2+t2*y))
 *  where t1 = 2x with lower 32 bits cleared, t2 = 2x-t1, 
 *  y1= y with lower 32 bits chopped, y2 = y-y1.
 *      
 *  NOTE: scaling may be necessary if some argument is too 
 *        large or too tiny
 *
 * Special cases:
 *  hypot(x,y) is INF if x or y is +INF or -INF; else
 *  hypot(x,y) is NAN if x or y is NAN.
 *
 * Accuracy:
 *  hypot(x,y) returns sqrt(x^2+y^2) with error less 
 *  than 1 ulps (units in the last place) 
 */

So if you can compute $1+c^2$ sufficiently accurately (and in your case with $|c|\leq 1$ you will not have over-/underflow), the final result will be accurate.

If you have a modern CPU with a fused multiply-add (FMA) instruction, this is trivial using fma, which most languages' standard math libraries have, so you can use sqrt(fma(c, c, 1)) (cost: just 1 flop plus the cost of a square root), this is even marginally cheaper than what hypot does. The error in fma(c, c, 1) is at most $\frac12$ an ulp with round-to-nearest, so you'll get the same accuracy as hypot, with error $<1$ ulp, so this is the best.

Regarding the formulas that hypot actually uses, I don't really understand what they're doing. It chooses between $1+c^2$ and $2c+(c-1)^2$ depending on whether $c\leq\frac12$. It almost looks like a special case of double-double arithmetic, where you compute a product accurately by writing it as a sum of two numbers, but I'm not sure. I imagine if they could use fma, the formulas would be a lot simpler. In my testing they're at most as accurate as the plain $1+c^2$.

What would happen if you tried $1+c^2$ directly? The relative error of evaluating $\hat w = \mathrm{fl}(1+\mathrm{fl}(c^2))$ is at most $\epsilon_1\leq \frac34\mathrm{ulp}$, which gives the error in $\mathrm{fl}(\sqrt{\hat w})$ as $\sqrt{1+\epsilon_1}-1+\epsilon_2 \leq \tfrac78\mathrm{ulp}$, which is at most a single ulp, so it's accurate too, and as good as hypot.

Kirill
  • 11,438
  • 2
  • 27
  • 51
  • @Damien You're welcome. Incidentally, it's not obvious from your question if you want this, and it might be overkill, but you could probably increase accuracy all the way to half an ulp by using compensated arithmetic. That's probably close to something like what openlibm's hypot is trying to do. – Kirill Aug 31 '17 at 21:48
  • @Krill, I have clarified the final question a little - if nothing else but to make it fit your excellent answer.

    My ultimate motivation is that I need 5 decimal digits of precision when using 32-bit float. However, the $\sqrt{1+c^2}$ is in the inner loop of an $O^3$ operation running on an embedded system, so cycles matter, and there will be other contributors to the error budget.

    – Damien Aug 31 '17 at 22:47
  • So sqrt(fma(c,c,1)) or sqrt(1+c*c) (depending on having FMA, and sometimes your compiler can rewrite 1+c*c into fma(c,c,1) on its own, but it can depend on compiler flags) is about as good as it can get. By comparison, hypot does a lot of work to handle the harder cases that you don't have. Also, it usually doesn't really get much better than 1ulp error, most common arithmetic expression you'd normally have are not even as accurate as that. – Kirill Aug 31 '17 at 22:55
  • Better yet, C++11 now defines std::fma() which should delegate the instruction to hardware if available (which I just checked it is for 32-bit float on my platform) – Damien Aug 31 '17 at 22:57
  • @Damien Right, but bear in mind that if FMA is not available, then std::fma may default to a slower software implementation. (Although for Float32, it might look like Float32(Float64(a)*Float64(b)+Float64(c)), but it depends.) Here's openlibm's: https://github.com/JuliaLang/openlibm/blob/master/src/s_fmaf.c (compare with Float64 version: https://github.com/JuliaLang/openlibm/blob/master/src/s_fma.c), so normally you have to check for CPU support before relying on it too much. – Kirill Aug 31 '17 at 22:58
  • @Kirill Since fma requires a result with a single rounding, one cannot in general emulate it for binary32 by simply going through 'binary64` in the manner indicated in your example. It's a bit more complicated than that. – njuffa Sep 03 '17 at 01:56
  • @njuffa Sure, but going through binary64 is really close: the openlibm code for it, which I believe is correct, only has one special case when the binary64 rounding had a tie. I didn't really want to say that the expression I gave was correct, that was my mistake, just that it was the idea. – Kirill Sep 03 '17 at 04:01
  • @Kirill This may be a compiler issue (I compiled for C99 and strict IEEE-754 compliance), but the openlibm fmaf emulation seems to produce incorrect results occasionally (in the following, res is the openlibm code): a= 0x1.e511a0p-94 b= 0x1.f234a0p+71 c=-0x1.f22d80p-3 res=-0x1.f22d44p-3 ref=-0x1.f22d46p-3; a=-0x1.e26cd8p-124 b= 0x1.978a7ap-1 c= 0x1.9fd604p-101 res= 0x1.9fd600p-101 ref= 0x1.9fd602p-101 – njuffa Sep 03 '17 at 07:24
  • @Kirill I took a closer look at the failing cases of openlibm's fmaf emulation (compared both to hardware and my own emulation), and it does not look like a compilation issue but rather like a design issue in the source code, specifically the way preliminary results that are both "exact" and "half-way" are treated. – njuffa Sep 03 '17 at 15:34
  • @njuffa That's pretty cool, I don't think that's a compiler issue, looks like a bug. I checked also with Z3 and exact arithmetic, the "exact" test in the fmaf code returns true, but doesn't imply that double rounding is correct. I submitted it here: https://github.com/JuliaLang/openlibm/issues/160 – Kirill Sep 03 '17 at 15:46
  • @njuffa Can I ask how you found the counterexamples? I tried Z3 because it usually works on such problems, but it takes too long. – Kirill Sep 03 '17 at 15:57
  • @kirill I used a totally unsophisticated brute-force search using purely random operands and looked for mismatches against reference #1, then double-checked the mismatches found against reference #2. There was about one mismatch every 10 billion trials. – njuffa Sep 03 '17 at 16:00
  • @njuffa I think you could replace result - xy == z in openlibm's fmaf with a correct test for exactness of the addition, and it might then work correctly. – Kirill Sep 03 '17 at 20:20