Summary: it is cheapest and most accurate to use sqrt(fma(c, c, 1)) if you have FMA, and sqrt(1+c*c) otherwise. In my testing, though, the difference is extremely marginal: of the 1065353216 32-bit floating point numbers $0\leq c\leq 1$, the first formula is better 532509 times (0.05%), and worse 382159 times (0.035%), and neither formula has error worse than 1 ulp, so the obvious sqrt(1+c*c) is good enough.
It is a little bit of a fine point, and Wikipedia can sometimes be a little unreliable on this. One good way to settle these types of questions is to go to a mature library that already implements hypot, such as openlibm (https://github.com/JuliaLang/openlibm).
Quoting the source code comment that explains it (https://github.com/JuliaLang/openlibm/blob/master/src/e_hypot.c):
/* __ieee754_hypot(x,y)
*
* Method :
* If (assume round-to-nearest) z=x*x+y*y
* has error less than sqrt(2)/2 ulp, than
* sqrt(z) has error less than 1 ulp (exercise).
*
* So, compute sqrt(x*x+y*y) with some care as
* follows to get the error below 1 ulp:
*
* Assume x>y>0;
* (if possible, set rounding to round-to-nearest)
* 1. if x > 2y use
* x1*x1+(y*y+(x2*(x+x1))) for x*x+y*y
* where x1 = x with lower 32 bits cleared, x2 = x-x1; else
* 2. if x <= 2y use
* t1*y1+((x-y)*(x-y)+(t1*y2+t2*y))
* where t1 = 2x with lower 32 bits cleared, t2 = 2x-t1,
* y1= y with lower 32 bits chopped, y2 = y-y1.
*
* NOTE: scaling may be necessary if some argument is too
* large or too tiny
*
* Special cases:
* hypot(x,y) is INF if x or y is +INF or -INF; else
* hypot(x,y) is NAN if x or y is NAN.
*
* Accuracy:
* hypot(x,y) returns sqrt(x^2+y^2) with error less
* than 1 ulps (units in the last place)
*/
So if you can compute $1+c^2$ sufficiently accurately (and in your case with $|c|\leq 1$ you will not have over-/underflow), the final result will be accurate.
If you have a modern CPU with a fused multiply-add (FMA) instruction, this is trivial using fma, which most languages' standard math libraries have, so you can use sqrt(fma(c, c, 1)) (cost: just 1 flop plus the cost of a square root), this is even marginally cheaper than what hypot does. The error in fma(c, c, 1) is at most $\frac12$ an ulp with round-to-nearest, so you'll get the same accuracy as hypot, with error $<1$ ulp, so this is the best.
Regarding the formulas that hypot actually uses, I don't really understand what they're doing. It chooses between $1+c^2$ and $2c+(c-1)^2$ depending on whether $c\leq\frac12$. It almost looks like a special case of double-double arithmetic, where you compute a product accurately by writing it as a sum of two numbers, but I'm not sure. I imagine if they could use fma, the formulas would be a lot simpler. In my testing they're at most as accurate as the plain $1+c^2$.
What would happen if you tried $1+c^2$ directly? The relative error of evaluating $\hat w = \mathrm{fl}(1+\mathrm{fl}(c^2))$ is at most $\epsilon_1\leq \frac34\mathrm{ulp}$, which gives the error in $\mathrm{fl}(\sqrt{\hat w})$ as $\sqrt{1+\epsilon_1}-1+\epsilon_2 \leq \tfrac78\mathrm{ulp}$, which is at most a single ulp, so it's accurate too, and as good as hypot.
My ultimate motivation is that I need 5 decimal digits of precision when using 32-bit float. However, the $\sqrt{1+c^2}$ is in the inner loop of an $O^3$ operation running on an embedded system, so cycles matter, and there will be other contributors to the error budget.
– Damien Aug 31 '17 at 22:47sqrt(fma(c,c,1))orsqrt(1+c*c)(depending on having FMA, and sometimes your compiler can rewrite1+c*cintofma(c,c,1)on its own, but it can depend on compiler flags) is about as good as it can get. By comparison, hypot does a lot of work to handle the harder cases that you don't have. Also, it usually doesn't really get much better than 1ulp error, most common arithmetic expression you'd normally have are not even as accurate as that. – Kirill Aug 31 '17 at 22:55Float32(Float64(a)*Float64(b)+Float64(c)), but it depends.) Here's openlibm's: https://github.com/JuliaLang/openlibm/blob/master/src/s_fmaf.c (compare with Float64 version: https://github.com/JuliaLang/openlibm/blob/master/src/s_fma.c), so normally you have to check for CPU support before relying on it too much. – Kirill Aug 31 '17 at 22:58fmarequires a result with a single rounding, one cannot in general emulate it forbinary32by simply going through 'binary64` in the manner indicated in your example. It's a bit more complicated than that. – njuffa Sep 03 '17 at 01:56fmafemulation seems to produce incorrect results occasionally (in the following,resis the openlibm code):a= 0x1.e511a0p-94 b= 0x1.f234a0p+71 c=-0x1.f22d80p-3 res=-0x1.f22d44p-3 ref=-0x1.f22d46p-3;a=-0x1.e26cd8p-124 b= 0x1.978a7ap-1 c= 0x1.9fd604p-101 res= 0x1.9fd600p-101 ref= 0x1.9fd602p-101– njuffa Sep 03 '17 at 07:24fmafemulation (compared both to hardware and my own emulation), and it does not look like a compilation issue but rather like a design issue in the source code, specifically the way preliminary results that are both "exact" and "half-way" are treated. – njuffa Sep 03 '17 at 15:34result - xy == zin openlibm's fmaf with a correct test for exactness of the addition, and it might then work correctly. – Kirill Sep 03 '17 at 20:20