What is the fastest possible square number test in Mathematica, both for machine size and big integers?
I presume starting in version 8 the fastest will be a dedicated C LibraryLink function.
What is the fastest possible square number test in Mathematica, both for machine size and big integers?
I presume starting in version 8 the fastest will be a dedicated C LibraryLink function.
Here's an idea similar to Carl Woll's that's a little faster:
sQ[n_] := FractionalPart@Sqrt[n + 0``1] == 0;
sQa = FractionalPart@Sqrt[# + 0``1] == 0 &; (* @Roman's suggestion *)
@Roman reports the pure function is 10% faster. I find on several runs of timeRun[] below, the variation in the timings cause them to overlap, with sQa sometimes timed slower than sQ. The median for sQa is around 5–6% faster. If I change AbsoluteTiming to Timing in timeRun[], sQ and sQa finish in a dead heat, ±2% of each other. Theoretically, I would expect pure functions to have less overhead, but it would be a small difference compared to the time Sqrt[n + 0``1] will take. Maybe %5 is about right. It's difficult to time computations in a multiprocess environment like my laptop. The upshot is that sQa appears to be a bit faster.
Here are some timing runs similar to @fgrieu's:
timeRun[f_] := Module[{a, m},
a = (2^1024 - 3^644)^2;
m = (2^1024 - 3^644)^2 + 9;
First@ AbsoluteTiming@ Do[f[n], {n, m - 200000, m}]
]
timeRun2[f_] :=
First@ AbsoluteTiming[
Do[
f /@ (n^2 + {-2, -1, 0, 1, 2}),
{n, 2^1357, 0, -Floor[2^1357/99]}]
];
Tests of a long sequence of consecutive integers about single large square number:
timeRun[sQ]
timeRun[SqQ]
timeRun[sqQ1]
timeRun[SquareQ2]
timeRun[SquareQ08]
(*
0.626601 sQ
0.789668 SqQ (@fgrieu)
1.11774 sqQ1 (@CarlWoll)
1.63489 SquareQ2 (@Mr.Wizard)
3.39258 SquareQ08 (@KennyColnago)
*)
Tests of short sequences of consecutive integers about many small to large square numbers:
timeRun2[sQ]
timeRun2[SqQ]
timeRun2[sqQ1]
timeRun2[SquareQ2]
timeRun2[SquareQ08]
(*
0.002639 sQ
0.003289 SqQ
0.0039 sqQ1
0.005791 SquareQ2
0.01749 SquareQ08
*)
A test of just smaller numbers:
aa = 1; bb = 10^6;
AbsoluteTiming@Do[sQ@(n), {n, aa, bb}]
AbsoluteTiming@Do[SqQ@(n), {n, aa, bb}]
AbsoluteTiming@Do[sqQ1@(n), {n, aa, bb}]
AbsoluteTiming@Do[SquareQ2@(n), {n, aa, bb}]
AbsoluteTiming@Do[SquareQ08@(n), {n, aa, bb}]
(*
{2.34658, Null}
{3.2571, Null}
{3.18561, Null}
{3.42899, Null}
{3.25997, Null}
*)
If you want to verify its accuracy, you can test it against other solutions like this:
aa = 10^20 - 100; bb = aa + 10^3;
Table[sQ[n], {n, aa, bb}] === Table[IntegerQ@Sqrt[n], {n, aa, bb}]
(* True *)
aa = 1; bb = 10^6;
Table[sQ[n], {n, aa, bb}] === Table[IntegerQ@Sqrt[n], {n, aa, bb}]
(* True *)
sQ's correctness varies with Mathematica version. On M4.0 PowerPC, sQ[9] is False. On M5.2 x86, sQ[2] is True. On M7.0.1 x64 it passes my tests (negative of squares give True but that's a matter of convention). Same on M12.0 x64, except for a hiccup at -1. Inexact arithmetic scares me. Also: I tuned my SqQ by removing n>=0 && and it wins more benchmarks (negatives are still never squares). Incorporating your idea is next (imitation is the sincerest form of flattery).
– fgrieu
Nov 01 '19 at 14:52
sQa = FractionalPart@Sqrt[# + 0``1] == 0 &
– Roman
Mar 04 '21 at 08:27
sQa to the slowest sQ on my machine. But I've found in the past that pure functions are faster. Usually the difference (~10^–8 sec in a simple test) is not significant unless the function body is executed in very little time. I'm not sure why there would be a big difference in this case, which for me is about 5-6% on average. (That still seems a significant improvement.) FractionalPart@Sqrt[m + 0``1] takes around 2x10^–6 sec, so I'd expect around a 1% improvement, more or less.
– Michael E2
Mar 04 '21 at 16:38
Map on a long list of very large numbers.
– Roman
Mar 04 '21 at 20:29
Sorry for my ignorance not taking into account that the question specifically asked for a Mathematica 7 solution. I updated the complete post.
In Mathematica 7 we don't have the option the compile code into a C-library which includes the thread parallelization which can be turned on when using RuntimeAttributes->Listable and Parallelization->True. Therefore, acl's solution will not run in Mathematica 7 because the RuntimeAttributes option for Compile was introduced in version 8.
This leaves the possibility to not compile the used function and make it a normal Mathematica function where you can set the attribute Listable. I tried this, but it was horrible slow.
After a bit of research I found a nice solution which uses some number-properties in base 16. Since (at least in V7) it seems somewhat hard to return lists of True|False, I use 0 and 1 where 0 means no square.
fPat = Compile[{{numbers, _Integer, 1}},
With[{l = Length[numbers]},
Module[{n = 0, i = 0, h = 0, test = 0.0, result = Table[0, {l}]},
For[i = 1, i <= l, ++i,
n = numbers[[i]];
h = BitAnd[15, n];
If[h > 9, Continue[]];
If[h != 2 && h != 3 && h != 5 && h != 6 && h != 7 && h != 8,
test = Sqrt[n];
result[[i]] = test == Floor[test];
];
];
result
]
]
];
Comparing this with the almost one-liner of Sal gives
data = Table[i, {i, 1, 10^6}];
fSal = Compile[{{n, _Integer}},
With[{test = Sqrt[n]}, Floor[test] == test]];
BarChart[{Timing[fSal /@ data][[1]], Timing[fPat[data]][[1]]
}, ChartLabels -> {"Sal Mangano", "Patrick V7"},
ChartStyle -> {Gray, Green}]
I leave it to you to decide whether such a C-like programming style is worth the small speed up.

The fastest way (using Mathematica only) I know is to compile a C-library and process all data in parallel. Since most computers these days have at least 2 cores, this gives a boost. In Mathematica 8 the compilation to a C-library does not copy the data when it is called.
To make the computation parallel you have to use the Parallization option and the compiled function must be Listable. If you are sure of your input-data, you can additionally switch off most of the data-checks by using RuntimeOptions set to "Speed".
Update I include here the parallelized version of the Mathematica 7 code above:
data = Table[i, {i, 1, 10^6}];
fSal = Compile[{{n, _Integer}},
With[{test = Sqrt[n]}, Floor[test] == test]];
fAcl = Compile[{{n, _Integer}},
With[{test = Sqrt[n]}, Floor[test] == test],
RuntimeAttributes -> {Listable}];
fPat = Compile[{{n, _Integer}},
With[{test = Sqrt[n]}, Floor[test] == test],
CompilationTarget -> "C", RuntimeAttributes -> {Listable},
Parallelization -> True, RuntimeOptions -> "Speed"];
fPat2 = Compile[{{numbers, _Integer, 1}},
With[{l = Length[numbers]},
Module[{n = 0, i = 0, h = 0, test = 0.0, result = Table[0, {l}]},
For[i = 1, i <= l, ++i,
n = numbers[[i]];
h = BitAnd[15, n];
If[h > 9, Continue[]];
If[h != 2 && h != 3 && h != 5 && h != 6 && h != 7 && h != 8,
test = Sqrt[n];
result[[i]] = test == Floor[test];
];
];
result
]
], CompilationTarget -> "C", RuntimeAttributes -> {Listable},
Parallelization -> True, RuntimeOptions -> "Speed"
];
BarChart[{Timing[fSal /@ data][[1]], Timing[fAcl[data]][[1]],
Timing[fPat[data]][[1]],
Timing[fPat2[data]][[1]]},
ChartLabels -> {"Sal Mangano", "acl", "Patrick",
"Patrick V7 parallel"},
ChartStyle -> {Gray, Gray, Darker[Green], Green}]
The results here come from my MacBook in battery-save mode which has 2 Intel cores. The disadvantage is that you need a C-compiler installed on your system which is most likely not true for the majority of Mathematica users.

fPat and fPat2 for others. I was almost done implementing it, but I see you beat me to it :) I believe that will be the fastest of all approaches
– rm -rf
Jan 21 '12 at 21:14
result[[i]] = test == Floor[test]; works as expected for integers and gives a speed-up compared to its if-then-else counterpart.
– halirutan
Jan 21 '12 at 21:53
Compile solutions are all much faster than IntegerQ@Sqrt@# & but what about big integers?
– Mr.Wizard
Jan 22 '12 at 04:46
I voted for all three previous answer because they all taught me something. However they, being Compile solutions, are not helpful with big integers.
At least on my system, Sal Mangano's code appears reducible to this without loss of speed:
isSq2 = Compile[n, Floor@# == # & @ Sqrt @ n];
For big integers between about 2*10^9 and 2*10^11 I am currently using this code from Sasha:
SquareQ =
JacobiSymbol[#, 13] =!= -1 &&
JacobiSymbol[#, 19] =!= -1 &&
JacobiSymbol[#, 17] =!= -1 &&
JacobiSymbol[#, 23] =!= -1 &&
IntegerQ@Sqrt@# &;
For integers larger than that I am using code (modified) from Daniel Lichtblau:
SquareQ2 = # == Round@# & @ Sqrt @ N[#, Log[10`, #] + $MachinePrecision] &;
JacobiSymbol comes into play here. I don't know the underlying workings of this and I'd like to know...
– rm -rf
Feb 20 '12 at 14:21
JacobiSymbol[n,p]=1 for any odd prime $p$ when $n$ is square. JacobiSymbol[n,p]=1 for some odd primes $p$ when $n$ is not square. JacobiSymbol[n,p]=0 for any odd prime $p$ that divides $n$, where $n$ may or may not be square. JacobiSymbol[n,p]=-1 otherwise. Hence, JacobiSymbol[n,p]=-1 means $n$ is not square. Evaluating JacobiSymbol[n,p] is much faster than Sqrt[n]. Use JacobiSymbol to filter candidate $n$ before passing to Sqrt[n]. I tested with 8 $p$ near 541 on $n>10^{11}$ and found it faster than SquareQ by @Sasha and SquareQ2 by @Daniel Lichtblau.
– KennyColnago
Oct 18 '12 at 16:41
NumberTheory\PowersRepresentationsDump`ProbablePerfectSquareQ[]` sometime...
– J. M.'s missing motivation
Oct 20 '12 at 15:46
PowersRepresentations[]; if you have that, you're supposed to have the Jacobi symbol-based test as well... (unless the implementation changed a bit between version seven and version eight)
– J. M.'s missing motivation
Oct 20 '12 at 16:08
PowersRepresentations once to load it. Quite similar indeed...
– Mr.Wizard
Oct 20 '12 at 16:45
SquareQ2 = FractionalPart@Sqrt@N[#, 2 Log[10, #]] == 0 &; By removing the extra step you get marginal improvement. Of course, we both know that there is a big assumption there, which is probably not true for sufficiently large $N$.
– mhp
Nov 06 '14 at 08:17
PowerMod be of any use here?
– Mr.Wizard
Nov 06 '14 at 08:38
I don't think there are any built-in functions for this but the following is probably fast enough for most purposes.
isSq = Compile[{{n, _Integer}}, With[{test = Sqrt[n]},
Floor[test] == test]];
Does 1 million integers in under a second.
Timing[Table[isSq[i], {i, 1, 1000000}]][[1]]
(*
0.76195
*)
This is under 2 orders of magnitude faster than the un-compiled equivalent, by the way.
Sqrt operation is comparatively expensive; I believe one should filter out numbers that are destined to fail with a less expensive test before progressing to Sqrt.
– Mr.Wizard
Jan 21 '12 at 15:08
More info as requested by @Mr.Wizard.
For $n$ below the $\approx 2*10^9$ limit, Compile gives the fastest solutions. For larger $n$, Sasha used JacobiSymbol with four primes 13, 19, 17, and 23, before resorting to the expensive IntegerQ[Sqrt[n]]. The number of ambiguous cases where JacobiSymbol[n,p]=0 decreases as the size of the prime $p$ increases. So using larger $p$ helps filter out more candidates before Sqrt must be called. Similarly, using more primes filters more candidates. However, the computation of JacobiSymbol slows as the number of and size of $p$ increases (no free lunch). As a rough balance, I used SquareQ08.
SquareQ08[n_] :=
JacobiSymbol[n, 541] =!= -1 && JacobiSymbol[n, 547] =!= -1 &&
JacobiSymbol[n, 557] =!= -1 && JacobiSymbol[n, 563] =!= -1 &&
JacobiSymbol[n, 569] =!= -1 && JacobiSymbol[n, 647] =!= -1 &&
JacobiSymbol[n, 653] =!= -1 && JacobiSymbol[n, 659] =!= -1 &&
IntegerQ[Sqrt[n]]
SetAttributes[SquareQ08, Listable]
JacobiSymbol test, and that this percentage was independent of the size of $n$. Your mileage may vary...
– KennyColnago
Oct 24 '12 at 23:55
isSq2 as well as SquareQ08 on sets of 100 random integers in various ranges (10^8 to 10^9, 10^9 to 10^10, 10^10 to 10^11, and 10^11 to 10^12) with RepeatedTiming. In each case, the results were remarkably consistent - isSq2 was about 7.4 x 10^{-5} seconds and SquareQ08 was about 4.5 x 10^{-4} seconds. This is on Mathematica 10.3; I wonder what version you were using and if the performance of isQq2 was somehow improved.
– rogerl
Dec 24 '15 at 17:09
isSq2 about 6 to 10 times faster than SquareQ08, which is written for integers of any size. I do not understand how isSq2, which is compiled, works with integers larger than 2 billion. Can you explain?
– KennyColnago
Dec 25 '15 at 01:09
Compile will work with 64-bit integers (or about $10^18$). Is that right? And, just to be clear, your result regarding isQq1 and SquareQ08 that you give in your comment above is at variance with your original answer, right? Do you know what's going on, or am I just confused?
– rogerl
Dec 25 '15 at 01:44
Compile on integers greater than 2 billion, my original answer never tested isSq2 on such large numbers. Therefore, the result in the comment is new, and not at odds with the original answer. However, before I trust any compiled result on large integers, I must see a documented source which explains that Compile on 64 bit machines works on integers above 2 billion.
– KennyColnago
Dec 25 '15 at 01:57
This is a variation of Daniel Lichtblau's contribution that avoids the need to compute logarithms:
sqQ1[i_Integer] := Floor[Sqrt[i + If[i>10^16, .1`1, .1]]]^2 == i
It seems to be a bit faster than SquareQ2. For example:
n = 432^2;
sqQ1[n] //RepeatedTiming
SquareQ2[n]//RepeatedTiming
{2.42*10^-6, True}
{3.2*10^-6, True}
and:
n = 43212113212231231241334^2;
sqQ1[n] //RepeatedTiming
SquareQ2[n]//RepeatedTiming
{3.61*10^-6, True}
{5.3*10^-6, True}
But not always:
n = 432121231231241334^2;
sqQ1[n] //RepeatedTiming
SquareQ2[n]//RepeatedTiming
{7.8*10^-6, True}
{5.26*10^-6, True}
A "listable" version appears to be faster than the compiled versions (at least when the maximum value is less than 10^16):
sqQ2[x:{__Integer}] := With[{add = If[Max[x]>10^16, .1`1, .1]},
UnitStep[Floor[Sqrt[x+add]]^2 - x]
]
Comparison with fPat2:
data = RandomInteger[10^15, 10^6];
r1 = sqQ2[data]; //RepeatedTiming
r2 = fPat2[data]; //RepeatedTiming
r1 === r2
{0.0075, Null}
{0.023, Null}
True
Of course, sqQ2 works for any size integers, while the compile solutions only work for integers less than Developer`$MaxMachineInteger.
The following is optimized for large values. The main idea is to reduce the integer tested modulo a product of small primes less than 64-bit, so that the cost is low and linear with the bit size of the argument, and filter the remainder using precomputed Jacobi tables to weed out all except very few (1/11595) of the non-squares.
SqQ::usage =
"SqQ[n] is True when n is an exact square, and False otherwise.";
(* We reduce n modulo a product of small primes and use *)
(* pre-computed tables of Jacobi symbols to filters out *)
(* most non-squares with a single multi-precision operation. *)
(* We use IntegerQ[Sqrt[n]] on less than 1/11595 integers. *)
(* Pre-computed variables starting in SqQ$ are for internal use; *)
SqQ$m = (SqQ$0 = 59*13*7*5*3)*(SqQ$1 = 23*19*17*11)*
(SqQ$2 = 47*37*31) *(SqQ$3 = 43*41*29);
SqQ$u = SqQ$v = SqQ$w = SqQ$x = 0;
Block[{j},
For[j = SqQ$0, j-- > 0, SqQ$u += SqQ$u + If[
JacobiSymbol[j, 59] < 0 || JacobiSymbol[j, 13] < 0 ||
JacobiSymbol[j, 7] < 0 || JacobiSymbol[j, 5] < 0 ||
JacobiSymbol[j, 3] < 0, 1, 0]];
For[j = SqQ$1, j-- > 0, SqQ$v += SqQ$v + If[
JacobiSymbol[j, 23] < 0 || JacobiSymbol[j, 19] < 0 ||
JacobiSymbol[j, 17] < 0 || JacobiSymbol[j, 11] < 0, 1, 0]];
For[j = SqQ$2, j-- > 0, SqQ$w += SqQ$w + If[
JacobiSymbol[j, 47] < 0 || JacobiSymbol[j, 37] < 0 ||
JacobiSymbol[j, 31] < 0, 1, 0]];
For[j = SqQ$3, j-- > 0, SqQ$x += SqQ$x + If[
JacobiSymbol[j, 43] < 0 || JacobiSymbol[j, 41] < 0 ||
JacobiSymbol[j, 29] < 0, 1, 0]]
];
(* The function itself starts here *)
SqQ[n_Integer] := Block[{m = Mod[n, SqQ$m]},
BitGet[SqQ$u, Mod[m, SqQ$0]] == 0 &&
BitGet[SqQ$v, Mod[m, SqQ$1]] == 0 &&
BitGet[SqQ$w, Mod[m, SqQ$2]] == 0 &&
BitGet[SqQ$x, Mod[m, SqQ$3]] == 0 &&
IntegerQ[Sqrt[n]]]
(* Automatically thread over lists *)
SetAttributes[SqQ, Listable];
It comfortably beats sqQ1, SquareQ2 and SqareQ08 when benchmarked with large non-squares
m = (2^1024 - 3^644)^2 + 9;
Timing[s = 0;
For[n = m - 200000, n < m, ++n, If[SqQ[n], ++s]];
s == 1]
and more narrowly so when benchmarked/validated as
Timing[For[n = 2^1357,
n > 0 && SqQ[s = n^2] && ! SqQ[s + 1] && ! SqQ[s + 2], --n,
n -= Floor[n/99]]; n == 0]
I am not sure how to speed up each comparison (as in, I spent half an hour trying different things and didn't manage to), but making the compiled function listable speeds things up quite a bit.
If isSq is the direct implementation that Sal gave, simply make it listable and compare:
isSqL = Compile[
{{n, _Integer}}, With[{test = Sqrt[n]}, Floor[test] == test],
RuntimeAttributes -> {Listable}
];
and then compare:
Timing[Table[isSq[i], {i, 1, 10^6}]]; // Timing
isSq /@ Range[1, 10^6]; // Timing
isSqL[Range[1, 10^6]]; // Timing
(*
{0.697799, Null}
{0.545856, Null}
{0.150171, Null}
*)
ie, it's 3-4 times faster.
What makes you say Sqrt is expensive? (ie, compared to what?).
CompilePrint[isSq], it's hard to see anything more costly thatn Sqrt.
– acl
Jan 21 '12 at 16:34