Monday, 25 February 2013

Testing whether a number is a perfect square

Introduction

A simple way to test whether a whole number is a perfect square is to take its square root, square the integer part and compare the result to the original number. In Python this might be coded as follows (the +0.5 is to avoid rounding errors)
from math import sqrt
def issquare(n):
  return int(sqrt(n) + 0.5) ** 2 == n
However, a preliminary test can avoid the need to calculate the square root for some numbers. Perfect squares always end in 0,1,4,5,6 or 9, so numbers ending in  2,3,7 or 8 could immediately be eliminated by looking at the remainder after dividing by 10.

Similarly, a perfect square divided by 16 always leaves a remainder of 0,1,4 or 9. Numbers with remainders 2,3,5,6,7,8,10,11,12,13,14,15 (75% of numbers) can be eliminated. As 16 is a power of 2, $16=2^4$, the remainder can be obtained in two ways: by division or by finding the last 4 bits by masking. In Python these could be implemented as:

Obtaining remainder by division
from math import sqrt
SR16= frozenset((0, 1, 4, 9))
def issquare(n):
  if n%16 in SR16 : 
    return int(sqrt(n) + 0.5) ** 2 == n
  return False
Obtaining remainder by masking
from math import sqrt
SR16= frozenset((0, 1, 4, 9))
def issquare(n):
  if n & 15 in SR16 : 
    return int(sqrt(n) + 0.5) ** 2 == n
  return False


Which divisor produces the most efficient pre-test?

Defining $s(n)$ to be the number of possible residues of square numbers modulo n, a formula for $s(n)$ is derived by Stangl, W. D. "Counting Squares in ℤn, Math. Mag. 69, 285-289, 1996. The formula is summarised in Wolfram Mathworld - Square Number

Defining                                                       $r(n) = $ $\frac{s(n)}{n}$, 

we are interested in numbers with small values of $r(n)$. In the examples above, s(10)=6, r(10)=0.6,  s(16)=4, r(16)=0.25.


Stangl shows that $s(n)$ is multiplicative, i.e $s(ab)=s(a)s(b)$ if $a$ and $b$ are co-prime, so $r(n)$ is also multiplicative, and is therefore defined by its values for $p^n$ where $p$ is prime. The following graph shows the value of $r(p^n)$ for primes up to 29


Values of $r(2^n)$ look much more promising than powers of other primes, but Stangl's formula shows that $r(2^n)$ tends towards $\frac{1}{6}$ as n increases, so there are diminishing returns in choosing larger values of $2^n$ as a pre-test divisor.

Can other composite divisors produce smaller values of $r(n)$?

The above graph shows that for primes greater than 2, there is no benefit in including a factor of $p$ higher than $p^2$ in $n$ to reduce $r(n)$. A few powers of 2 may have a benefit in reducing $r(n)$.



This graph shows $r(n)$ for all values of n below 10,000. We are interested in the values of n at the lower edge of the graph, highlighted with a blue line. These are values of $n$ for which

$r(n) < r(m)$ for all $m < n$





 
Defining the set 

 $Opt=\{n: r(n) < r(m)$ for all $m < n \}$

this graph shows $r(n)$ values of $n$ in $Opt$ up to 100,000

Values of $r(n)$
below the general trend for $n$ in $Opt$ are highlighted. Their prime factorisation shows the trend of the numbers:

16 = 24
144 = 24.32
720 = 24.32.5
5,040 = 24.32.5.7
55,440 = 24.32.5.7.11

Extending this trend, this graph shows values of r(n) for the following values of n:

16 = 24
144 = 24.32
720 = 24.32.5
5,040 = 24.32.5.7
55,440 = 24.32.5.7.11
720,720 =  24.32.5.7.11.13
12,252,240 =  24.32.5.7.11.13.17
232,792,560 =  24.32.5.7.11.13.17.19
5,354,228,880 =  24.32.5.7.11.13.17.19.23

$r(n)$ reaches a tantalisingly small 0.16%, but for n=5,354,228,880 (larger than 232), and for $s(n)=$8,709,120. 

Testing different divisors

The following graph tests the divisors listed above and divisors of the form $2^n$, by calculating the time to test all numbers in the range 1 to 10,000,000. For divisors of the form $2^n$ , two forms of Python function are tested, using division and masking to obtain the remainder.

For comparison, the time taken by the simplest issquare function is shown in yellow



The graph shows that the performance improves for divisors of the form $2^m$ up to $2^7$, and then levels out. The graph also shows that there is no significant advantage in using masking to calculate a number modulo $2^m$ (blue points) compared to straightforward division (green points).

The performance of the best composites (red points) is noticably better than the performance of  powers of 2 of similar magnitude. The best performance is for D= 55,440 = 24.32.5.7.11, after which the performance degrades.

Pre calculating the residues of squares modulo D

The overhead of pre-calculating the residues of square numbers modulo D needs to considered when choosing the pre-test divisor.

Using the following algorithm, the table shows the time taken to generate the set of residues of squares modulo n. The table shows that the set of square residues mod 55,440 can be calculated in 2ms


SR={0}
D=1
for b in [2**4,3**2,5,7,11]:
  squares_mod_b = set([(x*x)%b for x in range(b)])
  SR =  set([ r + n*D for r in SR for n in range(b)
              if (r+n*D)%b in squares_mod_b ])
  D*=b
Factor introduced D r(D)s(D)Time to generate residues of squares
 (ms)
16  1625.00% 4 0.01
914411.11% 16 0.12
57206.67% 48 0.23
7 5,0403.81% 192 0.53
11  55,4402.08% 1,152 2.01

We can now define algorithms with the best pre-test. Two options are presented, one with a hard coded set of square residues, and one with a calculated set.

$s(720)=48$, which is a reasonable number of square residues to hard-code into an algorithm. An implementation of issquare using D=720 is:

from math import sqrt
SR = frozenset([0,1,4,385,324,649,256,144,145,121,580,409,576,25,640,265,\
                  544,289,180,676,369,169,304,49,436,9,441,36,64,160,196,481,\
                  585,484,81,340,625,601,225,100,16,529,361,496,241,400,244,505])
def issquare(n):
  if n%720 in SR : 
    return int(sqrt(n) + 0.5) ** 2 == n
  return False

An implementation using D=55,440, with the square residues being constructed dynamically:
from math import sqrt

SR={0}
D=1
for b in [2**4,3**2,5,7,11]:
  squares_mod_b = set([(x*x)%b for x in range(b)])
  SR =  set([ r + n*D for r in SR for n in range(b)
              if (r+n*D)%b in squares_mod_b ])
  D*=b


def issquare(n):
  if n%D in SR: 
    return int(sqrt(n) + 0.5) ** 2 == n
  return False
For a further increase in speed of approximately 33%, at the expense of readability, function calls could be replaced with in-line code, for example:

Function calls
count=0
for n in xrange(10000000):
 if issquare(n): 
   count+=1
In-line code
count=0
for n in xrange(10000000):
 if n%D in SR and int(sqrt(n) + 0.5) ** 2 == n: 
   count+=1


No comments:

Post a Comment