Archive for January, 2008

336 ARITHMETIC 4.5.2 This completes our analysis of (Web hosting services)

Thursday, January 31st, 2008

336 ARITHMETIC 4.5.2 This completes our analysis of the average values of C and D. The other three quantities appearing in the running time of Algorithm B are rather easily analyzed; see exercises 6, 7, and 8. Thus we know approximately how Algorithm B behaves on the average. Let us now consider a worst case analysis: What values of u and v are in some sense the hardest to handle? If we assume as before that LlguJ = m and 1kvJ = n, let us try to find (u, v) that make the algorithm run most slowly. In view of the fact that the subtractions take somewhat longer than the shifts, when the auxiliary bookkeeping is considered, this question may be rephrased by asking for the inputs u and v that require most subtractions. The answer is somewhat surprising; the maximum value of C is exactly max(m, n) + 1, (46) although the lattice-point model would predict that substantially higher values of C are possible (see exercise 26). The derivation of the worst case (46) is quite interesting, so it has been left as an amusing problem for the reader to work out by himself (see exercises 27 and 28). EXERCISES 1. [A&21] How can (8), (9), (lo), (11) and (12) be derived easily from (6) and (7)? 2. [A4.86] Given that u divides zlizl~ . . . v,, prove that u divides gcd(u, VI) gcd(u, ~2). . . gcd(u, v,). 3. [Mz%] Show that the number of ordered pairs of positive integers (u, v) such that lcm(u, v) = n is the number of divisors of n2. 4. [A&Z] Given positive integers u and v, show that there are divisors u of u and v of v such that gcd(u , v ) = 1 and u v = lcm(u, v). b 5. [AL%] Invent an algorithm (analogous to Algorithm B) for calculating the greatest common divisor of two integers based on their balanced ternary representation. Dem- onstrate your algorithm by applying it to the calculation of gcd(40902,24140). 6. [MZ?] Given that u and v are random positive integers, find the mean and the standard deviation of the quantity A that enters into the timing of Program B. (This is the number of right shifts applied to both u and v during the preparatory phase.) 7. [A4,20] Analyze the quantity B that enters into the timing of Program B. b 8. [MEY] Show that in Program B, the average value of E is approximately equal to +C,,,, where C,,, is the average value of C. 9. [18] Using Algorithm B and hand calculation, find gcd(31408,2718). Also find integers m and n such that 31408m + 2718n = gcd(31408,2718), using Algorithm X.

Adult web hosting - 4.5.2 THE GREATEST COMMON DIVISOR 335 These experiments

Wednesday, January 30th, 2008

4.5.2 THE GREATEST COMMON DIVISOR 335 These experiments showed a rather small standard deviation from the observed average values. The coefficients $ and 1 of m in (42) and (43) can be verified rigorously without using the lattice-point approximation (see exercise 21); so the error in the lattice-point model is apparently in the coefficient of n, which is too high. The above calculations have been made under the assumption that u and v are odd and in the ranges 2m 5 u < 2m+1 and 2n 2 v < 2n+1. If we assume instead that 1~ and v are to be any integers, independently and uniformly distributed over the ranges 12 IL< 2N, 1 2 v < 2N, then we can calculate the average values of C and D from the data already given; in fact, if C,, denotes the average value of C under our earlier assumptions, exercise 22 shows that we have (zN -1)2C = N2Coo + 2N c (N -n)2n-1C,0 l 1, or by 2d/(X-1 -1) if X < 1. Th e random variable X has a limiting distribution that makes it possible to analyze the average value of the ratio by which max(u, v) decreases at each iteration; see exercise 25. Numerical calculations show that this maximum decreases by p = 0.705971246102 bits per iteration; the agreement with experiment is so good that Brent s constant p must be the true value of the number 0.70 in (45), and we should replace 0.203 by 0.206 in (43). [See Algorithms and Complexity, ed. by J. F. Daub (New York: Academic Press, 1976), 321-355.1

334 ARITHMETIC 4.5.2 Table 1 NUMBER OF SUBTRACTIONS

Wednesday, January 30th, 2008

334 ARITHMETIC 4.5.2 Table 1 NUMBER OF SUBTRACTIONS IN ALGORITHM B n n 0 12 3 4 5 0 12 3 4 5 0 1.00 2.00 2.50 3.00 3.50 4.00 1.00 2.00 2.50 3.00 3.50 4.00 0 1 2.00 1.00 2.50 3.00 3.50 4.00 2.00 1.00 3.00 2.75 3.63 3.94 1 2 2.50 2.50 2.25 3.38 3.88 4.38 2.50 3.00 2.00 3.50 3.88 4.25 2 3 3.00 3.00 3.38 3.25 4.22 4.72 3.00 2.75 3.50 2.88 4.13 4.34 3 4 3.50 3.50 3.88 4.22 4.25 5.10 3.50 3.63 3.88 4.13 3.94 4.80 4 5 4.00 4.00 4.38 4.72 5.10 5.19 4.00 3.94 4.25 4.34 4.80 4.60 5 m Predicted by model Actual average values m execution of Algorithm B, with u and v both odd, the corresponding path in the lattice model must satisfy the relation number of shifts + number of diagonal points f 2Llg gcd(u, v)] = m + n, since we are assuming that T in (33) is always either 0 or 1. The average value of [lggcd(u, w)J predicted by the lattice-point model is approximately 2 (see exercise 20). Therefore we have, for the total number of shifts, D=m+n-(in+&++&,)–* (41) =m+jn–g-Zj3Smn, when m 2 n, plus terms that decrease rapidly to zero for large n. To summarize the most important facts we have derived from the lattice- point model, we have shown that if u and w are odd and if [lg u] = m, [lg v] = n, then the quantities C and D that are the critical factors in the running time of Program B will have average values given by C = $m + i$ + O(l), D = m -I- i$ -I- O(l), m >_ 72. (42) But the model that we have used to derive (42) is only a pessimistic approxima- tion to the true behavior; Table 1 compares the true average values of C, com- puted by actually running Algorithm B with all possible inputs, to the values predicted by the lattice-point model, for small m and n. The lattice model is completely accurate when m or n is zero, but it tends to be less accurate when (m - n( is small and min(m, n) is large. When m = n = 9, the lattice-point model gives C = 8.78, compared to the true value C = 7.58. Empirical tests of Algorithm B with several million random inputs and with various values of m, n in the range 29 5 m,n 5 37 indicate that the actual average behavior of the algorithm is given by CR5 irn + 0.203n+ 1.9 - 0.4(0.6)m-n, m 2 n. (43) DZ m + 0.41n -0.5 - 0.7(0.6)m-n,

4.5.2 THE GREATEST COMMON DIVISOR 333 Thus the (Apache web server for windows)

Tuesday, January 29th, 2008

4.5.2 THE GREATEST COMMON DIVISOR 333 Thus the double recurrence (34) can be transformed into the single recurrence relation in (37). Use of the generating function G(z) = & + Alz + A2z2 + . . . now transforms (37) into the equation (1-$z-&z2)G(z) = ao+aiz+a2z2+~ 1–+$1–z/2 A?-+ (l-z/2)2 7 (38) where a~, al, and a2 are constants that can be determined by the values of Ao, Al, and Aa. Since 1-$z + az2 = (1+ az)(l -z), we can express G(z) by the method of partial fractions in the form Tedious manipulations produce the values of these constants bo, . . . , bs, and thus the coefficients of G(z) are determined. We finally obtain the solution + (-:,Y-& -+@ + &c + ++bo) + &c&o; A,, = +mc + n($a + +A + (&a + $b + & + @oo) + 2- (44 + (-$,+&a -@ + $c + @bo), m > n. (39) With these elaborate calculations done, we can readily determine the be- havior of the lattice-point model. Assume that the inputs u and v to the algo- rithm are odd, and let m = LlguJ, n = [lg vJ. The average number of subtrac- tion cycles, namely the quantity C in the analysis of Program B, is obtained by setting a = 1, b = 0, c = 1, A00 = 1 in the recurrence (34). By (39) we see that (for m 2: n) the lattice model predicts subtraction cycles, plus terms that rapidly go to zero as n approaches infinity. The average number of times that gcd(u, v) = 1 is obtained by setting a = b = c = 0, A00 = 1; this gives the probability that u and v are relatively prime, approximately 2. Actually, since u and v are assumed to be odd, they should be relatively prime with probability 8/.rr2 (see exercise 13), so this reflects the degree of inaccuracy of the lattice-point model. The average number of times that a path from (m, n) goes through one of the diagonal points (m , m ) for some m 2 1 is obtained by setting a = 1, b = c = &O = 0 in (34); so we find that this quantity is approximately m n. *n + & + &L,, when 2 Now we can determine the average number of shifts, i.e., the number of times that step B3 is performed. (This is the quantity D in Program B.) In any

Best web design - 332 ARITHMETIC 4.5.2 An analysis of the lattice-point

Monday, January 28th, 2008

332 ARITHMETIC 4.5.2 An analysis of the lattice-point model can be carried out by solving the following rather complicated set of double recurrence relations: 1 &, = a + 5&qm–2) + … -I LA,o &, ifm 2 1; 2m-l + A mn = c + iAcrn-~jn + … + &&AI, + &hn, if m > 72 2 0; A mn - &V2, if n > m 2 0. (34) The problem is to solve for A,, in terms of m, n, and the parameters a, b, c, and Aoo. This is an interesting set of recurrence equations, which have an interesting solution. First we observe that if 0 5 rz < m, A(,+,),= c+ c 2-kA(m+,-k), + 2-m&n l 0. (35) Now let A,,, = A,,. If m > 0, we have A(m+qm= c+ c 2-k-+n+wc)m+ 2-m-1&m l_ 2, (37)

Geocities web hosting - 4.5.2 THE GREATEST COMMON DIVISOR 331 not on

Monday, January 28th, 2008

4.5.2 THE GREATEST COMMON DIVISOR 331 not on the actual values of u and v. Let us call this approximation a lattice-point model, since we will say that we are at the point (m, n) when Llg ~1 = m and [lg v] = n. From point (m, n) the algorithm takes us to (m , n) if u > U, or to (m,n ) if u < U, or terminates if u = v. For example, the calculation starting with u = 20451 and u = 6035 begins at the point (14,12), then goes to (9,12), P!ll), (%9), (4,9), (4,5), (4,4), and terminates. In line with the comments of the preceding paragraph, we will make the following assumptions about the probability that we reach a given point just after point (m, n): Case 1, m > n. Case 2, m < 12. Next point Probability Next point Probability Cm - Ln) % (m,n-1) a (m -2, n) a (m,n-2) a (rn,l) w-1. . h 0) w -1 Case 3, m = n > 0. Next point Probability Cm - 2, n), Cm, n -2) a, a Cm - 3, n), (3 n -3) QJ R (0, ni im, 0) c:,i,~id,m terminate w-1 For example, from points (5,3) the lattice-point model would go to points (4,3), (3,3), (2,3), (1,3), (0,3) with the respective probabilities 4, a, i, &, &; from (4,4) it would go to (2,4), (1,4), (0,4), (4,2), (4, I), (4,0), or would terminate, with the respective probabilities a, A, &, $, 4, &, i. When m and n are both 0, the formulas above do not apply; the algorithm always terminates in such a case, since m = n = 0 implies that u = v = 1. The detailed calculations of exercise 18 show that this lattice-point model is somewhat pessimistic. In fact, when m > 3 the actual probability that Algorithm B goes from (m, m) to one of the two points (m -2, m) or (m, m -2) is equal to i, although we have assumed the value 3; the algorithm actually goes from (m, m) to (m-3, m) or (m, m-3) with probability $, not a; and it actually goes from (m + 1, m) to (m, m) with probability 4, not 4. The probabilities in the model are nearly correct when Irn -n( is large, but when Irn -nl is small the model predicts slower convergence than is actually obtained. In spite of the fact that our model is not a completely faithful representation of Algorithm B, it has one great virtue: It can be completely analyzed! Furthermore, empirical experiments with Algorithm B show that the behavior predicted by the lattice- point model is analogous to the true behavior.

330 ARITHMETIC (Web host server) 4.5.2 L3. [Emulate Euclid.] Set T

Sunday, January 27th, 2008

330 ARITHMETIC 4.5.2 L3. [Emulate Euclid.] Set T t A -qC, A t C, C t T, T + B -qD, B c D, D c T, T c C -q6, ii c 6, 6 t T, and go back to step L2. (These single-precision calculations are the equivalent of multiple-precision operations, as in (28), under the conventions of (29).) L4. [Multiprecision step.] If B = 0, set t t umod v, u t v, v t t, using multiple-precision division. (This happens only if the single-precision opera- tions cannot simulate any of the multiple-precision ones. It implies that Euclid s algorithm requires a very large quotient, and this is an extremely rare occurrence.) Otherwise, set t c Au, t t t+Bv, w e Cu, w t w+Dv, u t t, 21 t w (using straightforward multiple-precision operations). Go back to step Ll. 1 The values of A, B, C, D remain as single-precision numbers throughout this calculation, because of (31). Algorithm L requires a somewhat more complicated program than Algo- rithm B, but with large numbers it will be faster on many computers. The binary technique of Algorithm B can, however, be speeded up in a similar way (see exercise 34), to the point where it continues to win. Algorithm L has the advantage that it can readily be extended, as in Algorithm X (see exercise 17); furthermore, it determines the sequence of quotients obtained in Euclid s algorithm, and this yields the regular continued fraction expansion of a real number (see exercise 4.5.3-18). Analysis of the binary algorithm. Let us conclude this section by studying the running time of Algorithm B, in order to justify the formulas stated earlier. An exact determination of the behavior of Algorithm B appears to be ex- ceedingly difficult to derive, but we can begin to study it by means of an ap- proximate model of its behavior. Suppose that u and v are odd numbers, with u > v and 1kuJ = m, [lgvl = 72. (32) (Thus, u is an (m + 1)-bit number, and v is an (n + 1)-bit number.) Algorithm B forms u -v and shifts this quantity right until obtaining an odd number u that replaces u. Under random conditions, we would expect to have u = (u -v)/2 about one-half of the time, u = (u -v)/4 about one-fourth of the time, u = (u -v)/8 about one-eighth of the time, and so on. We have ]lg ~ 1 = m -k -T, (33) where k is the number of places that u -v is shifted right, and where r is [lgu] -]lg(u -v)l, the number of bits lost at the left during the subtraction of v from u. Note that r 5 1 when m 2 n + 2, and r 2 1 when m = n. For simplicity, we will assume that r = 0 when m # n and that r = 1 when m = n, although this lower bound tends to make u seem larger than it usually is. The approximate model we shall use to study Algorithm B is based solely on the values m = Llgu] and n = [lgvJ throughout the course of the algorithm,

4.5.2 THE GREATEST COMMON DIVISOR 329 so long (Net web server)

Saturday, January 26th, 2008

4.5.2 THE GREATEST COMMON DIVISOR 329 so long as (27) holds. We can therefore avoid the multiple-precision operations of the first five steps, and replace them all by a multiple-precision calculation of -4u0 + 11~ and 7~ -19vo. In this case we obtain u = 1268728, v = 279726; the calculation can now proceed in a similar manner with u = 1268, U = 280, U = 1269, II = 279, etc. If we had a larger accumulator, more steps could be done by single-precision calculations; our example showed that only five cycles of Euclid s algorithm were combined into one multiple step, but with (say) a word size of 10 digits we could do about twelve cycles at a time. (Results proved in Section 4.5.3 imply that the number of multiple-precision cycles that can be replaced at each iteration is essentially proportional to the number of digits used in the single-precision calculations.) Lehmer s method can be formulated as follows: Algorithm L (Euclid s algorithm for large numbers). Let u,v be nonnegative integers, with u 2 v, represented in multiple precision. This algorithm computes the greatest common divisor of u and v, making use of auxiliary single-precision pdigit variables ti, 6, A, B, C, D, T, g, and auxiliary multiple-precision variables t and w. Ll. [Initialize.] If v is small enough to be represented as a single-precision value, calculate gcd(zl, v) by Algorithm A and terminate the computation. Otherwise, let C be the p leading digits of U, and let 6 be the corresponding digits of v; in other words, if radix-b notation is being used, Q t [u/b ] and 6 +– [v/b ], where k is as small as possible consistent with the condition 6 < bp. Set A t 1, B +-0, C +-0, D e 1. (These variables represent the coefficients in (28), where u=AuO+BvO, v = Cue + Dv,,, (29) in the equivalent actions of Algorithm A on multiple-precision numbers. We also have u =C+B, v = 6 + D, u = 6 + A, v = 6 + c (30) in terms of the notation in the example worked above.) L2. [Test quotient.] Set q + Kc + A)l(fi + C)l. If q # NC + B)/(fj + D)J, go to step L4. (This step tests if q # q , in the notation of the above example. Note that single-precision overflow can occur in special circumstances during the computation in this step, but only when Q = bp -1 and A = 1 or when 6 = bP -1 and D = 1; the conditions O

Hosting web - 328 ARITHMETIC 4.5.2 we don t really mind having

Friday, January 25th, 2008

328 ARITHMETIC 4.5.2 we don t really mind having a large quotient 4, since Euclid s algorithm makes a great deal of progress when it replaces u by u mod v in such a case. A significant improvement in the speed of Euclid s algorithm when high- precision numbers are involved can be achieved by using a method due to D. H. Lehmer [AMM45 (1938), 227-2331. Working only with the leading digits of large numbers, it is possible to do most of the calculations with single-precision arith- metic, and to make a substantial reduction in the number of multiple-precision operations involved. We save a lot of time by doing a virtual calculation instead of the actual one. For example, let us consider the pair of eight-digit numbers u = 27182818, v = 10000000, assuming that we are using a machine with only four-digit words. Let U = 2718, v = 1001, u = 2719, v = 1000; then u//v and u /v are approximations to u/v, with u /v < u/v < U /V . (27) The ratio U/V determines the sequence of quotients obtained in Euclid s algo- rithm. If we carry out Euclid s algorithm simultaneously on the single-precision values (u , v ) and (u , v ) until we get a different quotient, it is not difficult to see that the same sequence of quotients would have appeared to this point if we had worked with the multiple-precision numbers (u, v). Thus, consider what happens when Euclid s algorithm is applied to (u , v ) and to (u , v ): 21′ 21′ q’ U” 21″ q’ 2718 1001 2 2719 1000 2 1001 716 1 1000 719 1 716 285 2 719 281 2 285 146 1 281 157 1 146 139 1 157 124 1 139 7 19 124 33 3 The first five quotients are the same in both cases, so they must be the true ones. But on the sixth step we find that q’ # q”, so the single-precision calculations are suspended. We have gained the knowledge that the calculation would have proceeded as follows if we had been working with the original multiple-precision numbers: 21 V q uo vo 2 vo uo - 2vc 1 uo - 2vc -uo + 3vo 2 (28) -uo + 3vo 3uo - 8vo 1 3uo - 8vo -4uc + llvc 1 -4210 + llvc 7uc - 19vo ? (The next quotient lies somewhere between 3 and 19.) No matter how many digits are in u and V, the first five steps of Euclid s algorithm would be the same as (28)

Web hosting comparison - 4.5.2 THE GREATEST COMMON DIVISOR 327 In other

Friday, January 25th, 2008

4.5.2 THE GREATEST COMMON DIVISOR 327 In other words, all integer solutions (w, Z, y, Z) to the original equations (li ), (18) are obtained from (23) by letting ti and t3 independently run through all integers. The general method that has just been illustrated is based on the following procedure: Find a nonzero coefficient c of smallest absolute value in the system of equations. Suppose that this coefficient appears in an equation having the form cx,, + cl51 + + Ckxk = d; (24 assume for simplicity that c > 0. If c = 1, use this equation to eliminate the variable x0 from the other equations remaining in the system; then repeat the procedure on the remaining equations. (If no more equations remain, the computation stops, and a general solution in terms of the variables not yet eliminated has essentially been obtained.) If c > 1, then if cl mod c = . . . = ck mod c = 0 check that d mod c = 0, otherwise there is no integer solution; then divide both sides of (24) by c and eliminate x0 as in the case c = 1. Finally, if c > 1 and not all of cimodc, . . . . ck mod c are zero, then introduce a new variable lc/cjxO + Lcl/cjxl + + Lck/cjxk = t; (25) eliminate the variable x0 from the other equations, in favor of t, and replace the original equation (24) by ct + (cl mod c)5i + . . . + (ck mod c)rk = d. (2 4 (Cf. (19) and (21) in the above example.) This process must terminate, since each step reduces either the number of equations or the size of the smallest nonzero coefficient in the system. A study of the above procedure will reveal its intimate connection with Euclid s algorithm. The method is a comparatively simple means of solving linear equations when the variables are required to take on integer values only. It isn t the best available method for this problem, however; substantial refinements are possible, but beyond the scope of this book. High-precision calculation. If u and v are very large integers, requiring a multiple- precision representation, the binary method (Algorithm B) is a simple and fairly efficient means of calculating their greatest common divisor, since it involves only subtractions and shifting. By contrast, Euclid s algorithm seems much less attractive, since step A2 requires a multiple-precision division of u by v. But this difficulty is not really as bad as it seems, since we will prove in Section 4.5.3 that the quotient Lu/vJ is almost always very small; for example, assuming random inputs, the quotient l~/vj will be less th an 1000 approximately 99.856 percent of the time. Therefore it is almost always possible to find lu/vJ and (U modv) using single-precision calculations, together with the comparatively simple operation of calculating u -gv where q is a single-precision number. Furthermore, if it does turn out that u is much larger than 2, (e.g., the initial input data might have this form),