Archive for November, 2007

248 ARITHMETIC 4.2.4 EXERCISES 1. [19] Given that (Web design portfolio)

Friday, November 30th, 2007

248 ARITHMETIC 4.2.4 EXERCISES 1. [19] Given that u and v are nonzero floating decimal numbers with the same sign, what is the approximate probability that fraction overflow occurs during the calculation of u $ V, according to Tables 1 and 2? 2. [42] Make further tests of floating point addition and subtraction, to confirm or improve on the accuracy of Tables 1 and 2. 3. [IS] What is the probability that the two leading digits of a floating decimal number are 23 , according to the logarithmic law? 4. [MS?] The text points out that the front pages of a well-used table of logarithms get dirtier than the back pages do. What if we had an antilogarithm table instead, i.e., a table giving the value of z when log,, x is given; which pages of such a table would be the dirtiest? b 5. [A&%] Let U be a random real number that is uniformly distributed in the interval 0 < U < 1. What is the distribution of the leading digits of U? 6. [%?I If we have binary computer words containing n + 1 bits, we might use p bits for the fraction part of floating binary numbers, one bit for the sign, and n -p bits for the exponent. This means that the range of values representable, i.e., the ratio of the largest positive normalized value to the smallest, is essentially 22 -p. The same computer word could be used to represent floating hexadecimal numbers, i.e., floating point numbers with radix 16, with p + 2 bits for the fraction part ((p + 2)/4 hexadecimal digits) and n -p -2 bits for the exponent; then the range of values would be 162n–p–2 = s2 - , the same as before, and with more bits in the fraction part. This may sound as if we are getting something for nothing, but the normalization condition for base 16 is weaker in that there may be up to three leading zero bits in the fraction part; thus not all of the p + 2 bits are significant. On the basis of the logarithmic law, what are the probabilities that the fraction part of a positive normalized radix 16 floating point number has exactly 0, 1, 2, and 3 leading zero bits? Discuss the desirability of hexadecimal versus binary. 7. [HA&B] Prove that there is no distribution function F(u) that satisfies (5) for each integer b 2 2, and for all real values r in the range 1 < r 5 b. 8. [HA&31 Does (10) hold when m = 0 for suitable NO(E)? 9. [HM24] (P. Diaconis.) Let S(n), Pz(n), . . , be any sequence of functions defined by repeatedly averaging a given function P,(n) according to Eq. (9). Prove that lim,,, P,(n) = PO(~) for all fixed n. b 10. [HA&%] The text shows that cm = log,, r -1 + cm, where em approaches zero as m + 00. Obtain the next term in the asymptotic expansion of cm. 11. [MIS] Given that U is a random variable distributed according to the logarithmic law, prove that l/U is also. 12. [HA& s] (R. W. Hamming.) The purpose of this exercise is to show that the result of floating point multiplication tends to obey the logarithmic law more perfectly than the operands do. Let U and V be random, normalized, positive floating point numbers, whose fraction parts are independently distributed with the respective density functions f(x) and g(x). Thus, fU 2 r and fV 2 s with probability Sllb J:,* f(z)g(y) dy dx, for l/b 2 r, s 2 1. Let h(x) be the density function of the fraction part of U X V (unrounded). Define the abnormality A(f) of a density function f to be the maximum

4.2.4 DISTRIBUTION OF FLOATING POINT NUMBERS 247 is (Database web hosting)

Friday, November 30th, 2007

4.2.4 DISTRIBUTION OF FLOATING POINT NUMBERS 247 is the coefficient of .P+ in the function ~C(z)lOZ 10 (24) +rlO( J(&)-ST This condition holds for all values of m, so (24) must equal C(Z), and we obtain the explicit formula –z (lo/r)*–1 -1 C(z) = G 102-l -1 . (25) ( > We want to study asymptotic properties of the coefficients of C(Z), to complete our analysis. The large parenthesized factor in (25) approaches ln(lO/r)/ In 10 = 1 - log,, r as z + 1, so we see that 1 - log,, r C(z) + 1 _ z = R(z) is an analytic function of the complex variable z in the circle I.4 < 11+ g/- In particular, R(z) converges for z = 1, so its coefficients approach zero. This proves that the coefficients of C(z) behave like those of (log,, T - l)/(l -z), that is, lim c, = log,, r -1. WI-CX Finally, we may combine this with (22), to show that Qm(s) approaches 1+ logloi - (1 + Ins + $(ln 3) + . . .) = l%o f uniformly for 1 5 s 2 10. 1 Therefore we have established the logarithmic law for integers by direct calculation, at the same time seeing that it is an extremely good approximation to the average behavior although it is never precisely achieved. The above proofs of Lemma Q and Theorem F are slight simplifications and amplifications of methods due to B. J. Flehinger, AMM 73 (1966), 1056-1061. Many authors have written about the distribution of initial digits, showing that the logarithmic law is a good approximation for many underlying distributions; see the survey by Ralph A. Raimi, -83 (1976), 521-538, for a comprehensive review of the literature. Another interesting (and different) treatment of floating point distribution has been given by Alan G. Konheim, Math. Comp. 19 (1965), 143-144. Exercise 17 discusses an approach to the definition of probability under which the logarithmic law holds exactly, over the integers. Furthermore, ex- ercise 18 demonstrates that any reasonable definition of probability over the integers must lead to the logarithmic law, if it assigns a value to the probability of leading digits.

246 ARITHMETIC 4.2.4 It is not difficult to (Remote web server)

Thursday, November 29th, 2007

246 ARITHMETIC 4.2.4 It is not difficult to solve the recurrence formula (11) for R, : We have Z&(s) = -1, RI(s) = -1 + T/S, Rz(s) = -1 + (r/s)(l + ln(s/r)), and in general For the stated range of s, this converges uniformly to -1 + (T/s) exp(ln(s/r)) = 0. The recurrence (11) for Qm takes the form &m(s) = ;(cm + 1 + 1 &m-l(t) ,t), (20) where cm=-1 9(l QmAl(t) dt + I &-1(t) dt) - -(21) The solution to recurrence (20) is easily found by trying out the first few cases and guessing at a formula that can be proved by induction; we find that c,+Ac,-ihis+…+ . (22) > It remains for us to calculate the coefficients c,, which by (19), (21), and (22) satisfy the relations Cl = (T -10)/9; 1 9( In 10 + $c,–i(ln 1O)2 + . . . + -ci(ln 10) (23) Gn+1=-Gn +~~~+~ln~+…+~~n~)..)-~O~. This sequence appears at first to be very complicated, but actually we can analyze it without difficulty with the help of generating functions. Let C(z) = Cl2 + c2.2 + c3z3 + . . . ; then since 10 = 1 + z In 10 + (1/2!)(2 In 1O)2 + . . . , we deduce that 1 9 Gn+1= yp+1+-c??%+1 10 = h cn+i +c,lnlO+~~~+~cl(lnlO)m ( m! > +k l+…+-$ lny m -1 ( .( >>

4.2.4 DISTRIBUTION OF FLOATING POINT NUMBERS 245 O.S,- (Web domain)

Wednesday, November 28th, 2007

4.2.4 DISTRIBUTION OF FLOATING POINT NUMBERS 245 O.S,- 1 2 I 3 I 4 I 5 I 6 I I I 9I I 7 8 10 s Fig. 5. The probability that the leading digit is 1. The gist of Lemma Q is that we have the limiting relationship lim Pm(lOns) = Sm(s). (16) n-+oo Also, since Sm(s) is not constant as s varies, the limit lim P,(n) n–r00 (which would be our desired probability ) does not exist for any m. The situation is shown in Fig. 5, which shows the values of S*(s) when m is small and r = 2. Even though S,(s) is not a constant, so that we do not have a definite limit for P,(n), note that already for m = 3 in Fig. 5 the value of Z&(s) stays very close to log,, 2 = 0.30103.. . . Therefore we have good reason to suspect that S&s) is very close to log,, r for all large m, and, in fact, that the sequence of functions (Sm(s)) converges uniformly to the constant function log,, V-. It is interesting to prove this conjecture by explicitly calculating Q,(s) and R,,Js) for all m, as in the proof of the following theorem: Theorem F. Let Sm(s) be the limit defined in (16). For dl E > 0, there exists a number N(e) such that ISm(s) - lw&, 4 < 6, for 1 5 s 5 10, (17) whenever m > N(E). Proof In view of Lemma Q, we can prove this result if we can show that there is a number M depending on E such that, for 1 < s 5 10 and for all m > M, we have IQm(s) -loglo 4 < E and I%dS)I < E. (9

244 ARITHMETIC 4.2.4 Proof. Consider the (Web site construction) functions Q&s)

Wednesday, November 28th, 2007

244 ARITHMETIC 4.2.4 Proof. Consider the functions Q&s) and R&s) defined by (ll), and let &m(t), t 5 r. (12) Sm(t) = &m(t) + &n(t), t > r. We will prove the lemma by induction on m. First note that &r(s) = (1 + (s - 1) + (r -10)/9)/s = 1 + (T -10)/9s, and RI(s) = (r -s)/s. From (8) we find that IPr(lOns) -Sr(s)/ = O(n)/lO ; this establishes the lemma when m = 1. Now for m > 1, we have and we want to approximate this quantity. By induction, the difference (13) is less than qc when 1 5 q 5 10 and j > N,-l(~). Since Sm-l(t) is continuous, it is a Riemann-integrable function; and the difference c s-,(~) -&1(t)dtl (14) lOj &-r(e). Therefore for 72 > N, the difference ),,lo~s)-~(o~,,~~los~-l~t)dt+/;.s~-l~~~~t)/ (15) - is bounded by C o

4.2.4 DISTRIBUTION OF FLOATING POINT NUMBERS 243 Pl(lons) (Best web design)

Tuesday, November 27th, 2007

4.2.4 DISTRIBUTION OF FLOATING POINT NUMBERS 243 Pl(lons) = -&rl -1 + [lOrl -10 +. . . + [lo - rl -ion-l + [lO s] + 1 - 10%) = –&1+ 10 +. . . + 10n-l) + O(n) + Lions] -1 -10 -. *. -10 ) = -y g lo? -lon+l) + lonsl+ O(n)). (8) 102 ( As n + 00, P1(lOns) therefore approaches the limiting value 1 + (r -10)/9s. The above calculation for the case s 5 r can be modified so that it is valid for s > r if we replace [lOns] + 1 by [lO+]; when s 2 r, we therefore obtain the limiting value lO(r -1)/9s. [See J. F ranel, Naturforschende Gesellschaft, Vierteljahrsschrift 62 (Ziirich, 1917), 286-295.1 In other words, the sequence (PI(n)) has subsequences (Pl(lOns)) whose limit goes from (T - 1)/9 up to lO(r -1)/9r and down again to (T - 1)/9, as s goes from 1 to T to 10. We see that PI(n) has no limit as n -+ m; and the values of PI(n) for large n are not particularly good approximations to our conjectured limit logI T either! Since PI(~) doesn t approach a limit, we can try to use the same idea as (7) once again, to average out the anomalous behavior. In general, let &+1(4= ; c Pm(k). l_ 1 and any real number t > 0, there are functions Qm(s), R,( s) an d an integer N%(E), such that whenever n > Nm(c) and 1 5 s 5 10, we have IPm(lOns)-&m(s)1 < t, ifs 5 r; I%(lW -(&m(s)+ %&))I < E, ifs > T. (10) Furthermore the functions Qm(s) and &(s) satisfy the relations s Rm(s) = ; s7 %-l(t)& I &o(s) = 1, Ro(s) = -1.

242 ARITHMETIC 4.2.4 One way out of the (Zeus web server)

Monday, November 26th, 2007

242 ARITHMETIC 4.2.4 One way out of the difficulty is to regard the logarithm law p(r) = log,, r as only a very close approximation to the true distribution. The true distribution itself may perhaps be changing as the universe expands, becoming a better and better approximation as time goes on; and if we replace 10 by an arbitrary base b, the approximation might be less accurate (at any given time) as b gets larger. Another rather appealing way to resolve the dilemma, by abandoning the traditional idea of a distribution function, has been suggested by R. A. Raimi, AMA4 76 (1969), 342-348. The hedging in the last paragraph is probably a very unsatisfactory explana- tion, and so the following further calculation (which sticks to rigorous mathe- matics and avoids any intuitive, yet paradoxical, notions of probability) should be welcome. Let us consider the distribution of the leading digits of the positive integers, instead of the distribution for some imagined set of real numbers. The investigation of this topic is quite interesting, not only because it sheds some light on the probability distributions of floating point data, but also because it makes a particularly instructive example of how to combine the methods of discrete mathematics with the methods of infinitesimal calculus. In the following discussion, let r be a fixed real number, 1 5 T 5 10; we will attempt to make a reasonable definition of p(r), the probability that the representation lOeN . fN of a random positive integer N has 1OfN < r, assuming infinite precision. To start, let us try to find the probability using a limiting method like the definition of Pr in Section 3.5. One nice way to rephrase that definition is to define 1, if n = 10e. f where 1Of < r, PO(n) = i.e., if (loglo n) mod 1 < loglo r; (6) 1 0, otherwise. Now PO(l), Po(2), . . . is an infinite sequence of zeros and ones, with ones to represent the cases that contribute to the probability we are seeking. We can try to average out this sequence, by defining Thus if we generate a random integer between 1 and n using the techniques of Chapter 3, and convert it to floating decimal form (e, f), the probability that 1Of < r is exactly PI(n). It is natural to let limn+oo PI(~) be the probability p(r) we are after, and that is just what we did in Section 3.5. But in this case the limit does not exist: For example, let us consider the subsequence PI(S), Pl(lOS), Pl(lOOS), . . . , pl(lons), . . .) where s is a real number, 1 5 s 5 10. If s 2 r, we find that

Michigan web site - 4.2.4 DISTRIBUTION OF FLOATING POINT NUMBERS 241 By

Sunday, November 25th, 2007

4.2.4 DISTRIBUTION OF FLOATING POINT NUMBERS 241 By our assumption, we should also have p(r) = probability that (log,, U + log,, c) mod 1 5 log,, r probability that (logI U mod 1) 5 log,, r -logI c or (log,, U mod 1) 2 1 - log,,, c, ifc 5 r; = probability that (log,, U mod 1) 2 log,,, r + 1 -log,,, c I and (log,, U mod 1) 2 1 - log,, c, if c > -r; drlc) + 1 -P(lOlC), ifc < r; P(lo~lc) -PUOlC), ifc > r. (2) Let us now extend the function p(r) to values outside the range 1 2 r 5 10, by defining p(l09) = p(r) + n; then if we replace 10/c by d, the last equation of (2) may be written PC4 = p(r) + ~(4. (3) If our assumption about invariance of the distribution under multiplication by a constant factor is valid, then Eq. (3) must hold for all r > 0 and 1 5 d 2 10. The facts that p(1) = 0, ~(10) = 1 now imply that 1 = p(10) = p((XO)n) = p(CCl) + p((KG) - ) = . *. = rip(Z); hence we deduce that p(lOmln ) = m/n for all positive integers m and n. If we now decide to require that p is continuous, we are forced to conclude that p(r) = logm r, and this is the desired law. Although this argument may be more convincing than the first one, it doesn t really hold up under scrutiny if we stick to conventional notions of probability. The traditional way to make the above argument rigorous is to assume that there is some underlying distribution of numbers F(u) such that a given positive number U is < u with probability F(u); then the probability of concern to us is p(r) = C (F(lO9) -F(lOm)), (4 m summed over all values -IX < m < 00. Our assumptions about scale invariance and continuity have led us to conclude that p(r) = loglo r. Using the same argument, we could prove that C (F(b9) -F(bm)) = log, r, (5) m for each integer b 2 2, when 1 5 r 5 b. But there is no distribution function F that satisfies this equation for all such b and r! (See exercise 7.)

240 ARITHMETIC 4.2.4 (Michigan web site) gave good grounds for believing

Saturday, November 24th, 2007

240 ARITHMETIC 4.2.4 gave good grounds for believing that the leading digit d occurs with probability log,,,(l + l/d). The same distribution was discovered empirically, many years later, by Frank Benford, who reported the results of 20,229 observations taken from different sources [Proc. Amer. Philosophical Sot. 78 (1938), 551-5721. In order to account for this leading-digit law, let s take a closer look at the way we write numbers in floating point notation. If we take any positive number u, its leading digits are determined by the value (logrc u) mod 1: The leading digit is less than d if and only if (loglo u) mod 1 < log,, d, (1) since 10fu = 10(lofJlo u, mod l. Now if we have a random positive number U, chosen from some reasonable distribution that might occur in nature, we might expect that (log,, U) mod 1 would be uniformly distributed between zero and one, at least to a very good approximation. (Similarly, we expect U mod 1, U2 mod 1, d-mod 1, etc., to be uniformly distributed. We expect a roulette wheel to be unbiased, for essentially the same reason.) Therefore by (1) the leading digit will be 1 with probability log,, 2 M 30.103 percent; it will be 2 with probability log,, 3 - log,,2 M 17.609 percent; and, in general, if T is any real value between 1 and 10, we ought to have 1Ofu 2 T approximately logic T of the time. Another way to explain this law is to say that a random value U should appear at a random point on a slide rule, according to the uniform distribution, since the distance from the left end of a slide rule to the position of U is propor- tional to (log,,, U) mod 1. The analogy between slide rules and floating point calculation is very close when multiplication and division are being considered. The fact that leading digits tend to be small is important to keep in mind; it makes the most obvious techniques of average error estimation for floating point calculations invalid. The relative error due to rounding is usually a little more than expected. Of course, it may justly be said that the heuristic argument above does not prove the stated law. It merely shows us a plausible reason why the leading digits behave the way they do. An interesting approach to the analysis of leading digits has been suggested by R. Hamming: Let p(r) be the probability that 1Ofu 2 r, where 1 2 r 5 10 and fry is the normalized fraction part of a random normalized floating point number U. If we think of random quantities in the real world, we observe that they are measured in terms of arbitrary units; and if we were to change the definition of a meter or a gram, many of the fundamental physical constants would have different values. Suppose then that all of the numbers in the universe are suddenly multiplied by a constant factor c; our universe of random floating point quantities should be essentially unchanged by this transformation, so p(r) should not be affected. Multiplying everything by c has the effect of transforming (log,, U) mod 1 into (log,, U + log,, c) mod 1. It is now time to set up formulas that describe the desired behavior; we may assume that 1 2 c 2 10. By definition, p(r) = probability that (log,, U) mod 1 2 log,, r.

Web hosting ratings - 4.2.4 DISTRIBUTION OF FLOATING POINT NUMBERS 239 Table

Friday, November 23rd, 2007

4.2.4 DISTRIBUTION OF FLOATING POINT NUMBERS 239 Table 2 EMPIRICAL DATA FOR NORMALIZATION AFTER ADDITION b=2 b= 10 b= 16 b = 64 Shift right 1 0.20 0.07 0.06 0.03 No shift 0.59 0.80 0.82 0.87 Shift left 1 0.07 0.08 0.07 0.06 Shift left 2 0.03 0.02 0.01 0.01 Shift left 3 0.02 0.00 0.01 0.00 Shift left 4 0.02 0.01 0.00 0.01 Shift left >4 0.06 0.02 0.02 0.02 When u and v have the same sign and are normalized, then u + v either requires one shift to the right (for fraction overflow), or no normalization shifts whatever. When u and v have opposite signs, we have zero or more left shifts during the normalization. Table 2 gives the observed number of shifts required; the last line of that table includes all cases where the result was zero. The average number of left shifts per normalization was about 0.9 when b = 2; about 0.2 when b = 10 or 16; and about 0.1 when b = 64. B. The fraction parts. Further analysis of floating point routines can be based on the statistical distribution of the fraction parts of randomly chosen normalized floating point numbers. In this case the facts are quite surprising, and there is an interesting theory that accounts for the unusual phenomena that are observed. For convenience let us temporarily assume that we are dealing with floating decimal (i.e., radix 10) arithmetic; modifications of the following discussion to any other positive integer base b will be very straightforward. Suppose we are given a random positive normalized number (e, f) = 10e . f. Since f is normalized, we know that its leading digit is 1, 2, 3, 4, 5, 6, 7, 8, or 9, and it seems natural to assume that each of these nine possible leading digits will occur about one-ninth of the time. But, in fact, the behavior in practice is quite different. For example, the leading digit tends to be equal to 1 over 30 percent of the time! One way to test the assertion just made is to take a table of physical constants (e.g., the speed of light, the acceleration of gravity) from some standard reference. If we look at the Handbook of Mathematical Functions (U.S. Dept of Commerce, 1964), for example, we find that 8 of the 28 different physical constants given in Table 2.3, roughly 29 percent, have leading digit equal to 1. The decimal values of n! for 1 5 n 5 100 include exactly 30 entries beginning with 1; so do the decimal values of 2n and of F,, for 1 5 n 5 100. We might also try looking at census reports, or a Farmer s Almanack (but not a telephone directory). In the days before pocket calculators, the pages in well-used tables of loga- rithms tended to get quite dirty in the front, while the last pages stayed relatively clean and neat. This phenomenon was apparently first mentioned in print by the American astronomer Simon Newcomb [Amer. J. Math. 4 (1881), 39-401, who