242 ARITHMETIC 4.2.4 One way out of the (Zeus web server)
242 ARITHMETIC 4.2.4 One way out of the difficulty is to regard the logarithm law p(r) = log,, r as only a very close approximation to the true distribution. The true distribution itself may perhaps be changing as the universe expands, becoming a better and better approximation as time goes on; and if we replace 10 by an arbitrary base b, the approximation might be less accurate (at any given time) as b gets larger. Another rather appealing way to resolve the dilemma, by abandoning the traditional idea of a distribution function, has been suggested by R. A. Raimi, AMA4 76 (1969), 342-348. The hedging in the last paragraph is probably a very unsatisfactory explana- tion, and so the following further calculation (which sticks to rigorous mathe- matics and avoids any intuitive, yet paradoxical, notions of probability) should be welcome. Let us consider the distribution of the leading digits of the positive integers, instead of the distribution for some imagined set of real numbers. The investigation of this topic is quite interesting, not only because it sheds some light on the probability distributions of floating point data, but also because it makes a particularly instructive example of how to combine the methods of discrete mathematics with the methods of infinitesimal calculus. In the following discussion, let r be a fixed real number, 1 5 T 5 10; we will attempt to make a reasonable definition of p(r), the probability that the representation lOeN . fN of a random positive integer N has 1OfN < r, assuming infinite precision. To start, let us try to find the probability using a limiting method like the definition of Pr in Section 3.5. One nice way to rephrase that definition is to define 1, if n = 10e. f where 1Of < r, PO(n) = i.e., if (loglo n) mod 1 < loglo r; (6) 1 0, otherwise. Now PO(l), Po(2), . . . is an infinite sequence of zeros and ones, with ones to represent the cases that contribute to the probability we are seeking. We can try to average out this sequence, by defining Thus if we generate a random integer between 1 and n using the techniques of Chapter 3, and convert it to floating decimal form (e, f), the probability that 1Of < r is exactly PI(n). It is natural to let limn+oo PI(~) be the probability p(r) we are after, and that is just what we did in Section 3.5. But in this case the limit does not exist: For example, let us consider the subsequence PI(S), Pl(lOS), Pl(lOOS), . . . , pl(lons), . . .) where s is a real number, 1 5 s 5 10. If s 2 r, we find that