3.3.2 EMPIRICAL TESTS (Web hosting colocation) 69 Suppose we have m
3.3.2 EMPIRICAL TESTS 69 Suppose we have m urns and we throw n balls at random into those urns, where m is much greater than n. Most of the balls will land in urns that were previously empty, but if a ball falls into an urn that already contains at least one ball we say that a collision has occurred. The collision test counts the number of collisions, and a generator passes this test if it doesn t induce too many or too few collisions. To fix the ideas, suppose m = 220 and n = 214. Then each urn will receive only one 64th of a ball, on the average. The probability that a given urn will contain exactly Ic balls is pk = (z)mpk(l -m-l)n-k, so the expected number of collisions per urn is ,&>l(k-l)pk = xkBO k&-E,>, pk = n/m-l+po. Since po = (1 - m-l)n = 1 - n/m + (y)mm2 + small& terms, we find that the average total number of collisions taken over all m urns is very slightly less than n2/2m = 128. We can use the collision test to rate a random number generator in a large number of dimensions. For example, when m = 220 and n = 214 we can test the 20-dimensional randomness of a number generator by letting d = 2 and forming 20-dimensional vectors Vj = (Yzo~, Yzo~+~, . . . , Yzo~+~~) for 0 2 j < n. It suffices to keep a table of m = 22o bits to determine collisions, one bit for each possible value of the vector V,; on a computer with 32 bits per word, this amounts to 215 words. Initially all 220 bits of this table are cleared to zero; then for each V,, if the corresponding bit is already 1 we record a collision, otherwise we set the bit to 1. This test can also be used in 10 dimensions with d = 4, and so on. To decide if the test is passed, we can use the following table of percentage points when m = 220 and n = 214: collisions 5 101 108 119 126 134 145 153 with probability .009 .043 .244 .476 .742 .946 .989 The theory underlying these probabilities is the same we used in the poker test, Eq. (5); the probability that c collisions occur is the probability that n - c urns are occupied, namely m(m-l)...(m-n+c+l) n mn 1n-c I Although m and n are very large, it is not difficult to compute these probabilities using the following method: Algorithm S (Percentage points for collision test). Given m and n, this algorithm determines the distribution of the number of collisions that occur when n balls are scattered into m urns. An auxiliary array A[O], A[l], . . . , A[n] of floating point numbers is used for the computation; actually A[j] will be nonzero only for jo 2 j 2 j,, and ~ 1 - ~ 0 will be at most of order log n, so it would be possible to get by with considerably less storage.
Note: If you are looking for best quality webspace to host and run your tomcat application check Vision virtual web hosting services