New version of my General Usage Routines

new topic     » topic index » view thread      » older message » newer message

This is a multi-part message in MIME format.

------=_NextPart_000_0027_01C259D1.5110FD20
	charset="iso-8859-1"

Rob:
I am attaching a new version of the package. Apart from minor corrections
(missing 'global' word, comments), it contains an interesting non-repetitive
random number generator (named DealFast()) I developed years ago, to be
precise in 1971. It uses only the storage needed for output and is faster
than other non-repetitive random number generators, really much faster when
the range of numbers to select from is large and the quantity of numbers to
be generated is small. I don't remember how I hit on this algorithm, and can
only justify why the numbers don't repeat, but not why they have reasonably
good randomness. Perhaps someone in the list with higher mathematics
knowledge could help in this topic.
I mainly used the algorithm to treat the problem of synonyms when using
hashing to locate keys in a file, because it avoids the problem of
clustering when one goes to the next record in case of a synonym.
Regards.

Here is the new code:

global function DealFast(integer upto, integer howmany)
--Returns a sequence of 'howmany' random integers, without repetition,
-- taken from 1..upto.
--If hownany > upto, it starts repeating numbers cyclically.
--This algorithm (for which I have no good justification, but it works) is
-- faster than any other, except when 'howmany' and 'upto' are small,
-- due to its larger initialization time.
-- Probably it does not allow for the generation of all possible
permutations.
-- Up to now, this has not been proved or disproved.
-- Perhaps it does not satisfy all randomness tests.
--It also uses the minimum amount of storage.
--For an even faster execution, when 'upto' is fixed, you may previously
-- call subroutine calcpot3(), get hold of its result, and provide it as an
-- additional parameter instead of calling calcpot3() inside the main
routine.
--This algorithm can be easily modified to treat the problem of synonyms
-- while accessing a file through hashing routines. You can provide the
-- hashed key as an argument to the set_rand() function, getting the first
-- location to try. If the bin is full, you can get the next location to try
-- by iterating the formula u = remainder(x - u - u, pot3). This way,
-- you can scan the whole file without inspecting any location more than
once,
-- and without scanning successive locations, which tends to develop
clusters.
--A somewhat easier algorithm can be programmed using powers of 2 instead of
3.
-- This is enough for hashing routines, but the pseudo-random numbers are
not
-- so well behaved. Maybe someone can design an improved algorithm
-- using powers of 2, but anyhow the performance gain is not really
-- significative.
    integer u, x, pot3
    sequence r
    pot3 = calcpot3(upto)
    r = repeat(0, howmany)
    x = rand(floor(pot3 / 3) * 2) - 1
    x += 3 * pot3 - 2 + floor(x / 2)
    u = rand(pot3) - 1
    for i = 1 to howmany do
 while u >= upto do
     u = remainder(x - 2 * u, pot3)
 end while
 r[i] = u + 1
 u = remainder(x - 2 * u, pot3)
    end for
    return r
end function




------=_NextPart_000_0027_01C259D1.5110FD20
Content-Type: application/x-zip-compressed;
	name="genfunc.ZIP"

new topic     » topic index » view thread      » older message » newer message

Search



Quick Links

User menu

Not signed in.

Misc Menu