Primes & Factorization
- Posted by MAP <ddhinc at ALA.NET>
Mar 11, 1998
Ok... my inept attempt at funny stuff and subtle
sarcasms at Mr. Riley's expense aside now. Sorry,
Mr. Riley... I apparently came off as being way
too serious;) Also, let me recant some of my
assertions in my previous post. The truth is
factorization is something which could benefit
from optimization. I found several articles
on the web concerning it, which employ some fairly
complex methods. Short of those, it would seem
to me that a particularly good way to speed it up would
be with a (at least partial) list of prime numbers,
since the base factors of any number are prime.
How 'bout that, there's a chance that these buckets
of primes have a niche market after all.
If you're thinking such a list would be too expensive
to maintain memory-wise, that was my first
thought as well...
My first primes program dealt with the idea of all primes
being 1 above or below a multiple of 6, and my
inclination was to avoid the flag-system RDS employed.
When that turned out rather slow I went off in another
direction, an ill-fated attempt at a recursive algorithm.
As my displeasure over not being abled to make it work
grew, I began to reconsider the sieve. Then I happened
upon an insight, a way to store primes with pretty good
efficiency by using the 6n +/- 1 method and the flags.
I discarded the idea though, because at the time I was
only interested in speed, and I hadn't thought it would
be useful for anything else.
Here's what the memory optimization looks like:
Where 1=prime and 0=composite,
Bits
Byte [ 0 1 1 1 1 1 1 1]*
25 23 19 17 13 11 7 5
Byte [ 0 1 1 1 1 0 1 1]... etc.
49 47 43 41 37 35 31 29
*Of course, 2 and 3 are left out because they don't
fall into the 6n +/- 1 set, but we don't really need
them stored.
Compared to the size of using an atom or integer
(8 bytes or 4 bytes) for each prime, that's a pretty
decent savings. You can see though why I didn't employ
that method, the extra overhead in processing to
store them like that would have dragged the speed
back to several seconds at least. But in a
factorization application, if you were going to
factor a multitude of numbers, a few seconds of
extra calculation at startup doesn't seem so bad.
And with any perhaps forthcoming innovations in the
generators, even that could be improved.
You might argue that the cost of accessing them
in that form during factorization would be expensive
as well, killing any efficiency they bring to the
looping. Hopefully not... You can't compare it to
the time to find and store primes used by the sieve
because sieves run through the same list of numbers
hundreds, thousands *and more* times, hitting
many of the numbers repeatedly. With factorization,
you can always eliminate the worst case scenerio
first (it being a large prime) by testing that
directly. If it's composite, you only have to loop
as many times as there are factors. With each factor
divided out (provided the looping is linear from low
to high) you know the next factor will be no lower
than the last one. The high end of the loop will be
sqrt(n), and that decreases with every factor
divided out, making the looping tighter with each
progression.
I'm not ready to tackle such a program yet, I have
other projects requiring my attention. But I thought
I'd go ahead and post it to see what discussion, if
any, it might generate.
Christopher D. Hickman
|
Not Categorized, Please Help
|
|