1. Enumerating Combinations (6-from-54)

Well, thanks Matt! smile

I'm working with a combin(54,6), which produces 25,827,165 combinations...
What's the least amount of memory into which I can fit all these
combinations? I'm wondering if it would be smart to put all of them into a
file and then use an index to all the combinations.

Also, when generating random combinations using combin(54,6), what's the
quickest way to check for duplicates? I'm using simple find() or match() (i
forget at the moment)... But I was wondering, when I'm going to start having
to check thousands of combinations for duplicates, would it be simpler to
check, say, a length 3 sequence instead of length 6? For example, I might
have the following in the combos group:

{ 1, 2, 3, 4, 5, 6 }
{ 1, 2, 3, 4, 5, 7 }
{ 1, 2, 4, 5, 6, 7 }
{ 1, 3, 5, 7, 8, 9 }
{ 1, 3, 6, 8, 9, 10 }
...ad infinitum

Instead of checking the entire sequence against each other, what if i were
to check just the first three (and upon finding a duplicate, THEN check all
6)?

I wonder... Of course, it may not matter (in regards to the speed of
checking for duplicates) until I'm up into the hundreds of thousands of
combinations... just something for you guys to think about and comment on.

Thanks!
<\<

----- Original Message -----
From: Matthew Lewis <MatthewL at KAPCOUSA.COM>
To: <EUPHORIA at LISTSERV.MUOHIO.EDU>
Sent: Monday, September 11, 2000 10:09 AM
Subject: Re: enumerating all the combinations


> > From: ck lester
> >
> > Can anybody decode this into EUPHORIA?
> <snip>
>
> Erm...no thanks...


_________________________________________________________
Do You Yahoo!?
Get your free @yahoo.com address at http://mail.yahoo.com

new topic     » topic index » view message » categorize

2. Re: Enumerating Combinations (6-from-54)

> From: ck lester

> Well, thanks Matt! smile

Sure.

> I'm working with a combin(54,6), which produces 25,827,165
> combinations...
> What's the least amount of memory into which I can fit all these
> combinations? I'm wondering if it would be smart to put all
> of them into a
> file and then use an index to all the combinations.

You might look into using EDS.  I don't know that anyone's actually used it
for anything this big, but...

> Also, when generating random combinations using combin(54,6),
> what's the
> quickest way to check for duplicates? I'm using simple find()
> or match() (i
> forget at the moment)...

<snip>

> I wonder... Of course, it may not matter (in regards to the speed of
> checking for duplicates) until I'm up into the hundreds of
> thousands of
> combinations... just something for you guys to think about
> and comment on.

Based on the docs, find or match starts to get slow with sequences over 50
elements in length.  You could do some kind of a hashing function or binary
search, or you could let EDS do that work for you.

I'm not sure that you really need to check for duplicates, though.  When you
use my algorithm, you'll need to do things a little differently than I did,
since each of your choices comes from the same population (mine don't), and
I'm guessing you're doing this without replacement (looks like the lottery,
actually).

Basically, you'll need to dynamically recalculate the min sequence each time
you go to a different combination, and you won't have any problem with
generating duplicates.  Just make sure that the min value is one greater
than the min value for the preceding value.  Of course, the first value will
always be 1.

It should start as:
min = {1,2,3,4,5,6}

And actually, max should be:
max = {49,50,51,52,53,54}

That'll save you a lot of time, since you won't bother to generate all those
duplicates.  Be careful, though, because you'll probably still run out of
memory with something this big.

Now that I think about it, you might not want to do this at all.  What if
you just consider this a number base 53 (since there's no zero in the
lottery), and use something like the big-number arithmetic library in the
archives?  There wouldn't be an easy way to exclude all the dup's, but maybe
it doesn't matter that much (can't tell, since I don't know what exactly
you're doing).  Since the corresponding integer could be expressed in no
more than 30-bits (5-bits * 6 digits), it should fit into the Euphoria
integer range of 31-bits.  Then you could use rand() to get a combination.
Hmm...I guess the dup's would be a problem, and I'm not sure there's a quick
way to weed them out.  Oh, well, that was interesting (IMHO), anyway.

Matt Lewis

new topic     » goto parent     » topic index » view message » categorize

3. Re: Enumerating Combinations (6-from-54)

The attached file which I have called "lotto.exu" may be of interest... or
maybe not.  :)

All it does though is select 6 random unduplicated number from min - max
using Michael Nelson's recently posted generic random number function.  It
then runs again until a match is found.

I get dizzy thinking about a real lotto emulation - would be interesting
though.  My purpose was to illustrate to my children just how miniscule odds
are in winning a lotto.

I have been thinking about using this code and set_rand(x) to bench mark
insertion_ sort, shell, sort, and bucket sort.  Casual observation doesn't
reveal much difference.

Ken Rhodes




On Mon, 11 Sep 2000 10:57:39 -0700, Euphoria Programming for MS-DOS wrote:

>  > From: ck lester
>
>  > Well, thanks Matt! smile
>
>  Sure.
>
>  > I'm working with a combin(54,6), which produces 25,827,165
>  > combinations...
>  > What's the least amount of memory into which I can fit all these
>  > combinations? I'm wondering if it would be smart to put all
>  > of them into a
>  > file and then use an index to all the combinations.
>
>  You might look into using EDS.  I don't know that anyone's actually used
it
>  for anything this big, but...
>
>  > Also, when generating random combinations using combin(54,6),
>  > what's the
>  > quickest way to check for duplicates? I'm using simple find()
>  > or match() (i
>  > forget at the moment)...
>
>  <snip>
>
>  > I wonder... Of course, it may not matter (in regards to the speed of
>  > checking for duplicates) until I'm up into the hundreds of
>  > thousands of
>  > combinations... just something for you guys to think about
>  > and comment on.
>
>  Based on the docs, find or match starts to get slow with sequences over
50
>  elements in length.  You could do some kind of a hashing function or
binary
>  search, or you could let EDS do that work for you.
>
>  I'm not sure that you really need to check for duplicates, though.  When
you
>  use my algorithm, you'll need to do things a little differently than I
did,
>  since each of your choices comes from the same population (mine don't),
and
>  I'm guessing you're doing this without replacement (looks like the
lottery,
>  actually).
>
>  Basically, you'll need to dynamically recalculate the min sequence each
time
>  you go to a different combination, and you won't have any problem with
>  generating duplicates.  Just make sure that the min value is one greater
>  than the min value for the preceding value.  Of course, the first value
will
>  always be 1.
>
>  It should start as:
>  min = {1,2,3,4,5,6}
>
>  And actually, max should be:
>  max = {49,50,51,52,53,54}
>
>  That'll save you a lot of time, since you won't bother to generate all
those
>  duplicates.  Be careful, though, because you'll probably still run out of
>  memory with something this big.
>
>  Now that I think about it, you might not want to do this at all.  What if
>  you just consider this a number base 53 (since there's no zero in the
>  lottery), and use something like the big-number arithmetic library in the
>  archives?  There wouldn't be an easy way to exclude all the dup's, but
maybe
>  it doesn't matter that much (can't tell, since I don't know what exactly
>  you're doing).  Since the corresponding integer could be expressed in no
>  more than 30-bits (5-bits * 6 digits), it should fit into the Euphoria
>  integer range of 31-bits.  Then you could use rand() to get a
combination.
>  Hmm...I guess the dup's would be a problem, and I'm not sure there's a
quick
>  way to weed them out.  Oh, well, that was interesting (IMHO), anyway.
>
>  Matt Lewis





_______________________________________________________
Say Bye to Slow Internet!
http://www.home.com/xinbox/signup.html

new topic     » goto parent     » topic index » view message » categorize

4. Re: Enumerating Combinations (6-from-54)

Matt,

> You might look into using EDS.  I don't know that anyone's actually used
it
> for anything this big, but...

Yeah. What I'll do is run a program to populate a EUPHORIA database with
every possible combination... or a significant portion thereof (for test
purposes) and measure the performance...

> I'm not sure that you really need to check for duplicates, though.  When
you
> use my algorithm, you'll need to do things a little differently than I
did,
> since each of your choices comes from the same population (mine don't),
and
> I'm guessing you're doing this without replacement (looks like the
lottery,
> actually).

Yes, it's for the lottery. I need a program that will spew extra sets as
needed. We maintain a database of the picks we use in our club*, and every
so often, when we hit 4-of-6 or something, we'll need a couple hundred or
more picks. I want to generate these picks on the fly but not duplicate any
in the database.

I guess I'm just curious as to how far EUPHORIA can handle this kind of
number crunching, especially when the picks get up into the thousands or
millions or whatever. It's more an exercise than anything applicable (at
least for now).

So, you see, I can't have duplicates.

<\<

* I realize the sheer impossibilities of winning a lottery, so no flames
please**. We do this in our extra time and with our extra dough, hoping to
get lucky, but not betting on it. blink

** This is mainly because I know jiri will have something to say about the
"idiot tax." smile



_________________________________________________________
Do You Yahoo!?
Get your free @yahoo.com address at http://mail.yahoo.com

new topic     » goto parent     » topic index » view message » categorize

Search



Quick Links

User menu

Not signed in.

Misc Menu