Re: Enumerating Combinations (6-from-54)

new topic     » goto parent     » topic index » view thread      » older message » newer message

> From: ck lester

> Well, thanks Matt! smile

Sure.

> I'm working with a combin(54,6), which produces 25,827,165
> combinations...
> What's the least amount of memory into which I can fit all these
> combinations? I'm wondering if it would be smart to put all
> of them into a
> file and then use an index to all the combinations.

You might look into using EDS.  I don't know that anyone's actually used it
for anything this big, but...

> Also, when generating random combinations using combin(54,6),
> what's the
> quickest way to check for duplicates? I'm using simple find()
> or match() (i
> forget at the moment)...

<snip>

> I wonder... Of course, it may not matter (in regards to the speed of
> checking for duplicates) until I'm up into the hundreds of
> thousands of
> combinations... just something for you guys to think about
> and comment on.

Based on the docs, find or match starts to get slow with sequences over 50
elements in length.  You could do some kind of a hashing function or binary
search, or you could let EDS do that work for you.

I'm not sure that you really need to check for duplicates, though.  When you
use my algorithm, you'll need to do things a little differently than I did,
since each of your choices comes from the same population (mine don't), and
I'm guessing you're doing this without replacement (looks like the lottery,
actually).

Basically, you'll need to dynamically recalculate the min sequence each time
you go to a different combination, and you won't have any problem with
generating duplicates.  Just make sure that the min value is one greater
than the min value for the preceding value.  Of course, the first value will
always be 1.

It should start as:
min = {1,2,3,4,5,6}

And actually, max should be:
max = {49,50,51,52,53,54}

That'll save you a lot of time, since you won't bother to generate all those
duplicates.  Be careful, though, because you'll probably still run out of
memory with something this big.

Now that I think about it, you might not want to do this at all.  What if
you just consider this a number base 53 (since there's no zero in the
lottery), and use something like the big-number arithmetic library in the
archives?  There wouldn't be an easy way to exclude all the dup's, but maybe
it doesn't matter that much (can't tell, since I don't know what exactly
you're doing).  Since the corresponding integer could be expressed in no
more than 30-bits (5-bits * 6 digits), it should fit into the Euphoria
integer range of 31-bits.  Then you could use rand() to get a combination.
Hmm...I guess the dup's would be a problem, and I'm not sure there's a quick
way to weed them out.  Oh, well, that was interesting (IMHO), anyway.

Matt Lewis

new topic     » goto parent     » topic index » view thread      » older message » newer message

Search



Quick Links

User menu

Not signed in.

Misc Menu