Re: Enumerating Combinations (6-from-54)
- Posted by Matthew Lewis <MatthewL at KAPCOUSA.COM> Sep 11, 2000
- 397 views
> From: ck lester > Well, thanks Matt! Sure. > I'm working with a combin(54,6), which produces 25,827,165 > combinations... > What's the least amount of memory into which I can fit all these > combinations? I'm wondering if it would be smart to put all > of them into a > file and then use an index to all the combinations. You might look into using EDS. I don't know that anyone's actually used it for anything this big, but... > Also, when generating random combinations using combin(54,6), > what's the > quickest way to check for duplicates? I'm using simple find() > or match() (i > forget at the moment)... <snip> > I wonder... Of course, it may not matter (in regards to the speed of > checking for duplicates) until I'm up into the hundreds of > thousands of > combinations... just something for you guys to think about > and comment on. Based on the docs, find or match starts to get slow with sequences over 50 elements in length. You could do some kind of a hashing function or binary search, or you could let EDS do that work for you. I'm not sure that you really need to check for duplicates, though. When you use my algorithm, you'll need to do things a little differently than I did, since each of your choices comes from the same population (mine don't), and I'm guessing you're doing this without replacement (looks like the lottery, actually). Basically, you'll need to dynamically recalculate the min sequence each time you go to a different combination, and you won't have any problem with generating duplicates. Just make sure that the min value is one greater than the min value for the preceding value. Of course, the first value will always be 1. It should start as: min = {1,2,3,4,5,6} And actually, max should be: max = {49,50,51,52,53,54} That'll save you a lot of time, since you won't bother to generate all those duplicates. Be careful, though, because you'll probably still run out of memory with something this big. Now that I think about it, you might not want to do this at all. What if you just consider this a number base 53 (since there's no zero in the lottery), and use something like the big-number arithmetic library in the archives? There wouldn't be an easy way to exclude all the dup's, but maybe it doesn't matter that much (can't tell, since I don't know what exactly you're doing). Since the corresponding integer could be expressed in no more than 30-bits (5-bits * 6 digits), it should fit into the Euphoria integer range of 31-bits. Then you could use rand() to get a combination. Hmm...I guess the dup's would be a problem, and I'm not sure there's a quick way to weed them out. Oh, well, that was interesting (IMHO), anyway. Matt Lewis