1. Random function 2
- Posted by aku at inbox.as Mar 11, 2001
- 471 views
How I get x items randomly from a sequence that has length y, where y > x, but no duplicate items in x? This function (untested): function getsome(object howmany, object y) object result, r result = {} for i=1 to howmany do r = rand(length(y)) result = append(result, y[r]) y = y[1..r-1] & y[r+1..length(y)] end for return result end function seems to be inefficient. Any suggestion? Thanks!
2. Re: Random function 2
- Posted by Ted Fines <fines at macalester.edu> Mar 11, 2001
- 438 views
It might be much more efficient to just store the sequence items randomly in the first place, which is always a fast operation. Then when you need a number n of them, you can just grab the first n from the sequence. Storing them randomly would be easy enough: function storeitem(sequence s, object o) -- this is not refined and needs to be modified to account for empty sequence, -- extreme values of r, etc. r=rand(length(s)) s=s[1..r] & o & s[r+1..length(s)] return s end function Thus, s[1..n] is a random sampling of n items from sequenece n. But if you need an ordered sequence also, just whenever you add an item to it, add a function call to the above function to add the item to your randomized sequence. When you delete from your ordered sequence, be sure to delete from your randomized one also. --Ted --On Sunday, March 11, 2001 12:59:33 AM -0800 aku at inbox.as wrote: > > How I get x items randomly from a sequence that has length y, > where y > x, but no duplicate items in x? > > > This function (untested): > > function getsome(object howmany, object y) > object result, r > result = {} > for i=1 to howmany do > r = rand(length(y)) > result = append(result, y[r]) > y = y[1..r-1] & y[r+1..length(y)] > end for > return result > end function > > seems to be inefficient. > > Any suggestion? > > Thanks! > > > >
3. Re: Random function 2
- Posted by Jiri Babor <jbabor at PARADISE.NET.NZ> Mar 12, 2001
- 426 views
Hi Aku, It's an interesting problem. Initially I tried to optimize your function, but could get only about 10% speed increase. Then it occurred to me I could use the random shuffle discussed few days ago on the input sequence and then spit out only the front slice of it: function getsome(integer n, sequence s) object temp integer j for i = length(s) to 2 by -1 do j = rand(i) temp = s[j] s[j] = s[i] s[i] = temp end for return s[1..n] end function This works well when the number of required items is a significant proportion of the total length. The more the better. But of course it is quite lousy if you need just a few. So eventually I went for a hybrid: function getsome(integer n, sequence s) sequence u integer l,r l = length(s) if 2*n > l then for i = 1 to l-n do r = rand(l) s[r..l-1] = s[r+1..l] l -= 1 end for return s[1..n] end if u = repeat(0, n) for i = 1 to n do r = rand(l) u[i] = s[r] s[r..l-1] = s[r+1..l] l -= 1 end for return u end function It seems to be 2 to 6 times faster than the original depending on the relative size of n. But to be honest, I am not terribly happy with it. I have a funny feeling there must be a more elegant, much faster solution lurking just around the corner. Unfortunately, I just have not got the time to chase it right now... jiri ----- Original Message ----- From: <aku at inbox.as> To: "EUforum" <EUforum at topica.com> Sent: Sunday, March 11, 2001 9:59 PM Subject: Random function 2 > > > How I get x items randomly from a sequence that has length y, > where y > x, but no duplicate items in x? > > > This function (untested): > > function getsome(object howmany, object y) > object result, r > result = {} > for i=1 to howmany do > r = rand(length(y)) > result = append(result, y[r]) > y = y[1..r-1] & y[r+1..length(y)] > end for > return result > end function > > seems to be inefficient. > > Any suggestion? > > Thanks! > > > > >
4. Re: Random function 2
- Posted by Kat <gertie at PELL.NET> Mar 12, 2001
- 446 views
On 12 Mar 2001, at 0:01, Jiri Babor wrote: > > > Hi Aku, > > It's an interesting problem. Initially I tried to optimize your > function, but could get only about 10% speed increase. Then it > occurred to me I could use the random shuffle discussed few days ago > on the input sequence and then spit out only the front slice of it: > > function getsome(integer n, sequence s) > object temp > integer j > > for i = length(s) to 2 by -1 do > j = rand(i) > temp = s[j] > s[j] = s[i] > s[i] = temp > end for > return s[1..n] > end function > > This works well when the number of required items is a significant > proportion of the total length. The more the better. But of course it > is quite lousy if you need just a few. So eventually I went for a > hybrid: > > function getsome(integer n, sequence s) > sequence u > integer l,r > > l = length(s) > if 2*n > l then > for i = 1 to l-n do > r = rand(l) > s[r..l-1] = s[r+1..l] > l -= 1 > end for > return s[1..n] > end if > > u = repeat(0, n) > for i = 1 to n do > r = rand(l) > u[i] = s[r] > s[r..l-1] = s[r+1..l] > l -= 1 > end for > return u > end function > > It seems to be 2 to 6 times faster than the original depending on the > relative size of n. But to be honest, I am not terribly happy with it. > I have a funny feeling there must be a more elegant, much faster > solution lurking just around the corner. Unfortunately, I just have > not got the time to chase it right now... This seems too simple to be workable, but i had to ask, what about this? function getsome(integer n, sequence s) object temp integer j for i = 1 to n do j = rand(i) temp = s[j] s[j] = s[i] s[i] = temp end for return s[1..n] end function Kat, curious as a.
5. Re: Random function 2
- Posted by Jiri Babor <jbabor at PARADISE.NET.NZ> Mar 12, 2001
- 421 views
Kat, You are a genius, and you don't even know about it! Your function does not work, because it merely churns elements in the 1..n range. But if we simply change its fifth line from j = rand(i) to j = rand(length(s)) we have a very efficient beast for ( 2*n <= length(s) ): function getsome(integer n, sequence s) object temp integer l,r l = length(s) for i = 1 to n do r = rand(l) temp = s[r] s[r] = s[i] s[i] = temp end for return s[1..n] end function For the other half of the range we can use the following trick: function getsome(integer n, sequence s) object temp integer l,r l = length(s) for i = n+1 to l do r = rand(l) temp = s[r] s[r] = s[i] s[i] = temp end for return s[1..n] end function And if we combine them, we get just what we always wanted: function getsome(integer n, sequence s) object temp integer l,r l = length(s) if 2*n > l then for i = n+1 to l do r = rand(l) temp = s[r] s[r] = s[i] s[i] = temp end for else for i = 1 to n do r = rand(l) temp = s[r] s[r] = s[i] s[i] = temp end for end if return s[1..n] end function Only when I compared the last function with about 6 other solutions I noticed that I had copied a wrong routine into my previous post. The second and the third functions are *wrong*! Please do not used them! jiri ----- Original Message ----- From: "Kat" <gertie at PELL.NET> To: "EUforum" <EUforum at topica.com> Sent: Monday, March 12, 2001 10:56 PM Subject: Re: Random function 2 > > > On 12 Mar 2001, at 0:01, Jiri Babor wrote: > > > > > > > Hi Aku, > > > > It's an interesting problem. Initially I tried to optimize your > > function, but could get only about 10% speed increase. Then it > > occurred to me I could use the random shuffle discussed few days ago > > on the input sequence and then spit out only the front slice of it: > > > > function getsome(integer n, sequence s) > > object temp > > integer j > > > > for i = length(s) to 2 by -1 do > > j = rand(i) > > temp = s[j] > > s[j] = s[i] > > s[i] = temp > > end for > > return s[1..n] > > end function > > > > This works well when the number of required items is a significant > > proportion of the total length. The more the better. But of course it > > is quite lousy if you need just a few. So eventually I went for a > > hybrid: > > > > function getsome(integer n, sequence s) > > sequence u > > integer l,r > > > > l = length(s) > > if 2*n > l then > > for i = 1 to l-n do > > r = rand(l) > > s[r..l-1] = s[r+1..l] > > l -= 1 > > end for > > return s[1..n] > > end if > > > > u = repeat(0, n) > > for i = 1 to n do > > r = rand(l) > > u[i] = s[r] > > s[r..l-1] = s[r+1..l] > > l -= 1 > > end for > > return u > > end function > > > > It seems to be 2 to 6 times faster than the original depending on the > > relative size of n. But to be honest, I am not terribly happy with it. > > I have a funny feeling there must be a more elegant, much faster > > solution lurking just around the corner. Unfortunately, I just have > > not got the time to chase it right now... > > > This seems too simple to be workable, but i had to ask, what about this? > > > function getsome(integer n, sequence s) > object temp > integer j > > for i = 1 to n do > j = rand(i) > temp = s[j] > s[j] = s[i] > s[i] = temp > end for > return s[1..n] > end function > > Kat, > curious as a. > > > >
6. Re: Random function 2
- Posted by aku at inbox.as Mar 13, 2001
- 472 views
------------171D3A5278D0659 Thanks to Jiri, Kat, Derek, Al who sent the getsome function. I tried to benchmark it (attached) and this is the result... getsome_kat 6.92 getsome_jiri 6.81 getsomeA 6.92 getsomeA_derek 6.92 tmp.ex:109 in function getsomeB() slice ends past end of sequence (50000 > 104) .. called from tmp.ex:171 --> see ex.err Warning: private variable result in getsomeC() is not used no much difference... getsomeB and getsomeC is failed... ------------171D3A5278D0659