1. Random function 2

How I get x items randomly from a sequence that has length y,
 where y > x, but no duplicate items in x?


 This function (untested):

 function getsome(object howmany, object y)
 object result, r
     result = {}
     for i=1 to howmany do
         r = rand(length(y))
         result = append(result, y[r])
         y = y[1..r-1] & y[r+1..length(y)]
     end for
     return result
 end function

 seems to be inefficient.

 Any suggestion?

 Thanks!

new topic     » topic index » view message » categorize

2. Re: Random function 2

It might be much more efficient to just store the sequence items randomly 
in the first place, which is always a fast operation.  Then when you need a 
number n of them, you can just grab the first n from the sequence.  Storing 
them randomly would be easy enough:

function storeitem(sequence s, object o)
-- this is not refined and needs to be modified to account for empty 
sequence,
-- extreme values of r, etc.
r=rand(length(s))
s=s[1..r] & o & s[r+1..length(s)]
return s
end function

Thus, s[1..n] is a random sampling of n items from sequenece n.

But if you need an ordered sequence also, just whenever you add an item to 
it, add a function call to the above function to add the item to your 
randomized sequence.  When you  delete from your ordered sequence, be sure 
to delete from your randomized one also.

--Ted


--On Sunday, March 11, 2001 12:59:33 AM -0800 aku at inbox.as wrote:

>
>  How I get x items randomly from a sequence that has length y,
>  where y > x, but no duplicate items in x?
>
>
>  This function (untested):
>
>  function getsome(object howmany, object y)
>  object result, r
>      result = {}
>      for i=1 to howmany do
>          r = rand(length(y))
>          result = append(result, y[r])
>          y = y[1..r-1] & y[r+1..length(y)]
>      end for
>      return result
>  end function
>
>  seems to be inefficient.
>
>  Any suggestion?
>
>  Thanks!
>
>
>
>

new topic     » goto parent     » topic index » view message » categorize

3. Re: Random function 2

Hi Aku,

It's an interesting problem. Initially I tried to optimize your
function, but could get only about 10% speed increase. Then it
occurred to me I could use the random shuffle discussed few days ago
on the input sequence and then spit out only the front slice of it:

    function getsome(integer n, sequence s)
        object temp
        integer j

        for i = length(s) to 2 by -1 do
            j = rand(i)
            temp = s[j]
            s[j] = s[i]
            s[i] = temp
        end for
        return s[1..n]
    end function

This works well when the number of required items is a significant
proportion of the total length. The more the better. But of course it
is quite lousy if you need just a few. So eventually I went for a
hybrid:

    function getsome(integer n, sequence s)
        sequence u
        integer l,r

        l = length(s)
        if 2*n > l then
            for i = 1 to l-n do
                r = rand(l)
                s[r..l-1] = s[r+1..l]
                l -= 1
            end for
            return s[1..n]
        end if

        u = repeat(0, n)
        for i = 1 to n do
            r = rand(l)
            u[i] =  s[r]
            s[r..l-1] = s[r+1..l]
            l -= 1
        end for
        return u
    end function

It seems to be 2 to 6 times faster than the original depending on the
relative size of n. But to be honest, I am not terribly happy with it.
I have a funny feeling there must be a more elegant, much faster
solution lurking just around the corner. Unfortunately, I just have
not got the time to chase it right now...

jiri


----- Original Message -----
From: <aku at inbox.as>
To: "EUforum" <EUforum at topica.com>
Sent: Sunday, March 11, 2001 9:59 PM
Subject: Random function 2


>
>
>  How I get x items randomly from a sequence that has length y,
>  where y > x, but no duplicate items in x?
>
>
>  This function (untested):
>
>  function getsome(object howmany, object y)
>  object result, r
>      result = {}
>      for i=1 to howmany do
>          r = rand(length(y))
>          result = append(result, y[r])
>          y = y[1..r-1] & y[r+1..length(y)]
>      end for
>      return result
>  end function
>
>  seems to be inefficient.
>
>  Any suggestion?
>
>  Thanks!
>
>
>
>
>

new topic     » goto parent     » topic index » view message » categorize

4. Re: Random function 2

On 12 Mar 2001, at 0:01, Jiri Babor wrote:

> 
> 
> Hi Aku,
> 
> It's an interesting problem. Initially I tried to optimize your
> function, but could get only about 10% speed increase. Then it
> occurred to me I could use the random shuffle discussed few days ago
> on the input sequence and then spit out only the front slice of it:
> 
>     function getsome(integer n, sequence s)
>         object temp
>         integer j
> 
>         for i = length(s) to 2 by -1 do
>             j = rand(i)
>             temp = s[j]
>             s[j] = s[i]
>             s[i] = temp
>         end for
>         return s[1..n]
>     end function
> 
> This works well when the number of required items is a significant
> proportion of the total length. The more the better. But of course it
> is quite lousy if you need just a few. So eventually I went for a
> hybrid:
> 
>     function getsome(integer n, sequence s)
>         sequence u
>         integer l,r
> 
>         l = length(s)
>         if 2*n > l then
>             for i = 1 to l-n do
>                 r = rand(l)
>                 s[r..l-1] = s[r+1..l]
>                 l -= 1
>             end for
>             return s[1..n]
>         end if
> 
>         u = repeat(0, n)
>         for i = 1 to n do
>             r = rand(l)
>             u[i] =  s[r]
>             s[r..l-1] = s[r+1..l]
>             l -= 1
>         end for
>         return u
>     end function
> 
> It seems to be 2 to 6 times faster than the original depending on the
> relative size of n. But to be honest, I am not terribly happy with it.
> I have a funny feeling there must be a more elegant, much faster
> solution lurking just around the corner. Unfortunately, I just have
> not got the time to chase it right now...


This seems too simple to be workable, but i had to ask, what about this?


    function getsome(integer n, sequence s)
        object temp
        integer j

        for i = 1 to n do
            j = rand(i)
            temp = s[j]
            s[j] = s[i]
            s[i] = temp
        end for
        return s[1..n]
    end function

Kat,
curious as a.

new topic     » goto parent     » topic index » view message » categorize

5. Re: Random function 2

Kat,

You are a genius, and you don't even know about it! Your function does
not work, because it merely churns elements in the 1..n range. But if
we simply change its fifth line from

    j = rand(i)

to

    j = rand(length(s))

we have a very efficient beast for ( 2*n <= length(s) ):

    function getsome(integer n, sequence s)
        object temp
        integer l,r

        l = length(s)
        for i = 1 to n do
            r = rand(l)
            temp = s[r]
            s[r] = s[i]
            s[i] = temp
        end for
        return s[1..n]
    end function


For the other half of the range we can use the following trick:

    function getsome(integer n, sequence s)
        object temp
        integer l,r

        l = length(s)
        for i = n+1 to l do
            r = rand(l)
            temp = s[r]
            s[r] = s[i]
            s[i] = temp
        end for
        return s[1..n]
    end function

And if we combine them, we get just what we always wanted:

    function getsome(integer n, sequence s)
        object temp
        integer l,r

        l = length(s)
        if 2*n > l then
            for i = n+1 to l do
                r = rand(l)
                temp = s[r]
                s[r] = s[i]
                s[i] = temp
            end for
        else
            for i = 1 to n do
                r = rand(l)
                temp = s[r]
                s[r] = s[i]
                s[i] = temp
            end for
        end if
        return s[1..n]
    end function

Only when I compared the last function with about 6 other solutions I
noticed that I had copied a wrong routine into my previous post. The
second and the third functions are *wrong*! Please do not used them!

jiri

----- Original Message -----
From: "Kat" <gertie at PELL.NET>
To: "EUforum" <EUforum at topica.com>
Sent: Monday, March 12, 2001 10:56 PM
Subject: Re: Random function 2


>
>
> On 12 Mar 2001, at 0:01, Jiri Babor wrote:
>
> >
> >
> > Hi Aku,
> >
> > It's an interesting problem. Initially I tried to optimize your
> > function, but could get only about 10% speed increase. Then it
> > occurred to me I could use the random shuffle discussed few days
ago
> > on the input sequence and then spit out only the front slice of
it:
> >
> >     function getsome(integer n, sequence s)
> >         object temp
> >         integer j
> >
> >         for i = length(s) to 2 by -1 do
> >             j = rand(i)
> >             temp = s[j]
> >             s[j] = s[i]
> >             s[i] = temp
> >         end for
> >         return s[1..n]
> >     end function
> >
> > This works well when the number of required items is a significant
> > proportion of the total length. The more the better. But of course
it
> > is quite lousy if you need just a few. So eventually I went for a
> > hybrid:
> >
> >     function getsome(integer n, sequence s)
> >         sequence u
> >         integer l,r
> >
> >         l = length(s)
> >         if 2*n > l then
> >             for i = 1 to l-n do
> >                 r = rand(l)
> >                 s[r..l-1] = s[r+1..l]
> >                 l -= 1
> >             end for
> >             return s[1..n]
> >         end if
> >
> >         u = repeat(0, n)
> >         for i = 1 to n do
> >             r = rand(l)
> >             u[i] =  s[r]
> >             s[r..l-1] = s[r+1..l]
> >             l -= 1
> >         end for
> >         return u
> >     end function
> >
> > It seems to be 2 to 6 times faster than the original depending on
the
> > relative size of n. But to be honest, I am not terribly happy with
it.
> > I have a funny feeling there must be a more elegant, much faster
> > solution lurking just around the corner. Unfortunately, I just
have
> > not got the time to chase it right now...
>
>
> This seems too simple to be workable, but i had to ask, what about
this?
>
>
>     function getsome(integer n, sequence s)
>         object temp
>         integer j
>
>         for i = 1 to n do
>             j = rand(i)
>             temp = s[j]
>             s[j] = s[i]
>             s[i] = temp
>         end for
>         return s[1..n]
>     end function
>
> Kat,
> curious as a.
>
>
>
>

new topic     » goto parent     » topic index » view message » categorize

6. Re: Random function 2

------------171D3A5278D0659

Thanks to Jiri, Kat, Derek, Al who sent the getsome function.

I tried to benchmark it (attached) and this is the result...

getsome_kat 6.92
getsome_jiri 6.81
getsomeA 6.92
getsomeA_derek 6.92

tmp.ex:109 in function getsomeB()
slice ends past end of sequence (50000 > 104)
.. called from tmp.ex:171
--> see ex.err
Warning: private variable result in getsomeC() is not used


no much difference...

getsomeB and getsomeC is failed...

------------171D3A5278D0659

new topic     » goto parent     » topic index » view message » categorize

Search



Quick Links

User menu

Not signed in.

Misc Menu