1. Looking for the best "JOIN"

Im looking for the best JOIN method.

Heres what I have:   {"str1","str2","str3"}

Heres what I want:   "str1str2str3"

No slow ideas please>>>>

Euman
euman at bellsouth.net

new topic     » topic index » view message » categorize

2. Looking for the best "JOIN"

BTW, anyother way to do what I wanted other than this is what I would like to
see..

for i = 1 to length(ipath) do
      path &= ipath[i][1..length(ipath[i])]
end for

Euman
euman at bellsouth.net

new topic     » goto parent     » topic index » view message » categorize

3. Re: Looking for the best "JOIN"

----- Original Message ----- 
From: <euman at bellsouth.net>
To: "EUforum" <EUforum at topica.com>
Subject: Looking for the best "JOIN"


> 
> Im looking for the best JOIN method.
> 
> Heres what I have:   {"str1","str2","str3"}
> 
> Heres what I want:   "str1str2str3"
> 
> No slow ideas please>>>>

Okay, I'll bite.

---------------------
global function joinStrings(sequence s)
    integer l,a,b
    sequence t

    if length(s) = 0 then
        return s
    end if

    l = 0
    for i = 1 to length(s) do
        l += length(s[i])
    end for

    t = repeat(0, l)

    a = 1
    for i = 1 to length(s) do
        b = a + length(s[i]) - 1
        t[a..b] = s[i]
        a = b + 1
    end for

    return t
end function
-- Test code --
constant words = {"abc","defghi","jklmnopqr","","st","u","vwxyz"}
atom e
sequence k
integer x
x = 0
e = time() + 10
while e >= time() do
    x += 1
    k =joinStrings(words)
end while
printf(1, "%d interations in 10 seconds\n", x)

---------------------
Derek

new topic     » goto parent     » topic index » view message » categorize

4. Re: Looking for the best "JOIN"

Euman wrote:

> Im looking for the best JOIN method.
> 
> Heres what I have:   {"str1","str2","str3"}
> 
> Heres what I want:   "str1str2str3"

function join(sequence s) -- this is the way most people would do it
    sequence j
    if not length(s) then return s end if
    j = {}
    for i = 1 to length(s)
        j &= s[i]
    end for
    return j
end function

...or...:

function join2(sequence s) -- this shouldn't be any faster but who knows?
    sequence j
    integer mid, len
    len = length(s)
    if not len then return s
    elsif len = 1 then return s[1]
    elsif len = 2 then return s[1]&s[2]
    end if
    mid = floor(len/2)
    return join2(s[1..mid])&join2(s[mid+1..len])
end function
    
Carl

new topic     » goto parent     » topic index » view message » categorize

5. Re: Looking for the best "JOIN"


new topic     » goto parent     » topic index » view message » categorize

6. Re: Looking for the best "JOIN"

Derek,

simple concatenation seems to be marginally faster, I got 214,361 :
193,488.
Try it.

jiri


global function joinStrings(sequence s)
    integer l,a,b
    sequence t

    if length(s) = 0 then
        return s
    end if

    l = 0
    for i = 1 to length(s) do
        l += length(s[i])
    end for

    t = repeat(0, l)

    a = 1
    for i = 1 to length(s) do
        b = a + length(s[i]) - 1
        t[a..b] = s[i]
        a = b + 1
    end for

    return t
end function

function join(sequence s)
    sequence r

    r = {}
    for i = 1 to length(s) do
        r &= s[i]
    end for
    return r
end function

-- Test code --
constant words = {"abc","defghi","jklmnopqr","","st","u","vwxyz"}
puts(1, join(words) & "\n")
puts(1, joinStrings(words) & "\n")

atom e
sequence k
integer x

x = 0
e = time() + 10
while e >= time() do
    x += 1
    k =join(words)
end while
printf(1, "%d interations in 10 seconds\n", x)

x = 0
e = time() + 10
while e >= time() do
    x += 1
    k =joinStrings(words)
end while
printf(1, "%d interations in 10 seconds\n", x)

new topic     » goto parent     » topic index » view message » categorize

7. Re: Looking for the best "JOIN"

I found this too. I got about 285,000 vs 270,000. I suspect that Euphoria
has better garbage collection when using the '&' operator.

----- Original Message -----
From: "Jiri Babor" <jbabor at PARADISE.NET.NZ>
To: "EUforum" <EUforum at topica.com>
Subject: Re: Looking for the best "JOIN"


>
> Derek,
>
> simple concatenation seems to be marginally faster, I got 214,361 :
> 193,488.
> Try it.
>
> jiri
>
>
> global function joinStrings(sequence s)
>     integer l,a,b
>     sequence t
>
>     if length(s) = 0 then
>         return s
>     end if
>
>     l = 0
>     for i = 1 to length(s) do
>         l += length(s[i])
>     end for
>
>     t = repeat(0, l)
>
>     a = 1
>     for i = 1 to length(s) do
>         b = a + length(s[i]) - 1
>         t[a..b] = s[i]
>         a = b + 1
>     end for
>
>     return t
> end function
>
> function join(sequence s)
>     sequence r
>
>     r = {}
>     for i = 1 to length(s) do
>         r &= s[i]
>     end for
>     return r
> end function
>
> -- Test code --
> constant words = {"abc","defghi","jklmnopqr","","st","u","vwxyz"}
> puts(1, join(words) & "\n")
> puts(1, joinStrings(words) & "\n")
>
> atom e
> sequence k
> integer x
>
> x = 0
> e = time() + 10
> while e >= time() do
>     x += 1
>     k =join(words)
> end while
> printf(1, "%d interations in 10 seconds\n", x)
>
> x = 0
> e = time() + 10
> while e >= time() do
>     x += 1
>     k =joinStrings(words)
> end while
> printf(1, "%d interations in 10 seconds\n", x)
>
>
>

new topic     » goto parent     » topic index » view message » categorize

8. Re: Looking for the best "JOIN"

Unsaid in Euman request, but assumed by others is the desire to find a
method of joining an abitary number of strings. The exact number would not
be known until run-time.

If we were to take Euman's request a face value the answer might be ...


  function join()
     return "str1str2str3"
  end function

because that is literally what he asked for, 'Heres what I want:
"str1str2str3"'.

----- Original Message -----
From: "Andy Drummond" <kestrelandy at xalt.co.uk>
To: "EUforum" <EUforum at topica.com>
Sent: Monday, October 22, 2001 10:54 PM
Subject: RE: Looking for the best "JOIN"


>
> Have I missed something?
> Surely this is built in to Euphoria as a fundamental?
>
> sequence x
>
> x = "abc"&"def"&"ghijk"
>
>  ... and d then = "abcdefghijk"
>
> I just tried it and it does ... so?
>
> Andy
>
>
> euman at bellsouth.net wrote:
> > Im looking for the best JOIN method.
> >
> > Heres what I have:   {"str1","str2","str3"}
> >
> > Heres what I want:   "str1str2str3"
> >
> > No slow ideas please>>>>
> >
> > Euman
> > euman at bellsouth.net
> >
> >
>
>

new topic     » goto parent     » topic index » view message » categorize

9. Re: Looking for the best "JOIN"

Hi Andy,
Sorry about my offhanded comment before.

Your function, I believe, is the most obvious. However, I've learnt that the
most obvious algorithm is not always the fastest one. Here is the reasoning
behind my first effort...

The obvious algorithm, I guessed, does this sort of thing internally...

> function join (sequence strings)
>    sequence result
>    result = ""
>    for i=1 to length(strings)
        -- Allocate some RAM big enough to hold the current
        -- length of result plus the length of strings[i]
        -- Copy the values in result to the new area
        -- Copy the values in strings[i] to the new area
        -- Return the old area (aka old results) to the system
>       result = result & strings[i]
>    end for
>    return result
> end function

Thus for each element in strings it would do 1 Allocate, 2 Copys, and one
deallocate. Also, the copys would be increasing in size for each iteration.
Meaning that some values would be copied multiple times.

I wrote my function thinking that if I just did the one allocation,
regardless of how many elements in 'strings' then only copy each value once
it would be faster. This would mean I'd have to do a loop through 'strings'
to calc the total length, then loop through again to copy each of the
values.

However, it turns out that the overheads in running two loops through the
elements in 'strings' is more than the allocate/copy/copy/deallocate cycle.
(I guess).
----
Derek.

new topic     » goto parent     » topic index » view message » categorize

10. Re: Looking for the best "JOIN"

Hi Euman,

----------
> Im looking for the best JOIN method.
> 
> Heres what I have:   {"str1","str2","str3"}
> 
> Heres what I want:   "str1str2str3"
> 
> No slow ideas please>>>>

Your question has analogy with my question
from the Threaded Euphoria thread, remember ?

{java & browser & windows & applet} system

Just analogy which seems to be amusing,
and it seems to be similar with
string execution from my favourite
Basic 02 of Soviet Iskra 226 machine.

Just analogies, which may be useful
for the list authors.

Good luck with this.

Regards,
Igor Kachan
kinz at peterlink.ru

new topic     » goto parent     » topic index » view message » categorize

11. Re: Looking for the best "JOIN"

I really never intended for this to be a war to see who's routine was the best
but it sure has been fun sitting back watching you all duke it out...hehe

Now that we have the fastest method, what the hek do we do with it? just
picking!

Thanks JB, DP, CB, RF, IK, AD, RS, CW and ROB....

Euman
euman at bellsouth.net

new topic     » goto parent     » topic index » view message » categorize

12. Re: Looking for the best "JOIN"

That's funny,

I did'nt know that I found Jiri's fast 'join()' in advance!

Have a nice day, Rolf

new topic     » goto parent     » topic index » view message » categorize

13. Re: Looking for the best "JOIN"

You can have it, Rolf, since  you appear so desperate.
Frame it!

I only wish you lot stopped using my name in vain...

jiri

----- Original Message ----- 
From: <rolf.schroeder at desy.de>
To: "EUforum" <EUforum at topica.com>
Subject: Re: Looking for the best "JOIN"


> 
> That's funny,
> 
> I did'nt know that I found Jiri's fast 'join()' in advance!
> 
> Have a nice day, Rolf
> 
> 
>

new topic     » goto parent     » topic index » view message » categorize

14. Re: Looking for the best "JOIN"

Jiri Babor wrote:
 
> You can have it, Rolf, since  you appear so desperate.
> Frame it!
> 
> I only wish you lot stopped using my name in vain...
> 
> jiri

Hi Jiri,

I'm waiting vor ALL responses to my mail just to 'join' them with the
quickest routine . . .

Have a nice day, Rolf

new topic     » goto parent     » topic index » view message » categorize

15. Re: Looking for the best "JOIN"


new topic     » goto parent     » topic index » view message » categorize

16. Re: Looking for the best "JOIN"

Yes, my routine takes into account an empty input sequence, not explicitly,
but it does. Try it. In this case, it is a bit slower, but I think it
doesn't matter... blink
----- Original Message -----
From: <bensler at mail.com>
To: "EUforum" <EUforum at topica.com>
Subject: RE: Looking for the best "JOIN"


>
> First of all,
>
>    Igor, the entire reason I attempted to improve on the JOIN algorithm,
was because I thought the same thing as you :), that it isn't necessary to
have a return sequence, when you can utilize the first element of the input
sequence. But this proves to be MUCH slower because EU has to shuffle the
data in the sequence too much(this is my assumption anyways).
>
>    Rforno, your version doesn't take into account empty sequences for
input.
>
> Now, I made a NEW bench test, which should prove to be much more accurate.
It's attached.
>
> Here are my results.
> (For EXE.EXE only)
>
> 10000 iterations of operations on 10000 random sequences using 4 different
versions of join()
> (Total of 4000000 iterations)
>
> my version : 139180 iterations per second
> Derek's    : 142229
> Jiri's     : 133288
> Igor's     :  71904
>
> The results are fairly close in this test but I found that on the whole,
Derek's routine was much more consistent, and always greater than the
others.
>
> Using not quite as extreme of a test, the results are similar from a DOS
box. I didn't test pure DOS.
>
> Also, I was able to improve Derek's routine by about 8%.
> He originally had it to test for an empty input sequence at the start of
the routine.
> This means that his routine requires a conditional jump every time the
input sequence is NOT empty.
> I just reversed it, so it only branches when the input sequence IS empty.
>
> There ya go Euman :)
> Derek's routine is fastest(with minor modification)
>
> function joinStrings(sequence s)
>    integer l,a,b
>    sequence t
>
>    if length(s) then
>       l=0
>       for i = 1 to length(s) do
>          l+=length(s[i])
>       end for
>
>       t =repeat(0,l)
>       a=1
>       for i = 1 to length(s) do
>          b = a + length(s[i]) - 1
>          t[a..b] = s[i]
>          a = b+1
>       end for
>
>       return t
>    end if
>    return s
> end function
>
> Chris
> --
>
>
>

new topic     » goto parent     » topic index » view message » categorize

Search



Quick Links

User menu

Not signed in.

Misc Menu