1. Fast appending and sorting of alot of short strings
Can somone give me some good theory to get this to go fast?
there is a sequence and it gets searched alot, and also appended to.
I thought that using a hash_find for search was good, but then it has
to be sorted.
So I use hash_append to find where string goes, and split the sequence
in insert the sting inside. The search must return an integer (0 not
found or subscript number). I would like to keep this in euphoria, but
I'm always open to an assembly idea.
Daniel
<code>
global function append_sort(sequence seq, sequence query)
--AKA hash_append
integer low,high,current,len,comp
len=length(seq)
if length(seq)=0 then return {query} end if
low=0
high=len-1
while 1 do
current=floor((high+low)/2)
comp=compare(query,seq[current+1])
if comp=0 then return seq end if
if high=low then
if comp<0 then return seq[1..current]&{query}&seq[current+1..len]
else return seq[1..current+1]&{query}&seq[current+2..len] end if
end if
if comp<0 then high=current-1
else low=current+1 end if
if high<low then
if comp<0 then return seq[1..current]&{query}&seq[current+1..len]
else return seq[1..current+1]&{query}&seq[current+2..len] end if
end if
end while
end function
global function findseqhashzero(sequence query, sequence seq)
--AKA hash_find
integer low,high,current,len,comp
len=length(seq)
if length(seq)=0 then return 0 end if
low=0
high=len-1
while 1 do
current=floor((high+low)/2)
comp=compare(query,seq[current+1])
if comp=0 then return current+1 end if
if high=low then return 0 end if
if comp<0 then high=current-1
else low=current+1 end if
if high<low then return 0 end if
end while
end function
</code>
2. Re: Fast appending and sorting of alot of short strings
On Wed, 6 Oct 2004 07:30:31 -0700, codepilot Gmail Account
<codepilot at gmail.com> wrote:
>Can somone give me some good theory to get this to go fast?
I was going to experiment with this a little, but got sidetracked & it
does not seem like I will get round to it. Still, I'll comment:
>there is a sequence and it gets searched alot, and also appended to.
>I thought that using a hash_find for search was good, but then it has
>to be sorted.
The point of using a hash is that you do not sort the main table.
Have a play with jiri's hash tables in the archives.
If I assume you try that and don't like it:
1) If you have a big table that gets added to alot, then it is far
better to use the following scheme:
sequence table table=repeat(0,64)
integer tabused tabused=0
...
if tabused=length(table) then
table&=repeat(0,64)
end if
tabused+=1
table[tabused]=<blah> -- see next
2) In order to keep the table sorted, you are currently using:
seq[1..current]&{query}&seq[current+1..len]
Instead try:
table[current+1..tabused]=table[current..tabused-1]
table[current]=query
This is much faster (I think, as I said untested) because instead of
creating two large slices of table, stitching them all together into a
big result, then discarding the original table, only the second big
slice of the table is created (and discarded), the original table
remains (and obviously the first half remains, completely untouched).
A similar argument applies to deleting entries.
3) While the overhead of passing a large table to a function is
negligible, modifying/returning, and replacing it adds a huge
overhead.
If possible, code a separate routine to modify each (large) table
directly, rather than passing/returning it. You can use a single
search function, provided it does not modify/return the whole table.
HTH,
Pete
3. Re: Fast appending and sorting of alot of short strings
Right now after some testing I think that binary search trees would be
the optimal solution in euphoria.
to find just compare if equal return, less goto leg1, greater leg2 repeat
easy
to append, just find and add to empty leg, very very fast and easy.
the only concern is keeping it balanced and thats pretty easy.
But I've decided to break up the one big array of strings into lots of
special cases which is faster in my case.
On Sat, 09 Oct 2004 02:35:45 +0100, Pete Lomax
<petelomax at blueyonder.co.uk> wrote:
>
> On Wed, 6 Oct 2004 07:30:31 -0700, codepilot Gmail Account
> <codepilot at gmail.com> wrote:
>
> >Can somone give me some good theory to get this to go fast?
> I was going to experiment with this a little, but got sidetracked & it
> does not seem like I will get round to it. Still, I'll comment:
> >there is a sequence and it gets searched alot, and also appended to.
> >I thought that using a hash_find for search was good, but then it has
> >to be sorted.
> The point of using a hash is that you do not sort the main table.
> Have a play with jiri's hash tables in the archives.
>
> If I assume you try that and don't like it:
>
> 1) If you have a big table that gets added to alot, then it is far
> better to use the following scheme:
>
> sequence table table=repeat(0,64)
> integer tabused tabused=0
> ...
> if tabused=length(table) then
> table&=repeat(0,64)
> end if
> tabused+=1
> table[tabused]=<blah> -- see next
>
> 2) In order to keep the table sorted, you are currently using:
>
> seq[1..current]&{query}&seq[current+1..len]
>
> Instead try:
>
> table[current+1..tabused]=table[current..tabused-1]
> table[current]=query
>
> This is much faster (I think, as I said untested) because instead of
> creating two large slices of table, stitching them all together into a
> big result, then discarding the original table, only the second big
> slice of the table is created (and discarded), the original table
> remains (and obviously the first half remains, completely untouched).
> A similar argument applies to deleting entries.
>
> 3) While the overhead of passing a large table to a function is
> negligible, modifying/returning, and replacing it adds a huge
> overhead.
> If possible, code a separate routine to modify each (large) table
> directly, rather than passing/returning it. You can use a single
> search function, provided it does not modify/return the whole table.
>
> HTH,
> Pete
>
>
>
>
>
4. Re: Fast appending and sorting of alot of short strings
On 9 Oct 2004, at 2:35, Pete Lomax wrote:
>
>
> On Wed, 6 Oct 2004 07:30:31 -0700, codepilot Gmail Account
> <codepilot at gmail.com> wrote:
>
> >Can somone give me some good theory to get this to go fast?
> I was going to experiment with this a little, but got sidetracked & it
> does not seem like I will get round to it. Still, I'll comment:
> >there is a sequence and it gets searched alot, and also appended to.
> >I thought that using a hash_find for search was good, but then it has
> >to be sorted.
> The point of using a hash is that you do not sort the main table.
> Have a play with jiri's hash tables in the archives.
>
> If I assume you try that and don't like it:
>
> 1) If you have a big table that gets added to alot, then it is far
> better to use the following scheme:
>
> sequence table table=repeat(0,64)
> integer tabused tabused=0
> ...
> if tabused=length(table) then
> table&=repeat(0,64)
> end if
> tabused+=1
> table[tabused]=<blah> -- see next
>
> 2) In order to keep the table sorted, you are currently using:
>
> seq[1..current]&{query}&seq[current+1..len]
>
> Instead try:
>
> table[current+1..tabused]=table[current..tabused-1]
But you are dropping the last t abused element of t able, aren't you? Or is it
spare space preappended to table?
> table[current]=query
Kat
5. Re: Fast appending and sorting of alot of short strings
On Fri, 8 Oct 2004 22:56:04 -0500, Kat <gertie at visionsix.com> wrote:
>> table[current+1..tabused]=table[current..tabused-1]
>
>But you are dropping the last tabused element of table, aren't you?
Perhaps. I should have coded tabused+=1 immediately before the above
statement, to make it clearer. It also depends on how you calculate
"current", eg inserting 4 into {1,3,5,7}, is current 2 or 3? If a
binary search fails, I'm not sure which it would return. Perhaps it
would be wise to test and add 1 just in case. To be honest, I've only
used this technique to remove chunks of a table, and usually shifting
relatively small blocks about near the end of the table, which may
have exaggerated the benefits I experienced (vs. "&").
> Or is it spare space preappended to table?
Yes, the intention is that table[tabused+1..length(tabused)] is
unused, and that before the above if tabused=length(tabused) you
append additional spare space.
Regards,
Pete
6. Re: Fast appending and sorting of alot of short strings
A little trick I found for this kind of dynamically growing sequences,
and that I use a lot in Win4Eu, is storing the number of used elements
in the sequence itself, instead of using a separate variable.
A very good place for this is the last element (seq[length(seq)]).
Here's a piece of code from Win4Eu that deals with collections (my name
for this kind of sequences):
constant GROWSIZE = 64 -- number of elements by which collections grow
constant BUFFER = repeat(0, GROWSIZE)
global function newCollection()
-- Create a new collection
-- (the last element of a collection contains the element-count)
return BUFFER
end function
global function addToCollection(sequence collection, object element)
-- Add an element to a collection
integer count, growSize
count = collection[length(collection)] + 1
growSize = count - length(collection)
if growSize > 0 then
if growSize > GROWSIZE then
collection &= repeat(0, growSize + GROWSIZE)
else collection &= BUFFER
end if
end if
collection[count] = element
collection[length(collection)] = count
return collection
end function
global function getCollectionCount(sequence collection)
-- Get the number of elements in the collection
return collection[length(collection)]
end function
global function findInCollection(object element, sequence collection)
-- Find an element in a collection
integer index
index = find(element, collection)
if index > collection[length(collection)] then index = 0 end if
return index
end function
--
tommy online: http://users.telenet.be/tommycarlier
tommy.blog: http://tommycarlier.blogspot.com
Euphoria Message Board: http://uboard.proboards32.com
7. Re: Fast appending and sorting of alot of short strings
On Sat, 09 Oct 2004 05:47:34 -0700, Tommy Carlier
<guest at RapidEuphoria.com> wrote:
>Here's a piece of code from Win4Eu
I applied my theory to it and compared your:
>}}}
<eucode>
>global function addToCollection(sequence collection, object element)
> -- Add an element to a collection
> integer count, growSize
> count = collection[length(collection)] + 1
> growSize = count - length(collection)
> if growSize > 0 then
> if growSize > GROWSIZE then
> collection &= repeat(0, growSize + GROWSIZE)
> else collection &= BUFFER
> end if
> end if
> collection[count] = element
> collection[length(collection)] = count
> return collection
>end function
></eucode>
{{{
with:
sequence collection
procedure addTocollection(object element)
-- Add an element to a collection
integer count, growSize
count = collection[length(collection)] + 1
growSize = count - length(collection)
if growSize > 0 then
if growSize > GROWSIZE then
collection &= repeat(0, growSize + GROWSIZE)
else collection &= BUFFER
end if
end if
collection[count] = element
collection[length(collection)] = count
end procedure
They are identical apart from one is a function acting on and
returning a parameter, whereas the other is a procedure acting
directly on a file-level variable.
It may surprise you to find that it is 66 times faster.
The testing code I used to arrive at that figure was:
sequence MyTable
collection=newCollection()
MyTable=newCollection()
integer c1,c2
atom t
t=time()+1
c1=0
while t>time() do
c1+=1
addTocollection(c1)
end while
t=time()+1
c2=0
while t>time() do
c2+=1
MyTable=addToCollection(MyTable,c2)
end while
?c1
?c2
?c1/c2
abort(0)
Of course it is a pain to have to duplicate code for each table, but
for that kind of gain it is well worth it.
Regards,
Pete
8. Re: Fast appending and sorting of alot of short strings
- Posted by "Kat" <gertie at visionsix.com>
Oct 09, 2004
-
Last edited Oct 10, 2004
On 9 Oct 2004, at 20:19, Pete Lomax wrote:
>
>
> On Sat, 09 Oct 2004 05:47:34 -0700, Tommy Carlier
> <guest at RapidEuphoria.com> wrote:
>
> >Here's a piece of code from Win4Eu
>
> I applied my theory to it and compared your:
> >}}}
<eucode>
> >global function addToCollection(sequence collection, object element)
> > -- Add an element to a collection
> > integer count, growSize
> > count = collection[length(collection)] + 1
> > growSize = count - length(collection)
> > if growSize > 0 then
> > if growSize > GROWSIZE then
> > collection &= repeat(0, growSize + GROWSIZE)
> > else collection &= BUFFER
> > end if
> > end if
> > collection[count] = element
> > collection[length(collection)] = count
> > return collection
> >end function
> ></eucode>
{{{
> with:
> }}}
<eucode>
> sequence collection
>
> procedure addTocollection(object element)
> -- Add an element to a collection
> integer count, growSize
> count = collection[length(collection)] + 1
> growSize = count - length(collection)
> if growSize > 0 then
> if growSize > GROWSIZE then
> collection &= repeat(0, growSize + GROWSIZE)
> else collection &= BUFFER
> end if
> end if
> collection[count] = element
> collection[length(collection)] = count
> end procedure
> </eucode>
{{{
>
> They are identical apart from one is a function acting on and
> returning a parameter, whereas the other is a procedure acting
> directly on a file-level variable.
>
> It may surprise you to find that it is 66 times faster.
Perfect example of why global variables are sometimes a necessity for
speed, even if some people claim they are bad programming practice. Ditto
for "goto". Thanks, Pete.
Kat
9. Re: Fast appending and sorting of alot of short strings
- Posted by "Juergen Luethje" <j.lue at gmx.de>
Oct 09, 2004
-
Last edited Oct 10, 2004
Kat wrote:
<snip>
> Perfect example of why global variables are sometimes a necessity for
> speed, even if some people claim they are bad programming practice.
> Ditto for "goto". Thanks, Pete.
We don't know where to GOTO if we don't know where we've COME FROM.
http://www.fortran.com/fortran/come_from.html
Regards,
Juergen
10. Re: Fast appending and sorting of alot of short strings
- Posted by codepilot Gmail Account <codepilot at gmail.com>
Oct 09, 2004
-
Last edited Oct 10, 2004
What is goto good for, that for while prcedure function routineid cant do?
On Sun, 10 Oct 2004 00:01:05 +0200, Juergen Luethje <j.lue at gmx.de> wrote:
>
> Kat wrote:
>
> <snip>
>
> > Perfect example of why global variables are sometimes a necessity for
> > speed, even if some people claim they are bad programming practice.
> > Ditto for "goto". Thanks, Pete.
>
> We don't know where to GOTO if we don't know where we've COME FROM.
> http://www.fortran.com/fortran/come_from.html
>
> Regards,
> Juergen
>
>
>
>
>
11. Re: Fast appending and sorting of alot of short strings
- Posted by "Juergen Luethje" <j.lue at gmx.de>
Oct 09, 2004
-
Last edited Oct 10, 2004
codepilot Gmail Account wrote
[quoting order rearranged]:
> On Sun, 10 Oct 2004 00:01:05 +0200, Juergen Luethje <j.lue at gmx.de> wrote:
>>
>> Kat wrote:
>>
>> <snip>
>>
>>> Perfect example of why global variables are sometimes a necessity for
>>> speed, even if some people claim they are bad programming practice.
>>> Ditto for "goto". Thanks, Pete.
>>
>> We don't know where to GOTO if we don't know where we've COME FROM.
>> http://www.fortran.com/fortran/come_from.html
>>
>> Regards,
>> Juergen
>
> What is goto good for, that for while prcedure function routineid cant do?
No doubt that 'goto' is the most important and cleanest construct in
higher programming languages, but 'come frome' is even more important,
and its cleaner than clean.
Read THE TRUTH[tm] at <http://www.fortran.com/fortran/come_from.html>.
Regards,
Juergen
--
/"\ ASCII ribbon campain | Money is the root of all evil.
\ / against HTML in | Send 20 Dollars for more info.
X e-mail and news, |
/ \ and unneeded MIME | http://home.arcor.de/luethje/prog/
12. Re: Fast appending and sorting of alot of short strings
On 10 Oct 2004, at 1:55, Juergen Luethje wrote:
>
>
> codepilot Gmail Account wrote
> [quoting order rearranged]:
>
> > On Sun, 10 Oct 2004 00:01:05 +0200, Juergen Luethje <j.lue at gmx.de> wrote:
> >>
> >> Kat wrote:
> >>
> >> <snip>
> >>
> >>> Perfect example of why global variables are sometimes a necessity for
> >>> speed, even if some people claim they are bad programming practice.
> >>> Ditto for "goto". Thanks, Pete.
> >>
> >> We don't know where to GOTO if we don't know where we've COME FROM.
> >> http://www.fortran.com/fortran/come_from.html
> >>
> >> Regards,
> >> Juergen
> >
> > What is goto good for, that for while prcedure function routineid cant do?
>
> No doubt that 'goto' is the most important and cleanest construct in
> higher programming languages, but 'come frome' is even more important,
> and its cleaner than clean.
> Read THE TRUTH[tm] at <http://www.fortran.com/fortran/come_from.html>.
Heh, i had argued for access to the call stack before, here, to see what point
in the program called a function, but i didn't know anyone had named such a
ability.
Kat
13. Re: Fast appending and sorting of alot of short strings
On 9 Oct 2004, at 15:29, codepilot Gmail Account wrote:
>
>
> What is goto good for, that for while prcedure function routineid cant do?
goto :end
> On Sun, 10 Oct 2004 00:01:05 +0200, Juergen Luethje <j.lue at gmx.de> wrote:
> >
> > Kat wrote:
> >
> > <snip>
> >
> > > Perfect example of why global variables are sometimes a necessity for
> > > speed, even if some people claim they are bad programming practice.
> > > Ditto for "goto". Thanks, Pete.
> >
> > We don't know where to GOTO if we don't know where we've COME FROM.
> > http://www.fortran.com/fortran/come_from.html
> >
> > Regards,
> > Juergen
:end
Getting to here, instead of top posting.
There were no conditionals, and no calls outside this email.
Kat
14. Re: Fast appending and sorting of alot of short strings
So would come from really be a good thing? Cause it just looks screwy,
to me that is.
On Sat, 9 Oct 2004 20:49:35 -0500, Kat <gertie at visionsix.com> wrote:
>
>
> On 9 Oct 2004, at 15:29, codepilot Gmail Account wrote:
>
> >
> > What is goto good for, that for while prcedure function routineid cant do?
>
> goto :end
>
> > On Sun, 10 Oct 2004 00:01:05 +0200, Juergen Luethje <j.lue at gmx.de> wrote:
> > >
> > > Kat wrote:
> > >
> > > <snip>
> > >
> > > > Perfect example of why global variables are sometimes a necessity for
> > > > speed, even if some people claim they are bad programming practice.
> > > > Ditto for "goto". Thanks, Pete.
> > >
> > > We don't know where to GOTO if we don't know where we've COME FROM.
> > > http://www.fortran.com/fortran/come_from.html
> > >
> > > Regards,
> > > Juergen
>
> :end
>
> Getting to here, instead of top posting.
> There were no conditionals, and no calls outside this email.
>
> Kat
>
>
>
>
>
15. Re: Fast appending and sorting of alot of short strings
On 9 Oct 2004, at 21:17, codepilot Gmail Account wrote:
>
>
> So would come from really be a good thing? Cause it just looks screwy,
> to me that is.
goto :end2
> On Sat, 9 Oct 2004 20:49:35 -0500, Kat <gertie at visionsix.com> wrote:
> >
> >
> > On 9 Oct 2004, at 15:29, codepilot Gmail Account wrote:
> >
> > >
> > > What is goto good for, that for while prcedure function routineid cant do?
> >
> > goto :end
> >
> > > On Sun, 10 Oct 2004 00:01:05 +0200, Juergen Luethje <j.lue at gmx.de>
> > > wrote:
> > > >
> > > > Kat wrote:
> > > >
> > > > <snip>
> > > >
> > > > > Perfect example of why global variables are sometimes a necessity for
> > > > > speed, even if some people claim they are bad programming practice.
> > > > > Ditto for "goto". Thanks, Pete.
> > > >
> > > > We don't know where to GOTO if we don't know where we've COME FROM.
> > > > http://www.fortran.com/fortran/come_from.html
> > > >
> > > > Regards,
> > > > Juergen
> >
> > :end
> >
> > Getting to here, instead of top posting.
> > There were no conditionals, and no calls outside this email.
> >
> > Kat
:end2
No, that page was a joke, it wasn't serious.
But i still say it would be nice to have a way to know how one got to where
one is, like the built-in trace knows. I again vote we have some access to
that if we have "with trace" or some analog enabled.
Kat
16. Re: Fast appending and sorting of alot of short strings
How abouts a call stack.
On Sat, 9 Oct 2004 23:33:46 -0500, Kat <gertie at visionsix.com> wrote:
>
>
> On 9 Oct 2004, at 21:17, codepilot Gmail Account wrote:
>
> >
> > So would come from really be a good thing? Cause it just looks screwy,
> > to me that is.
>
> goto :end2
>
> > On Sat, 9 Oct 2004 20:49:35 -0500, Kat <gertie at visionsix.com> wrote:
> > >
> > >
> > > On 9 Oct 2004, at 15:29, codepilot Gmail Account wrote:
> > >
> > > >
> > > > What is goto good for, that for while prcedure function routineid cant
> > > > do?
> > >
> > > goto :end
> > >
> > > > On Sun, 10 Oct 2004 00:01:05 +0200, Juergen Luethje <j.lue at gmx.de>
> > > > wrote:
> > > > >
> > > > > Kat wrote:
> > > > >
> > > > > <snip>
> > > > >
> > > > > > Perfect example of why global variables are sometimes a necessity
> > > > > > for
> > > > > > speed, even if some people claim they are bad programming practice.
> > > > > > Ditto for "goto". Thanks, Pete.
> > > > >
> > > > > We don't know where to GOTO if we don't know where we've COME FROM.
> > > > > http://www.fortran.com/fortran/come_from.html
> > > > >
> > > > > Regards,
> > > > > Juergen
> > >
> > > :end
> > >
> > > Getting to here, instead of top posting.
> > > There were no conditionals, and no calls outside this email.
> > >
> > > Kat
>
> :end2
> No, that page was a joke, it wasn't serious.
>
> But i still say it would be nice to have a way to know how one got to where
> one is, like the built-in trace knows. I again vote we have some access to
> that if we have "with trace" or some analog enabled.
>
>
> Kat
>
>
>
>
17. Re: Fast appending and sorting of alot of short strings
On 9 Oct 2004, at 22:14, codepilot Gmail Account wrote:
>
>
> How abouts a call stack.
goto :end3
> On Sat, 9 Oct 2004 23:33:46 -0500, Kat <gertie at visionsix.com> wrote:
> >
> >
> > On 9 Oct 2004, at 21:17, codepilot Gmail Account wrote:
> >
> > >
> > > So would come from really be a good thing? Cause it just looks screwy,
> > > to me that is.
> >
> > goto :end2
> >
> > > On Sat, 9 Oct 2004 20:49:35 -0500, Kat <gertie at visionsix.com> wrote:
> > > >
> > > >
> > > > On 9 Oct 2004, at 15:29, codepilot Gmail Account wrote:
> > > >
> > > > >
> > > > > What is goto good for, that for while prcedure function routineid cant
> > > > > do?
> > > >
> > > > goto :end
> > > >
> > > > > On Sun, 10 Oct 2004 00:01:05 +0200, Juergen Luethje <j.lue at gmx.de>
> > > > > wrote:
> > > > > >
> > > > > > Kat wrote:
> > > > > >
> > > > > > <snip>
> > > > > >
> > > > > > > Perfect example of why global variables are sometimes a necessity
> > > > > > > for speed, even if some people claim they are bad programming
> > > > > > > practice. Ditto for "goto". Thanks, Pete.
> > > > > >
> > > > > > We don't know where to GOTO if we don't know where we've COME FROM.
> > > > > > http://www.fortran.com/fortran/come_from.html
> > > > > >
> > > > > > Regards,
> > > > > > Juergen
> > > >
> > > > :end
> > > >
> > > > Getting to here, instead of top posting.
> > > > There were no conditionals, and no calls outside this email.
> > > >
> > > > Kat
> >
> > :end2
> > No, that page was a joke, it wasn't serious.
> >
> > But i still say it would be nice to have a way to know how one got to where
> > one is, like the built-in trace knows. I again vote we have some access to
> > that if we have "with trace" or some analog enabled.
> >
> >
> > Kat
Well, that *is* what i had asked for. Even tho it is limited to what called
functions and procedures. I'd think giving us access to the trace function in
the program would be just as easy?
Kat
18. Re: Fast appending and sorting of alot of short strings
Pete Lomax wrote:
> They are identical apart from one is a function acting on and
> returning a parameter, whereas the other is a procedure acting
> directly on a file-level variable.
<SNIP>
> It may surprise you to find that it is 66 times faster.
> Of course it is a pain to have to duplicate code for each table, but
> for that kind of gain it is well worth it.
I can't do this. The Collection-routines are for general use, not only
for one specific library. There are at least 10 different kinds of
collections in Win4Eu.
However, if anyone can optimize the routines, without changing the
functionality, I would really be grateful. Collections are used a lot
in Win4Eu, especially on the lowest level.
--
tommy online: http://users.telenet.be/tommycarlier
tommy.blog: http://tommycarlier.blogspot.com
Euphoria Message Board: http://uboard.proboards32.com
Empire for Euphoria: http://creativeportal.ca
19. Re: Fast appending and sorting of alot of short strings
On Sun, 10 Oct 2004 02:02:33 -0700, Tommy Carlier
<guest at RapidEuphoria.com> wrote:
>I can't do this. The Collection-routines are for general use, not only
>for one specific library. There are at least 10 different kinds of
>collections in Win4Eu.
>However, if anyone can optimize the routines, without changing the
>functionality, I would really be grateful. Collections are used a lot
>in Win4Eu, especially on the lowest level.
How about giving external programs an integer index, rather than
letting/making them store the table as a sequence. Using a table of
tables internally, you can do this:
sequence AllMyTables
global function newCollectionIdx()
AllMyTables=append(AllMyTables,BUFFER)
return length(AllMyTables)
end function
global procedure addToTable(integer tableid, object element)
integer count, growSize
sequence OneOfMyTables
OneOfMyTables=AllMyTables[tableid]
AllMyTables[tableid]=0 --<<-- Ensure Ref Count of 1!!
count = OneOfMyTables[length(OneOfMyTables)] + 1
growSize = count - length(OneOfMyTables)
if growSize > 0 then
if growSize > GROWSIZE then
OneOfMyTables &= repeat(0, growSize + GROWSIZE)
else OneOfMyTables &= BUFFER
end if
end if
OneOfMyTables[count] = element
OneOfMyTables[length(OneOfMyTables)] = count
AllMyTables[tableid]=OneOfMyTables
end procedure
...
integer MyTbleIdx
MyTbleIdx=newCollectionIdx()
addToTable(MyTbleIdx,c1)
Despite the fact that it looks like you are copying huge chunks of
data about, it only copies the top level ref, which is nearly as fast
as assigning an integer. By temporarily removing the table to be
updated from your table of tables, you avoid reference counts of >1,
which trigger the creation of new sequences.
Done this way, however, it is only 57 times faster
)
Regards,
Pete
PS I couldn't see the purpose of:
if growSize > GROWSIZE then
collection &= repeat(0, growSize + GROWSIZE)
20. Re: Fast appending and sorting of alot of short strings
Pete Lomax wrote:
> How about giving external programs an integer index, rather than
> letting/making them store the table as a sequence. Using a table of
> tables internally, you can do this:
True, but then every operation has to be done through routine-calls.
The collections I have now, are just sequences. I can subscript them,
and slice them using normal sequence-syntax.
<SNIP>
> PS I couldn't see the purpose of:
> if growSize > GROWSIZE then
> collection &= repeat(0, growSize + GROWSIZE)
OK, this piece of code probably doesn't fit in anymore, because
it used to be in a different function. If you just add 1 element
to a collection, this isn't necessary. I previously had a function
'ensureCapacity', which would grow the collection to ensure a given
capacity. It's useful if you know in advance that you'll add 100
elements, you first call 'ensureCapacity', and the collection grows
enough to allow you to add the 100 elements faster.
--
tommy online: http://users.telenet.be/tommycarlier
tommy.blog: http://tommycarlier.blogspot.com
Euphoria Message Board: http://uboard.proboards32.com
Empire for Euphoria: http://empire.iwireweb.com
21. Re: Fast appending and sorting of alot of short strings
I've decided to remove collections from Win4Eu. I implemented collections
in the first place, because I thought that it's better to append multiple
elements at once than to append element per element.
But I just did a little test, and it appears that my collection-code is
a lot slower than the default appending 1 element to a sequence.
Here's the test-code:
sequence s1, s2
atom t1, t2
s1 = newCollection()
t1 = time()
for i=1 to 100000 do
s1 = addToCollection(s1, i)
end for
t1 = time() - t1
s2 = {}
t2 = time()
for i=1 to 100000 do
s2 &= i
end for
t2 = time() - t2
? {t1, t2}
--
tommy online: http://users.telenet.be/tommycarlier
tommy.blog: http://tommycarlier.blogspot.com
Euphoria Message Board: http://uboard.proboards32.com
Empire for Euphoria: http://empire.iwireweb.com