1. Fast appending and sorting of alot of short strings

Can somone give me some good theory to get this to go fast?
there is a sequence and it gets searched alot, and also appended to.
I thought that using a hash_find for search was good, but then it has
to be sorted.
So I use hash_append to find where string goes, and split the sequence
in insert the sting inside. The search must return an integer (0 not
found or subscript number). I would like to keep this in euphoria, but
I'm always open to an assembly idea.

Daniel

<code>
global function append_sort(sequence seq, sequence query)
--AKA hash_append
	integer low,high,current,len,comp
	len=length(seq)
	if length(seq)=0 then return {query} end if
	low=0
	high=len-1
	while 1 do
		current=floor((high+low)/2)
		comp=compare(query,seq[current+1])
		if comp=0 then return seq end if
		if high=low then
			if comp<0 then return seq[1..current]&{query}&seq[current+1..len]
			else return seq[1..current+1]&{query}&seq[current+2..len] end if
		end if
		if comp<0 then high=current-1
		else low=current+1 end if
		if high<low then
			if comp<0 then return seq[1..current]&{query}&seq[current+1..len]
			else return seq[1..current+1]&{query}&seq[current+2..len] end if
		end if
	end while
end function

global function findseqhashzero(sequence query, sequence seq)
--AKA hash_find
	integer low,high,current,len,comp
	len=length(seq)
	if length(seq)=0 then return 0 end if
	low=0
	high=len-1
	while 1 do
		current=floor((high+low)/2)
		comp=compare(query,seq[current+1])
		if comp=0 then return current+1 end if
		if high=low then return 0 end if
		if comp<0 then high=current-1
		else low=current+1 end if
		if high<low then return 0 end if
	end while
end function

</code>

new topic     » topic index » view message » categorize

2. Re: Fast appending and sorting of alot of short strings

On Wed, 6 Oct 2004 07:30:31 -0700, codepilot Gmail Account
<codepilot at gmail.com> wrote:

>Can somone give me some good theory to get this to go fast?
I was going to experiment with this a little, but got sidetracked & it
does not seem like I will get round to it. Still, I'll comment:
>there is a sequence and it gets searched alot, and also appended to.
>I thought that using a hash_find for search was good, but then it has
>to be sorted.
The point of using a hash is that you do not sort the main table.
Have a play with jiri's hash tables in the archives.

If I assume you try that and don't like it:

1) If you have a big table that gets added to alot, then it is far
better to use the following scheme:

sequence table	table=repeat(0,64)
integer tabused	tabused=0
...
	if tabused=length(table) then
		table&=repeat(0,64)
	end if
	tabused+=1
	table[tabused]=<blah>	-- see next

2) In order to keep the table sorted, you are currently using:

	seq[1..current]&{query}&seq[current+1..len]

Instead try:

	table[current+1..tabused]=table[current..tabused-1]
	table[current]=query

This is much faster (I think, as I said untested) because instead of
creating two large slices of table, stitching them all together into a
big result, then discarding the original table, only the second big
slice of the table is created (and discarded), the original table
remains (and obviously the first half remains, completely untouched).
A similar argument applies to deleting entries.

3) While the overhead of passing a large table to a function is
negligible, modifying/returning, and replacing it adds a huge
overhead.
If possible, code a separate routine to modify each (large) table
directly, rather than passing/returning it. You can use a single
search function, provided it does not modify/return the whole table.

HTH,
Pete

new topic     » goto parent     » topic index » view message » categorize

3. Re: Fast appending and sorting of alot of short strings

Right now after some testing I think that binary search trees would be
the optimal solution in euphoria.
to find just compare if equal return, less goto leg1, greater leg2 repeat
easy
to append, just find and add to empty leg, very very fast and easy.
the only concern is keeping it balanced and thats pretty easy.

But I've decided to break up the one big array of strings into lots of
special cases which is faster in my case.


On Sat, 09 Oct 2004 02:35:45 +0100, Pete Lomax
<petelomax at blueyonder.co.uk> wrote:
> 
> On Wed, 6 Oct 2004 07:30:31 -0700, codepilot Gmail Account
> <codepilot at gmail.com> wrote:
> 
> >Can somone give me some good theory to get this to go fast?
> I was going to experiment with this a little, but got sidetracked & it
> does not seem like I will get round to it. Still, I'll comment:
> >there is a sequence and it gets searched alot, and also appended to.
> >I thought that using a hash_find for search was good, but then it has
> >to be sorted.
> The point of using a hash is that you do not sort the main table.
> Have a play with jiri's hash tables in the archives.
> 
> If I assume you try that and don't like it:
> 
> 1) If you have a big table that gets added to alot, then it is far
> better to use the following scheme:
> 
> sequence table  table=repeat(0,64)
> integer tabused tabused=0
> ...
>        if tabused=length(table) then
>                table&=repeat(0,64)
>        end if
>        tabused+=1
>        table[tabused]=<blah>   -- see next
> 
> 2) In order to keep the table sorted, you are currently using:
> 
>        seq[1..current]&{query}&seq[current+1..len]
> 
> Instead try:
> 
>        table[current+1..tabused]=table[current..tabused-1]
>        table[current]=query
> 
> This is much faster (I think, as I said untested) because instead of
> creating two large slices of table, stitching them all together into a
> big result, then discarding the original table, only the second big
> slice of the table is created (and discarded), the original table
> remains (and obviously the first half remains, completely untouched).
> A similar argument applies to deleting entries.
> 
> 3) While the overhead of passing a large table to a function is
> negligible, modifying/returning, and replacing it adds a huge
> overhead.
> If possible, code a separate routine to modify each (large) table
> directly, rather than passing/returning it. You can use a single
> search function, provided it does not modify/return the whole table.
> 
> HTH,
> Pete
> 
> 
> 
> 
>

new topic     » goto parent     » topic index » view message » categorize

4. Re: Fast appending and sorting of alot of short strings

On 9 Oct 2004, at 2:35, Pete Lomax wrote:

> 
> 
> On Wed, 6 Oct 2004 07:30:31 -0700, codepilot Gmail Account
> <codepilot at gmail.com> wrote:
> 
> >Can somone give me some good theory to get this to go fast?
> I was going to experiment with this a little, but got sidetracked & it
> does not seem like I will get round to it. Still, I'll comment:
> >there is a sequence and it gets searched alot, and also appended to.
> >I thought that using a hash_find for search was good, but then it has
> >to be sorted.
> The point of using a hash is that you do not sort the main table.
> Have a play with jiri's hash tables in the archives.
> 
> If I assume you try that and don't like it:
> 
> 1) If you have a big table that gets added to alot, then it is far
> better to use the following scheme:
> 
> sequence table	table=repeat(0,64)
> integer tabused	tabused=0
> ...
>  if tabused=length(table) then
>   table&=repeat(0,64)
>  end if
>  tabused+=1
>  table[tabused]=<blah>	-- see next
> 
> 2) In order to keep the table sorted, you are currently using:
> 
>  seq[1..current]&{query}&seq[current+1..len]
> 
> Instead try:
> 
>  table[current+1..tabused]=table[current..tabused-1]

But you are dropping the last t abused element of t able, aren't you? Or is it 
spare space preappended to table?
>  table[current]=query

Kat

new topic     » goto parent     » topic index » view message » categorize

5. Re: Fast appending and sorting of alot of short strings

On Fri, 8 Oct 2004 22:56:04 -0500, Kat <gertie at visionsix.com> wrote:

>>  table[current+1..tabused]=table[current..tabused-1]
>
>But you are dropping the last tabused element of table, aren't you?
Perhaps. I should have coded tabused+=1 immediately before the above
statement, to make it clearer. It also depends on how you calculate
"current", eg inserting 4 into {1,3,5,7}, is current 2 or 3? If a
binary search fails, I'm not sure which it would return. Perhaps it
would be wise to test and add 1 just in case. To be honest, I've only
used this technique to remove chunks of a table, and usually shifting
relatively small blocks about near the end of the table, which may
have exaggerated the benefits I experienced (vs. "&").
> Or is it spare space preappended to table?
Yes, the intention is that table[tabused+1..length(tabused)] is
unused, and that before the above if tabused=length(tabused) you
append additional spare space.

Regards,
Pete

new topic     » goto parent     » topic index » view message » categorize

6. Re: Fast appending and sorting of alot of short strings

A little trick I found for this kind of dynamically growing sequences,
and that I use a lot in Win4Eu, is storing the number of used elements
in the sequence itself, instead of using a separate variable.
A very good place for this is the last element (seq[length(seq)]).

Here's a piece of code from Win4Eu that deals with collections (my name
for this kind of sequences):
constant GROWSIZE = 64 -- number of elements by which collections grow
constant BUFFER = repeat(0, GROWSIZE)

global function newCollection()
	-- Create a new collection
	-- (the last element of a collection contains the element-count)
	return BUFFER
end function

global function addToCollection(sequence collection, object element)
	-- Add an element to a collection
	integer count, growSize
	count = collection[length(collection)] + 1
	growSize = count - length(collection)
	if growSize > 0 then
		if growSize > GROWSIZE then
			collection &= repeat(0, growSize + GROWSIZE)
		else collection &= BUFFER
		end if
	end if
	collection[count] = element
	collection[length(collection)] = count
	return collection
end function

global function getCollectionCount(sequence collection)
	-- Get the number of elements in the collection
	return collection[length(collection)]
end function

global function findInCollection(object element, sequence collection)
	-- Find an element in a collection
	integer index
	index = find(element, collection)
	if index > collection[length(collection)] then index = 0 end if
	return index
end function


--
tommy online: http://users.telenet.be/tommycarlier
tommy.blog: http://tommycarlier.blogspot.com
Euphoria Message Board: http://uboard.proboards32.com

new topic     » goto parent     » topic index » view message » categorize

7. Re: Fast appending and sorting of alot of short strings

On Sat, 09 Oct 2004 05:47:34 -0700, Tommy Carlier
<guest at RapidEuphoria.com> wrote:

>Here's a piece of code from Win4Eu 

I applied my theory to it and compared your:
>}}}
<eucode>
>global function addToCollection(sequence collection, object element)
>	-- Add an element to a collection
>	integer count, growSize
>	count = collection[length(collection)] + 1
>	growSize = count - length(collection)
>	if growSize > 0 then
>		if growSize > GROWSIZE then
>			collection &= repeat(0, growSize + GROWSIZE)
>		else collection &= BUFFER
>		end if
>	end if
>	collection[count] = element
>	collection[length(collection)] = count
>	return collection
>end function
></eucode>
{{{

with:
sequence collection

procedure addTocollection(object element)
	-- Add an element to a collection
	integer count, growSize
	count = collection[length(collection)] + 1
	growSize = count - length(collection)
	if growSize > 0 then
		if growSize > GROWSIZE then
			collection &= repeat(0, growSize + GROWSIZE)
		else collection &= BUFFER
		end if
	end if
	collection[count] = element
	collection[length(collection)] = count
end procedure


They are identical apart from one is a function acting on and
returning a parameter, whereas the other is a procedure acting
directly on a file-level variable.

It may surprise you to find that it is 66 times faster.

The testing code I used to arrive at that figure was:
sequence MyTable
	collection=newCollection()
	MyTable=newCollection()
integer c1,c2
atom t
	t=time()+1
	c1=0
	while t>time() do
		c1+=1
		addTocollection(c1)
	end while
	t=time()+1
	c2=0
	while t>time() do
		c2+=1
		MyTable=addToCollection(MyTable,c2)
	end while
?c1
?c2
?c1/c2
abort(0)


Of course it is a pain to have to duplicate code for each table, but
for that kind of gain it is well worth it.

Regards,
Pete

new topic     » goto parent     » topic index » view message » categorize

8. Re: Fast appending and sorting of alot of short strings

On 9 Oct 2004, at 20:19, Pete Lomax wrote:

> 
> 
> On Sat, 09 Oct 2004 05:47:34 -0700, Tommy Carlier
> <guest at RapidEuphoria.com> wrote:
> 
> >Here's a piece of code from Win4Eu 
> 
> I applied my theory to it and compared your:
> >}}}
<eucode>
> >global function addToCollection(sequence collection, object element)
> >	-- Add an element to a collection
> >	integer count, growSize
> >	count = collection[length(collection)] + 1
> >	growSize = count - length(collection)
> >	if growSize > 0 then
> >		if growSize > GROWSIZE then
> >			collection &= repeat(0, growSize + GROWSIZE)
> >		else collection &= BUFFER
> >		end if
> >	end if
> >	collection[count] = element
> >	collection[length(collection)] = count
> >	return collection
> >end function
> ></eucode>
{{{

> with:
> }}}
<eucode>
> sequence collection
> 
> procedure addTocollection(object element)
>  -- Add an element to a collection
>  integer count, growSize
>  count = collection[length(collection)] + 1
>  growSize = count - length(collection)
>  if growSize > 0 then
>   if growSize > GROWSIZE then
>    collection &= repeat(0, growSize + GROWSIZE)
>   else collection &= BUFFER
>   end if
>  end if
>  collection[count] = element
>  collection[length(collection)] = count
> end procedure
> </eucode>
{{{

> 
> They are identical apart from one is a function acting on and
> returning a parameter, whereas the other is a procedure acting
> directly on a file-level variable.
> 
> It may surprise you to find that it is 66 times faster.

Perfect example of why global variables are sometimes a necessity for 
speed, even if some people claim they are bad programming practice. Ditto 
for "goto". Thanks, Pete.

Kat

new topic     » goto parent     » topic index » view message » categorize

9. Re: Fast appending and sorting of alot of short strings

Kat wrote:

<snip>

> Perfect example of why global variables are sometimes a necessity for
> speed, even if some people claim they are bad programming practice.
> Ditto for "goto". Thanks, Pete.

We don't know where to GOTO if we don't know where we've COME FROM.
http://www.fortran.com/fortran/come_from.html

Regards,
   Juergen

new topic     » goto parent     » topic index » view message » categorize

10. Re: Fast appending and sorting of alot of short strings

What is goto good for, that for while prcedure function routineid cant do?


On Sun, 10 Oct 2004 00:01:05 +0200, Juergen Luethje <j.lue at gmx.de> wrote:
> 
> Kat wrote:
> 
> <snip>
> 
> > Perfect example of why global variables are sometimes a necessity for
> > speed, even if some people claim they are bad programming practice.
> > Ditto for "goto". Thanks, Pete.
> 
> We don't know where to GOTO if we don't know where we've COME FROM.
> http://www.fortran.com/fortran/come_from.html
> 
> Regards,
>   Juergen
> 
> 
> 
> 
>

new topic     » goto parent     » topic index » view message » categorize

11. Re: Fast appending and sorting of alot of short strings

codepilot Gmail Account wrote
[quoting order rearranged]:

> On Sun, 10 Oct 2004 00:01:05 +0200, Juergen Luethje <j.lue at gmx.de> wrote:
>>
>> Kat wrote:
>>
>> <snip>
>>
>>> Perfect example of why global variables are sometimes a necessity for
>>> speed, even if some people claim they are bad programming practice.
>>> Ditto for "goto". Thanks, Pete.
>>
>> We don't know where to GOTO if we don't know where we've COME FROM.
>> http://www.fortran.com/fortran/come_from.html
>>
>> Regards,
>>   Juergen
>
> What is goto good for, that for while prcedure function routineid cant do?

No doubt that 'goto' is the most important and cleanest construct in
higher programming languages, but 'come frome' is even more important,
and its cleaner than clean.
Read THE TRUTH[tm] at <http://www.fortran.com/fortran/come_from.html>.

Regards,
   Juergen

-- 
 /"\  ASCII ribbon campain  |  Money is the root of all evil.
 \ /  against HTML in       |  Send 20 Dollars for more info.
  X   e-mail and news,      |
 / \  and unneeded MIME     |  http://home.arcor.de/luethje/prog/

new topic     » goto parent     » topic index » view message » categorize

12. Re: Fast appending and sorting of alot of short strings

On 10 Oct 2004, at 1:55, Juergen Luethje wrote:

> 
> 
> codepilot Gmail Account wrote
> [quoting order rearranged]:
> 
> > On Sun, 10 Oct 2004 00:01:05 +0200, Juergen Luethje <j.lue at gmx.de> wrote:
> >>
> >> Kat wrote:
> >>
> >> <snip>
> >>
> >>> Perfect example of why global variables are sometimes a necessity for
> >>> speed, even if some people claim they are bad programming practice.
> >>> Ditto for "goto". Thanks, Pete.
> >>
> >> We don't know where to GOTO if we don't know where we've COME FROM.
> >> http://www.fortran.com/fortran/come_from.html
> >>
> >> Regards,
> >>   Juergen
> >
> > What is goto good for, that for while prcedure function routineid cant do?
> 
> No doubt that 'goto' is the most important and cleanest construct in
> higher programming languages, but 'come frome' is even more important,
> and its cleaner than clean.
> Read THE TRUTH[tm] at <http://www.fortran.com/fortran/come_from.html>.

Heh, i had argued for access to the call stack before, here, to see what point 
in the program called a function, but i didn't know anyone had named such a 
ability.

Kat

new topic     » goto parent     » topic index » view message » categorize

13. Re: Fast appending and sorting of alot of short strings

On 9 Oct 2004, at 15:29, codepilot Gmail Account wrote:

> 
> 
> What is goto good for, that for while prcedure function routineid cant do?

goto :end 
 
> On Sun, 10 Oct 2004 00:01:05 +0200, Juergen Luethje <j.lue at gmx.de> wrote:
> > 
> > Kat wrote:
> > 
> > <snip>
> > 
> > > Perfect example of why global variables are sometimes a necessity for
> > > speed, even if some people claim they are bad programming practice.
> > > Ditto for "goto". Thanks, Pete.
> > 
> > We don't know where to GOTO if we don't know where we've COME FROM.
> > http://www.fortran.com/fortran/come_from.html
> > 
> > Regards,
> >   Juergen

:end

Getting to here, instead of top posting.
There were no conditionals, and no calls outside this email.

Kat

new topic     » goto parent     » topic index » view message » categorize

14. Re: Fast appending and sorting of alot of short strings

So would come from really be a good thing? Cause it just looks screwy,
to me that is.


On Sat, 9 Oct 2004 20:49:35 -0500, Kat <gertie at visionsix.com> wrote:
> 
> 
> On 9 Oct 2004, at 15:29, codepilot Gmail Account wrote:
> 
> >
> > What is goto good for, that for while prcedure function routineid cant do?
> 
> goto :end
> 
> > On Sun, 10 Oct 2004 00:01:05 +0200, Juergen Luethje <j.lue at gmx.de> wrote:
> > >
> > > Kat wrote:
> > >
> > > <snip>
> > >
> > > > Perfect example of why global variables are sometimes a necessity for
> > > > speed, even if some people claim they are bad programming practice.
> > > > Ditto for "goto". Thanks, Pete.
> > >
> > > We don't know where to GOTO if we don't know where we've COME FROM.
> > > http://www.fortran.com/fortran/come_from.html
> > >
> > > Regards,
> > >   Juergen
> 
> :end
> 
> Getting to here, instead of top posting.
> There were no conditionals, and no calls outside this email.
> 
> Kat
> 
> 
> 
> 
>

new topic     » goto parent     » topic index » view message » categorize

15. Re: Fast appending and sorting of alot of short strings

On 9 Oct 2004, at 21:17, codepilot Gmail Account wrote:

> 
> 
> So would come from really be a good thing? Cause it just looks screwy,
> to me that is.

goto :end2
 
> On Sat, 9 Oct 2004 20:49:35 -0500, Kat <gertie at visionsix.com> wrote:
> > 
> > 
> > On 9 Oct 2004, at 15:29, codepilot Gmail Account wrote:
> > 
> > >
> > > What is goto good for, that for while prcedure function routineid cant do?
> > 
> > goto :end
> > 
> > > On Sun, 10 Oct 2004 00:01:05 +0200, Juergen Luethje <j.lue at gmx.de>
> > > wrote:
> > > >
> > > > Kat wrote:
> > > >
> > > > <snip>
> > > >
> > > > > Perfect example of why global variables are sometimes a necessity for
> > > > > speed, even if some people claim they are bad programming practice.
> > > > > Ditto for "goto". Thanks, Pete.
> > > >
> > > > We don't know where to GOTO if we don't know where we've COME FROM.
> > > > http://www.fortran.com/fortran/come_from.html
> > > >
> > > > Regards,
> > > >   Juergen
> > 
> > :end
> > 
> > Getting to here, instead of top posting.
> > There were no conditionals, and no calls outside this email.
> > 
> > Kat

:end2 
No, that page was a joke, it wasn't serious.

But i still say it would be nice to have a way to know how one got to where 
one is, like the built-in trace knows. I again vote we have some access to 
that if we have "with trace" or some analog enabled.

Kat

new topic     » goto parent     » topic index » view message » categorize

16. Re: Fast appending and sorting of alot of short strings

How abouts a call stack.


On Sat, 9 Oct 2004 23:33:46 -0500, Kat <gertie at visionsix.com> wrote:
> 
> 
> On 9 Oct 2004, at 21:17, codepilot Gmail Account wrote:
> 
> >
> > So would come from really be a good thing? Cause it just looks screwy,
> > to me that is.
> 
> goto :end2
> 
> > On Sat, 9 Oct 2004 20:49:35 -0500, Kat <gertie at visionsix.com> wrote:
> > >
> > >
> > > On 9 Oct 2004, at 15:29, codepilot Gmail Account wrote:
> > >
> > > >
> > > > What is goto good for, that for while prcedure function routineid cant
> > > > do?
> > >
> > > goto :end
> > >
> > > > On Sun, 10 Oct 2004 00:01:05 +0200, Juergen Luethje <j.lue at gmx.de>
> > > > wrote:
> > > > >
> > > > > Kat wrote:
> > > > >
> > > > > <snip>
> > > > >
> > > > > > Perfect example of why global variables are sometimes a necessity
> > > > > > for
> > > > > > speed, even if some people claim they are bad programming practice.
> > > > > > Ditto for "goto". Thanks, Pete.
> > > > >
> > > > > We don't know where to GOTO if we don't know where we've COME FROM.
> > > > > http://www.fortran.com/fortran/come_from.html
> > > > >
> > > > > Regards,
> > > > >   Juergen
> > >
> > > :end
> > >
> > > Getting to here, instead of top posting.
> > > There were no conditionals, and no calls outside this email.
> > >
> > > Kat
> 
> :end2
> No, that page was a joke, it wasn't serious.
> 
> But i still say it would be nice to have a way to know how one got to where
> one is, like the built-in trace knows. I again vote we have some access to
> that if we have "with trace" or some analog enabled.
> 
> 
> Kat
> 
> 
> 
>

new topic     » goto parent     » topic index » view message » categorize

17. Re: Fast appending and sorting of alot of short strings

On 9 Oct 2004, at 22:14, codepilot Gmail Account wrote:

> 
> 
> How abouts a call stack.

goto :end3
 
 
> On Sat, 9 Oct 2004 23:33:46 -0500, Kat <gertie at visionsix.com> wrote:
> > 
> > 
> > On 9 Oct 2004, at 21:17, codepilot Gmail Account wrote:
> > 
> > >
> > > So would come from really be a good thing? Cause it just looks screwy,
> > > to me that is.
> > 
> > goto :end2
> > 
> > > On Sat, 9 Oct 2004 20:49:35 -0500, Kat <gertie at visionsix.com> wrote:
> > > >
> > > >
> > > > On 9 Oct 2004, at 15:29, codepilot Gmail Account wrote:
> > > >
> > > > >
> > > > > What is goto good for, that for while prcedure function routineid cant
> > > > > do?
> > > >
> > > > goto :end
> > > >
> > > > > On Sun, 10 Oct 2004 00:01:05 +0200, Juergen Luethje <j.lue at gmx.de>
> > > > > wrote:
> > > > > >
> > > > > > Kat wrote:
> > > > > >
> > > > > > <snip>
> > > > > >
> > > > > > > Perfect example of why global variables are sometimes a necessity
> > > > > > > for speed, even if some people claim they are bad programming
> > > > > > > practice. Ditto for "goto". Thanks, Pete.
> > > > > >
> > > > > > We don't know where to GOTO if we don't know where we've COME FROM.
> > > > > > http://www.fortran.com/fortran/come_from.html
> > > > > >
> > > > > > Regards,
> > > > > >   Juergen
> > > >
> > > > :end
> > > >
> > > > Getting to here, instead of top posting.
> > > > There were no conditionals, and no calls outside this email.
> > > >
> > > > Kat
> > 
> > :end2
> > No, that page was a joke, it wasn't serious.
> > 
> > But i still say it would be nice to have a way to know how one got to where
> > one is, like the built-in trace knows. I again vote we have some access to
> > that if we have "with trace" or some analog enabled.
> > 
> > 
> > Kat

Well, that *is* what i had asked for. Even tho it is limited to what called 
functions and procedures. I'd think giving us access to the trace function in 
the program would be just as easy?

Kat

new topic     » goto parent     » topic index » view message » categorize

18. Re: Fast appending and sorting of alot of short strings

Pete Lomax wrote:
> They are identical apart from one is a function acting on and
> returning a parameter, whereas the other is a procedure acting
> directly on a file-level variable.
<SNIP> 
> It may surprise you to find that it is 66 times faster.
> Of course it is a pain to have to duplicate code for each table, but
> for that kind of gain it is well worth it.

I can't do this. The Collection-routines are for general use, not only
for one specific library. There are at least 10 different kinds of
collections in Win4Eu.
However, if anyone can optimize the routines, without changing the
functionality, I would really be grateful. Collections are used a lot
in Win4Eu, especially on the lowest level.

--
tommy online: http://users.telenet.be/tommycarlier
tommy.blog: http://tommycarlier.blogspot.com
Euphoria Message Board: http://uboard.proboards32.com
Empire for Euphoria: http://creativeportal.ca

new topic     » goto parent     » topic index » view message » categorize

19. Re: Fast appending and sorting of alot of short strings

On Sun, 10 Oct 2004 02:02:33 -0700, Tommy Carlier
<guest at RapidEuphoria.com> wrote:

>I can't do this. The Collection-routines are for general use, not only
>for one specific library. There are at least 10 different kinds of
>collections in Win4Eu.
>However, if anyone can optimize the routines, without changing the
>functionality, I would really be grateful. Collections are used a lot
>in Win4Eu, especially on the lowest level.

How about giving external programs an integer index, rather than
letting/making them store the table as a sequence. Using a table of
tables internally, you can do this:

sequence AllMyTables

global function newCollectionIdx()
	AllMyTables=append(AllMyTables,BUFFER)
	return length(AllMyTables)
end function

global procedure addToTable(integer tableid, object element)
integer count, growSize
sequence OneOfMyTables

	OneOfMyTables=AllMyTables[tableid]

	AllMyTables[tableid]=0		--<<-- Ensure Ref Count of 1!!

	count = OneOfMyTables[length(OneOfMyTables)] + 1
	growSize = count - length(OneOfMyTables)
	if growSize > 0 then
		if growSize > GROWSIZE then
			OneOfMyTables &= repeat(0, growSize + GROWSIZE)
		else OneOfMyTables &= BUFFER
		end if
	end if
	OneOfMyTables[count] = element
	OneOfMyTables[length(OneOfMyTables)] = count

	AllMyTables[tableid]=OneOfMyTables
end procedure

...
integer MyTbleIdx
	MyTbleIdx=newCollectionIdx()
		addToTable(MyTbleIdx,c1)


Despite the fact that it looks like you are copying huge chunks of
data about, it only copies the top level ref, which is nearly as fast
as assigning an integer. By temporarily removing the table to be
updated from your table of tables, you avoid reference counts of >1,
which trigger the creation of new sequences.

Done this way, however, it is only 57 times faster blink)

Regards,
Pete
PS I couldn't see the purpose of:
		if growSize > GROWSIZE then
			collection &= repeat(0, growSize + GROWSIZE)

new topic     » goto parent     » topic index » view message » categorize

20. Re: Fast appending and sorting of alot of short strings

Pete Lomax wrote:
> How about giving external programs an integer index, rather than
> letting/making them store the table as a sequence. Using a table of
> tables internally, you can do this:
True, but then every operation has to be done through routine-calls.
The collections I have now, are just sequences. I can subscript them,
and slice them using normal sequence-syntax.

<SNIP>

> PS I couldn't see the purpose of:
> 		if growSize > GROWSIZE then
> 			collection &= repeat(0, growSize + GROWSIZE)

OK, this piece of code probably doesn't fit in anymore, because
it used to be in a different function. If you just add 1 element
to a collection, this isn't necessary. I previously had a function
'ensureCapacity', which would grow the collection to ensure a given
capacity. It's useful if you know in advance that you'll add 100
elements, you first call 'ensureCapacity', and the collection grows
enough to allow you to add the 100 elements faster.

--
tommy online: http://users.telenet.be/tommycarlier
tommy.blog: http://tommycarlier.blogspot.com
Euphoria Message Board: http://uboard.proboards32.com
Empire for Euphoria: http://empire.iwireweb.com

new topic     » goto parent     » topic index » view message » categorize

21. Re: Fast appending and sorting of alot of short strings

I've decided to remove collections from Win4Eu. I implemented collections
in the first place, because I thought that it's better to append multiple
elements at once than to append element per element.
But I just did a little test, and it appears that my collection-code is
a lot slower than the default appending 1 element to a sequence.

Here's the test-code:
sequence s1, s2
atom t1, t2

s1 = newCollection()
t1 = time()
for i=1 to 100000 do
	s1 = addToCollection(s1, i)
end for
t1 = time() - t1

s2 = {}
t2 = time()
for i=1 to 100000 do
	s2 &= i
end for
t2 = time() - t2

? {t1, t2}


--
tommy online: http://users.telenet.be/tommycarlier
tommy.blog: http://tommycarlier.blogspot.com
Euphoria Message Board: http://uboard.proboards32.com
Empire for Euphoria: http://empire.iwireweb.com

new topic     » goto parent     » topic index » view message » categorize

Search



Quick Links

User menu

Not signed in.

Misc Menu