1. Slow memory allocation
- Posted by Haflidi Asgrimsson <haflidi at prokaria.com> Jun 11, 2005
- 617 views
Suprisingly my old Python code was faster than Euphoria. I was reading a file of 40.000 lines like this: a10_Clo10_BL26_ClFish_O_C251A_0_0_0_2003X04 Clown Fish 17 mark 120 B 405 425 404.83 425.86 71.0 51.0 false -1.0 -1.0 -1.0 -1.0 -1.0 -1.0 -1.0 -1.0 -1.0 1.0 into memory. The code is below. It took 2 minutes just reading the file into memory during which the ntvdm.exe was running at full speed and memory allocation rose slowly to around 55 megabytes. If I skipped the line: table = append(table,line) this took 2 seconds. include kparse.e sequence line, table integer file object o_line constant TRUE = 1 constant FALSE = 0 constant TAB = 9 file = open("big_file.txt", "r") table = {} while TRUE do o_line = gets(file) if atom(o_line) then exit end if line = {} line = Kparse(o_line, TAB) table = append(table,line) end while
2. Re: Slow memory allocation
- Posted by Matt Lewis <matthewwalkerlewis at gmail.com> Jun 11, 2005
- 592 views
Haflidi Asgrimsson wrote: > > Suprisingly my old Python code was faster than Euphoria. I was reading a file > of 40.000 > lines like this: > a10_Clo10_BL26_ClFish_O_C251A_0_0_0_2003X04 Clown Fish 17 > mark 120 B 405 425 404.83 > 425.86 71.0 51.0 false -1.0 -1.0 -1.0 -1.0 -1.0 -1.0 -1.0 -1.0 -1.0 1.0 > into memory. The code is below. It took 2 minutes just reading the file into > memory > during which the ntvdm.exe was running at full speed and memory allocation > rose slowly > to around 55 megabytes. If I skipped the line: table = append(table,line) this > took > 2 seconds. > If you know how big table will eventually be, you should intialize it like:
table = repeat( "", 40000 )
Even if you don't know how big it will be, you'll get better performance if you grow it in chunks. Chunk size is, of course, up to you, but here's an example:
include kparse.e sequence line, table integer file, table_size, table_index object o_line constant TRUE = 1 constant FALSE = 0 constant TAB = 9 constant TABLE_CHUNK = 1024 table_size = TABLE_CHUNK table_index = 0 file = open("big_file.txt", "r") table = repeat( 0, table_size ) while TRUE do o_line = gets(file) if atom(o_line) then exit end if line = {} line = Kparse(o_line, TAB) table_index += 1 if table_index = table_size then table_size += TABLE_CHUNK table &= repeat( 0, TABLE_CHUNK ) end if table[table_index] = line end while table = table[1..table_size]
You can play around with that and change the size of TABLE_CHUNK, or grow TABLE_CHUNK dynamically, to allocate more memory each time. Basically, whenever you create a sequence, Euphoria will allocate enough space for it, plus a little extra whenever you start to grow it. Once you go beyond that size, it allocates a new chunk, and moves the memory. So you want to minimize the number of times this happens. Matt Lewis
3. Re: Slow memory allocation
- Posted by Al Getz <Xaxo at aol.com> Jun 11, 2005
- 616 views
Haflidi Asgrimsson wrote: > > Suprisingly my old Python code was faster than Euphoria. I was reading a file > of 40.000 > lines like this: > a10_Clo10_BL26_ClFish_O_C251A_0_0_0_2003X04 Clown Fish 17 > mark 120 B 405 425 404.83 > 425.86 71.0 51.0 false -1.0 -1.0 -1.0 -1.0 -1.0 -1.0 -1.0 -1.0 -1.0 1.0 > into memory. The code is below. It took 2 minutes just reading the file into > memory > during which the ntvdm.exe was running at full speed and memory allocation > rose slowly > to around 55 megabytes. If I skipped the line: table = append(table,line) this > took > 2 seconds. > > include kparse.e > > sequence line, table > integer file > object o_line > constant TRUE = 1 > constant FALSE = 0 > constant TAB = 9 > > file = open("big_file.txt", "r") > table = {} > while TRUE do > o_line = gets(file) > if atom(o_line) then > exit > end if > line = {} > line = Kparse(o_line, TAB) > table = append(table,line) > end while > Hi there, That's interesting, because i do the same thing with my personal editor and it opens a file with 56,000 lines (4 Megabytes) in about a second. How much physical RAM do you have installed in your computer? Take care, Al And, good luck with your Euphoria programming! My bumper sticker: "I brake for LED's"
4. Re: Slow memory allocation
- Posted by Mario Steele <eumario at trilake.net> Jun 11, 2005
- 566 views
- Last edited Jun 12, 2005
Al Getz wrote: > > Haflidi Asgrimsson wrote: > > > > Suprisingly my old Python code was faster than Euphoria. I was reading a > > file of 40.000 > > lines like this: > > a10_Clo10_BL26_ClFish_O_C251A_0_0_0_2003X04 Clown Fish 17 > > mark 120 B 405 425 404.83 > > 425.86 71.0 51.0 false -1.0 -1.0 -1.0 -1.0 -1.0 -1.0 -1.0 -1.0 -1.0 1.0 > > into memory. The code is below. It took 2 minutes just reading the file into > > memory > > during which the ntvdm.exe was running at full speed and memory allocation > > rose slowly > > to around 55 megabytes. If I skipped the line: table = append(table,line) > > this took > > 2 seconds. > > > > include kparse.e > > > > sequence line, table > > integer file > > object o_line > > constant TRUE = 1 > > constant FALSE = 0 > > constant TAB = 9 > > > > file = open("big_file.txt", "r") > > table = {} > > while TRUE do > > o_line = gets(file) > > if atom(o_line) then > > exit > > end if > > line = {} > > line = Kparse(o_line, TAB) > > table = append(table,line) > > end while > > > > Hi there, > > That's interesting, because i do the same thing with my personal editor > and it opens a file with 56,000 lines (4 Megabytes) in about a second. > > How much physical RAM do you have installed in your computer? > > > Take care, > Al > > And, good luck with your Euphoria programming! > > My bumper sticker: "I brake for LED's" Actually, this isn't a problem with slow memory allocation. This is a problem with append(). This function has always been known to be slow, amongst most of the "old timers" here. It is much more suggestable that you use the &= oper sign to concat data together. It has been proven to be much faster then append() on many occasions. Here's an Example:
sequence a a = "This is part of a string" a &= ", and this is the rest" a = {a} a &= { "This is a new string." } -- a now looks like: {"This is part of a string, and this is the rest", "This is a new string"}
Always remember though, when you want to concat together two sequences, and want each to have it's own place, and not concated together into 1 sequence, use the { } brackets around the data, even if it is a sequence, it shows the Euphoria Interpreter, that you want to seperate the two sequences from each other. Mario Steele http://enchantedblade.trilake.net Attaining World Dominiation, one byte at a time...
5. Re: Slow memory allocation
- Posted by Haflidi Asgrimsson <haflidi at prokaria.com> Jun 12, 2005
- 587 views
Mario Steele wrote: > > Al Getz wrote: > > > > Haflidi Asgrimsson wrote: > > > > > > Suprisingly my old Python code was faster than Euphoria. I was reading a > > > file of 40.000 > > > lines like this: > > > a10_Clo10_BL26_ClFish_O_C251A_0_0_0_2003X04 Clown Fish 17 > > > mark 120 B 405 425 404.83 > > > 425.86 71.0 51.0 false -1.0 -1.0 -1.0 -1.0 -1.0 -1.0 -1.0 -1.0 -1.0 1.0 > > > into memory. The code is below. It took 2 minutes just reading the file > > > into memory > > > during which the ntvdm.exe was running at full speed and memory allocation > > > rose slowly > > > to around 55 megabytes. If I skipped the line: table = append(table,line) > > > this took > > > 2 seconds. > > > > > > include kparse.e > > > > > > sequence line, table > > > integer file > > > object o_line > > > constant TRUE = 1 > > > constant FALSE = 0 > > > constant TAB = 9 > > > > > > file = open("big_file.txt", "r") > > > table = {} > > > while TRUE do > > > o_line = gets(file) > > > if atom(o_line) then > > > exit > > > end if > > > line = {} > > > line = Kparse(o_line, TAB) > > > table = append(table,line) > > > end while > > > > > > > Hi there, > > > > That's interesting, because i do the same thing with my personal editor > > and it opens a file with 56,000 lines (4 Megabytes) in about a second. > > > > How much physical RAM do you have installed in your computer? > > > > > > Take care, > > Al > > > > And, good luck with your Euphoria programming! > > > > My bumper sticker: "I brake for LED's" > > Actually, this isn't a problem with slow memory allocation. This is a problem > with append(). > This function has always been known to be slow, amongst most of the "old > timers" here. > It is much more suggestable that you use the &= oper sign to concat data > together. > It has been proven to be much faster then append() on many occasions. > > Here's an Example: > > }}} <eucode> > sequence a > a = "This is part of a string" > a &= ", and this is the rest" > a = {a} > a &= { "This is a new string." } > -- a now looks like: {"This is part of a string, and this is the rest", "This > is a new string"} > <font color="#330033"></eucode> {{{ </font> > > Always remember though, when you want to concat together two sequences, and > want each > to have it's own place, and not concated together into 1 sequence, use the { } > brackets > around the data, even if it is a sequence, it shows the Euphoria Interpreter, > that > you want to seperate the two sequences from each other. > > > Mario Steele > <a > href="http://enchantedblade.trilake.net">http://enchantedblade.trilake.net</a> > Attaining World Dominiation, one byte at a time... > I tried this on two systems both running Windows XP Pro, the latter is 2GHz with 1.5 Gb memory, there it took 25 seconds. What I found most interesting is that append(), &= }}} <eucode>and</eucode> {{{ assignment gave the same result, 25 seconds. table[i] = line is taking around 24 seconds of those.
integer file, n_line object o_line constant TRUE = 1 constant FALSE = 0 constant TAB = 9 file = open("big_file.txt", "r") n_line = 0 while TRUE do n_line += 1 o_line = gets(file) if atom(o_line) then exit end if end while if seek(file,0) then puts(1,"Seek failed\n") end if table = repeat( {}, n_line ) for i = 1 to n_line do o_line = gets(file) if atom(o_line) then exit end if line = {} line = Kparse(o_line, TAB) table[i] = line end for
6. Re: Slow memory allocation
- Posted by Al Getz <Xaxo at aol.com> Jun 12, 2005
- 557 views
Haflidi Asgrimsson wrote: > > Mario Steele wrote: > > > > Al Getz wrote: > > > > > > Haflidi Asgrimsson wrote: > > > > > > > > Suprisingly my old Python code was faster than Euphoria. I was reading a > > > > file of 40.000 > > > > lines like this: > > > > a10_Clo10_BL26_ClFish_O_C251A_0_0_0_2003X04 Clown Fish 17 > > > > mark 120 B 405 425 404.83 > > > > 425.86 71.0 51.0 false -1.0 -1.0 -1.0 -1.0 -1.0 -1.0 -1.0 -1.0 -1.0 1.0 > > > > into memory. The code is below. It took 2 minutes just reading the file > > > > into memory > > > > during which the ntvdm.exe was running at full speed and memory > > > > allocation rose slowly > > > > to around 55 megabytes. If I skipped the line: table = > > > > append(table,line) this took > > > > 2 seconds. > > > > > > > > include kparse.e > > > > > > > > sequence line, table > > > > integer file > > > > object o_line > > > > constant TRUE = 1 > > > > constant FALSE = 0 > > > > constant TAB = 9 > > > > > > > > file = open("big_file.txt", "r") > > > > table = {} > > > > while TRUE do > > > > o_line = gets(file) > > > > if atom(o_line) then > > > > exit > > > > end if > > > > line = {} > > > > line = Kparse(o_line, TAB) > > > > table = append(table,line) > > > > end while > > > > > > > > > > Hi there, > > > > > > That's interesting, because i do the same thing with my personal editor > > > and it opens a file with 56,000 lines (4 Megabytes) in about a second. > > > > > > How much physical RAM do you have installed in your computer? > > > > > > > > > Take care, > > > Al > > > > > > And, good luck with your Euphoria programming! > > > > > > My bumper sticker: "I brake for LED's" > > > > Actually, this isn't a problem with slow memory allocation. This is a > > problem with append(). > > This function has always been known to be slow, amongst most of the "old > > timers" here. > > It is much more suggestable that you use the &= oper sign to concat data > > together. > > It has been proven to be much faster then append() on many occasions. > > > > Here's an Example: > > > > }}} <eucode> > > sequence a > > a = "This is part of a string" > > a &= ", and this is the rest" > > a = {a} > > a &= { "This is a new string." } > > -- a now looks like: {"This is part of a string, and this is the rest", > > "This is a new string"} > <font color="#330033">> <font color=</font><font > color="#00A033">"#330033"</font><font color="#330033">></eucode> {{{ </font></font> > > > > Always remember though, when you want to concat together two sequences, and > > want each > > to have it's own place, and not concated together into 1 sequence, use the { > > } brackets > > around the data, even if it is a sequence, it shows the Euphoria > > Interpreter, that > > you want to seperate the two sequences from each other. > > > > > > Mario Steele > > <a > > href="http://enchantedblade.trilake.net">http://enchantedblade.trilake.net</a> > > Attaining World Dominiation, one byte at a time... > > > <font color="#330033">I tried this on two systems both running Windows XP Pro, > the latter is 2GHz </font><font color="#0000FF">with </font><font > color="#330033">1.5 Gb memory, there it took 25 seconds. What I found most > interesting is that </font><font color="#FF00FF">append</font><font > color="#330033">(), > &= }}} <eucode></font><font color="#0000FF">and</font><font > color="#330033"></eucode> {{{ assignment gave the same result, 25 seconds.</font> > table[i] = line is taking around 24 seconds of those. > }}} <eucode> > integer file, n_line > object o_line > constant TRUE = 1 > constant FALSE = 0 > constant TAB = 9 > > file = open("big_file.txt", "r") > n_line = 0 > while TRUE do > n_line += 1 > o_line = gets(file) > if atom(o_line) then > exit > end if > end while > if seek(file,0) then > puts(1,"Seek failed\n") > end if > table = repeat( {}, n_line ) > for i = 1 to n_line do > o_line = gets(file) > if atom(o_line) then > exit > end if > line = {} > line = Kparse(o_line, TAB) > table[i] = line > end for > <font color="#330033"></eucode> {{{ </font> > Hello again, My only question now is what version of Euphoria are you using? Im asking these questions because i do this: object line sequence buff atom fn fn=open("c:\\myfile.txt","r") while 1 do line=gets(fn) if atom(line) then exit end if buff=append(buff,line) end while The above code opens a 2,839,910 bytes text file in about 1 second using Euphoria v2.4 with a bindw'd exe. Did you try loading the file completely first, then parsing after? If so, does it speed it up any? I know you have tried leaving out the 'append' line, but what happens when you leave out the parse line without leaving out the append line? Take care, Al And, good luck with your Euphoria programming! My bumper sticker: "I brake for LED's"
7. Re: Slow memory allocation
- Posted by Al Getz <Xaxo at aol.com> Jun 12, 2005
- 578 views
Hello again, Im not sure if i replied to the correct post, so here's the post i meant to reply to: >I tried this on two systems both running Windows XP Pro, > the latter is 2GHz with 1.5 Gb memory, there it took 25 seconds. > What I found most interesting is that append(), >&= and assignment gave the same result, 25 seconds. >table[i] = line is taking around 24 seconds of those. Your system is faster than mine and has more memory so i would have expected it to read the file faster than on mine, and mine reads a 2,900,000+ byte file in about a second. I tried a non-bindw'd .exw file and it was the same, and i tried the .exw file with Version 2.5 of Euphoria (PD Beta version) and it was the same, about one second to read the whole file into the sequence. What i would do is try that exact code fragment (in the previous post) without the parse line and see if it works faster. If not, i would wonder if you have any active virus software or the page file was moved by a non-Windows disk manager. If it does in fact speed up, then perhaps you should do your parsing AFTER the whole file is read into the sequence. I've been using Euphoria for several years now and i've had my editor up and running for most of them, and it's always been fast even though im using 'append' to store the lines in the sequence. Pete Lomax recently started a new editor which uses basically the same technique and that is about the same speed (fast). This makes me think something else is wrong. Take care, Al And, good luck with your Euphoria programming! My bumper sticker: "I brake for LED's"
8. Re: Slow memory allocation
- Posted by Haflidi Asgrimsson <haflidi at prokaria.com> Jun 12, 2005
- 566 views
Al Getz wrote: > > Hello again, > > Im not sure if i replied to the correct post, so here's the post i meant > to reply to: > > >I tried this on two systems both running Windows XP Pro, > > the latter is 2GHz with 1.5 Gb memory, there it took 25 seconds. > > What I found most interesting is that append(), > >&= and assignment gave the same result, 25 seconds. > >table[i] = line is taking around 24 seconds of those. > > Your system is faster than mine and has more memory so i would have > expected it to read the file faster than on mine, and mine reads > a 2,900,000+ byte file in about a second. > I tried a non-bindw'd .exw file and it was the same, and i tried > the .exw file with Version 2.5 of Euphoria (PD Beta version) and > it was the same, about one second to read the whole file into > the sequence. > What i would do is try that exact code fragment (in the previous post) > without the parse line and see if it works faster. If not, i would > wonder if you have any active virus software or the page file was > moved by a non-Windows disk manager. > If it does in fact speed up, then perhaps you should do your parsing > AFTER the whole file is read into the sequence. > > I've been using Euphoria for several years now and i've had my > editor up and running for most of them, and it's always been fast > even though im using 'append' to store the lines in the sequence. > Pete Lomax recently started a new editor which uses basically the > same technique and that is about the same speed (fast). This makes > me think something else is wrong. > > > Take care, > Al > > And, good luck with your Euphoria programming! > > My bumper sticker: "I brake for LED's" > Us getting different results led to me trying you code, ran in less than a second. The culprit line is: line = Kparse(o_line, TAB) Although it only slows the code with append(), &= or assignment following.
9. Re: Slow memory allocation
- Posted by Matt Lewis <matthewwalkerlewis at gmail.com> Jun 12, 2005
- 577 views
Haflidi Asgrimsson wrote: > > Us getting different results led to me trying you code, ran in less than > a second. The culprit line is: > line = Kparse(o_line, TAB) > Although it only slows the code with append(), &= or assignment following. > That would have been my next question. What does Kparse() do? Could you tell us what the results are when you: 1) Comment out the Kparse() call 2) Comment out the assignment to table 3) Comment out the Kparse() call and the assignment (just read and ignore) Matt Lewis
10. Re: Slow memory allocation
- Posted by Haflidi Asgrimsson <haflidi at prokaria.com> Jun 12, 2005
- 609 views
Matt Lewis wrote: > > Haflidi Asgrimsson wrote: > > > > Us getting different results led to me trying you code, ran in less than > > a second. The culprit line is: > > line = Kparse(o_line, TAB) > > Although it only slows the code with append(), &= or assignment following. > > > > That would have been my next question. What does Kparse() do? Could you > tell us what the results are when you: > > 1) Comment out the Kparse() call > 2) Comment out the assignment to table > 3) Comment out the Kparse() call and the assignment (just read and ignore) > > Matt Lewis > Kparse is from kparse.e --KPARSE.E (parse with keep) --(c) 05/01/104 Michael J Raley (thinkways at yahoo.com) --Turns a string of delimited text into a list of items, --while retaining the position of empty elements. The function is below So string like "1\t2\t\t3" is converted into list: {"1","2","","3"} In all three cases I get nearly instant response, only when both lines: line = Kparse(o_line, TAB) table = append(table, line) then the CPU runs for 25 seconds. This must be some kind of late typechecking because if TAB is replaced by a sequence. Then this takes only about 2 seconds. line = Kparse(o_line, "9")
-------------------------------------------------------- global function Kparse(object s, object o) sequence clipbook, parsed_list atom ls, lc, clip if atom(s) then return s end if clipbook = {} parsed_list = {} ls = length(s) --convert atom delimiter into a 1n sequence if atom(o) then o = {o} end if --bookmark the position of all delimiters in 1 pass for a = 1 to ls do if match({s[a]},o) then clipbook &= a end if end for lc = length(clipbook) if lc = 0 then return {} end if -- find the text between the recorded delimeter positions to create a list. -- First check to see if the first bookmarked delimiter starts sequence s clip = clipbook[1] if clip = 1 then -- Yes. Create the first element empty parsed_list = {{}} else parsed_list = {s[1..clip-1]} -- No. Build the first element from s up to our bookmark end if -- now we can process the rest of the sequence for ic = 2 to lc do if clip+1 = clipbook[ic] then parsed_list = append(parsed_list,{}) else parsed_list = append(parsed_list,s[clip+1..clipbook[ic]-1]) end if clip = clipbook[ic] end for --test if end of s is past last delimeter if ls > clipbook[lc] then parsed_list = append(parsed_list,s[clipbook[lc]+1..ls]) end if return parsed_list end function
11. Re: Slow memory allocation
- Posted by Vincent <darkvincentdude at yahoo.com> Jun 12, 2005
- 542 views
Haflidi Asgrimsson wrote: > > Matt Lewis wrote: > > > > Haflidi Asgrimsson wrote: > > > > > > Us getting different results led to me trying you code, ran in less than > > > a second. The culprit line is: > > > line = Kparse(o_line, TAB) > > > Although it only slows the code with append(), &= or assignment following. > > > > > > > That would have been my next question. What does Kparse() do? Could you > > tell us what the results are when you: > > > > 1) Comment out the Kparse() call > > 2) Comment out the assignment to table > > 3) Comment out the Kparse() call and the assignment (just read and ignore) > > > > Matt Lewis > > > Kparse is from kparse.e > --KPARSE.E (parse with keep) > --(c) 05/01/104 Michael J Raley (thinkways at yahoo.com) > --Turns a string of delimited text into a list of items, > --while retaining the position of empty elements. > The function is below > So string like "1\t2\t\t3" is converted into list: {"1","2","","3"} > > In all three cases I get nearly instant response, only when both lines: > line = Kparse(o_line, TAB) > table = append(table, line) > then the CPU runs for 25 seconds. > This must be some kind of late typechecking because if TAB is replaced by > a sequence. Then this takes only about 2 seconds. > line = Kparse(o_line, "9") > > > }}} <eucode> > -------------------------------------------------------- > global function Kparse(object s, object o) > sequence clipbook, parsed_list > atom ls, lc, clip > > if atom(s) then return s end if > > clipbook = {} > parsed_list = {} > ls = length(s) > > --convert atom delimiter into a 1n sequence > if atom(o) then o = {o} end if > > --bookmark the position of all delimiters in 1 pass > for a = 1 to ls do > if match({s[a]},o) then clipbook &= a > end if > > end for > > lc = length(clipbook) > if lc = 0 then return {} end if > > -- find the text between the recorded delimeter positions to create a list. > -- First check to see if the first bookmarked delimiter starts sequence s > > clip = clipbook[1] > if clip = 1 then -- Yes. Create the first element empty > parsed_list = {{}} > else > parsed_list = {s[1..clip-1]} -- No. Build the first element from s up to > our bookmark > end if > > -- now we can process the rest of the sequence > for ic = 2 to lc do > if clip+1 = clipbook[ic] then > parsed_list = append(parsed_list,{}) > else > parsed_list = append(parsed_list,s[clip+1..clipbook[ic]-1]) > end if > clip = clipbook[ic] > end for > --test if end of s is past last delimeter > if ls > clipbook[lc] then > parsed_list = append(parsed_list,s[clipbook[lc]+1..ls]) > end if > > return parsed_list > end function > <font color="#330033"></eucode> {{{ </font> > Look at all the appends here with sequence subscripting & slicing operations. With "append()" and "&=" short-hand operator, sequences are dynamically growning (allocating more memory) whenever more data is pushed into the sequences. With repeat() you can specify exactly how big you want the sequence to be (without dynamic sequence growning & memory allocation), and push data into the already allocated sequence elements, using a loop. Dynamic allocation is very useful in Euphoria, but maybe not in this case where performance is the biggest issue. See if you can modify this code to use some repeats. That could help the kparse routine perform quicker and more efficently. Regards, Vincent -- Without walls and fences, there is no need for Windows and Gates.
12. Re: Slow memory allocation
- Posted by Matt Lewis <matthewwalkerlewis at gmail.com> Jun 12, 2005
- 558 views
Haflidi Asgrimsson wrote: > > I tried this on two systems both running Windows XP Pro, the > latter is 2GHz with 1.5 Gb memory, there it took 25 seconds. What > I found most interesting is that append(), &= and assignment gave > the same result, 25 seconds > table[i] = line is taking around 24 seconds of those. This seems really odd. I just made a text file that's 64,886 lines of 10 random numbers with 11 characters, tab delimited. I can't make the time go up much beyond 1.5 seconds, and this is on a 2.4GHz Celeron with 512MB RAM WinXP Home. Total memory usage is about 55Megs, which is seems correct to me (the file's about 7Megs). Is there something odd about the file? For one thing, you're adding extra ref's and deref's to sequences by using the line variable. Cut that out and see if that makes any difference (didn't on my machine). Also, I'd advise using a sequence passed to kparse, since it just makes an integer into a sequence, so you're wasting some cycles right there. Can you post the source file (or one just like it, but with the data changed, if that's an issue)? Maybe there's something strange about the format of the data that's causing issues. Anyway, here's my code that runs in about 1.5s on my machine. Replace "bigrand.txt" with your file, and let me know what happens. If you want, I can email you my file (it's about 3MB zipped).
include get.e global function kparse(object s, object o) sequence clipbook, parsed_list atom ls, lc, clip if atom(s) then return s end if clipbook = {} parsed_list = {} ls = length(s) --convert atom delimiter into a 1n sequence if atom(o) then o = {o} end if --bookmark the position of all delimiters in 1 pass for a = 1 to ls do if match({s[a]},o) then clipbook &= a end if end for lc = length(clipbook) if lc = 0 then return {} end if -- find the text between the recorded delimeter positions to create a list. -- First check to see if the first bookmarked delimiter starts sequence s clip = clipbook[1] if clip = 1 then -- Yes. Create the first element empty parsed_list = {{}} else parsed_list = {s[1..clip-1]} -- No. Build the first element from s up to our bookmark end if -- now we can process the rest of the sequence for ic = 2 to lc do if clip+1 = clipbook[ic] then parsed_list = append(parsed_list,{}) else parsed_list = append(parsed_list,s[clip+1..clipbook[ic]-1]) end if clip = clipbook[ic] end for --test if end of s is past last delimeter if ls > clipbook[lc] then parsed_list = append(parsed_list,s[clipbook[lc]+1..ls]) end if return parsed_list end function procedure main() atom t integer fn object in sequence table fn = open( "bigrand.txt", "r" ) table = {} t = time() in = gets( fn ) while sequence(in) do table = append( table, kparse( in, "\t" ) ) in = gets( fn ) end while printf( 1, "%gsec\n", time() - t) if wait_key() then end if end procedure main()
13. Re: Slow memory allocation
- Posted by Al Getz <Xaxo at aol.com> Jun 12, 2005
- 558 views
- Last edited Jun 13, 2005
Haflidi Asgrimsson wrote: > > Matt Lewis wrote: > > > > Haflidi Asgrimsson wrote: > > > > > > Us getting different results led to me trying you code, ran in less than > > > a second. The culprit line is: > > > line = Kparse(o_line, TAB) > > > Although it only slows the code with append(), &= or assignment following. > > > > > > > That would have been my next question. What does Kparse() do? Could you > > tell us what the results are when you: > > > > 1) Comment out the Kparse() call > > 2) Comment out the assignment to table > > 3) Comment out the Kparse() call and the assignment (just read and ignore) > > > > Matt Lewis > > > Kparse is from kparse.e > --KPARSE.E (parse with keep) > --(c) 05/01/104 Michael J Raley (thinkways at yahoo.com) > --Turns a string of delimited text into a list of items, > --while retaining the position of empty elements. > The function is below > So string like "1\t2\t\t3" is converted into list: {"1","2","","3"} > > In all three cases I get nearly instant response, only when both lines: > line = Kparse(o_line, TAB) > table = append(table, line) > then the CPU runs for 25 seconds. > This must be some kind of late typechecking because if TAB is replaced by > a sequence. Then this takes only about 2 seconds. > line = Kparse(o_line, "9") > > > }}} <eucode> > -------------------------------------------------------- > global function Kparse(object s, object o) > sequence clipbook, parsed_list > atom ls, lc, clip > > if atom(s) then return s end if > > clipbook = {} > parsed_list = {} > ls = length(s) > > --convert atom delimiter into a 1n sequence > if atom(o) then o = {o} end if > > --bookmark the position of all delimiters in 1 pass > for a = 1 to ls do > if match({s[a]},o) then clipbook &= a > end if > > end for > > lc = length(clipbook) > if lc = 0 then return {} end if > > -- find the text between the recorded delimeter positions to create a list. > -- First check to see if the first bookmarked delimiter starts sequence s > > clip = clipbook[1] > if clip = 1 then -- Yes. Create the first element empty > parsed_list = {{}} > else > parsed_list = {s[1..clip-1]} -- No. Build the first element from s up to > our bookmark > end if > > -- now we can process the rest of the sequence > for ic = 2 to lc do > if clip+1 = clipbook[ic] then > parsed_list = append(parsed_list,{}) > else > parsed_list = append(parsed_list,s[clip+1..clipbook[ic]-1]) > end if > clip = clipbook[ic] > end for > --test if end of s is past last delimeter > if ls > clipbook[lc] then > parsed_list = append(parsed_list,s[clipbook[lc]+1..ls]) > end if > > return parsed_list > end function > <font color="#330033"></eucode> {{{ </font> > Hi again, It's usually better to load the entire file first and then parse later. Take care, Al And, good luck with your Euphoria programming! My bumper sticker: "I brake for LED's"
14. Re: Slow memory allocation
- Posted by Haflidi Asgrimsson <haflidi at prokaria.com> Jun 12, 2005
- 559 views
- Last edited Jun 13, 2005
As with all other interpreters the more information you give the more efficiently it runs the code. I think I've learned a valuable lesson here: constant TAB = 9 is bad constant TAB = '9' is OK constant TAB = "9" is OK
15. Re: Slow memory allocation
- Posted by Matt Lewis <matthewwalkerlewis at gmail.com> Jun 12, 2005
- 555 views
- Last edited Jun 13, 2005
Haflidi Asgrimsson wrote: > > As with all other interpreters the more information you give the more > efficiently it runs the code. > I think I've learned a valuable lesson here: > constant TAB = 9 is bad > constant TAB = '9' is OK > constant TAB = "9" is OK > This is misleading. '9' != '\t' and "9" != "\t". The reason it's much faster is that you're getting many fewer delimiters (only when the character 9 is encountered). Do you have other things running? How much *free* memory do you have? Maybe your memory is swapping out or something? There's something else going on here, because we're all getting very different results on relatively similar hardware. Matt Lewis
16. Re: Slow memory allocation
- Posted by "Elliott S. de Andrade" <quantum_analyst at hotmail.com> Jun 12, 2005
- 590 views
- Last edited Jun 13, 2005
>From: Haflidi Asgrimsson <guest at RapidEuphoria.com> >Reply-To: EUforum at topica.com >To: EUforum at topica.com >Subject: Re: Slow memory allocation >Date: Sun, 12 Jun 2005 14:17:17 -0700 > >posted by: Haflidi Asgrimsson <haflidi at prokaria.com> > >As with all other interpreters the more information you give the more >efficiently it runs the code. >I think I've learned a valuable lesson here: >constant TAB = 9 is bad >constant TAB = '9' is OK >constant TAB = "9" is OK > Those last two are not correct. They just mean the number 9, not a TAB character. If you're splitting by 9's and not TAB's, then maybe you aren't splitting anything at all, thereby making it seem faster. The constants should be like this: constant TAB = '\t' or constant TAB = "\t" ~[ WingZone ]~ http://wingzone.tripod.com/
17. Re: Slow memory allocation
- Posted by Haflidi Asgrimsson <haflidi at prokaria.com> Jun 12, 2005
- 573 views
- Last edited Jun 13, 2005
Except Kparses stops parsing and returns empty list. So I wrote my own function and it parses my file in 3 seconds:
function mySplit(string s_input, sequence s_char) sequence l_return, s_return integer n_start, n_stop atom a_char if equal(s_char, "#") then a_char = '!' else a_char = '#' end if l_return = {} n_start = 1 while TRUE do n_stop = match(s_char, s_input) if n_stop then s_input[n_stop] = a_char s_return = s_input[n_start..n_stop-1] l_return = append(l_return, s_return) n_start = n_stop+1 else l_return = append(l_return, s_input[n_start..$]) exit end if end while return l_return end function
Tank you all for trying to help
18. Re: Slow memory allocation
- Posted by Haflidi Asgrimsson <haflidi at prokaria.com> Jun 12, 2005
- 577 views
- Last edited Jun 13, 2005
I made the bigfile in Excel filling down the following line, 40000 lines a10_Clo10_BL26_ClFish_O_C251A_0_0_0_2003X04 Clown Fish 17 mark 120 B 405 425 404.83 425.86 71.0 51.0 FALSE -1.0 -1.0 -1.0 -1.0 -1.0 -1.0 -1.0 -1.0 -1.0 1.0 The "\t" did not change anything but of corse is this right. But 9 works too. I wrote my own function that works so I blame allt this on Kparse function. Thank you
19. Re: Slow memory allocation
- Posted by Pete Lomax <petelomax at blueyonder.co.uk> Jun 13, 2005
- 569 views
On Sun, 12 Jun 2005 15:59:17 -0700, Haflidi Asgrimsson <guest at RapidEuphoria.com> wrote: >I made the bigfile in Excel filling down the following line, 40000 lines While I don't want to re-open that can of worms, I am reminded of a previous thread: http://www.listfilter.com/cgi-bin/esearch.exu?thread=1&fromMonth=A&fromYear=9&toMonth=C&toYear=9&keywords=%22Dramatic+slowdown+-ping+Rob%22 >a10_Clo10_BL26_ClFish_O_C251A_0_0_0_2003X04 Clown Fish 17 >mark 120 B 405 425 404.83 425.86 71.0 51.0 FALSE -1.0 -1.0 -1.0 -1.0 -1.0 -1.0 -1.0 -1.0 -1.0 1.0 ..and in this unusual test case you are apparently allocating 40,000 strings of length 43. The program might well have run fine on real data, it could just have been that particular test set. > >I wrote my own function that works so I blame all this on Kparse function. > Perhaps not the kindest words ever written, but I'm glad you resolved it. I imagine it is a programmers lot to occasionally run into such problems, and they probably occur whatever language we code in. Regards, Pete
20. Re: Slow memory allocation
- Posted by Haflidi Asgrimsson <haflidi at prokaria.com> Jun 13, 2005
- 561 views
Pete Lomax wrote: > > On Sun, 12 Jun 2005 15:59:17 -0700, Haflidi Asgrimsson > <guest at RapidEuphoria.com> wrote: > > >I made the bigfile in Excel filling down the following line, 40000 lines > While I don't want to re-open that can of worms, I am reminded of a > previous thread: > <a > href="http://www.listfilter.com/cgi-bin/esearch.exu?thread=1&fromMonth=A&fromYear=9&toMonth=C&toYear=9&keywords=%22Dramatic+slowdown+-ping+Rob%22">http://www.listfilter.com/cgi-bin/esearch.exu?thread=1&fromMonth=A&fromYear=9&toMonth=C&toYear=9&keywords=%22Dramatic+slowdown+-ping+Rob%22</a> > > > >a10_Clo10_BL26_ClFish_O_C251A_0_0_0_2003X04 Clown Fish 17 > >mark 120 B 405 425 404.83 425.86 71.0 > 51.0 FALSE -1.0 -1.0 -1.0 -1.0 -1.0 -1.0 -1.0 -1.0 -1.0 1.0</font></i> > > ..and in this unusual test case you are apparently allocating 40,000 > strings of length 43. The program might well have run fine on real > data, it could just have been that particular test set. > > > >I wrote my own function that works so I blame all this on Kparse function. > > > Perhaps not the kindest words ever written, but I'm glad you resolved > it. I imagine it is a programmers lot to occasionally run into such > problems, and they probably occur whatever language we code in. > > Regards, > Pete > > Sorry, It was meant as a joke at my expense. When one is stuck in ones own code the last resort is to blame someone else’s. Actually my solution wasn't good enough so I'm using the Kparse function not reading the whole file into memory, just one line at a time and it works fine. I'm lacking insight into the Euphoria interpreter so I found this case interesting. And I got a lot of hints form you all, thank you! And I repeat, I'm very sorry if I sounded rude!