OpenEuphoria: Forum: getc( ) and gets( ) speed

1. getc( ) and gets( ) speed

Posted by "Euman" <euman at bellsouth.net> Dec 14, 2003
560 views

Hello,                                                  
                                                        
Can the 1st example be made any faster than the second? 
                                                        
atom t, t1, t2, fn, lFileLen                            
integer lWS, lWE                                        
sequence lFile, line                                    
 
-- Example 1                                                       
--------------------------------------------------------
lFileLen = seek(fn, -1)                                 
lFileLen = where(fn)                                    
if seek(fn, 0) then end if                              
lFile = repeat(0, lFileLen + 1)                         
lFile[lFileLen+1] = crlf                                
                                                        
t1 = time()                                             
for i =1 to lFileLen do                                 
    lFile[i] = getc(fn)                                 
end for                                                 
                                                        
while lWE < lFileLen do                                 
   line = lFile[lWS..lWE])                              
   lWS = lWE + 1                                        
   lWE = lWS + 1                                        
end if                                                  
t2 = time()- t1                                         
--------------------------------------------------------
 
-- Example 2                                                                    
--------------------------------------------------------
t1 = time()                                             
for i = 1 to val[2]  do                                 
    line = gets(fn)                                     
    if atom(line) then                                  
       exit                                             
    end if                                              
    line = line[1..200]                                 
end for                                                 
t2 = time()- t1                                         
--------------------------------------------------------

new topic » topic index » view message » categorize

2. Re: getc( ) and gets( ) speed

Posted by "Derek Parnell" <ddparnell at bigpond.com> Dec 15, 2003
548 views

----- Original Message ----- 
From: "Euman" <euman at bellsouth.net>
To: <EUforum at topica.com>
Subject: getc( ) and gets( ) speed


> 
> 
> Hello,                                                  
>                                                         
> Can the 1st example be made any faster than the second? 

You do realize that they are not equivilent processes don't you? What is it you
are trying to achieve?

The first one doesn't actually work 'cos lWS and lWE are not initialized; and
even if they were, what is the ...
   lWS = lWE + 1                                        
   lWE = lWS + 1                                        
trying to do? Did you mean this instead...

   lWS = lWE + 1                                        
   lWE = lWS + 199

The second just grabs the first 200 chars of each line in the file.

If you are trying to break a file into 200 byte chunks, you could try this ...


atom t, t1, t2, fn, lFileLen                            
integer lChunks
sequence lFile, line                                    
  
 -- Example 1                                                       
 --------------------------------------------------------
lFileLen = seek(fn, -1)                                 
 lFileLen = where(fn)                                    
 if seek(fn, 0) then end if           
lChunks = floor((lFileLen + 1) / 200) + 1                   
 lFile = repeat(repeat(0,200), lChunks)                         
 lFile[lFileLen+1] = crlf                                
                                                         
 t1 = time()                                             
 for i=1 to lChunks do                                 
  for j = 1 to 200 do   
     lFile[i][j] = getc(fn)                                 
  end for
end for                                                 
                                                         
 t2 = time()- t1                                         
 --------------------------------------------------------


> atom t, t1, t2, fn, lFileLen                            
> integer lWS, lWE                                        
> sequence lFile, line                                    
>  
> -- Example 1                                                       
> --------------------------------------------------------
> lFileLen = seek(fn, -1)                                 
> lFileLen = where(fn)                                    
> if seek(fn, 0) then end if                              
> lFile = repeat(0, lFileLen + 1)                         
> lFile[lFileLen+1] = crlf                                
>                                                         
> t1 = time()                                             
> for i =1 to lFileLen do                                 
>     lFile[i] = getc(fn)                                 
> end for                                                 
>                                                         
> while lWE < lFileLen do                                 
>    line = lFile[lWS..lWE])                              
>    lWS = lWE + 1                                        
>    lWE = lWS + 1                                        
> end if                                                  
> t2 = time()- t1                                         
> --------------------------------------------------------
>  
> -- Example 2                                                                  
>
> --------------------------------------------------------
> t1 = time()                                             
> for i = 1 to val[2]  do                                 
>     line = gets(fn)                                     
>     if atom(line) then                                  
>        exit                                             
>     end if                                              
>     line = line[1..200]                                 
> end for                                                 
> t2 = time()- t1                                         
> --------------------------------------------------------
> 
> 
> 
> TOPICA - Start your own email discussion group. FREE!
> 
>

new topic » goto parent » topic index » view message » categorize

3. Re: getc( ) and gets( ) speed

Posted by "Euman" <euman at bellsouth.net> Dec 15, 2003
539 views

----- Original Message ----- 
From: "Derek Parnell" <ddparnell at bigpond.com>
To: <EUforum at topica.com>
Subject: Re: getc( ) and gets( ) speed


> > Hello,                                                  
> >                                                         
> > Can the 1st example be made any faster than the second? 
> 
> You do realize that they are not equivilent processes don't you? What is it
> you are trying to achieve?
> 
> The first one doesn't actually work 'cos lWS and lWE are not initialized; and
> even if they were, what is the ...
>    lWS = lWE + 1                                        
>    lWE = lWS + 1                                        
> trying to do? Did you mean this instead...
> 
>    lWS = lWE + 1                                        
>    lWE = lWS + 199

Yes.

> The second just grabs the first 200 chars of each line in the file.
> 
> If you are trying to break a file into 200 byte chunks, you could try this ...
> 
> 
> atom t, t1, t2, fn, lFileLen                            
> integer lChunks
> sequence lFile, line                                    
>   
>  -- Example 1                                                       
>  --------------------------------------------------------
> lFileLen = seek(fn, -1)                                 
>  lFileLen = where(fn)                                    
>  if seek(fn, 0) then end if           
> lChunks = floor((lFileLen + 1) / 200) + 1                   
>  lFile = repeat(repeat(0,200), lChunks)                         
>  lFile[lFileLen+1] = crlf                                
>                                                          
>  t1 = time()                                             
>  for i=1 to lChunks do                                 
>   for j = 1 to 200 do   
>      lFile[i][j] = getc(fn)                                 
>   end for
> end for                                                 
>                                                          
>  t2 = time()- t1                                         
>  --------------------------------------------------------

what about 13,10 (crlf) from each line ending
if I have 5000 lines with 201 bytes and I need
only 200,  I would need to bypass the crlf on each
line..

This shoot holes in the theory that using getc( )
is faster than gets( ) for most things.

Euman

new topic » goto parent » topic index » view message » categorize

4. Re: getc( ) and gets( ) speed

Posted by "Euman" <euman at bellsouth.net> Dec 15, 2003
536 views

Here you go Derek...

This is more correct in the way it should read the ascii file...

-------------
include file.e

with trace

atom t, t1, t2, fn, lFileLen
integer lWS, lWE, crlf, line_len
sequence lFile, line

crlf = 10
lWS = 1
lWE = 201

--trace(1)

fn = open("trmdemo.txt", "r")
--------------------------------------------------------
lFileLen = seek(fn, -1)
lFileLen = where(fn)
if seek(fn, 0) then end if
lFile = repeat(0, lFileLen + 1)
lFile[lFileLen+1] = crlf

t1 = time()
for i =1 to lFileLen do
    lFile[i] = getc(fn)
end for

while lWE < lFileLen do
   line = lFile[lWS..lWE-1]
   line_len = length(line)
   lWS = lWE + 1
   lWE = lWS + 200
end while
t2 = time()- t1

printf(1,"Average Time : %1.6f sec\n", t2 )

--------------------------------------------------------
close(fn)

fn = open("trmdemo.txt", "r")
--------------------------------------------------------
t1 = time()
for i = 1 to 5000  do
    line = gets(fn)
    if atom(line) then
       exit
    end if
    line = line[1..200]
end for
t2 = time()- t1
--------------------------------------------------------
close(fn)

printf(1,"Average Time : %1.6f sec\n", t2 )

if getc(0) then end if
---------


----- Original Message ----- 
From: "Derek Parnell" <ddparnell at bigpond.com>
> If you are trying to break a file into 200 byte chunks, you could try this ...
>
>
> atom t, t1, t2, fn, lFileLen
> integer lChunks
> sequence lFile, line
>
>  -- Example 1
>  --------------------------------------------------------
> lFileLen = seek(fn, -1)
>  lFileLen = where(fn)
>  if seek(fn, 0) then end if
> lChunks = floor((lFileLen + 1) / 200) + 1
>  lFile = repeat(repeat(0,200), lChunks)
>  lFile[lFileLen+1] = crlf
>
>  t1 = time()
>  for i=1 to lChunks do
>   for j = 1 to 200 do
>      lFile[i][j] = getc(fn)
>   end for
> end for
>
>  t2 = time()- t1
>  --------------------------------------------------------
>

new topic » goto parent » topic index » view message » categorize

5. Re: getc( ) and gets( ) speed

Posted by "Derek Parnell" <ddparnell at bigpond.com> Dec 15, 2003
535 views

----- Original Message ----- 
From: "Euman" <euman at bellsouth.net>
To: <EUforum at topica.com>
Subject: Re: getc( ) and gets( ) speed


> 
> 
> Here you go Derek...
> 
> This is more correct in the way it should read the ascii file...

I still don't get it. The two examples are doing different things, so why are
you comparing their speeds? Are they meant to be doing the same thing?

The first example reads the whole file into RAM then breaks it up into 200 byte
chunks.

The second example reads the first 5000 lines and extacts the first 200 bytes of
each line.

Clearly not the same thing.

What are you TRYING to do?

> -------------
> include file.e
> 
> with trace
> 
> atom t, t1, t2, fn, lFileLen
> integer lWS, lWE, crlf, line_len
> sequence lFile, line
> 
> crlf = 10
> lWS = 1
> lWE = 201
> 
> --trace(1)
> 
> fn = open("trmdemo.txt", "r")
> --------------------------------------------------------
> lFileLen = seek(fn, -1)
> lFileLen = where(fn)
> if seek(fn, 0) then end if
> lFile = repeat(0, lFileLen + 1)
> lFile[lFileLen+1] = crlf
> 
> t1 = time()
> for i =1 to lFileLen do
>     lFile[i] = getc(fn)
> end for
> 
> while lWE < lFileLen do
>    line = lFile[lWS..lWE-1]
>    line_len = length(line)
>    lWS = lWE + 1
>    lWE = lWS + 200
> end while
> t2 = time()- t1
> 
> printf(1,"Average Time : %1.6f sec\n", t2 )
> 
> --------------------------------------------------------
> close(fn)
> 
> fn = open("trmdemo.txt", "r")
> --------------------------------------------------------
> t1 = time()
> for i = 1 to 5000  do
>     line = gets(fn)
>     if atom(line) then
>        exit
>     end if
>     line = line[1..200]
> end for
> t2 = time()- t1
> --------------------------------------------------------
> close(fn)
> 
> printf(1,"Average Time : %1.6f sec\n", t2 )
> 
> if getc(0) then end if
> ---------
> 
> 
> ----- Original Message ----- 
> From: "Derek Parnell" <ddparnell at bigpond.com>
> > If you are trying to break a file into 200 byte chunks, you could try this
> > ...
> >
> >
> > atom t, t1, t2, fn, lFileLen
> > integer lChunks
> > sequence lFile, line
> >
> >  -- Example 1
> >  --------------------------------------------------------
> > lFileLen = seek(fn, -1)
> >  lFileLen = where(fn)
> >  if seek(fn, 0) then end if
> > lChunks = floor((lFileLen + 1) / 200) + 1
> >  lFile = repeat(repeat(0,200), lChunks)
> >  lFile[lFileLen+1] = crlf
> >
> >  t1 = time()
> >  for i=1 to lChunks do
> >   for j = 1 to 200 do
> >      lFile[i][j] = getc(fn)
> >   end for
> > end for
> >
> >  t2 = time()- t1
> >  --------------------------------------------------------
> >
> 
> 
> 
> TOPICA - Start your own email discussion group. FREE!
> 
>

new topic » goto parent » topic index » view message » categorize

6. Re: getc( ) and gets( ) speed

Posted by "Derek Parnell" <ddparnell at bigpond.com> Dec 15, 2003
522 views

----- Original Message ----- 
From: "Euman" <euman at bellsouth.net>
To: <EUforum at topica.com>
Subject: Re: getc( ) and gets( ) speed

[snip]
> 
> what about 13,10 (crlf) from each line ending
> if I have 5000 lines with 201 bytes and I need
> only 200,  I would need to bypass the crlf on each
> line..
> 
> This shoot holes in the theory that using getc( )
> is faster than gets( ) for most things.
> 

Never heard of that theory.

But think about it, you need to call getc() for each and every character inside
some form of (slow) Euphoria loop; but gets() only needs to be called for each
line (multiple characters per time), so in many circumstances gets() will be
faster then getc(). However, gets() only works for text files, it makes a mess of
binary files.

-- 
Derek

new topic » goto parent » topic index » view message » categorize

7. Re: getc( ) and gets( ) speed

Posted by "Euman" <euman at bellsouth.net> Dec 15, 2003
521 views

Hi Derek,

How about this example, if there is a way to incorporate this
into win32lib for reading Ascii files where each line is identical
in length, not much would need change except for the line length
count and add 2 to that variable.

BTW, this example is faster than using getc( ) and gets( )

If you need the code because of line wrap let me know...

-- Co-Authors H.W Overman and Tommy Carlier
-- Dec 15, 2003

include machine.e
include file.e
include dll.e

atom t1, t2, fn, lFileLen, result
sequence lFile, line

atom kernel32
kernel32 = open_dll("kernel32.dll")

constant
xCreateFile =
define_c_func(kernel32,"CreateFileA",{C_POINTER,C_LONG,C_LONG,C_POINTER,C_LONG,C_LONG,C_INT},C_LONG),
xReadFile =
define_c_func(kernel32,"ReadFile",{C_INT,C_POINTER,C_UINT,C_POINTER,C_POINTER},C_LONG)

global constant 
  GENERIC_READ             = #80000000,
  FILE_ATTRIBUTE_NORMAL    = #80,
  FILE_FLAG_SEQUENTIAL_SCAN= #8000000,
  OPEN_EXISTING            = 3

atom hFile
  
global function CreateFile(sequence fname)
atom FileName 
     FileName = allocate_string(fname)
     hFile = c_func(xCreateFile,{FileName,
                                GENERIC_READ,
                                0,
                                NULL,
                                OPEN_EXISTING,
                                FILE_ATTRIBUTE_NORMAL+FILE_FLAG_SEQUENTIAL_SCAN,
                                NULL})
    return hFile
end function

atom lpNumberOfBytesRead
lpNumberOfBytesRead = allocate(4) --lpNumberOfBytesRead

function ReadFile(atom hFile, atom lpBuffer, atom nNumberOfBytesToRead)
return
  c_func(xReadFile,{hFile,lpBuffer,nNumberOfBytesToRead,lpNumberOfBytesRead,0})
end function

fn = open("trmdemo.txt", "r")
lFileLen = seek(fn, -1)                                 
lFileLen = where(fn)
close(fn)

atom lpFileBuff, cIN, cIndex, cMaxIndex
lpFileBuff = allocate(lFileLen)
cIN = 0

hFile = CreateFile("trmdemo.txt")
result = ReadFile(hFile, lpFileBuff, lFileLen) 

if result then
   t1 = time()   
   cMaxIndex = lpFileBuff + lFileLen
   cIndex = lpFileBuff
   while cIndex < cMaxIndex do
      line = peek({cIndex, 200})
      cIndex += 202
   end while
   t2 = time()- t1  
end if                                  
printf(1,"Average Time : %1.4f sec\n", t2 )


if getc(0) then end if

-- Euman 2003

new topic » goto parent » topic index » view message » categorize

8. Re: getc( ) and gets( ) speed

Posted by "Kat" <gertie at visionsix.com> Dec 15, 2003
521 views

On 16 Dec 2003, at 8:44, Derek Parnell wrote:

> 
> 
> ----- Original Message ----- 
> From: "Euman" <euman at bellsouth.net>
> To: <EUforum at topica.com>
> Subject: Re: getc( ) and gets( ) speed
> 
> 
> [snip]
> > 
> > what about 13,10 (crlf) from each line ending
> > if I have 5000 lines with 201 bytes and I need
> > only 200,  I would need to bypass the crlf on each
> > line..
> > 
> > This shoot holes in the theory that using getc( )
> > is faster than gets( ) for most things.
> > 
> 
> Never heard of that theory.
> 
> But think about it, you need to call getc() for each and every character
> inside
> some form of (slow) Euphoria loop; but gets() only needs to be called for each
> line (multiple characters per time), so in many circumstances gets() will be
> faster then getc(). However, gets() only works for text files, it makes a mess
> of binary files.

Which makes even more sense to use fget() (or something like that) to grab 
the whole file from the OS into a sequence, glom the whole thing at once. 
Then if you want it broken into strings, parse() it on the line terminator. If
it's
a bitmap, blast it to the screen buffer (in pascal, i'd let dos take it from the
drive directly to the screen buffer).

Kat

new topic » goto parent » topic index » view message » categorize

9. Re: getc( ) and gets( ) speed

Posted by "Kat" <gertie at visionsix.com> Dec 15, 2003
518 views

On 16 Dec 2003, at 8:44, Derek Parnell wrote:

> 
> 
> ----- Original Message ----- 
> From: "Euman" <euman at bellsouth.net>
> To: <EUforum at topica.com>
> Subject: Re: getc( ) and gets( ) speed
> 
> 
> [snip]
> > 
> > what about 13,10 (crlf) from each line ending
> > if I have 5000 lines with 201 bytes and I need
> > only 200,  I would need to bypass the crlf on each
> > line..
> > 
> > This shoot holes in the theory that using getc( )
> > is faster than gets( ) for most things.
> > 
> 
> Never heard of that theory.
> 
> But think about it, you need to call getc() for each and every character
> inside
> some form of (slow) Euphoria loop; but gets() only needs to be called for each
> line (multiple characters per time), so in many circumstances gets() will be
> faster then getc(). However, gets() only works for text files, it makes a mess
> of binary files.

Can 
FileContents = fgetb(filehandle,1,SizeOfFile) 
get the entire file without using getc() behind the scene?

Kat

new topic » goto parent » topic index » view message » categorize

10. Re: getc( ) and gets( ) speed

Posted by "Juergen Luethje" <j.lue at gmx.de> Dec 16, 2003
512 views

Kat wrote:

> On 16 Dec 2003, at 8:44, Derek Parnell wrote:

<snip>

>> But think about it, you need to call getc() for each and every character
>> inside
>> some form of (slow) Euphoria loop; but gets() only needs to be called for
>> each
>> line (multiple characters per time), so in many circumstances gets() will be
>> faster then getc(). However, gets() only works for text files, it makes a
>> mess
>> of binary files.
>
> Which makes even more sense to use fget() (or something like that) to grab
> the whole file from the OS into a sequence, glom the whole thing at once.

<snip>

Yes.
Once upon a time  I thought that reading just a number of bytes from
a file should be faster then reading the file in text mode, because text
mode needs some interpretation. Then I realised, that the following
little program

-----------=-----------=------------=------------=----------=-----------
global function read_text_file (string filename)
   integer fn
   sequence content
   object line

   fn = open(file, "r")
   if fn = -1 then
      return -1
   end if

   content = ""
   line = gets(fn)
   while sequence(line) do
      content &= line
      line = gets(fn)
   end while
   close(fn)
   return content
end function
-----------=-----------=------------=------------=----------=-----------

is on my PC almost twice(!) as fast as doing the same, using get_bytes()
instead of gets(). That's strange IMHO.

Regards,
   Juergen

-- 
 /"\  ASCII ribbon campain  |    |\      _,,,---,,_
 \ /  against HTML in       |    /,`.-'`'    -.  ;-;;,_
  X   e-mail and news,      |   |,4-  ) )-,_..;\ (  `'-'
 / \  and unneeded MIME     |  '---''(_/--'  `-'\_)

new topic » goto parent » topic index » view message » categorize

11. Re: getc( ) and gets( ) speed

Posted by Robert Craig <rds at RapidEuphoria.com> Dec 16, 2003
534 views

Derek Parnell wrote:
 > However, gets() only works for text files, it makes a
 > mess of binary files.

gets() works fine on any kind of file.
It won't make a mess. You can use it to copy any file.
If the '\n' characters happen to be few and
far between, then the length of the "line"
that you read with gets() could be unpredictable and
very large. gets() is currently coded to run fast with
lines less than 1040 characters, but it will handle
lines of any length (as long as there's enough memory).

Regards,
    Rob Craig
    Rapid Deployment Software
    http://www.RapidEuphoria.com

new topic » goto parent » topic index » view message » categorize

12. Re: getc( ) and gets( ) speed

Posted by "Derek Parnell" <ddparnell at bigpond.com> Dec 16, 2003
509 views

----- Original Message ----- 
From: "Robert Craig" <rds at RapidEuphoria.com>
To: <EUforum at topica.com>
Subject: Re: getc( ) and gets( ) speed


> 
> 
> Derek Parnell wrote:
>  > However, gets() only works for text files, it makes a
>  > mess of binary files.
> 
> gets() works fine on any kind of file.
> It won't make a mess. You can use it to copy any file.
> If the '\n' characters happen to be few and
> far between, then the length of the "line"
> that you read with gets() could be unpredictable and
> very large. gets() is currently coded to run fast with
> lines less than 1040 characters, but it will handle
> lines of any length (as long as there's enough memory).
> 

Of course you are correct. I was getting mixed up with opening a binary file in
text mode. Now *that* is a bad idea.

-- 
Derek

new topic » goto parent » topic index » view message » categorize

13. Re: getc( ) and gets( ) speed

Posted by "Juergen Luethje" <j.lue at gmx.de> Jan 03, 2004
513 views

Euman wrote:

> ----- Original Message ----- 
> From: "Juergen Luethje"
>
>> Tommy wrote:

<snip>

>>> Robert, couldn't you consider making get_bytes a
>>> low-level, builtin routine to dramatically improve its performance? I
>>> think get_bytes is a really standard routine that is probably quite
>>> frequently used.
>>
>> I would appreciate that _very_ much.
>
> If you're using windows, try this:

Yes, at the moment, I need it for Windows, so your code is very useful
for me. Thank you!
However, this Windows API stuff is neither simple nor cross-platform.
So I agree with Tommy, that get_bytes() is really a standard routine,
which should be considerably faster.

[code snipped]

> This is FAST....be carefull, I think Rob C. says Gets( ) reads 1024 bytes
> at any one time. This routine can read as many as you supply, you could get
> into trouble or have a slow routine if virtual memory is needed to store
> the data.

Thanks for the warning. My program reads chunks of 32 KB each, so this
shouldn't be a problem.

I wrote a small program that compares the speed of gets(), get_bytes(),
and your Windows API code. On my Pentium 2, 400 MHz, 64 MB RAM (under
Win 98/1st ed.), the results for reading a 10 MB text file, consisting
of 150000 lines are as follows:
1.13 sec. using gets()                -- 2 times faster than get_bytes()
2.25 sec. using get_bytes()
0.70 sec. using Windows API routines  -- 3 times faster than get_bytes()


--====================================================================--
include file.e
include get.e
include dll.e
include machine.e

constant MAX_CHUNK = 32*1024    -- 32 KB

-----------=-----------=------------=------------=----------=-----------

global function read_file1 (sequence fileName)
   object buffer
   integer fn

   fn = open(fileName, "r")
   if fn = -1 then
      return -1        -- error
   end if

   buffer = gets(fn)
   while sequence(buffer) do
      --** do something with the stuff in the buffer
      buffer = gets(fn)
   end while

   close(fn)
   return 0            -- success
end function

-----------=-----------=------------=------------=----------=-----------

global function read_file2 (sequence fileName, atom fileSize)
   sequence buffer
   atom remaining
   integer fn, buffSize

   fn = open(fileName, "r")
   if fn = -1 then
      return -1        -- error
   end if

   buffSize = MAX_CHUNK
   remaining = fileSize
   while remaining > 0 do
      if remaining < buffSize then
         buffSize = remaining
      end if
      buffer = get_bytes(fn, buffSize)
      --** do something with the stuff in the buffer
      remaining -= buffSize
   end while

   close(fn)
   return 0            -- success
end function

-----------=-----------=------------=------------=----------=-----------
-- Euman's API code

constant
   kernel32 = open_dll("kernel32.dll"),
   xCreateFile = define_c_func(kernel32,"CreateFileA",{C_POINTER,C_LONG,
                               C_LONG,C_POINTER,C_LONG,C_LONG,C_INT},
                               C_LONG),
   xReadFile = define_c_func(kernel32,"ReadFile",{C_INT,C_POINTER,C_UINT,
                               C_POINTER,C_POINTER},C_LONG),
   xCloseHandle = define_c_func(kernel32,"CloseHandle",{C_LONG},C_LONG)

constant
   GENERIC_READ              = #80000000,
   FILE_ATTRIBUTE_NORMAL     = #80,
   FILE_FLAG_SEQUENTIAL_SCAN = #8000000,
   OPEN_EXISTING             = 3

function OpenFile_rb (sequence fname)
   atom handle, FileName

   FileName = allocate_string(fname)
   handle = c_func(xCreateFile,{FileName,
                               GENERIC_READ,
                               0,
                               NULL,
                               OPEN_EXISTING,
                               FILE_ATTRIBUTE_NORMAL+FILE_FLAG_SEQUENTIAL_SCAN,
                               NULL})
   return handle
end function

atom lpNumberOfBytesRead       -- actual No. of bytes read by routine

function ReadFile (atom hFile, atom lpBuffer, atom nNumberOfBytesToRead)
return
   c_func(xReadFile,{hFile,lpBuffer,nNumberOfBytesToRead,lpNumberOfBytesRead,0})
end function


global function read_file3 (sequence fileName, atom fileSize)
   sequence buffer
   atom lpBuffer, remaining
   integer fn, buffSize, void

   fn = OpenFile_rb(fileName)
   if fn = -1 then
      return -1        -- error
   end if

   buffSize = MAX_CHUNK
   lpBuffer = allocate(buffSize)
   lpNumberOfBytesRead = allocate(4)
   remaining = fileSize
   while remaining > 0 do
      if remaining < buffSize then
         buffSize = remaining
      end if
      void = ReadFile(fn, lpBuffer, buffSize)
      buffer = peek({lpBuffer, buffSize})
      --** do something with the stuff in the buffer
      remaining -= buffSize
   end while

   free(lpBuffer)
   free(lpNumberOfBytesRead)
   void = c_func(xCloseHandle, {fn})
   return 0            -- success
end function

-----------=-----------=------------=------------=----------=-----------

-- Compare speed of the 3 functions.

procedure wait_abort (sequence msg, integer code)
   puts(1, msg & "\n\nPress any key ...")
   if wait_key() then end if
   abort(code)
end procedure

object temp
sequence file
atom fileSize, t1, t2, t3
integer err

file = "test.txt"      -- Text file

temp = dir(file)
if atom(temp) then
   wait_abort("File '" & file & "' not found.", 1)
end if
fileSize = temp[1][D_SIZE]
printf(1, "Results of reading '%s' (%.2f MB):\n",
          {file, fileSize/(1024*1024)})

t1 = time()
err = read_file1(file)
t1 = time()-t1
if err != 0 then
   wait_abort("\n\nerror using gets().", 1)
end if
printf(1, "   %.2f sec. using gets()\n", {t1})

t2 = time()
err = read_file2(file, fileSize)
t2 = time()-t2
if err != 0 then
   wait_abort("\n\nerror using get_bytes().", 1)
end if
printf(1, "   %.2f sec. using get_bytes()\n", {t2})

t3 = time()
err = read_file3(file, fileSize)
t3 = time()-t3
if err != 0 then
   wait_abort("\n\nerror using Windows API routines.", 1)
end if
printf(1, "   %.2f sec. using Windows API routines", {t3})

wait_abort("", 0)
--====================================================================--


Regards,
   Juergen

-- 
 /"\  ASCII ribbon campain  |  This message has been ROT-13 encrypted
 \ /  against HTML in       |  twice for higher security.
  X   e-mail and news,      |
 / \  and unneeded MIME     |  http://home.arcor.de/luethje/prog/

new topic » goto parent » topic index » view message » categorize

14. Re: getc( ) and gets( ) speed

Posted by "Juergen Luethje" <j.lue at gmx.de> Jan 03, 2004
522 views

Me wrote:

<big snip>

> I wrote a small program that compares the speed of gets(), get_bytes(),
> and your Windows API code.

I just realized, that in the functions read_file1() and read_file2(),
it should be
   fn = open(fileName, "rb")
rather than
   fn = open(fileName, "r").

> On my Pentium 2, 400 MHz, 64 MB RAM (under
> Win 98/1st ed.), the results for reading a 10 MB text file, consisting
> of 150000 lines are as follows:
> 1.13 sec. using gets()                -- 2 times faster than get_bytes()
> 2.25 sec. using get_bytes()
> 0.70 sec. using Windows API routines  -- 3 times faster than get_bytes()

The speed of the functions remain unchanged.

<big snip>

Regards,
   Juergen

OpenEuphoria

1. getc( ) and gets( ) speed

2. Re: getc( ) and gets( ) speed

3. Re: getc( ) and gets( ) speed

4. Re: getc( ) and gets( ) speed

5. Re: getc( ) and gets( ) speed

6. Re: getc( ) and gets( ) speed

7. Re: getc( ) and gets( ) speed

8. Re: getc( ) and gets( ) speed

9. Re: getc( ) and gets( ) speed

10. Re: getc( ) and gets( ) speed

11. Re: getc( ) and gets( ) speed

12. Re: getc( ) and gets( ) speed

13. Re: getc( ) and gets( ) speed

14. Re: getc( ) and gets( ) speed

Search

Include:

Quick Links

User menu

Misc Menu