Re: Speed question

new topic     » goto parent     » topic index » view thread      » older message » newer message

--=======AVGMAIL-400DD1953AF5=======
boundary="----=_NextPart_000_000A_01C3E017.9904BCC0"

------=_NextPart_000_000A_01C3E017.9904BCC0

here is a fast way of reading a file. I exracted it from the forum not 
long ago.
you may be able to use it to your advantage.
One way would be to read the file, then sort your data.
Another way would be to read the file and sort the data at the same time.

Example: object x
                x = read(fileName,32)

read takes 2 arguments; the filename and a buffersize.

* read only reads a whole file. I do not understant what it is your 
trying to seek in the line
fp=seek(fn,0)
--====================================================================--
-- This section is used by the global function below.

include win32lib.ew

constant
    kernel32 = open_dll("kernel32.dll"),
    xCreateFile = define_c_func(kernel32,"CreateFileA",{C_POINTER,C_LONG,
           C_LONG,C_POINTER,C_LONG,C_LONG,C_INT},
           C_LONG),
    xReadFile = define_c_func(kernel32,"ReadFile",{C_INT,C_POINTER,C_UINT,
           C_POINTER,C_POINTER},C_LONG),
    xCloseHandle = define_c_func(kernel32,"CloseHandle",{C_LONG},C_LONG)

constant
    GENERIC_READ              = #80000000,
    FILE_ATTRIBUTE_NORMAL     = #80,
    FILE_FLAG_SEQUENTIAL_SCAN = #8000000,
    OPEN_EXISTING             = 3

function OpenFile_rb(sequence fname)
    atom handle, FileName

    FileName = allocate_string(fname)
    handle = c_func(xCreateFile,{FileName,
           GENERIC_READ,
           0,
           NULL,
           OPEN_EXISTING,
           FILE_ATTRIBUTE_NORMAL+FILE_FLAG_SEQUENTIAL_SCAN,
           NULL})
    return handle
end function

atom lpNumberOfBytesRead       -- actual No. of bytes read by routine

function ReadFile(atom hFile, atom lpBuffer, atom nNumberOfBytesToRead)
    return 
c_func(xReadFile,{hFile,lpBuffer,nNumberOfBytesToRead,lpNumberOfBytesRead,0})
end function
--====================================================================--
--This is a very fast way of reading a file

global function read(sequence fileName, integer KbChunks)
    sequence buffer, data
    atom lpBuffer, remaining, fileSize
    integer fn, buffSize, void
    object temp

    temp = dir(fileName)
    if atom(temp) then
        return -1       -- error
    end if
    fileSize = temp[1][D_SIZE]

    fn = OpenFile_rb(fileName)
    if fn = -1 then
       return -1        -- error
    end if

    data = {}

    buffSize = KbChunks * 1024
    lpBuffer = allocate(buffSize)
    lpNumberOfBytesRead = allocate(4)
    remaining = fileSize
    while remaining > 0 do
       if remaining < buffSize then
   buffSize = remaining
       end if
       void = ReadFile(fn, lpBuffer, buffSize)
       buffer = peek({lpBuffer, buffSize})
       -- you can process the read data here before appending it to 'data'
       data &= buffer
       remaining -= buffSize
    end while

    free(lpBuffer)
    free(lpNumberOfBytesRead)
    void = c_func(xCloseHandle, {fn})
    if data[length(data)] = '\n' then      --Remove the last character
    data = data[1..length(data) - 1]
    end if
    -- or you can process 'data' here before returning it.
    return data            -- success
end function


----- Original Message -----
From: "Kat" <gertie at visionsix.com>
To: <EUforum at topica.com>
Sent: Wednesday, January 21, 2004 3:33 AM
Subject: Re: Speed question


============ The Euphoria Mailing List ============ 


On 19 Jan 2004, at 20:08, Allen Robnett wrote:

> 
> 
> After opening a Euphoria text file "r", I am reading in one million 
> 8-character words, (the entire file).
> 
> clear_screen() 
> fp=seek(fn,0) -- why do you seek()?
> s = get(fn)

Since the file is not \n delimited, i'd use gets()

> close(fn)
> word_array = s[2] -- what?
> word_array[4][6] is then the 6th letter of the 4th word in the array.

using gets, word_array[wordlen x wordnum][6] is the same.

> It works fine, but it takes fifteen minutes to read in the array. Is 
> there a better way?

There must be, i can get a megabyte off the internet in 15 minutes! Take a 
peek at function getf() in file.e.

Kat



Kat

--^----------------------------------------------------------------
This email was sent to: hmck1 at dodo.com.au

EASY UNSUBSCRIBE click here: http://topica.com/u/?b1dd66.b60Ray.aG1jazFA
Or send an email to: EUforum-unsubscribe at topica.com

TOPICA - Start your own email discussion group. FREE!
http://www.topica.com/partner/tag02/create/index2.html
--^----------------------------------------------------------------





-- 
Incoming mail is certified Virus Free.
Checked by AVG anti-virus system (http://www.grisoft.com).
Version: 6.0.561 / Virus Database: 353 - Release Date: 15/01/04



---
Outgoing mail is certified Virus Free.
Checked by AVG anti-virus system (http://www.grisoft.com).
Version: 6.0.561 / Virus Database: 353 - Release Date: 13/01/04
------=_NextPart_000_000A_01C3E017.9904BCC0
Content-Type: text/html; charset=iso-8859-1
Content-Transfer-Encoding: 8bit

<!DOCTYPE HTML PUBLIC "-//W3C//DTD HTML 4.0 Transitional//EN">
<HTML><HEAD>
<META http-equiv=Content-Type content="text/html; charset=iso-8859-1">
<META content="MSHTML 6.00.2800.1276" name=GENERATOR>
<STYLE></STYLE>
</HEAD>
<BODY>
<DIV><FONT face=Arial size=2>here is a fast way of reading a file. I 
exracted it
from the forum not long ago.</FONT></DIV>
<DIV><FONT face=Arial size=2>you may be able to use it to your
advantage.</FONT></DIV>
<DIV><FONT face=Arial size=2>One way would be to read the file, then 
sort your
data.</FONT></DIV>
<DIV><FONT face=Arial size=2>Another way would be to read the file and 
sort the
data at the same time.</FONT></DIV>
<DIV><FONT face=Arial size=2></FONT>&nbsp;</DIV>
<DIV><FONT face=Arial size=2>Example: object x</FONT></DIV>
<DIV><FONT face=Arial
size=2>&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;x

= read(fileName,32)</FONT></DIV>
<DIV><FONT face=Arial size=2></FONT>&nbsp;</DIV>
<DIV><FONT face=Arial size=2>read takes 2 arguments; the filename and a
buffersize.</FONT></DIV>
<DIV><FONT face=Arial size=2></FONT>&nbsp;</DIV>
<DIV><FONT face=Arial size=2>* read only reads a whole file. I do not 
understant
what it is your trying to seek in the line</FONT></DIV>
<DIV><FONT face=Arial size=2>fp=seek(fn,0)</FONT></DIV>
<DIV><FONT face=Arial
size=2>--====================================================================--</FONT></DIV>
<DIV><FONT face=Arial color=#ff0000 size=2><U>-- This section is used 
by the
global function below.</U></FONT></DIV>
<DIV><FONT face=Arial size=2></FONT>&nbsp;</DIV>
<DIV><FONT face=Arial size=2>include win32lib.ew</FONT></DIV>
<DIV><FONT face=Arial size=2></FONT>&nbsp;</DIV>
<DIV><FONT face=Arial size=2>constant<BR>&nbsp;&nbsp; kernel32 =
open_dll("kernel32.dll"),<BR>&nbsp;&nbsp; xCreateFile =
define_c_func(kernel32,"CreateFileA",{C_POINTER,C_LONG,<BR>&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;

C_LONG,C_POINTER,C_LONG,C_LONG,C_INT},<BR>&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;

C_LONG),<BR>&nbsp;&nbsp; xReadFile =
define_c_func(kernel32,"ReadFile",{C_INT,C_POINTER,C_UINT,<BR>&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;

C_POINTER,C_POINTER},C_LONG),<BR>&nbsp;&nbsp; xCloseHandle =
define_c_func(kernel32,"CloseHandle",{C_LONG},C_LONG)</FONT></DIV>
<DIV><FONT face=Arial size=2></FONT>&nbsp;</DIV>
<DIV><FONT face=Arial size=2>constant<BR>&nbsp;&nbsp;
GENERIC_READ&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;

= #80000000,<BR>&nbsp;&nbsp; 
FILE_ATTRIBUTE_NORMAL&nbsp;&nbsp;&nbsp;&nbsp; =
#80,<BR>&nbsp;&nbsp; FILE_FLAG_SEQUENTIAL_SCAN = 
#8000000,<BR>&nbsp;&nbsp;
OPEN_EXISTING&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;

= 3&nbsp;&nbsp; </FONT></DIV>
<DIV><FONT face=Arial size=2></FONT>&nbsp;</DIV>
<DIV><FONT face=Arial size=2>function OpenFile_rb(sequence
fname)<BR>&nbsp;&nbsp; atom handle, FileName</FONT></DIV>
<DIV><FONT face=Arial size=2></FONT>&nbsp;</DIV>
<DIV><FONT face=Arial size=2>&nbsp;&nbsp; FileName =
allocate_string(fname)<BR>&nbsp;&nbsp; handle =
c_func(xCreateFile,{FileName,<BR>&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;

GENERIC_READ,<BR>&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;
0,<BR>&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;
NULL,<BR>&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;
OPEN_EXISTING,<BR>&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;
FILE_ATTRIBUTE_NORMAL+FILE_FLAG_SEQUENTIAL_SCAN,<BR>&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;

NULL})<BR>&nbsp;&nbsp; return handle<BR>end function</FONT></DIV>
<DIV><FONT face=Arial size=2></FONT>&nbsp;</DIV>
<DIV><FONT face=Arial size=2>atom
lpNumberOfBytesRead&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp; -- actual No. 
of bytes
read by routine</FONT></DIV>
<DIV><FONT face=Arial size=2></FONT>&nbsp;</DIV>
<DIV><FONT face=Arial size=2>function ReadFile(atom hFile, atom 
lpBuffer, atom
nNumberOfBytesToRead)<BR>&nbsp;&nbsp; return
c_func(xReadFile,{hFile,lpBuffer,nNumberOfBytesToRead,lpNumberOfBytesRead,0})<BR>end

function</FONT><FONT size=2><FONT
face=Arial><BR>--====================================================================--<BR><U><FONT

color=#ff0000>--This is a very fast way of reading a
file</FONT></U></FONT></FONT></DIV>
<DIV><FONT face=Arial size=2></FONT><BR><FONT face=Arial size=2>global 
function
read(sequence fileName, integer KbChunks)<BR>&nbsp;&nbsp; sequence 
buffer,
data<BR>&nbsp;&nbsp; atom lpBuffer, remaining, 
fileSize<BR>&nbsp;&nbsp; integer
fn, buffSize, void<BR>&nbsp;&nbsp; object temp<BR>&nbsp;&nbsp; 
<BR>&nbsp;&nbsp;
temp = dir(fileName)<BR>&nbsp;&nbsp; if atom(temp)
then<BR>&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp; return
-1&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp; -- error<BR>&nbsp;&nbsp; end
if<BR>&nbsp;&nbsp; fileSize = temp[1][D_SIZE]</FONT></DIV>
<DIV><FONT face=Arial size=2></FONT>&nbsp;</DIV>
<DIV><FONT face=Arial size=2>&nbsp;&nbsp; fn =
OpenFile_rb(fileName)<BR>&nbsp;&nbsp; if fn = -1
then<BR>&nbsp;&nbsp;&nbsp;&nbsp;&nbsp; return
-1&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp; -- error<BR>&nbsp;&nbsp; end
if<BR>&nbsp;&nbsp; <BR>&nbsp;&nbsp; data = {}</FONT></DIV>
<DIV><FONT face=Arial size=2></FONT>&nbsp;</DIV>
<DIV><FONT face=Arial size=2>&nbsp;&nbsp; buffSize = KbChunks *
1024<BR>&nbsp;&nbsp; lpBuffer = allocate(buffSize)<BR>&nbsp;&nbsp;
lpNumberOfBytesRead = allocate(4)<BR>&nbsp;&nbsp; remaining =
fileSize<BR>&nbsp;&nbsp; while remaining &gt; 0
do<BR>&nbsp;&nbsp;&nbsp;&nbsp;&nbsp; if remaining &lt; buffSize 
then<BR>&nbsp;
buffSize = remaining<BR>&nbsp;&nbsp;&nbsp;&nbsp;&nbsp; end
if<BR>&nbsp;&nbsp;&nbsp;&nbsp;&nbsp; void = ReadFile(fn, lpBuffer,
buffSize)<BR>&nbsp;&nbsp;&nbsp;&nbsp;&nbsp; buffer = peek({lpBuffer,
buffSize})</FONT></DIV>
<DIV><FONT><FONT face=Arial
size=2>&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;</FONT><FONT size=2><FONT
face=Arial><U><FONT color=#ff0000>-- you can process the read data 
here before
appending it to 'data'<BR></FONT></U>&nbsp;&nbsp;&nbsp;&nbsp;&nbsp; 
data &amp;=
buffer<BR>&nbsp;&nbsp;&nbsp;&nbsp;&nbsp; remaining -= 
buffSize<BR>&nbsp;&nbsp;
end while</FONT></FONT></FONT></DIV>
<DIV><FONT face=Arial size=2></FONT>&nbsp;</DIV>
<DIV><FONT face=Arial size=2>&nbsp;&nbsp; free(lpBuffer)<BR>&nbsp;&nbsp;
free(lpNumberOfBytesRead)<BR>&nbsp;&nbsp; void = c_func(xCloseHandle,
{fn})<BR>&nbsp;&nbsp; if data[length(data)] = '\n'
then&nbsp;&nbsp;&nbsp;&nbsp;&nbsp; --Remove the last 
character<BR>&nbsp;&nbsp;
data = data[1..length(data) - 1]<BR>&nbsp;&nbsp; end if</FONT></DIV>
<DIV><FONT face=Arial size=2>&nbsp;&nbsp; <U><FONT color=#ff0000>-- or 
you can
process&nbsp;'data' here before returning 
it.</FONT></U></FONT><FONT><BR><FONT
face=Arial size=2>&nbsp;&nbsp; return
data&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp; --
success<BR>end function</FONT></FONT></DIV>
<DIV><FONT face=Arial size=2></FONT>&nbsp;</DIV>
<DIV><FONT face=Arial size=2></FONT>&nbsp;</DIV>
<DIV><FONT face=Arial size=2>----- Original Message ----- </FONT>
<DIV><FONT face=Arial size=2>From: "Kat" &lt;</FONT><A
href="mailto:gertie at visionsix.com"><FONT face=Arial
size=2>gertie at visionsix.com</FONT></A><FONT face=Arial 
size=2>&gt;</FONT></DIV>
<DIV><FONT face=Arial size=2>To: &lt;</FONT><A
href="mailto:EUforum at topica.com"><FONT face=Arial
size=2>EUforum at topica.com</FONT></A><FONT face=Arial 
size=2>&gt;</FONT></DIV>
<DIV><FONT face=Arial size=2>Sent: Wednesday, January 21, 2004 3:33
AM</FONT></DIV>
<DIV><FONT face=Arial size=2>Subject: Re: Speed 
question</FONT></DIV></DIV>
<DIV><FONT face=Arial><BR><FONT size=2></FONT></FONT></DIV>
<DIV><FONT face=Arial size=2>&gt; ============ The Euphoria Mailing List
============ <BR>&gt; <BR>&gt; <BR>&gt; On 19 Jan 2004, at 20:08, 
Allen Robnett
wrote:<BR>&gt; <BR>&gt; &gt; <BR>&gt; &gt; <BR>&gt; &gt; After opening a
Euphoria text file "r", I am reading in one million <BR>&gt; &gt; 
8-character
words, (the entire file).<BR>&gt; &gt; <BR>&gt; &gt; clear_screen() 
<BR>&gt;
&gt; fp=seek(fn,0) -- why do you seek()?<BR>&gt; &gt; s = get(fn)<BR>&gt;
<BR>&gt; Since the file is not \n delimited, i'd use gets()<BR>&gt; 
<BR>&gt;
&gt; close(fn)<BR>&gt; &gt; word_array = s[2] -- what?<BR>&gt; &gt;
word_array[4][6] is then the 6th letter of the 4th word in the 
array.<BR>&gt;
<BR>&gt; using gets, word_array[wordlen x wordnum][6] is the 
same.<BR>&gt;
<BR>&gt; &gt; It works fine, but it takes fifteen minutes to read in 
the array.
Is <BR>&gt; &gt; there a better way?<BR>&gt; <BR>&gt; There must be, i 
can get a
megabyte off the internet in 15 minutes! Take a <BR>&gt; peek at 
function getf()
in file.e.<BR>&gt; <BR>&gt; Kat<BR>&gt; <BR>&gt; <BR>&gt; <BR>&gt; 
Kat<BR>&gt;
<BR>&gt;
--^----------------------------------------------------------------<BR>&gt; 
This
email was sent to: </FONT><A href="mailto:hmck1 at dodo.com.au"><FONT 
face=Arial
size=2>hmck1 at dodo.com.au</FONT></A><BR><FONT face=Arial size=2>&gt; 
<BR>&gt;
EASY UNSUBSCRIBE click here: </FONT><A
href="http://topica.com/u/?b1dd66.b60Ray.aG1jazFA"><FONT face=Arial
size=2>http://topica.com/u/?b1dd66.b60Ray.aG1jazFA</FONT></A><BR><FONT
face=Arial size=2>&gt; Or send an email to: </FONT><A
href="mailto:EUforum-unsubscribe at topica.com"><FONT face=Arial
size=2>EUforum-unsubscribe at topica.com</FONT></A><BR><FONT face=Arial 
size=2>&gt;
<BR>&gt; TOPICA - Start your own email discussion group. FREE!<BR>&gt; 
</FONT><A
href="http://www.topica.com/partner/tag02/create/index2.html"><FONT 
face=Arial
size=2>http://www.topica.com/partner/tag02/create/index2.html</FONT></A><BR><FONT

face=Arial size=2>&gt;
--^----------------------------------------------------------------<BR>&gt; 

<BR>&gt; <BR>&gt; <BR>&gt; <BR>&gt; <BR>&gt; -- <BR>&gt; Incoming mail is
certified Virus Free.<BR>&gt; Checked by AVG anti-virus system (</FONT><A
href="http://www.grisoft.com"><FONT face=Arial
size=2>http://www.grisoft.com</FONT></A><FONT face=Arial 
size=2>).<BR>&gt;
Version: 6.0.561 / Virus Database: 353 - Release Date: 15/01/04<BR>&gt;
</FONT></DIV>
<DIV>&nbsp;</DIV>
<DIV><FONT face=Arial size=2><BR>---<BR>Outgoing mail is certified Virus
Free.<BR>Checked by AVG anti-virus system (<A
href="http://www.grisoft.com">http://www.grisoft.com</A>).<BR>Version: 
6.0.561 /

------=_NextPart_000_000A_01C3E017.9904BCC0--
--=======AVGMAIL-400DD1953AF5=======
Content-Type: text/plain; x-avg=cert; charset=iso-8859-2
Content-Transfer-Encoding: 8bit
Content-Disposition: inline
Content-Description: "AVG certification"

Outgoing mail is certified Virus Free.
Checked by AVG anti-virus system (http://www.grisoft.com).
Version: 6.0.561 / Virus Database: 353 - Release Date: 15/01/04

--=======AVGMAIL-400DD1953AF5=======--

new topic     » goto parent     » topic index » view thread      » older message » newer message

Search



Quick Links

User menu

Not signed in.

Misc Menu