1. I/O: What is the best format? (To CChris [and All])

CChris, still about to translate from Pascal to Euphoria: I am sending
below some possible formats of text files. I would like to know how is
the best to input data into my variables. The first file (KillCPs.txt)
could be this format:

UDUDUDUDXXXXUD,3,4,5,10
XXXUUDDDUUDXUD,2,3,6,11
UDUUUDUDXUDDUD,2,4,4,10
XUUUDDDDXXXXUD,3,4,5,10
or this one:
"UDUDUDUDXXXXUD",3,4,5,10
"XXXUUDDDUUDXUD",2,3,6,11
"UDUUUDUDXUDDUD",2,4,4,10
"XUUUDDDDXXXXUD",3,4,5,10
or this one:
UDUDUDUDXXXXUD 3 4 510
XXXUUDDDUUDXUD 2 3 611
UDUUUDUDXUDDUD 2 4 410
XUUUDDDDXXXXUD 3 4 510
or this one:
UDUDUDUDXXXXUD03040510
XXXUUDDDUUDXUD02030611
UDUUUDUDXUDDUD02040410
XUUUDDDDXXXXUD03040510

The second file (parameter arqEnt) could be like this:
UDUDUDUDXXXXUD
XXXUUDDDUUDXUD
UDUUUDUDXUDDUD
XUUUDDDDXXXXUU
UDUDUDUDXXXXUD
XXXUUDDDUUDXUD
UDUUUDUDXUDDUD
XUUUDDDDXXXXUD
or like this one:
"UDUDUDUDXXXXUD"
"XXXUUDDDUUDXUD"
"UDUUUDUDXUDDUD"
"XUUUDDDDXXXXUU"
"UDUDUDUDXXXXUD"
"XXXUUDDDUUDXUD"
"UDUUUDUDXUDDUD"
"XUUUDDDDXXXXUD"

Considering that 5.000.000 records will be read, I should opt which 
format for? Or another, of couse. I am asking because you have made 
a function 'Split' who take time of the computer. Is there some way 
to simplify the data entrance?

Thanks a lot.

Paulo Fernandes
Porto Alegre/RS 
Brasil

PS: I have discovered the documentation! the EE editor! and the IDE 
    Judith (that is not working for now)! It is a new world!!



CChris wrote:
> 
> Well, you'll thank me when your program runs! I haven't coded much in Pascal
> for the last 10 years. As a result, I made some assertions about I/O which
> may work... or not.
> I think you won't have much to tweak in order to get a working program.
> It can be made simpler if you don't use the formatted file format of Pascal.
> Howvever, you may need to convert old files to flat files if you do. Let me
> know how things keep going.
> 
> CChris
> 
> 
> Paulo Fernandes wrote:
> > 
> > 
> > Hello, CChris!!
> > 
> > I have no words to thank you for the work. 
> > I am deep thank you very much! It was very 
> > kindness of you!
> > And the most important: your translation is 
> > a true lesson for me and for others, of course 
> > (today or in the future)!!
> > I intend, now, make a study on your annotations
> > and put the new program to run!
> > 
> > Merci beacoup. Santé et Bonne Chance!
> > 
> > Paulo Fernandes
> > Brazil
> > 
> > 
> > CChris wrote:
> > > .................. 
> > > 
> > > Here is a translation attempt. I haven't programmed in Pascal for a while,
> > > so I may have forgotten about subtleties of buffered and formatted file
> > > I/O.
> > > I left translation notes to help you see hw I translated nontrivial parts.
> > > Don't assume it will have the right formatting out of the box. For this
> > > sort
> > > of stuff, Pascal is way easier to write code with.
> > > 
> > > CChris
> > > 
> > > -- TR: {} comment brackets are replaced by trailing -- comments
> > > ----------------------------------------------------------------------
> > > -- * Prog.: MyProg.pas
> > > -- * Data : 19/05/2007
> > > -- * Obs. : Faz trabalho parcial de outro software
> > > -- *----------------------------------------------------------------------
> > > -- TR: call your file MyProg.ex, and process cmd line therein
> > > -- Program  MyProg (p1, p2, p3);
> > > -- TR: not needed, as we are going to use cross platform I/O routines
> > > -- Uses  Dos;
> > > constant TAMBLOCO = 10000,
> > >        NUMJOGOS = 14
> > > integer totCPs
> > > integer limTol
> > > --       aCPs     : array [1..TAMBLOCO]      of string[NUMJOGOS];
> > > -- TR: Eu doesn't know about arrays or typed sequences. This data will be
> > > used
> > > at initialisation time
> > > sequence aCPs
> > > --       mMgs     : array [1..TAMBLOCO,1..4] of byte;
> > > sequence mMgs
> > > ----------------------------------------------------------------------
> > > procedure MostHora()
> > > sequence s1, s2
> > > -- TR: files are represented by integer handles
> > > integer f9 
> > > -- TR: Eu doesn't know about instruction groups
> > > -- begin
> > > -- TR: '\' must be escaped in manifest strings, as it is the escape
> > > character
> > >     s1="C:\\Windows\\System32\\Cmd.exe "
> > >     s2="/C Time < C:\\Enter. > Wrk09.txt"
> > > --    exec(s1,s2);
> > > -- TR: please refer to the date() and time() commands. I don't know if the
> > > formats
> > > are the same as yours.
> > >     system_exec(s1 & s2,2)
> > > --    assign(f9,'Wrk09.txt');
> > > --    reset (f9);
> > >       f9 = open("Wrk09.txt","r")
> > > --    readln(f9,s1);
> > >       s1 = gets(f9)
> > >     close (f9)
> > > --    s2:=copy(s1,13,20);
> > >       s2 = s1[13..13+20-1]
> > > --    write(s2);
> > > -- TR: 1 is standard output
> > >       puts(1,s2)
> > >  end procedure
> > > 
> > > constant FieldSeparator = ',' -- hopefully
> > > include get.e
> > > 
> > > function split(sequence s)
> > > -- This convrts a flat comme separated record into parts
> > > -- converting numeric text to values
> > > integer pos
> > > sequence field,result
> > > 
> > > result={}
> > > while 1 do
> > >     pos=find(FieldSeparator,s)
> > >     field = value(s)
> > >     if field[1] = GET_SUCCESS then -- number
> > >         result &= field[2] 
> > >     else -- text
> > >         if pos>0 then
> > >             result=append(result,s[1..pos-1])
> > >         else
> > >             result=append(result,s)
> > >         end if
> > >     end if
> > >     if pos=0 then exit 
> > >     else s=s[pos+1..$]
> > >     end if
> > > end while
> > > return result
> > > end function
> > > ----------------------------------------------------------------------
> > > procedure PegaCPs()
> > > > var   comb  : string[NUMJOGOS];
> > > -- TR: this may be an integer as it will receive the gets() output
> > > object comb
> > > integer i, j
> > > integer f1
> > > -- TR: needed to read the blocks
> > > -- begin
> > >     MostHora()
> > > --    writeln('--Pegando CPs...');
> > > -- TR: The line break must be explicits
> > >       puts(1,"\n--Pegando CPs...\n")
> > > --    assign (f1,'KillCPs.txt');
> > > --    reset  (f1);
> > >       f1 = open("KillCPs.txt'","r")
> > >     totCPs=1
> > > -- TR: must initialise sequences before writing to them
> > >      3Mgs = repeat(0,TAMBLOCO)
> > >      aCPs = repeat(0,TAMBLOCO)
> > > --    while not(eof(f1)) do begin
> > > -- TR: There is no eof(), use the type of what gets() returns
> > >      while 1 do
> > > --       read(f1,comb);
> > >         comb = gets(f1)
> > >         if atom(comb) then exit end if
> > > -- TR: I assumed your record format is: a
> > > string,comma,byte1,comma,byte2,comma,byte3,comma,byte4,lineTol
> > > -- TR: the string being at most NUMJOGOS char long.
> > >         comb = split(comb)
> > >         aCPs[totCPs]= comb[1]
> > > --        for i=1 to 4 do            --lê limites
> > > --          read(f1,mMgs[totCPs,i]);
> > > --        end for
> > >         mMgs[totCPs] = comb[2..5]
> > > --       read(f1,limTol);
> > >         limTol = comb[$]
> > > --       readln(f1);
> > >          totCPs+=1
> > >      end while
> > >      totCPs-=1
> > >      for i=1 to totCPs do 
> > > --       write(aCPs[i]:NUMJOGOS);
> > > -- TR: can't use constants in formats directly
> > > -- TR: you'd have to build a format string and then apply it
> > > -- TR: I also assumed you want left justified text. Remove the '-' to get
> > > right
> > > justification.
> > >         printf(1,"%-14s",{aCPs[i]})
> > >         for j=1 to 4 do 
> > >            printf(1,"%-3s",{mMgs[i][j]})
> > >         end for
> > >         printf(1,"%-3s",{limTol})
> > > --       writeln('');
> > >         puts(1,'\n')
> > >      end for
> > >      close(f1)
> > >  end procedure
> > > ----------------------------------------------------------------------
> > > constant false = 0, true = not false
> > > procedure Expurga (sequence arqEnt,sequence arqSai,sequence arqRes)
> > > object comb --       : string[NUMJOGOS];
> > > integer f1, f2, f3
> > > integer iReg, oks 
> > > integer i, j, nTol
> > > integer nPts
> > > atom perc
> > > -- no builtin boolean type, although it was requested many times
> > > integer eBoa
> > > sequence splComb
> > >     MostHora()
> > >     puts(1,"\n--Fazendo Expurgo...\n")
> > > --    assign (f1,arqEnt);
> > > --    reset  (f1);
> > >     f1 = open(arqEnt,"r")
> > > --    assign (f2,arqSai);
> > > --    rewrite(f2);
> > >      f2 = open(arqSai,"w")
> > >      iReg=0 oks=0
> > >      while 1 do 
> > >         comb = gets(f1)
> > >         if atom(comb) then exit end if
> > >         iReg+=1
> > >         eBoa=true
<snip>

new topic     » topic index » view message » categorize

2. Re: I/O: What is the best format? (To CChris [and All])

Paulo Fernandes wrote:
> 
> CChris, still about to translate from Pascal to Euphoria: I am sending
> below some possible formats of text files. I would like to know how is
> the best to input data into my variables. The first file (KillCPs.txt)
> could be this format:
> 
> UDUDUDUDXXXXUD,3,4,5,10
> XXXUUDDDUUDXUD,2,3,6,11
> UDUUUDUDXUDDUD,2,4,4,10
> XUUUDDDDXXXXUD,3,4,5,10
> or this one:
> "UDUDUDUDXXXXUD",3,4,5,10
> "XXXUUDDDUUDXUD",2,3,6,11
> "UDUUUDUDXUDDUD",2,4,4,10
> "XUUUDDDDXXXXUD",3,4,5,10
> or this one:
> UDUDUDUDXXXXUD 3 4 510
> XXXUUDDDUUDXUD 2 3 611
> UDUUUDUDXUDDUD 2 4 410
> XUUUDDDDXXXXUD 3 4 510
> or this one:
> UDUDUDUDXXXXUD03040510
> XXXUUDDDUUDXUD02030611
> UDUUUDUDXUDDUD02040410
> XUUUDDDDXXXXUD03040510
> 
> The second file (parameter arqEnt) could be like this:
> UDUDUDUDXXXXUD
> XXXUUDDDUUDXUD
> UDUUUDUDXUDDUD
> XUUUDDDDXXXXUU
> UDUDUDUDXXXXUD
> XXXUUDDDUUDXUD
> UDUUUDUDXUDDUD
> XUUUDDDDXXXXUD
> or like this one:
> "UDUDUDUDXXXXUD"
> "XXXUUDDDUUDXUD"
> "UDUUUDUDXUDDUD"
> "XUUUDDDDXXXXUU"
> "UDUDUDUDXXXXUD"
> "XXXUUDDDUUDXUD"
> "UDUUUDUDXUDDUD"
> "XUUUDDDDXXXXUD"
> 
> Considering that 5.000.000 records will be read, I should opt which 
> format for? Or another, of couse. I am asking because you have made 
> a function 'Split' who take time of the computer. Is there some way 
> to simplify the data entrance?
> 
> Thanks a lot.
> 
> Paulo Fernandes
> Porto Alegre/RS 
> Brasil
> 
> PS: I have discovered the documentation! the EE editor! and the IDE 
>     Judith (that is not working for now)! It is a new world!!

<snip>

In my translation, I tried to respect the initial Pascal record format, because
I didn't know if you were reusing old files built with this format.

Now, there are obviously better ways. The simplest - perhaps not the fastest 
though - would be storing your items as
{"XUUUDDDDXXXXUD",3,4,5,10}. Wasn't there a 6th param read into limTol?
Then you could store using sprint(), retrieve using get() and avoid the 
need to split().
Since you appear to use actually fixed size binary data, the best shot
for speed would be to treat your files as binary, since a record has the
same number of bytes. Then you'd use get_bytes() to read and puts() to write.
I may cme up with a revised translation of your program using the above
techniques, but later, this weekend.

CChris

new topic     » goto parent     » topic index » view message » categorize

3. Re: I/O: What is the best format? (To CChris [and All])

Très bien, CChris. C'est ne pas du problem! When you have a little 
time to review the translate, go on, s'il vous plaît.
Binary files is the response...
Ah, yes, and there are another parameter (I didn't remember it), the
sixth, *limTol*.
And the Procedure MostHora (Function MostHora in Pascal) is working. The
code is this way:
------------------------------------------------------------------------
procedure   MostHora()
   sequence s1, s2
   integer  f9, rc 
   -- TR: files are represented by integer handles
   -- TR: Eu doesn't know about instruction groups -- begin
   -- TR: '\' must be escaped in manifest strings, as it is 
   --     the escape character
   s1 = "C:\\Windows\\System32\\Cmd.exe "
   s2 = "/C Time < C:\\Enter. > Wrk09.txt"
   rc = system_exec(s1 & s2,2)
   -- TR: please refer to the date() and time() commands. 
   --     I don't know if the formats are the same as yours.
   --     system_exec(s1 & s2,2)
   --     assign(f9,'Wrk09.txt');
   --     reset (f9);
   f9 = open("Wrk09.txt","r")
   s2 = gets(f9)
   s2 = s2[13..13+11-1]
   --     readln(f9,s1);
   --     s2: = copy(s1,13,20);
   --     write(s2);
   -- TR: 1 is standard output
   puts(1,s2)
   close (f9)
end procedure
------------------------------------------------------------------------
Now I know that there are the functions date() and time() but I prefer
to keep the original function in the same way.
A question: the interpreter give me an error message when I declare the
variable of a 'loop for'. Is this correct?

Thanks a lot!
God blessed You!

Paulo Fernandes
Brasil



CChris wrote:
> 
> Paulo Fernandes wrote:
> > 
> > CChris, still about to translate from Pascal to Euphoria: I am sending
> > below some possible formats of text files. I would like to know how is
> > the best to input data into my variables. The first file (KillCPs.txt)
> > could be this format:
> > 
> > UDUDUDUDXXXXUD,3,4,5,10
> > XXXUUDDDUUDXUD,2,3,6,11
> > UDUUUDUDXUDDUD,2,4,4,10
> > XUUUDDDDXXXXUD,3,4,5,10
> > or this one:
> > "UDUDUDUDXXXXUD",3,4,5,10
> > "XXXUUDDDUUDXUD",2,3,6,11
> > "UDUUUDUDXUDDUD",2,4,4,10
> > "XUUUDDDDXXXXUD",3,4,5,10
> > or this one:
> > UDUDUDUDXXXXUD 3 4 510
> > XXXUUDDDUUDXUD 2 3 611
> > UDUUUDUDXUDDUD 2 4 410
> > XUUUDDDDXXXXUD 3 4 510
> > or this one:
> > UDUDUDUDXXXXUD03040510
> > XXXUUDDDUUDXUD02030611
> > UDUUUDUDXUDDUD02040410
> > XUUUDDDDXXXXUD03040510
> > 
> > The second file (parameter arqEnt) could be like this:
> > UDUDUDUDXXXXUD
> > XXXUUDDDUUDXUD
> > UDUUUDUDXUDDUD
> > XUUUDDDDXXXXUU
> > UDUDUDUDXXXXUD
> > XXXUUDDDUUDXUD
> > UDUUUDUDXUDDUD
> > XUUUDDDDXXXXUD
> > or like this one:
> > "UDUDUDUDXXXXUD"
> > "XXXUUDDDUUDXUD"
> > "UDUUUDUDXUDDUD"
> > "XUUUDDDDXXXXUU"
> > "UDUDUDUDXXXXUD"
> > "XXXUUDDDUUDXUD"
> > "UDUUUDUDXUDDUD"
> > "XUUUDDDDXXXXUD"
> > 
> > Considering that 5.000.000 records will be read, I should opt which 
> > format for? Or another, of couse. I am asking because you have made 
> > a function 'Split' who take time of the computer. Is there some way 
> > to simplify the data entrance?
> > 
> > Thanks a lot.
> > 
> > Paulo Fernandes
> > Porto Alegre/RS 
> > Brasil
> > 
> > PS: I have discovered the documentation! the EE editor! and the IDE 
> >     Judith (that is not working for now)! It is a new world!!
> 
> <snip>
> 
> In my translation, I tried to respect the initial Pascal record format,
> because
> I didn't know if you were reusing old files built with this format.
> 
> Now, there are obviously better ways. The simplest - perhaps not the fastest
> 
> though - would be storing your items as
> {"XUUUDDDDXXXXUD",3,4,5,10}. Wasn't there a 6th param read into limTol?
> Then you could store using sprint(), retrieve using get() and avoid the 
> need to split().
> Since you appear to use actually fixed size binary data, the best shot
> for speed would be to treat your files as binary, since a record has the
> same number of bytes. Then you'd use get_bytes() to read and puts() to write.
> I may cme up with a revised translation of your program using the above
> techniques, but later, this weekend.
> 
> CChris

new topic     » goto parent     » topic index » view message » categorize

4. Re: I/O: What is the best format? (To CChris [and All])

<snip>
> Now I know that there are the functions date() and time() but I prefer
> to keep the original function in the same way.
> A question: the interpreter give me an error message when I declare the
> variable of a 'loop for'. Is this correct?
> 
> Thanks a lot!
> God blessed You!
> 
> Paulo Fernandes
> Brasil

It's correct, for loops variables are created automatically, you can read about
it here: http://www.rapideuphoria.com/refman_2.htm#for

Best regards,
    Guillermo Bonvehí

new topic     » goto parent     » topic index » view message » categorize

5. Re: I/O: What is the best format? (To CChris [and All])

Paulo Fernandes wrote:
> 
<snip>
> Now I know that there are the functions date() and time() but I prefer
> to keep the original function in the same way.
> A question: the interpreter give me an error message when I declare the
> variable of a 'loop for'. Is this correct?
> 
> Thanks a lot!
> God blessed You!
> 
> Paulo Fernandes
> Brasil
> 

<snip>
That's ok, I'll keep the original stuff; after all it's not a performance
killer since you invoke it only once. 

Yes, loop index variables should not be declared. This is justified because
loop indexes have a very peculiar definition scope, which is unlike any
declared variable. If I remember correctly, Pascal allows you to define the
index and so reuse it after the loop finished. That could be useful, but if 
you need this, you currently have to use a while loop and increment manually
some counter you declared, which behaves as just another variable.

Obrigado.
CChris
PS: while I manage to understand simple portuguese, I'd have problem writing
it, because I'm far more fluent in spanish.

new topic     » goto parent     » topic index » view message » categorize

Search



Quick Links

User menu

Not signed in.

Misc Menu