1. Update file

Hello, all!

Could someone please provide or direct me to a sample of the proper use
of the "u" (update) mode in 'open("infile","u").
I've been trying to use it as "random" access as in QBASIC, but that is
apparently wrong. My updates totally corrupt the file.
Thanks,
Ray in Colorado

new topic     » topic index » view message » categorize

2. Re: Update file

Ray Connolly wrote:
> Could someone please provide or direct me to a sample of the proper use
> of the "u" (update) mode in 'open("infile","u").
> I've been trying to use it as "random" access as in QBASIC, but that is
> apparently wrong. My updates totally corrupt the file.

        I'm a bit confused about this too..
        You might want to clarify this to us and in your documentation.
        What exactly happens when i write to the file in update mode and then
read from it.
        Is it true that when i read one byte and then write that same byte out
again, the file is at least 2-bytes long and the first two bytes are the
same ?
        So to double all values in a file, i have to read one byte, seek to the
previous byte and then write the new calculated byte over the old one ??
        (I think the way that qbasic/QuickBasic opened files for random acces
was very nice, and a good idea for a future version of Euphoria)

Ralf Nieuwenhuijsen
nieuwen at xs4all.nl

new topic     » goto parent     » topic index » view message » categorize

3. Re: Update file

>        (I think the way that qbasic/QuickBasic opened files for
>random acces was very nice, and a good idea for a future version of
Euphoria)

If you are talking about the being able to load up a record like this:
OPEN "this.dat" FOR RANDOM AS #1
TYPE that
        a AS STRING
        b AS INTEGER
END TYPE
DIM this(1 TO 5) AS that
FOR i = 1 TO 5
        GET #1, , this(i)
NEXT i

to fill up this(1).a and this(1).b to this(5).a and this(5).b, with only
one call for each, the INTEGER and stuff destroys the simplicity of
Euphoria... Try this:
include file.e
include get.e
sequence this
atom fn
fn = open("this.dat", "u")
this = get(fn)
this = this[2]

will work, but not be compatible with one done by Qbasic. But it will
load it up like this:
{{"String", 4}, {"String2", 19}..... and you can access it like:

this[2][A] where the constant A is 1, and the constant B is two. This
lets you do what I am doing in my Asteroids clone:
player[2][ENERGY] = player[2][ENGERGY] - 1
as opposed to Qbasics:
player(2).energy = player(2).energy - 1
Simple..... Hope this helps... (And helps you to understand that there is
a 99.9% chance that the random access you are thinking of will never come
to Euphoria.)
Oh, and to write the data to the disk, instead of using PUT for a certain
updated record for a database, just place the whole sequence back:
print(fn, this)

new topic     » goto parent     » topic index » view message » categorize

4. Re: Update file

Robert B Pilkington wrote:

> If you are talking about the being able to load up a record like this:

> OPEN "this.dat" FOR RANDOM AS #1

> TYPE that

>         a AS STRING

>         b AS INTEGER

> END TYPE

> DIM this(1 TO 5) AS that

> FOR i = 1 TO 5

>         GET #1, , this(i)

> NEXT i



        No, i didn't meant it like that, i ment with the use of field and LEN

=.

        The problem with writing to a file is that unlike Euphoria, a file does

work with fixed size lengths and does require you to know in which

sequence (not the euphoria word sequence) they will be read & written to

a file.

        Euphoria is perfect for any type of data management cause of it's

flexible memory structure, but sometime we need to really update a file

in real-time and then it doesn't give you the same flexibility that

Euphoria's memory management does.

        I once wrote a routine that compiles an Euphoria sequence with full

depth with any kind of data to a very compact serie of bits remaining

the original structure. Later on i used the compression routines from

eh. Daniel Bernstein to make EDOM. (Routines that save & load a whole

sequence to a file remaining all the original data and structure). This

was too slow, and still had no option for real-time updating.



> Try this:

> include file.e

> include get.e

> sequence this

> atom fn

> fn = open("this.dat", "u")

> this = get(fn)

> this = this[2]

>

> will work, but not be compatible with one done by Qbasic. But it will

> load it up like this:

> {{"String", 4}, {"String2", 19}..... and you can access it like:



        Yes, and will, i'm afraid be *very* inefficient with the size. Saving a

sequence like this: " { { 345, 234 } , 123 } " isn't good for a large

amount of data. I mean look at the spaces and the fact that two numbers

wich could be represented by a byte, are now 3 bytes long writting out

as a number. And how can we look up the 3rd sequence/object we saved to

disk. No real-time updating is impossible, cause we aren't working with

a fixed size record.



> Simple..... Hope this helps... (And helps you to understand that there is

> a 99.9% chance that the random access you are thinking of will never come

> to Euphoria.)



> Oh, and to write the data to the disk, instead of using PUT for a certain

> updated record for a database, just place the whole sequence back:

> print(fn, this)



        Yeah, it writes a sequence like this: {{"Euphoria Rulezzzz"},12}

        out to a file like this:

        {       {       {   69, 87, ......etc.

        All the tabs and spaces, and the fact that the string is represented by

numbers and the fact that the structure is using up a byte for each

item, is not very smart for a 10000+ record database is it???



Ralf Nieuwenhuijsen

nieuwen at xs4all.nl

new topic     » goto parent     » topic index » view message » categorize

5. Re: Update file

Ray Connolly writes:
> Could someone please provide or direct me to a sample of the proper use
> of the "u" (update) mode in 'open("infile","u").

You can use the functions where() and seek() to position yourself
in the file. Each open file has a "current position" based on reads
and writes to the file. You can find out the current position using where().
You can set the current position using seek().

Here's a simple example. It opens a file using "ub" - binary update mode.
I think it's better to use binary mode than text mode, with seek() and
where(),
because in text mode you don't read \r characters, but they are still counted
as far as the seek position is concerned, so things might get confusing.
The example records the starting position of each line in the "test.dat" file.
It is then able to pick a line at random and replace it with a string of X's.

-- update.ex
-- demo of binary update mode "ub"

include file.e

integer fn, replace_start, replace_len
integer replace_line, nlines
object line
sequence line_start

fn = open("test.dat", "ub")
if fn = -1 then
    puts(1, "Couldn't open test.dat\n")
    abort(1)
end if

-- read each line and record its starting byte position
line_start = {}
while 1 do
    line_start = append(line_start, where(fn))
    line = gets(fn)
    if atom(line) then
        exit
    end if
end while

-- pick a line at random and replace it with all 'X' characters
nlines = length(line_start)-1 -- last one is EOF position
if nlines = 0 then
    abort(1)
end if
replace_line = rand(nlines)
printf(1, "Replacing line number %d\n", replace_line)
replace_start = line_start[replace_line]
replace_len = line_start[replace_line+1] - replace_start
if seek(fn, replace_start) then
    puts(1, "seek failed!\n")
    abort(1)
end if
puts(fn, repeat('X', replace_len-2) & "\r\n") -- need \r in binary mode
close(fn)
puts(1, "see \"test.dat\"\n")
-----------------------------------------------

test.dat:

ABCDEFGHIJKL
012345678
00000001111111111
etc. etc. etc.

--------------------------------------------------

Regards,
  Rob Craig
  Rapid Deployment Software

new topic     » goto parent     » topic index » view message » categorize

6. Re: Update file

On 18 Sep 99 , Ralf Nieuwenhuijsen wrote:

> the original structure. Later on i used the compression routines from
> eh. Daniel Bernstein to make EDOM. (Routines that save & load a whole

        Actually, my name is Daniel Berstein ;)

>         All the tabs and spaces, and the fact that the string is represented b
y
> numbers and the fact that the structure is using up a byte for each
> item, is not very smart for a 10000+ record database is it???

        The way you can use PRINT() and GET() is quite useful for saving to
disk complex data structures, as binary trees... that means that you
can have an indexed  database without a separate index file (like
Xbase's .DBF and .I?X).
        The problem would be how to load the complete database into memory
(without spending 'n' Mb of RAM). The solution would be to break your
database into several "prints" ({},{}).
        Some time ago I said I was going to update my dbf routines... well I
decided to spend my time in creating a better (more flexible)
database format. I was thinking in create an Euphoria database system
and just code a DBF to this new format transaltor... "please allow
2-3" (Robert Craig words) months to be ready, at this step it will
be released at the same time Euphoria for win32 alpha is ;)

Regards,
  Daniel Berstein
  danielberstein at usa.net
  http://www.geocities.com/SiliconValley/Heights/9316

new topic     » goto parent     » topic index » view message » categorize

7. Re: Update file

Robert Craig wrote:
>
> I think it's better to use binary mode than text mode, with seek() and
> where(),
> because in text mode you don't read \r characters, but they are still counted
> as far as the seek position is concerned, so things might get confusing.

Ahah! Thank you, sir!

Ray in Colorado

new topic     » goto parent     » topic index » view message » categorize

8. Re: Update file

Robert Craig wrote:

>You can use the functions where() and seek() to position
>yourself in the file. Each open file has a "current position"
>based on reads and writes to the file. You can find out the
>current position using where(). You can set the current position
>using seek().

>Here's a simple example. It opens a file using "ub" - binary
>update mode. I think it's better to use binary mode than text
>mode, with seek() and where(), because in text mode you don't
>read \r characters, but they are still counted as far as the
>seek position is concerned, so things might get confusing. The
>example records the starting position of each line in the
>"test.dat" file. It is then able to pick a line at random and
>replace it with a string of X's. =


Hello Rob,

=46rom your example I deduce that in update mode a line can be replaced, =
but
only if the new line is exactly the same length as the old one. Is this t=
he
only way to work in update mode? Is it for instance not possible to inser=
t
a line in between other lines?
I know it is possible to append lines at the end of a file, but for that
purpose you also have the append mode.
Personally when I update a file, I first read them into (a) sequence(s),
close the file and after updating I open the file again for writing. Mayb=
e
this is more time-consuming, but when time is not that important, at the
beginning and end of a program.....

Sincerely,

Ad Rienks
email Ad_Rienks at compuserve.com
writing at 19:08 , =

on maandag 22 september 1997
Using EMail Assist for WinCIM

new topic     » goto parent     » topic index » view message » categorize

9. Re: Update file

Ad Rienks wrote:

>from your example I deduce that in update mode a line can be replaced, =
> but
> only if the new line is exactly the same length as the old one. Is this t=
> he
> only way to work in update mode? Is it for instance not possible to inser=
> t
> a line in between other lines?
> I know it is possible to append lines at the end of a file, but for that
> purpose you also have the append mode.
> Personally when I update a file, I first read them into (a) sequence(s),
> close the file and after updating I open the file again for writing. Mayb=
> e
> this is more time-consuming, but when time is not that important, at the
> beginning and end of a program.....
>
You "could" insert a longer line ( more bytes } into a disk file, but
you'd have to move all the following bytes to make room. That would be
time-consuming. This has nothing to do with Euphoria, any programming language
requires this, except for a full data base engine, which solves the problem of
inserts and deletes by creating links and indexes to pieces of data. New pieces
are stored at the end of the physical file, and retrieved via list pointers.

All this moving and linking is automatic when using sequences.
Trouble is, that it can take a looong time to read/write a big sequence
from disk. Too long, when the data file begins to reach a usable length.
A lot of the extra time is used up in reading/writing two, three or four
bytes for each data byte (ascii character, for example.)
.
Irv

new topic     » goto parent     » topic index » view message » categorize

10. Re: Update file

Ad Rienks writes:
> From your example I deduce that in update mode a line can be replaced,
> but only if the new line is exactly the same length as the old one. Is this
> the only way to work in update mode? Is it for instance not possible to
> insert a line in between other lines?

There's no easy way to insert a line. Actually, your file doesn't have
to be made up of lines. You could have any kind of data in it.

> I know it is possible to append lines at the end of a file, but for that
> purpose you also have the append mode.
> Personally when I update a file, I first read them into (a) sequence(s),
> close the file and after updating I open the file again for writing.
> Maybe this is more time-consuming, but when time is not that important,
> at the beginning and end of a program.....

I agree. For small to medium sized files, it's simpler to just read the
whole file into memory, manipulate it as a bunch of sequences,
and then write out the new version. That's what ed.ex and
demo\mydata.ex do.

Where update mode and seek() and where() would be crucial,
would be in the case of a huge database. Suppose you had
a 100 Mb file and you wanted to update record # N
in the middle of it. It would be very time consuming to have to
read all the records up to record N and write them out to a new file,
then write out a new record N, then read and write records N+1
to the end. If you had a small index file showing where each record
in the big file started, you could quickly seek to any record and
update it.

To make insertion of records practical, real databases are
not implemented as contiguous records in a
huge file, like my example above, but rather as "B-trees" or equivalent,
with indexes that point to small fixed-size pages of contiguous disk space
that
might contain at most a few hundred consecutive records. Inserting a record
into a
4Kb page that's only half full, by shifting a bunch of other records, is quite
reasonable.
Inserting into a 100Mb file would be very painful.

Regards,
   Rob Craig
   Rapid Deployment Software

new topic     » goto parent     » topic index » view message » categorize

Search



Quick Links

User menu

Not signed in.

Misc Menu