Re: HELP with 'seek'

new topic     » goto parent     » topic index » view thread      » older message » newer message

Al,

> Hope this helps some, if not perhaps in another project.Good luck,
>             --Al Getz

Yes, it helped a lot, although I not finished studying your concepts. Your
response has been something of a mini-course on data access, for me.

Thanks for taking some much time to explain your thinking.
I will be a while studying it and implementing your ideas.

> if your file is so large its not practical to read through
 > the whole file at startup,

Yeah, some of the files I'll be processing are pretty large, 17,000 records
of 100 bytes each (1.7million bytes). Your solution 'D' sounds unique...
and one of the first I'll implment.

Thanks for all the help.

Regards,

Jim Duffy
I
Al Getz wrote:

> On reading files using r, rb seek(), get(), gets() , where() etc.:
>
> I dont know if this helps but, i was working on a project some time
> ago that required repeated comparisons of data stored in a file in the form
> of text (the file written using printf() statements).  The problem was,
> because you never knew where you were going to have to go next in the
> file and the file was very large, you couldnt take the time to read
> through the whole file time after time just to get one or two items
> randomly placed in the file. Since it was all text lines gets() seemed
> like a good candidate method for reading back the data without having to
> program a get() interface to allow random accessing of  variable
> length 'records'.
>
>   The solution came out somewhat simple:
>
>   [A]
>
>   1.  open the file in "r" mode
>   2.  while reading all the records through once do:
>           do a where() followed by a gets() and
>           log all the addresses returned by where() in a sequence.
>           If you want you can save the first two letters in the same
>           sequence to function as a hash alpha lookup. This would yield
>           sub seq's such as: {100,"ab"},{123,"ac"},...
>   3. now that you have the address for every text group its simply a
>      matter of using seek() followed by gets() to get to the data.
>
>   [B]
>
>   if your data is mixed (not all text) then you simply use an
>   alternating series of gets() for text fields and get() for other
>   fields.  You only need to log 'where()' addresses once for the
>   first field of each 'record'.
>   Ultimately, also record the next record address within the file and
>   you've got a linked list on the next run. Record the previous address
>   also and you can query up and down the list as well.
>
>   [C]
>
>   if your data is doubly mixed (not all records are the same type)
>   then simply make the first field the format type identifier.
>
>   [D]
>
>   if your file is so large its not practical to read through
>   the whole file at startup, you can start a separate file to record
>   where()'s whenever a record IS found during the normal application
>   run. Each time the app runs more and more records are located
>   making the time to locate data less and less each time.  Of course
>   when something is added to the file the location is stored at the
>   same time.
>
>   [E]
>   ive also had great success with using home made delimiters chosen
>   such that an occurance of the delimiter char(s) never or seldom
>   occur naturally in the target data, or only in known locations.
>   Filenames are a good example as quite a few characters are not
>   allowed.
>
>
>
>    Hope this helps some, if not perhaps in another project.Good luck,
>               --Al Getz

new topic     » goto parent     » topic index » view thread      » older message » newer message

Search



Quick Links

User menu

Not signed in.

Misc Menu