1. Searching for Data in EDS
- Posted by "C. K. Lester" <cklester at yahoo.com> Nov 11, 2002
- 448 views
Let's say I have records set up like this: key: X, data: Y where X and Y are integers (atoms). How would I find all records with Y = 2? Do I just have to loop through all the records? What if I have a million records?! Elsewhere, anybody already have good code for managing EDS files to handle cases like this (a function that returns a sequence with the record numbers of all records where Y = 2)...? Thanks! ck
2. Re: Searching for Data in EDS
- Posted by irv at take.maxleft.com Nov 11, 2002
- 442 views
On Monday 11 November 2002 03:27 pm, you wrote: > > > From: C. K. Lester [mailto:cklester at yahoo.com] > > > > Let's say I have records set up like this: > > > > key: X, data: Y > > > > where X and Y are integers (atoms). > > > > How would I find all records with Y = 2? > > > > Do I just have to loop through all the records? What if I > > have a million > > records?! > > > > Elsewhere, anybody already have good code for managing EDS > > files to handle > > cases like this (a function that returns a sequence with the > > record numbers > > of all records where Y = 2)...? > > Your best bet is to use indices (there was a fairly in depth treatment on > this topic previously--I believe the question was regarding record albums, > and I think Irv was the main responder), where you keep a separate > table/record to remember all the keys which have certain properties. If you are likely to have duplicate Y values, then you might save lookup time by creating a unique value index. For example, given the following records: key val 1 99 2 102 3 99 4 25 5 99 You can build an index like this: key val 25 {4} 99 {1,3,5} 102 {2} So looking up a given value in the index is very fast (binary lookup?) and then looking up the individual records which contain that value is simply a matter of iterating thru the short list. Whether this is really practical depend on how many of your values are actual dupes, how often the database is updated, and whether there is a lot of extra data that goes with each record. Regards, Irv
3. Re: Searching for Data in EDS
- Posted by "C. K. Lester" <cklester at yahoo.com> Nov 11, 2002
- 484 views
Irv, do you save the index to the database? I use key -1 as the field name record. I'm thinking to use -2 or somesuch negative value as the index. Is it proper to store the index in the database file itself, or should that be kept separate? I wouldn't think one would want to regen the index every time it's opened...
4. Re: Searching for Data in EDS
- Posted by irv at take.maxleft.com Nov 12, 2002
- 462 views
On Monday 11 November 2002 05:49 pm, you wrote: > > Irv, do you save the index to the database? > > I use key -1 as the field name record. I'm thinking to use -2 or somesuch > negative value as the index. Is it proper to store the index in the > database file itself, or should that be kept separate? I wouldn't think one > would want to regen the index every time it's opened... I store the indices as tables in the database. For a more realistic example, suppose I have an employee database, and want quick access by lastname, city, state, zipcode, work location. My data would be stored in the main table with a unique key (ssn), and I would have additional tables for name_idx, city_idx, state_idx, zip_idx, loc_idx. Since EDS has built-in protection against storing dupe keys, I can use very simple code to store the index. For example, to create zip_idx, I just read the main data table sequentially, extract each zipcode, and use that as a key into the zip_idx table. If a record with that key doesn't exist, I can add a record: key = zip data = {ssn} if that key already exists, I can load that record, append the ssn to the data, and store it back, so the record now looks like: key = zip data = {ssn1, ssn2,....} So you can see that getting all employees in a given zip code, for example, is dead simple and quick. 1. from the zip_idx table, lookup the record with the key (zip) 2. for each ssn in the retrieved data, read the record from the main table using the key (ssn) Tiny fraction of a second, and you're done. Regenerating the indexes each time the db is opened is not necessary, but you do need to write code to update all indexes with each addttion, deletion or change. But (based on experience) you should also write re-indexing routines, they will come in handy. Regards, Irv