1. Robert...EDS - questions/comments
- Posted by Jonas Temple <jktemple at yhti.net> Feb 12, 2001
- 507 views
Robert, I have some questions/comments concerning the EDS: 1. Say I have a table with a key structure of : {atom, atom}. I insert 3 records with the following key values: {1 , 1} {1, 2} {1, 3}. I add another 10,00 entries with other key values. I then add {1, 4} {1, 5}. Now I want to get all records with the first key value of 1. Assuming I use db_find_key and find and read the first three records, would I have to read through the next 10,000 to get the last two records? If I checked for the first key value to change and then drop out of my read loop I would think I would miss the last two records for key value of 1. Is this right? If so, would it be possible to add a function to "retrieve_next_record_by_partial_key"? I suppose I could increment the second key value by 1 and use db_find_key but what if the second key value was not sequential? 2. Somewhat relating to question 1, if I used db_find_key with a key value of {1, 0} (assuming there will never be a record with a second key value of 0) I would get a negative result. If I am reading the docs right, the negtaive number would be the record number if it were inserted into the file. For example, if the db_find_key returned -4, would this mean that if I were to retrieve record 4, would it have a key value of {1, 1} (assuming key value {1, 1} is in the file)? 3. Let me say that I really like the simplicity and ease-of-use with the EDS. I have written programs using direct calls to Borland's Database Engine and it's not pretty. Jonas
2. Re: Robert...EDS - questions/comments
- Posted by matthewwalkerlewis at YAHOO.COM Feb 12, 2001
- 500 views
--- Jonas Temple <jktemple at yhti.net> wrote: > 1. Say I have a table with a key structure of : {atom, atom}. I insert > 3 records with the following key values: {1 , 1} {1, 2} {1, 3}. I add > another 10,00 entries with other key values. I then add {1, 4} {1, 5}. > Now I want to get all records with the first key value of 1. Assuming I > use db_find_key and find and read the first three records, would I have > to read through the next 10,000 to get the last two records? If I > checked for the first key value to change and then drop out of my read > loop I would think I would miss the last two records for key value of 1. > Is this right? No. The records are automatically 'sorted' by key with regards to record numbers. Another way to do this would be to use indices with your db. I have some plans to do just that with my EuSQL package, which should make [some] queries very fast. Right now I'm getting insert/update queries to work with literal values and parameters. Probably indices will be the next thing I tackle (don't hold your breath, though :). Currently, EuSQL would search through all records looking for keys starting with '1', although the syntax for doing so would be easy, assuming you've defined the key as a couple of fields "select * from tablename where key.field1 = 1". (Actually, you can't use '*' in the version currently at RDS, but I should probably have an update by the end of the week which will allow delete, insert, update and parameterized queries.) Matt Lewis
3. Re: Robert...EDS - questions/comments
- Posted by Kat <gertie at PELL.NET> Feb 12, 2001
- 484 views
On 12 Feb 2001, at 12:34, matthewwalkerlewis at YAHOO.COM wrote: > > Another way to do this would be to use indices with your db. I have some > plans > to do just that with my EuSQL package, which should make [some] queries very > fast. I have a problem that can only be solved with brute force, since it involves comparisons with 150,000 words, minimum,,, and if that works out, it could be expanded to a million. Problem is, 150K comparisons takes 82 seconds on my K2-6-266, and i am not convinced buying a faster puter will solve it, simply because that is a software based retrieval and compare solution. Even if a 5x faster 1Ghz dedicated puter were thrown at the problem, a 10 word sentence would still take 3 minutes to run, which is intolerable. I wish i could get my hands on one or more of the mythical Lisp machines, where hardware was thrown at the problem. Has anyone else given this a thought? Has anyone here met one of these machines, or know about them? Kat
4. Re: Robert...EDS - questions/comments
- Posted by Michael Sabal <mikes at notations.com> Feb 12, 2001
- 496 views
Could I see your compare routine? Maybe there's a way to sort the data so = 150K~1M compares aren't necessary. (NB: the sort routine could be run = when Tiggr recognizes she's the only one in the room. The learning may be = a bit slower if there is significant lag between updating the DB and = sorting it; but then a priority level could be assigned to that task to = let it be done more often.) =20 Just a thot.... Michael J. Sabal >>> gertie at PELL.NET 02/12/01 04:06PM >>> I have a problem that can only be solved with brute force, since it = involves comparisons=20 with 150,000 words, minimum,,, and if that works out, it could be expanded = to a=20 million.
5. Re: Robert...EDS - questions/comments
- Posted by Kat <gertie at PELL.NET> Feb 12, 2001
- 477 views
On 12 Feb 2001, at 13:19, Michael Sabal wrote: > Could I see your compare routine? Maybe there's a way to sort the data so > 150K~1M > compares aren't necessary. If one assumes every word in the sentence is a typo, then every word must be compared to find a tree of possible correct words for the entire sentence. Anything else is throwing away info. >(NB: the sort routine could be run when Tiggr recognizes she's > the only one in the room. Sorta useless to reply to conversation only after everyone leaves, isn't it? Kat >The learning may be a bit slower if there is significant lag > between updating the DB and sorting it; but then a priority level could be > assigned to > that task to let it be done more often.) > > Just a thot.... > > Michael J. Sabal > > >>> gertie at PELL.NET 02/12/01 04:06PM >>> > I have a problem that can only be solved with brute force, since it involves > comparisons > with 150,000 words, minimum,,, and if that works out, it could be expanded to > a million. > > >
6. Re: Robert...EDS - questions/comments
- Posted by Robert Craig <rds at RapidEuphoria.com> Feb 12, 2001
- 483 views
Jonas Temple writes: > 1. Say I have a table with a key structure of : {atom, atom}. > I insert 3 records with the following key values: > {1 , 1} {1, 2} {1, 3}. > I add another 10,00 entries with other key values. > I then add {1, 4} {1, 5}. > Now I want to get all records with the first key value of 1. > Assuming I use db_find_key and find and read the > first three records, would I have to read through the > next 10,000 to get the last two records? No. As Matt Lewis pointed out, the records are always organized in order of key value. That allows a fast binary search to be used to find any key. Sequences are sorted in the usual "alphabetic" way, with the first element being the most significant. > 2. Somewhat relating to question 1, if I used db_find_key > with a key value of {1, 0} (assuming there will never be > a record with a second key value of 0) I would get > a negative result. Yes. > If I am reading the docs right, the negative number > would be the record number if it were inserted into the file. Yes. > For example, if the db_find_key returned -4, would this mean > that if I were to retrieve record 4, would it have a key > value of {1, 1} (assuming key value {1, 1} is in the file)? Yes, the -4 tells you that if {1,0} were inserted right now, it would be the 4th record. Since you haven't actually inserted it yet, the current 4th record would be the record that comes after {1,0}. {1,1} in your case. Regards, Rob Craig Rapid Deployment Software http://www.RapidEuphoria.com
7. Re: Robert...EDS - questions/comments
- Posted by Michael Sabal <mikes at notations.com> Feb 13, 2001
- 519 views
I'm going to make a few assumptions about your dictionary format that may = be way off. If they are, then probably most of what I say will be = useless; but here goes anyway. I assume you have a dictionary that Tiggr can look in to determine the = meaning of words she reads, and in which she can find appropriate words = with which to respond. In order to know the appropriateness of a word, I = assume you have a class field as part of the dictionary entry. For = example, a homonym with both a casual class entry and a technical class = entry, but different meanings, would have to be decided between based on = the context of the discussion. You must have a means for Tiggr to learn new vocabulary based on the = context. Why not allow Tiggr to learn typos in the same way she learns = other vocabulary, but with a class that prevents her from using the typo = in her own responses? Also consider that about half the words in a sentence are grammatical. = That means that, based on the position in the sentence, grammatical typos = need only be compared to grammatical words and not the other half-million = nouns,verbs, adjectives, etc. in the dictionary. If the word expected is = a noun, compare the typo to only nouns, etc. This is a large reason why I = chose to go with a hex-based language for internal processing, even though = the overhead of translating to that language would be a bit higher. As for my statement about sorting during down-times, I was referring only = to sorting, not to comparing. The compare would obviously have to happen = during the conversation. Michael J. Sabal >>> gertie at PELL.NET 02/12/01 04:41PM >>> If one assumes every word in the sentence is a typo, then every word must = be=20 compared to find a tree of possible correct words for the entire sentence. = Anything=20 else is throwing away info. >(NB: the sort routine could be run when Tiggr recognizes she's > the only one in the room. =20 Sorta useless to reply to conversation only after everyone leaves, isn't = it? Kat