Re: RDMS Database's and/or Record Managers What good is Euphoria?
- Posted by "Kat" <gertie at visionsix.com> Jan 23, 2004
- 447 views
On 23 Jan 2004, at 2:20, Isaac Raway wrote: > > > Kat wrote: > > >>Well, I've never used strtok, but I will give it a try now that you > >>mention it. It looks to be most useful for parsing source code (which is > >>great), but not for creating a database. Databases should not be doing any > >>string tokenization of any kind. That wouldn't be a database, it'd be a flat > >>text file. > >> > >> > >Strtok was written to do data handling as far away from strict Pascal and C > >records as possible. Been there, done that, not again. > > > > > I didn't say anything about Pascal or C records. I'm talking about > records in a relational database table, or a more flexible system in a > object persistence system. I didn't write the disk access methods, or the memory storage methods, used by euphoria programs. And i didn't say the strtok lib made up a whole and complete database "system". > >> Databases use virtual file systems with paged > >>storage methods and separate indexes, and most importantly, always > >>access their data files in a binary fashion. > >> > >> > >Not always. I found most of my accesses are wildmatches. Can't do a btree > >with > >wildcards. And as for "relational", Tiggr's has pointers to other data items, > >executeable code, etc. And the db is not stored in memory. And the > > > > > I didn't mean memory by paged storage methods. I mean that the > database's actual disk file uses it's own pages, or sectors, to store > the data. So, it allocated pages when needed, and markers them as > deleted to be reused just like a physical hard drive. For seriously > large databases with variable sized records, this is the only practical > way to handle the data. I didn't write the disk access methods, or the memory storage methods, used by euphoria programs. And i didn't say the strtok lib made up a whole and complete database "system". Besides, for R&D purposes, my purposes, for this moment, right now, not necessarily appropriate for you, i prefer they stay in a human-readable form. Making them fixed-anything, or coded, or anysuch munging would just make things more difficult for me right now. > >contents are not fixed in any way or form. > > > For a business application, that could be a bad thing, though personally > I do like flexibility to add fields to records, etc. To each their own solution. > >If i turn her lose, there is no way > >the 5 gigs of data can be put in a Eu sequence in memory,, i'd need 20gig to > >load it, and another 40gig to manipulate it, and i don't have the money to > >waste like that. How you store and access the data on the drive is not > >strtok's > >business. > > > > > It should be, because serious databases have most of their data on the > HDD most of the time. I didn't write the disk access methods, or the memory storage methods, used by euphoria programs. And i didn't say the strtok lib made up a whole and complete database "system". > >But recently, someone asked about the access speed for a 1meg file, i gave > >code > >to read it in one sec, and parse the lines out in 1-2 sec (233mhz cpu), as if > >each line was a data item. Each line could have held separate fields, free > >form > >lengths, and each of those could have held more fields, etc etc. > > > > > Most SQLs read from multimegabyte files much faster than that. If you > can count access time in seconds, you're in trouble. I didn't write the disk access methods, or the memory storage methods, used by euphoria programs. And i didn't say the strtok lib made up a whole and complete database "system". Besides, not much happens quickly on a multitasking OS like win95 on a 233mhz cpu. I am sure it would have done faster on dos32. And the time i quoted was one file, of one megabyte, not a small file on a database of 1 million records. The small file in one million records would have been read much much faster. Heck, using an direct windoze api call would have been much faster. > Through all of this discussion, please keep in mind that I am talking > about truly massive databases. At least thousands of records. Really, "at least thousands"? Let me look at harddrive T:/ here.... 3.4 Gigabytes, 353,815 "records" in 1,185 "folders". The Eu plugin for mirc could open 15 "folders", read in one record from each, parse out a field from each record, parse out the fields in that field, and display the data almost before i could get my finger off the <enter> key. I was doing natural language parsing with 150,000+ "records", each with multiply nested "sub-records". I used getxml() heavily, which isn't in the current release of strtok, no one was interested in it. To get carried away, my translator files have outgrown the 2 gigabyte partition they started out on. Is 2Gigs a big enough database for you to consider "massive"? That's E:/. One of it's "records" has 355,742 "sub-records". How about if you add the aforementioned T:/ to E:/? What if you toss in F:/ with it's 4.6 Gigs of data (in 105,052 "records" )? <kat> apa khabar? <[Tiggr]> all's kool here> >YMMV > > > > > Forgive me, I don't understand this... ? <kat> Tiggr, can you tell me what YMMV means? <[Tiggr]> Your Mileage May Vary Kat, just luvs it when people feel the need to explain such simple concepts as "massive" to her.
( say something Lomax.... )