Re: Out of memory error
- Posted by Larry D Poos <ldpoos at JUNO.COM> Apr 27, 1997
- 542 views
On Fri, 25 Apr 1997 21:41:28 -0400 Robert Craig writes: >Larry D Poos writes: >> When I ran it in a DOS window with Win3.1 and it read the entire file >The filesort.ex demo program (less than 20 lines of code) is pretty >fast at sorting data up to a "reasonable" size. If you want to push the >limits of your machine's memory you should consider the following: > > 1. Euphoria stores all integer values in 4 bytes. i.e. characters > are stored in 4 bytes (not 1). This might change in the future, > but for now it makes the implementation simple and fast. Yes that almost maxes out the available memory 1.9M characters = 7.6MB memory out of 8MB available. This code will have to work on machines with a minimum of 1MB so a swap file is the only way to handle it. > you were to use something like great_sort of allsorts.ex you > could sort buffer in-place. Yep, did that too. Swapping was still unacceptable in the windows environment. Still could not get past the read loop without the "out of memory error" in the DOS environment. I disabled the Ramdrive and opened up an additional 10MB of memory and got it past the read-loop but even the great_sort was to slow (18min) on this file. I have a program using a Radix sort that benched this same file in 1.5 min, now if I can find the radix sort alogrithms so I can understand how they work then I can write one for the Euphoria language. Once I get the code to work as an entity then I can work on the memory swapping problem as I now feel it is with the extender in DOS mode. > 3. The disk swapping feature of the DOS extender has its >limitations, as you've discovered. I think this is what is causing it to choke with the Buffer variable being so large. Since I can get past the read loop when I run in a DOS box under windows. If I understand your docs correctly, when run under windows Euphoria uses the windows swap protocal and when run under DOS it uses the extenders swap protocol. >I once experimented a bit with this, figuring that >a "divide and conquer" type of sort, such as quick sort (allsorts.ex) might >be better suited to LRU swapping. I think the answer was yes, but it was still pretty slow. I think your are correct on this, a "divide and conquer" method to handle the large files is going to be the only way to handle it. The 'trick' is going to be to develop the fastest possible algorithm to handle the different file sizes expected to be encountered. > 4. For Euphoria Gurus Only: If you have lots of duplicate lines of >input, you might try matching each incoming line against lines you've >already read in. > You'll need a very fast check for duplicates, or this will > be slow. This speed problem is what I want to overcome in the steal taglines portion of the program. Selecting the line to steal is not a problem it is the dup checking that gets slow with large tagline files especially if you want to use Fuzzy logic dup-checking. I feel that ordering both the source and destination files prior to the check would speed this up a bit by allowing a type of indexed look-up in the destination file before the dup check. At some point though the product has to be combined and ordered. If it can't read in and ordered as one collection of objects then it has to be split, operated on and merged, which may cost in speed as these will require disk I/O vs ram I/O. Thanks for your insight and help with my difficulty. Larry D. Poos -[USMC (Retar{bks}{bks}ired) Havelock, NC]- e-mail: ldpoos at juno.com or Fido: 1:3629/101.6