RE: String compression
- Posted by "Kat" <gertie at visionsix.com> Dec 17, 2003
- 447 views
On 16 Dec 2003, at 19:53, Brian Broker wrote: > > > If anybody was so concerned about string storage to implement such a > scheme I'd tell them it's time to upgrade their system or start > programming in lower level language... Did you know you can buy a 512MB > stick of PC100 RAM for less than $50? That's less than 10 cents per > Meg. If I did my math right, that's over 31,000 characters per penny > using a plain old sequence. Hmm, 512megabytes, that's 100 for the OS to play in (browser, email, DLLs, etc etc), leaving 400 for Eu. That's one 100megabyte sequence. Leaving enough room to provide scratch space for operations on the sequence, i'd leave space for 3 instances of the sequence or it's parts, leaves 33megabytes for the one sequence. Or $1.51 / megabyte of useful room. Replacing sequence-type with ascii-string-type will allow 132megs of useful space, 4x as much. If Eu gets ported to 64bit cpus, a sequence will consume 8 bytes per 7bit ascii character. That's a different thought. But i could have sworn there already was machine-level peek/poke routines in the archives to store 4 ascii chars per 32bit memory word. Kat > Just a thought, > -- Brian > > Hayden McKay wrote: > > I noticed that people were concerned about string memory usage so I came up > > with this. To my knowlege an integer is stored as an integer regardless of > > its > > length, (since computers store such things in binary). Please correct me if > > I'm wrong. Its not much but a good start, modify it as u see fit. > > > > include machine.e > > -- integer = p_ints(integer digit, integer digit) > > -- Pack two integers onto one integer. > > -- Returns a single integer. > > global function p_ints(integer a,integer b) > > sequence s > > atom n > > s = int_to_bits(a,16) > > s &= int_to_bits(b,16) > > n = bits_to_int(s) > > return n > > end function > > -- sequence = u_ints(integer digit) > > -- Unpack integer into two integers. > > -- Returns a double integer sequence. > > global function u_ints(integer n) > > sequence s > > object a > > s = int_to_bits(n,32) > > a = bits_to_int(s[1..16]) > > a &= bits_to_int(s[17..32]) > > return a > > end function > > > > I have another routine that compresses whole words into an integer, but > > since two different words can equal the same integer I run into problems > > reversing it. (it only works for certain words about 5 - 10% of the time). > > (95% of the time I get different run_time errors.) I have three theorys to > > over come this. 1. Store a digit on the end as a seed to reversing the word. > > eg: the length of the word. 2. Compress word to a decimal atom to contain a > > seed in itself. 3. It cannot be done! > > > > If any one knows were I can find related topics it be much helpfull. > > > > TOPICA - Start your own email discussion group. FREE! > >