1. Bit-Addon & Compression
- Posted by Ralf Nieuwenhuijsen <nieuwen at XS4ALL.NL> Sep 22, 1997
- 785 views
- Last edited Sep 23, 1997
--------------B346DE9BAF9CC92B22013CD4 Included in this mail is my eupbit.e It is an add-on which you can use instead of file.e It also replaces a few in-built routines. (puts, gets, getc, open, close, etc.) They will work the same way the built-in wil, but.. you now have these 6 new routines also.. + put_bits (file_handle, boolean or bit-sequence) This routine will write out a bit or a sequence of bits to the file_handle. Example: put_bits(1,{1,0,1}) This will write the bits 1,0,1 to the screen. (Actually nothing is written yet, until all the 8 bits are known.) You can also use it like this: put_bits(1,1) Which will write bit 1 to the screen. But if you know write "Hello" with puts, all the bits are moved one place! So to say things clearly... you can now write bytes and bits to any file or device!! The rest of the routines are: + get_bit (file_handle) Returns one bit from file_handle + get_bits (file_handle, number of bits) Returns a sequence of bits + set_cp_normal (file_handle) Sets normal compression (lzh) Very slow.. but works. All output is compressed and/or all input is decompressed. + set_cp_off (file_handle) Turns compression off. + set_cp_full (file_handle) Doesn't work yet, at this point it gives the same compression as lzh. I am working on a lzh with binairy-tree. The compression is *very* slow, but at this point i was already happy when it worked, i will now try to optimize it, i think a lot of optimization can be done (to speed and size). For ya people out there knowing what lzh is, i now use 2(!!) bytes for the look-up-table, but have already made restrictions that the table can't be bigger than 32678. (12 bytes instead of 16) When i optimize this, the compression ratio will also increase. Enjoy until an upgrade.... (BTW I created this as an basis for EDOM 2) BTW2 is there any1 out there that really understands arithematic.. (yeah i know it is slower.. but in Euphoria the difference might not win, since real-time data manulation in Euphoria is offcourse a lot slower that in C or ASM (and sorry, but their code would also be lesss complicated, cause i have to cover the bridge from the simplicity of Euphoria to the complicated world of pc-dos/fat-file-system/bytes-bits eh. you get my point) Ralf Nieuwenhuijsen nieuwen at xs4all.nl (BTW3 Until i get some bugs out, the routine will not allow you to open a file for update... simply because at this point i will corrupt the file!) --------------B346DE9BAF9CC92B22013CD4 Content-Disposition: inline; filename="Eupbit.e" -- The trick is (a note to remember)... -- Setting a bit positive --> or_bits(i,val_pos[bit_to_set]) -- Setinng a bit negative --> and_bits(i,val_neg[bit_to_set]) -- I wrote this here so we all can look it up easily... -- BiTwIsE Add-On Alpha Version For Euphoria with problely many bugs. -- BBBBBBB iii TTTTTTTTTTTT -- BB BB iiiii TTTT & COMPRESSION -- BB BB iii TT c o m p r e s s i o n -- BBBBB B_I_T B_I_T TT B_I_T -- BB BB wise iii wise TT wise -- BB BB iii TT -- BBBBBBB iiiii TTTT by Ralf Nieuwenhuijsen -- FreeWare to all, please improve, change as you like. -- I'm *not* responsible for whatever happens to your computer your life, -- your pet, your haircut, etc. -- SO USE AT YOUR OWN RISK -- ------<<.. Euphoria BitWise Add-On & Built-In Compression Add-On ..>>-------- -- Completely programmed by Ralf Nieuwenhuijsen. -- For more information you can contact this 15-years old boy at: -- .._/^home_addres^\_.. >|< .._/^email_addres^\_.. -- Schoener 22 >|< -- 2401 MT >|< nieuwen at xs4all.nl -- Alphen A/D Rijn >|< -- Holland >|< -- Please do so for any comment, question, bomb, etc. -- If you use this code for anything commercial *please* give me a -- a copy of your program, althrough it is your own choice. -- This library is FreeWare !!!! -- See the file 'readme.txt' for instructions for any how-to's. -- Please report any bugs, language mistakes and misleading explenation -- to above addresses. (Either by normal or e-mail) ---------------------<<.. Code starts here..>>------------------------------- --<<.. Recommended settings are ..>> -- without type_check -- without warning -- without trace --<<.. Constants & Locals ..>> constant val_pos = { 1, 2, 4, 8, 16, 32, 64, 128} constant val_neg = {254, 253, 251, 247, 239, 223, 191, 127} integer dummy_i dummy_i = 0 --<<.. Organizing the Bit Handlers ..>> sequence bit_pos -- Saves the current bit position sequence cur_char -- Saves the current character sequence read_mode -- Saves the read-enable mode value sequence write_mode -- Saves the write-enable value sequence cp_mode -- Saves the compression mode sequence cs_mode -- Saves the 'changed ?' mode --<<.. Compression Tables ..>> sequence ctable -- Saves the data tables for the compression sequence cfreq -- Saves the table entries frequentie sequence cbc -- Saves the tables binairy codes sequence cstack -- Saves the stack sequence cpoint -- Saves the update point sequence clast -- Saves the last character written constant CP_OFF = 0 -- No compression constant CP_NORMAL = 1 -- Normal LZH compression constant CP_FULL = 2 -- Huffman & LZH compression (slow) constant CP_UPDATE_RATE = 1 -- Update rate for frequency list -- Higher is speed+ and size- -- Lower is speed- and size+ global integer CP_DEFAULT -- Default compression method CP_DEFAULT = CP_OFF -- No compression as default --<<.. Initializing the default file-handlers (screen & error-out) ..>> -- Euphoria has two default file-handlers: -- + The terminal -the screen- output will generate ASCII-text -- + The error-out -the screen- output will generate ASCII-text -- I really *don't* know the use of the error-out, but it is supported bit_pos = {1,1} -- Both screen and error-out start normal cur_char = {0,0} -- + with value #00 read_mode = {0,0} -- + non-readable device write_mode = {1,1} -- + writable device cp_mode = {0,0} -- + no-compression cs_mode = {0,0} -- + not changed yet ctable = {{},{}} -- + empty tables cfreq = {repeat(0,256),repeat(0,256)} clast = {0,0} cbc = {repeat({},256),repeat({},256)} cstack = {{},{}} cpoint = {0,0} --<<.. Types ..>> type bool_o ( object i ) -- Boolean object -- either an atom of 1 or 0 or an 1D sequence with 1 and 0's if sequence(i) then for j = 1 to length(i) do if not atom(i[j]) then return 0 end if if i[j] != 0 and i[j] != 1 then return 0 end if end for else return (i = 0) or (i = 1) end if return 1 end type type file_handle ( integer f ) -- File Handler object can NOT be negative return f >= 0 end type --<<.. Update Compression Routine ..>> procedure update_cp (file_handle f) atom min integer wim if length(cfreq) >= 32768 - CP_UPDATE_RATE then for d = 1 to CP_UPDATE_RATE do if length(cfreq) < 257 then return end if min = cfreq[f][1] wim = 1 for j = 257 to length(cfreq[f]) do if cfreq[f][j] > min then min = cfreq[f][j] wim = j end if end for cfreq[f] = cfreq[f][1..wim-1] & cfreq[f][wim+1..length(cfreq)] ctable[f] = ctable[f][1..wim-1] & cfreq[f][wim+1..length(cfreq)] end for end if end procedure --<<.. Real Routines ..>> -- Still be able to call the built-in routines function real_open (sequence s1, sequence s2) return open(s1,s2) end function procedure real_close (file_handle i) if write_mode[i] then if cp_mode[i] then if clast[i] != 0 then puts(i,remainder(clast[i], 256)) puts(i,clast[i]/256) end if end if end if close(i) end procedure procedure real_puts (file_handle f, object x) if cp_mode[f] then if sequence(x) then for j = 1 to length(x) do if cpoint[f] = 0 then cpoint[f] = CP_UPDATE_RATE update_cp (f) end if if clast[f] = 0 then clast[f] = x[j] else if clast[f] < 256 then dummy_i = find ({clast[f],x[j]},ctable[f]) + 255 if dummy_i = 255 then ctable[f] = append(ctable[f],{clast[f],x[j]}) cfreq[f] = append(cfreq[f],1) cfreq[f][clast[f]] = cfreq[f][clast[f]] + 1 puts(f,remainder(clast[f], 256)) puts(f,clast[f]/256) puts(f,x[j]) clast[f] = 0 else clast[f] = dummy_i end if else dummy_i = find (append(ctable[f][clast[f]-255],x[j]),ctable[f]) + 255 if dummy_i = 255 then ctable[f] = append(ctable[f],append(ctable[f][clast[f]-255],x[j])) cfreq[f] = append(cfreq[f],1) cfreq[f][clast[f]] = cfreq[f][clast[f]] + 1 puts(f,remainder(clast[f], 256)) puts(f,clast[f]/256) puts(f,x[j]) clast[f] = 0 else clast[f] = dummy_i end if end if end if cpoint[f] = cpoint[f] - 1 end for else if cpoint[f] = 0 then cpoint[f] = CP_UPDATE_RATE update_cp (f) end if if clast[f] = 0 then clast[f] = x else if clast[f] < 256 then dummy_i = find ({clast[f],x},ctable[f]) + 255 if dummy_i = 255 then ctable[f] = append(ctable[f],{clast[f],x}) cfreq[f] = append(cfreq[f],1) cfreq[f][clast[f]] = cfreq[f][clast[f]] + 1 puts(f,remainder(clast[f], 256)) puts(f,clast[f]/256) puts(f,x) clast[f] = 0 else clast[f] = dummy_i end if else dummy_i = find (append(ctable[f][clast[f]-255],x),ctable[f]) + 255 if dummy_i = 255 then ctable[f] = append(ctable[f],append(ctable[f][clast[f]-255],x)) cfreq[f] = append(cfreq[f],1) cfreq[f][clast[f]] = cfreq[f][clast[f]] + 1 puts(f,remainder(clast[f], 256)) puts(f,clast[f]/256) puts(f,x) clast[f] = 0 else clast[f] = dummy_i end if end if end if cpoint[f] = cpoint[f] - 1 end if else puts(f, x) end if end procedure function real_getc (file_handle f) if cp_mode[f] then if cpoint[f] = 0 then cpoint[f] = CP_UPDATE_RATE update_cp (f) end if if length(cstack[f]) = 0 then dummy_i = getc(f)+(getc(f)*256) if dummy_i < 256 then ctable[f] = append(ctable[f],{dummy_i,getc(f)}) cfreq[f] = append(cfreq[f],1) cfreq[f][dummy_i+1] = cfreq[f][dummy_i+1] + 1 else ctable[f] = append(ctable[f],append(ctable[f][dummy_i-255],getc(f))) cfreq[f] = append(cfreq[f],1) cfreq[f][dummy_i+1] = cfreq[f][dummy_i+1] + 1 end if cstack[f] = ctable[f][length(ctable[f])] end if if cstack[f][1] < 0 then return -1 end if dummy_i = cstack[f][1] cstack[f] = cstack[f][2..length(cstack[f])] cpoint[f] = cpoint[f] - 1 return dummy_i else return getc(f) end if end function function real_gets (file_handle f) sequence ret if cp_mode[f] then ret = {real_getc(f)} if ret[1] = -1 then return -1 end if while ret[length(ret)] != '\n' do ret = append(ret,real_getc(f)) if ret[length(ret)] = -1 then return ret[1..length(ret)-1] end if end while else return gets (f) end if return ret end function --<<.. Seek Routine ..>> -- New & Improved Seek Routine (to be binairy compatible) global function seek ( file_handle fn, object x ) integer ret if cs_mode[fn] then if cp_mode[fn] then real_puts(fn,cur_char[fn]) else puts(fn,cur_char[fn]) end if end if -- Machine_func 20 is WHERE if sequence (x) then -- Machine_func 19 is SEEK if length(x) < 2 then if machine_func(20,fn) != floor(x[1]/8) then ret = machine_func( 19, {fn, floor(x[1]/8)} ) else ret = 0 end if bit_pos[fn] = remainder (x[1],8) if read_mode[fn] then if cp_mode[fn] then cur_char[fn] = real_getc(fn) else cur_char[fn] = getc(fn) end if ret = machine_func (19,{fn, floor(x[1]/8)}) else cur_char[fn] = 0 end if else if machine_func(20,fn) != x[1] then ret = machine_func (19,{fn, x[1]}) else ret = 0 end if bit_pos[fn] = x[2] if read_mode[fn] then if cp_mode[fn] then cur_char[fn] = real_getc(fn) else cur_char[fn] = getc(fn) end if ret = machine_func (19,{fn,x[1]}) else cur_char[fn] = 0 end if end if else if machine_func(20,fn) != x then ret = machine_func (19,{fn, x}) else ret = 0 end if bit_pos[fn] = 1 if read_mode[fn] then if cp_mode[fn] then cur_char[fn] = real_getc(fn) else cur_char[fn] = getc(fn) end if ret = machine_func (19,{fn, x}) else cur_char[fn] = 0 end if end if cs_mode[fn] = 0 return ret end function --<<.. Close Routine ..>> -- New & Improved close routine (to be binairy compatible) global procedure close ( object x ) if sequence(x) then for j = 1 to length(x) do real_close(x[j]) end for else if x = 1 or x = 2 then return end if if cs_mode[x] then if cp_mode[x] then real_puts (x,cur_char[x]) else puts (x,cur_char[x]) end if cs_mode[x] = 0 end if real_close (x) bit_pos[x] = -1 for j = length(bit_pos) to 1 by -1 do if bit_pos[j] != -1 then exit end if bit_pos = bit_pos[1..j-1] cur_char = cur_char[1..j-1] read_mode = read_mode[1..j-1] write_mode = write_mode[1..j-1] cp_mode = cp_mode[1..j-1] cs_mode = cs_mode[1..j-1] ctable = ctable[1..j-1] cfreq = cfreq[1..j-1] cbc = cbc[1..j-1] cpoint = cpoint[1..j-1] clast = clast[1..j-1] cstack = cstack[1..j-1] end for end if end procedure --<<.. Open Routine ..>> -- New & Improved open routine ( some stuff need to be done then ) global function open (sequence s1, sequence s2) integer ret ret = real_open(s1, s2) if ret = -1 then return -1 end if s2 = s2 + ((s2 < 97)* 32) if find('u',s2) then puts(1,"\n !OOPS! >> Sorry, but update mode isn't possible *yet* with.. \n") puts(1,"\n <<.. The Binairy Add-On for Euphoria by Ralf Nieuwenhuijsen ..>> \n") puts(1,"\n <-:-> Make Sure you have the latest version <-:-> \n") puts(1,"\n You can email me about this: <<.. nieuwen at xs4all.nl ..>>\n") return -1 end if bit_pos = append(bit_pos, 1) read_mode = append(read_mode, find('u',s2) or find('r',s2)) write_mode = append(write_mode, find('u',s2) or find('a',s2) or find('w',s2)) cp_mode = append(cp_mode,CP_DEFAULT) cs_mode = append(cs_mode, 0) ctable = append(ctable, {}) cfreq = append(cfreq, repeat(0,256)) cbc = append(cbc, repeat({},256)) cpoint = append(cpoint, 0) clast = append(clast, 0) cstack = append(cstack,{}) if read_mode[ret] then if cp_mode[ret] then cur_char = append(cur_char, real_getc(ret)) else cur_char = append(cur_char, getc(ret)) end if else cur_char = append(cur_char, 0) end if if seek (ret, {0,1}) != 0 then close(ret) return -1 end if return ret end function --<<.. Bits Put Routine ..>> -- Outputs a bit or a sequence of bits global procedure put_bits ( file_handle bh, bool_o bit) if atom(bit) then if bit then if cur_char[bh] < val_pos[bit_pos[bh]] then cur_char[bh] = cur_char[bh] + val_pos[bit_pos[bh]] else cur_char[bh] = or_bits(cur_char[bh],val_pos[bit_pos[bh]]) end if else if cur_char[bh] >= val_pos[bit_pos[bh]] then cur_char[bh] = and_bits(cur_char[bh],val_neg[bit_pos[bh]]) end if end if bit_pos[bh] = bit_pos[bh] + 1 cs_mode[bh] = 1 if bit_pos[bh] > 8 then if cp_mode[bh] then real_puts (bh,cur_char[bh]) if read_mode[bh] then cur_char[bh] = real_getc(bh) dummy_i = machine_func(19,{bh, machine_func(20,bh) - 1}) else cur_char[bh] = 0 end if else puts (bh, cur_char[bh]) if read_mode[bh] then cur_char[bh] = getc(bh) dummy_i = machine_func(19,{bh, machine_func(20,bh) - 1}) else cur_char[bh] = 0 end if end if bit_pos[bh] = 1 cs_mode[bh] = 0 end if else for j = 1 to length(bit) do if bit[j] then if cur_char[bh] < val_pos[bit_pos[bh]] then cur_char[bh] = cur_char[bh] + val_pos[bit_pos[bh]] else cur_char[bh] = or_bits(cur_char[bh],val_pos[bit_pos[bh]]) end if else if cur_char[bh] >= val_pos[bit_pos[bh]] then cur_char[bh] = and_bits(cur_char[bh],val_neg[bit_pos[bh]]) end if end if bit_pos[bh] = bit_pos[bh] + 1 cs_mode[bh] = 1 if bit_pos[bh] > 8 then if cp_mode[bh] then real_puts (bh,cur_char[bh]) if read_mode[bh] then cur_char[bh] = real_getc(bh) dummy_i = machine_func(19,{bh, machine_func(20,bh) - 1}) else cur_char[bh] = 0 end if else puts (bh, cur_char[bh]) if read_mode[bh] then cur_char[bh] = getc(bh) dummy_i = machine_func(19,{bh, machine_func(20,bh) - 1}) else cur_char[bh] = 0 end if end if bit_pos[bh] = 1 cur_char[bh] = 0 end if end for end if end procedure --<<.. Sequence put routine ..>> -- New & Improved puts routine (to be binairy compatible) global procedure puts ( file_handle bh, object x ) if bit_pos[bh] = 1 then real_puts(bh, x) if read_mode[bh] then if cp_mode[bh] then cur_char[bh] = real_getc(bh) else cur_char[bh] = getc(bh) end if dummy_i = machine_func(19,{bh, machine_func(20,bh) - 1}) end if cs_mode[bh] = 0 else if atom(x) then x = {x} end if for j = 1 to length(x) do if cur_char[bh] < val_pos[bit_pos[bh]] then cur_char[bh] = cur_char[bh] + (x[j] - remainder(x[j],val_pos[bit_pos[bh]])) else cur_char[bh] = remainder(cur_char[bh],val_pos[bit_pos[bh]]) + (x[j] - remainder(x[j],val_pos[bit_pos[bh]])) end if real_puts(bh, cur_char[bh]) if read_mode[bh] then if cp_mode[bh] then cur_char[bh] = real_getc(bh) else cur_char[bh] = getc(bh) end if cur_char[bh] = remainder(x[j],val_pos[bit_pos[bh]]) + (cur_char[bh] - remainder(cur_char[bh],val_pos[bit_pos[bh]])) dummy_i = machine_func(19, {bh, machine_func(20,bh) -1 }) else cur_char[bh] = remainder(x[j],val_pos[bit_pos[bh]]) end if end for cs_mode[bh] = 1 end if end procedure --<<.. Get Bit Routine ..>> -- Returns the next bit from file_handle f or -1 if EOF global function get_bit ( file_handle f ) integer ret if cur_char[f] = -1 then return -1 end if if or_bits(cur_char[f],val_pos[bit_pos[f]]) = cur_char[f] then ret = 1 else ret = 0 end if bit_pos[f] = bit_pos[f] + 1 if bit_pos[f] > 8 then if cs_mode[f] then real_puts(f,cur_char[f]) cs_mode[f] = 0 end if bit_pos[f] = 1 if cp_mode[f] then cur_char[f] = real_getc(f) cur_char[f] = real_getc(f) else cur_char[f] = getc(f) cur_char[f] = getc(f) end if cs_mode[f] = 0 end if return ret end function --<<.. Get Bits Routine ..>> -- Returns the number of requested bits or -1 if EOF global function get_bits ( file_handle f, integer nob ) sequence ret if cur_char[f] = -1 then return -1 end if ret = repeat (0,nob) for j = 1 to nob do if or_bits(cur_char[f],val_pos[bit_pos[f]]) = cur_char[f] then ret[j] = 1 end if bit_pos[f] = bit_pos[f] + 1 if bit_pos[f] > 8 then if cs_mode[f] then real_puts(f,cur_char[f]) else if cp_mode[f] then cur_char[f] = real_getc(f) else cur_char[f] = getc(f) end if end if cs_mode[f] = 0 bit_pos[f] = 1 if cp_mode[f] then cur_char[f] = real_getc(f) else cur_char[f] = getc(f) end if if cur_char[f] = -1 then return ret[1..j-1] end if end if end for return ret end function --<<.. Get Sequence Routine ..>> -- New & Improved gets (to be binairy compatible) global function gets (file_handle f) object ret if bit_pos[f] = 1 then if cs_mode[f] then real_puts(f,cur_char[f]) dummy_i = machine_func(19, {f, machine_func(20,f) -1 }) end if cs_mode[f] = 0 ret = real_gets(f) if cp_mode[f] then cur_char[f] = real_getc(f) else cur_char[f] = getc(f) end if if cur_char[f] != -1 then dummy_i = machine_func(19, {f, machine_func(20,f) -1 }) end if else if cs_mode[f] then real_puts(f,cur_char[f]) else if cp_mode[f] then cur_char[f] = real_getc(f) else cur_char[f] = getc(f) end if end if cs_mode[f] = 0 if cur_char[f] = -1 then return -1 end if ret = {cur_char[f] - remainder(cur_char[f],val_pos[bit_pos[f]])} if cp_mode[f] then cur_char[f] = real_getc(f) else cur_char[f] = getc(f) end if if cur_char[f] = -1 then return ret end if ret[1] = ret[1] + remainder(cur_char[f],val_pos[bit_pos[f]]) while ret[length(ret)] != '\n' do ret = append(ret, cur_char[f] - remainder(cur_char[f],val_pos[bit_pos[f]])) if cp_mode[f] then cur_char[f] = real_getc(f) else cur_char[f] = getc(f) end if if cur_char[f] = -1 then return ret end if ret[length(ret)] = ret[length(ret)] + remainder(cur_char[f],val_pos[bit_pos[f]]) end while end if return ret end function --<<.. Get Character Routine ..>> -- New & Improved getc routine (to be binairy compatible) global function getc ( file_handle f ) integer ret if cs_mode[f] then real_puts(f,cur_char[f]) else dummy_i = machine_func(19, {f, machine_func(20,f) + 1 }) end if cs_mode[f] = 0 if bit_pos[f] = 1 then ret = cur_char[f] cur_char[f] = real_getc (f) else ret = cur_char[f] - remainder(cur_char[f],val_pos[bit_pos[f]]) cur_char[f] = real_getc(f) ret = ret + remainder(cur_char[f],val_pos[bit_pos[f]]) end if dummy_i = machine_func(19, {f, machine_func(20,f) -1 }) return ret end function --<<.. Compression Set Routines ..>> global procedure set_cp_off (file_handle f) cp_mode[f] = CP_OFF end procedure global procedure set_cp_normal (file_handle f) cp_mode[f] = CP_NORMAL end procedure global procedure set_cp_full (file_handle f) cp_mode[f] = CP_FULL end procedure --<<.. End of file ..>>-----------<<.. All done by Ralf Nieuwenhuijsen ..>>-- --------------B346DE9BAF9CC92B22013CD4--