Re: compression

new topic     » topic index » view thread      » older message » newer message

Lewis wrote:

>Just in case you or anyone else is interested, I was wanting
>this function for a compression algorithm I have been working on.
>Here is how it works:


> -- snip--


>I have got up to 17%
>compression with this. Has anyone done this before? Can
>anyone see any potential flaws and/or innefficiencies?

If the data to be compressed is a text file then how about compressing the text
in the
headers? ie:
a to z = 26
A to Z = 26
total 52 (5.7 BITS)

if the output is written to a disk file as 1 character per byte(8 bits) then
this will
save some space. Of course you will have to have a length/delimiter character of
sorts..
maybe you could group the header text into length sizes and do away with a local
delimiter
in favour of a length index at the very start of the compressed file - or will
this
approach spike the algorithm? If not then perhaps the principle could be applied
to the
compressed data itself.

Actually,

>I have got up to 17% compression with this.

Does that means that a 100k file was reduced to 83k or that 100k -> 17k?

If the former is meant then a pure text file of, say,
a to z = 26
A to Z = 26
0 to 9 = 10
!@#$%^&*()-_=+\|]}[{;:’”,<.>/? = 30
<space> <tab> = 2
TOTAL 94 ( 6.584962500721 bits)

could be compressed to about 82% just by 'packing' each char into the said
number of
bits - to make it easy the total number of unique values could be boosted to 96
making
each group of 13 bits = 2 chars ( i think that's right)

>It finds all substrings that are repeated in a string of bytes
>and sorts them by how many matches were found in descending
> order.(This was done with Michael's code) Then I re-sort these
>strings based on a "score"..

Would it be possible to amalgamate the 2 sorting processes with a single
comprehensive
sort? ie, loop through each matched group and calculate the "score" then sort
them (once)
.

Yours Truly
Mike

vulcan at win.co.nz

new topic     » topic index » view thread      » older message » newer message

Search



Quick Links

User menu

Not signed in.

Misc Menu