Re: Let's talk about zip files

new topic     » goto parent     » topic index » view thread      » older message » newer message
petelomax said...

I expect you've also seen https://github.com/kuba--/zip (which wraps miniz)

Yes I came across that as well. I'm basically doing the same thing by using miniz for the "low level" functions and crafting the "high level" functions directly in std/zip.e.

petelomax said...

Apparently that is easily compiled into a dll (/so?) but for me it's just an exercise in frustration.

Both miniz and "kubazip" (that doesn't really have a name, does it?) both use CMake but it's not really necessary.

Kubazip already includes the amalgamated source for miniz which makes it easier. A quick test on my Ubuntu system:

$ git clone https://github.com/kuba--/zip kubazip 
$ cd kubazip 
$ gcc -shared -fPIC -O2 -s -D_GNU_SOURCE -Isrc/ -o libkubazip.so src/zip.c 

If you just want miniz without "kubazip" you could build with all the source files. You just need to create miniz_export.h first (running CMake would do this).

$ git clone https://github.com/richgel999/miniz 
$ cd miniz 
$ printf "#ifndef MINIZ_EXPORT\n#define MINIZ_EXPORT\n#endif\n" > miniz_export.h 
$ gcc -shared -fPIC -O2 -s -D_GNU_SOURCE -o libminiz.so miniz*.c 

Note: Adding -D_GNU_SOURCE is required for large file support on 64-bit. I don't think it causes any harm to leave in when building for 32-bit either.

petelomax said...

I would quite probably quite happily jump ship to that or similar if only I could actually build it.

Here's a Makefile that should work on Windows or Linux. Hope that helps.

CC = gcc$(EXE_EXT) 
CFLAGS = -fPIC -O2 -s 
TARGET = $(LIB_PRE)miniz$(LIB_EXT) 
 
ifeq ($(OS),Windows_NT) 
    EXE_EXT = .exe 
    LIB_EXT = .dll 
else 
    LIB_PRE = lib 
    LIB_EXT = .so 
    CFLAGS += -D_GNU_SOURCE 
endif 
 
$(TARGET) : $(wildcard miniz*.c) 
	$(CC) -shared $(CFLAGS) -o $@ $^ 

petelomax said...

As I've noted there is an annoying niggle with LiteZip that will probably never ever be fixed,
plus it's still 32-bit only and completely untested on Linux, both of which are not exactly ideal.

Miniz hasn't had an actual tagged release in a while, but it still sees very frequent commits. Pretty sure it'll run on darn near anything.

petelomax said...

Not that it really matters but technically LiteZip is 99K all in, or just 59K for extract-only,
though I accept that argument would be blown out of the water for anyone shipping a 64-bit app.

Using the Makefile I provided above libminiz.so is 95K on my 64-bit Ubuntu system. When building into the interpreter, be_miniz.o is only 86K. libkubazip.so is about 120K.

Keep in mind that miniz also provides the lower-level zlib deflate/inflate functions as well, which I'm going to wrap separately in std/zlib.e.

petelomax said...

Regarding "directly into the backend" I trust you've considered and covered the eu2c implications.

Yes, indeed. What I've done package the amalgamated files into be_miniz.h and be_miniz.c and that gets compiled into the interpreter directly and into the translator's static library.

petelomax said...

PS Good find, it astonishes me how hard it is to find a decent zip component.

There are a few out there but I wanted to provide some functionality directly Euphoria and leave the "Swiss Army tool" functionality to shared libraries.

If you're looking for a one-stop library for all your archiving and compression needs, I'd recommend libarchive. For encryption and hashing, I'd recommend libtomcrypt.

petelomax said...

PPS One thing you may have missed is the ability to delete entries from a zip file? (no biggie)

I wouldn't say I've missed it; none of this is fully-baked yet. What I showed in the first post was just an example of what I'm putting together.

Zip files are weird in that "deleting" entries isn't really a thing. You basically have three options:

  1. Remove the entry from the central directory and zero-out its data in the file (does not reduce the file size at all).
  2. Perform step #1 above but then "shift" all of the other entries down to close the gap and rewrite the central directory (complicated, possibly destructive).
  3. Create a new zip file and copy all the entries, excluding what you're removing, and then delete the original zip file (simpler, slower, uses more disk space).

Obviously #3 is the safest approach so I'll probably implement that in std/zip.e and document it similarly to db_compress() which basically does the same thing.

-Greg

new topic     » goto parent     » topic index » view thread      » older message » newer message

Search



Quick Links

User menu

Not signed in.

Misc Menu