1. Block Size
I'm considering a data structure and algorithm to store characters as a binary
tree but without
the standard overead of 1 byte character to 8 bytes overhead which is quite
excessive. That's
in C where as doing it in euphoria sequences is 1 : 11+. So i decided to do it
in blocks. The
thing I am pondering is the size of the character to the number of entries per
block.
I'm leaning towards 256 chars, 64 per block or 65536 chars 64 per block, but
then I suppose I
could do both. The advantage of the first is it may take less time to peek at
the data.
The advantage of the larger blocks is they require less time to find the next
entry.
I was wondering what other people tought.
Each entry has
x bits for es char.
y *2 bits for the ptr.
1 bit for end of str flag.
2 bit for ptr modifier.
-----
3 + x + y *2 bits
So in 3 bytes x + y*2 = 21, or in 4 bytes x + y*2 = 29
Entry Chars per Block OH : Oh/Data Waste
Size Block Size b Data
(AVG)
3b 2048 32 96 13:11
1.18 48
3b 512 64 192 15:9
1.67 96
3b 128 128 384 17:7
2.43 192
4b 131072 64 256 15:17 0.88
128
4b 32768 128 512 17:15 1.13
256
4b 8192 256 1K 19:13
1.46 512
4b 2048 512 2K 21:11
1.91 1K
4b 512 1024 4K 23:9
2.56 2K
4b 128 2048 8K 25:7
3.57 3K
-------------------------
Sincerely,
Mathew Hounsell
mat.hounsell at excite.com