Another example of monster sequences
- Posted by Michael J. Sabal <m_sabal at ?a?oo.com> Jun 10, 2008
- 708 views
I know the discussion has somewhat passed, but I'm just now coming across a scenario where an internal 8-bit representation of strings could be useful. I have an FTP directory that contains over 100,000 entries. I didn't design the directory and I have no authority to change it. When requesting the directory listing (and this is just the listing, not the data), each entry can be anywhere from 57 bytes to over 1,000 bytes, depending on the length of the file name. Assuming an average of 65 bytes per entry, the list request will stream 6.5MB of data. This data is sent in blocks, and there is no way to tell in advance how many blocks are coming. Therefore, as each 4K block comes in, it must be concatenated to the data sequence. Since each streamed byte is stored as 4 memory bytes, the internal sequence is now 26MB. By the time the stream has been parsed, and the results stored in a separate sequence, over 50MB of memory is being used before anything can be written to a database. The additional memory use also increases the time it takes for the stream to process. It may be too difficult to actually implement - I'll admit I haven't looked into it yet. I only bring this up because there has been some doubt as to the need, as only a handful of examples have been presented to illustrate it. This case adds one more.