Memory trouble update
- Posted by Andy Serpa <ac at onehorseshy.com> Jul 16, 2003
- 496 views
Well, Because of all the problems I was having with Euphoria and other things, I went ahead and got XP Pro a few days ago. I was simply being too unproductive. Anyway, all problems are gone! Everything is running faster than ever! (I think I also traced those "mystery crashes" to improper latency settings for my RAM.) So while I do still have another system here still running Windows ME, I don't get much chance to use it and won't be able to spend any time tracking down this problem anymore. But I think I figured out the gist of it anyway, and using these guidelines anyone should be able to create slow performance at will with Eu 2.4 running Win ME (& 98, etc I assume). My original artificial example was this: -------------------------------------------- include machine.e function seq_transform(sequence s) sequence snew snew = {} for i = 1 to length(s) do if rand(2) = 1 then snew &= int_to_bytes(s[i]) else snew &= atom_to_float64(s[i]) end if end for return snew end function object x, s1, s2, s3, s4 s1 = {} s2 = {} s3 = {} s4 = {} puts(1,"Creating big sequences...\n") for i = 1 to 50000 do s1 &= rand(1073741823) * 100 s2 &= rand(1073741823) * 100 s3 &= rand(1073741823) * 100 s4 &= rand(1073741823) * 100 end for puts(1,"Processing sequences...\n") t = time() for i = 1 to 50 do x = seq_transform(s1[rand(10000)..20000]) end for ? time() - t s1 = {} t = time() for i = 1 to 50 do -- uncomment the following line to see just how slow -- ? i x = seq_transform(s2[rand(10000)..20000]) end for ? time() - t s2 = {} t = time() for i = 1 to 50 do x = seq_transform(s3[rand(10000)..20000]) end for ? time() - t s3 = {} t = time() for i = 1 to 50 do x = seq_transform(s4[rand(10000)..20000]) end for ? time() - t s4 = {} puts(1,"\n\nDone.") t = gets(0) --------------------------- Now Rob said I was basically fragmenting the heap with this, but what failed to hit me is just how specifically I was doing that: for i = 1 to 50000 do s1 &= rand(1073741823) * 100 s2 &= rand(1073741823) * 100 s3 &= rand(1073741823) * 100 s4 &= rand(1073741823) * 100 end for It has much less to do with what you do after you free up one of the sequences then how you built them up in the first place. The really important part here is not just that they are large, or contain part integers & part floats (integers outside integer range), but that they are "grown" in parallel. It adds a little bit to one sequence, then a little bit to the next, etc. This has the effect (I assume) of breaking up the memory used for each into lots of non-adjacent bits. So if you do this instead: for i = 1 to 50000 do s1 &= rand(1073741823) * 100 end for for i = 1 to 50000 do s2 &= rand(1073741823) * 100 end for for i = 1 to 50000 do s3 &= rand(1073741823) * 100 end for for i = 1 to 50000 do s4 &= rand(1073741823) * 100 end for Then the program runs fine because when each sequence is later freed ("s1 = {}") it is freeing (I assume) a big more or less contiguous block. (Unfortunately, in real programs this is not always possible or would create a new kind of slowdown.) But when you build them up in parallel, and then free one of them, you are creating a ton of little "holes" in memory, and then when you allocate some more it seems to take forever for the system to do that. The other important thing to note is that if you build up something again (after freeing a big fragmented sequence) that is the same shape and size RAM-wise as the freed sequence, it will work ok. It is when allocating chunks of memory that don't fit into the "holes" well when the problems occur. I don't think it has too much to do with integers vs. floats per se, only that they require a different number of bytes to store. I did some experiments where after I created fragmentation, I built up some new sequences using only integers. Worked fine, but if I made it so that it would add a slice of random length (say 1 to 10 elements of integers), then the slowdown would occur. So, to create bad performance, do the following: -- "Grow" incrementally several large sequences *in parallel*. -- Free one of them. -- Grow some new sequences of a different shape & size (ram-wise, mixing elements requiring different storage needs) than the one you freed. -- You should get diminished performance building up the new sequences. To further diminish, iterate the process. With that, I leave this problem for Rob to deal with, if desired...