OpenEuphoria: Forum: Re: Phix segfault when wildcard_match file between 1.5 and 2 million lines

Re: Phix segfault when wildcard_match file between 1.5 and 2 million lines

new topic » goto parent » topic index » view thread » older message » newer message

Posted by petelomax 3 months ago
628 views

lib9 said...

I uploaded it there.

Ah! I found it now, as file.txt.gz rather than file.txt.zip - so I guess you are running on Linux?
I've just tried it on Mint Cinnamon and it is crashing now (unlike my previous tests on Windows).

Update: I can now tell you it seems to be crashing when trying to say you have run out of memory....
Maybe a "paltry" 192MB should not really trouble it (and it don't on Windows, but see the final note).
First thing though, I'm testing mmap() for null returns, should be (void*)-1, let's fix that:
(Actually, all this probably won't really help you very much...)

builtins\VM\pHeap.e line 1074 said...

call "libc.so.6","mmap"

so let's just add right after that:

--9/2/24: 
            test rax,rax 
            jg @f 
                xor rax,rax 
         @@:

Which ran me straight into some missing error handling (the five lines above int3 next, no doubt other similar instances exist):

builtins\VM\pHeap.e line 3691 said...

        call :%pGetPool                 -- allocate rcx bytes, rounded up 
        test rax,rax 
--      jz :memoryallocationfailure 
        jnz @f 
--9/2/24: 
            mov rdx,[rsp+48] 
            mov al,33   -- e33maf 
            sub rdx,1 
            jmp :!iDiag 
            int3 
      @@:

A quick "./p -c p" later...
It now goes a bit mental with "Your program has run out of memory, one moment please", but you can kill that with Ctrl C.

lib9 said...

If I remove object txt = read_lines(t_file): It doesn't segfault.

Well that's certainly going to put this on the back burner. If you are going to load a really big file you really should process
it one line at a time and throw them away once dealt with. It takes Phix (running on a VM, so not fast) about 40 seconds
to plough through that file. In contrast, I gave up trying to load it in gedit after 10 minutes and it was not even 1/4
the way through, so to fully load the same file would take it (and gedit is not noted for being slow) at least 45 minutes.
[Of course gedit manages to load the first screenful very quickly, that's not what I'm talking about, try scrolling.]
Presumably what you want is to scrape some info and store it in a much faster database or similar for later use.

new topic » goto parent » topic index » view thread » older message » newer message

OpenEuphoria

Re: Phix segfault when wildcard_match file between 1.5 and 2 million lines

Search

Include:

Quick Links

User menu

Misc Menu