1. Stopping phix (bug?)
- Posted by irv Dec 22, 2020
- 2157 views
- Last edited Dec 26, 2020
When I end a EuGTK program, the window(s) close and if I'm running from a terminal, the terminal prompt returns.
When I end a PhixGTK program in the same way, the window(s) remain on the screen - non-responsive - and you have to do a ctl-c on the terminal, which closes the non-responsive windows, and gets you back to a prompt.
Watching the process monitor, it shows p using 10.9 megs of memory and 320 megs of virtual memory.
Clicking on the window close button shows p still running, but memory and virtual memory are N/A. Only ctl-c on the term or killing p in the system monitor will destroy the window and return to the prompt.
Doing the same with Euphoria, Eu disappears from the process list as soon as you close the main GTK window.
Calling gtk_main_quit() before abort makes no difference.
This is especially a problem if you compile a program and run it by clicking on the file icon. There's no way to stop it, short of waiting for the "Non-responding" dialog and clicking on "force quit", which is after a long delay.
Why the difference between Eu and Phix?
2. Re: Stopping phix
- Posted by ChrisB (moderator) Dec 23, 2020
- 2133 views
Hi
I also have found this problem occasionally, and while this may not help, and I can't remember specifics, it may be because the memory allocated by one of the GTK dlls is not explicitly being released, so it may be that you have to tell GTK to release the memory it's using. As I said it may not help. I usually find that these hangs are associated with DLLs.
Cheers
Chris
3. Re: Stopping phix
- Posted by petelomax Dec 23, 2020
- 2146 views
I'm stumped.
It is exiting gtk_main, but then both a normal exit and abort() just hang, and hang the debugger too.
ps -a shows the process as <defunct> which leads me to https://en.wikipedia.org/wiki/Zombie_process only there's no child/parent relation here at all.
The only other thing I could try is that opAbort is implemented in pStack as
[ELF64] mov rdi,rax -- error_code (p1) mov rax,60 -- sys_exit(rdi=int error_code) syscall
whereas exit_thread ended up as
[ELF64] -- xor rbx,rbx -- mov rax,60 -- sys_exit -- syscall xor rdi,rdi call "libpthread.so.0","pthread_exit"
So I'm thinking along those lines for trying glibc/exit(), but I can't remember how/where I ever figured the last line out...
(mind you, even if/when I figure that out, I'm betting it won't change a thing, though https://stackoverflow.com/questions/46903180/syscall-implementation-of-exit suggests it might.)
4. Re: Stopping phix
- Posted by irv Dec 26, 2020
- 2096 views
Calling system("ps -U irv axjf",0) as the last thing before ending the program (after calling gtk_main_quit) shows a difference which might explain why the window doesn't go away when the program ends. Notice there are two invocations of p test1, but only one with Eu.
7784 7791 7791 7791 pts/1 8162 Ss 1000 0:00 \_ bash 7791 8162 8162 7791 pts/1 8162 Sl+ 1000 0:00 \_ p test1 8162 8165 8162 7791 pts/1 8162 S+ 1000 0:00 \_ p test1 8165 8166 8162 7791 pts/1 8162 S+ 1000 0:00 \_ sh -c ps -U irv axjf 8166 8167 8162 7791 pts/1 8162 R+ 1000 0:00 \_ ps -U irv axjf 7784 7791 7791 7791 pts/1 8265 Ss 1000 0:00 \_ bash 7791 8265 8265 7791 pts/1 8265 Sl+ 1000 0:00 \_ eui test1 8265 8271 8265 7791 pts/1 8265 S+ 1000 0:00 \_ sh -c ps -U irv axjf 8271 8272 8265 7791 pts/1 8265 R+ 1000 0:00 \_ ps -U irv axjf
If you run the same program compiled by Phix (which exhibits the same misbehaviour as the interpreted version):
1 8618 1277 1277 ? -1 Sl 1000 0:00 /home/irv/pgtk/test1 8618 8622 1277 1277 ? -1 Z 1000 0:00 \_ [test1] <defunct> 8618 8625 1277 1277 ? -1 S 1000 0:00 \_ /home/irv/pgtk/test1 8625 8626 1277 1277 ? -1 S 1000 0:00 \_ sh -c ps -U irv axjf > PuOut.txt 8626 8627 1277 1277 ? -1 R 1000 0:00 \_ ps -U irv axjf
5. Re: Stopping phix
- Posted by irv Dec 28, 2020
- 2016 views
Here's a work-around for what appears to be two copies of phix launched when I run one GTK program. Create a tiny shared library: libpid.so
#include<stdio.h> #include<unistd.h> int get_pid() { return getpid() ; } int get_ppid() { return getppid() ; }
Then, on startup of your program, obtain the pid and ppid (process_id and parent_process_id) by calling these two functions e.g.
constant PID = open_dll("libpid.so") integer pid = c_func(define_c_func(PID,"get_pid",{},C_INT) integer ppid = c_func(define_c_func(PID,"get_ppid",{},C_INT)
Then, when your program ends, call
system(sprintf("kill %d %d",{pid,ppid}))
Without this, your graphics remain on screen but unresponsive. Just a guess, but this may have something to do with GTK using threads. Euphoria doesn't "do" threads, so it doesn't have this problem. Maybe that, or something else entirely?
At least, with this patch, your program behaves properly, and cleans up after itself. Now, if I could learn to do that in my workshop...
6. Re: Stopping phix
- Posted by katsmeow Dec 29, 2020
- 1990 views
system(sprintf("kill %d %d",{pid,ppid}))
On winders, i used taskinfo and taskkill. On win7 taskinfo was changed to tasklist. There's other task* too, found in /system32/.
Kat
7. Re: Stopping phix
- Posted by petelomax Jan 01, 2021
- 1915 views
- Last edited Jan 02, 2021
I think I finally found it, at last. Simply change pStack.e line 1771:
-- mov rax,60 -- sys_exit(rdi=int error_code) mov rax,231 -- sys_exit_group(rdi=int error_code)
Obviously ./p -cp and then, as long as you're calling gtk_main_quit, it should shut down properly.
Hmm, I tried all these things last week, turns out I was testing against the wrong pigging file...
8. Re: Stopping phix
- Posted by irv Jan 02, 2021
- 1904 views
This is why I like Phix. Bugs can be fixed without having to wait for ever for someone to re-compile the works.
Thanks!
9. Re: Stopping phix
- Posted by petelomax Jan 02, 2021
- 1900 views
For completeness (on this thread), the equivalent change for 32-bit a couple of lines above:
-- mov eax,1 -- sys_exit(ebx=int error_code) mov eax,252 -- sys_exit_group(ebx=int error_code)
10. Re: Stopping phix
- Posted by irv Jan 02, 2021
- 1904 views
This works for interpreted code, but compiled still hangs.
Watching the System Monitor shows something interesting:
When you execute "test1" as a compiled program (either from a command line or by clicking on the compiled program icon), the System Monitor shows one instance of test1 running. Things work as they should.
When you click on a close button to shut down test1, a SECOND, "zombie" instance of test1 pops up on the sys monitor.
A fraction of a second after that, the first instance is also converted to a zombie. It's just like the TV show!
Then you get the "Not responding" window after a few seconds delay.
Another clue might be: when you run the program in the interpreter, from a terminal, you are returned to the terminal prompt after the program ends. e.g.
irv@irv-Mint19:/pgtk$
When you run a compiled version, you are not returned to the prompt until you do a "force quit" from the "Not responding" dialog.
11. Re: Stopping phix
- Posted by petelomax Jan 02, 2021
- 1904 views
This works for interpreted code, but compiled still hangs.
That's a shame, it works flawlessly for me here.
12. Re: Stopping phix
- Posted by _tom (admin) Jan 02, 2021
- 1875 views
This works for interpreted code, but compiled still hangs.
That's a shame, it works flawlessly for me here.
Must work on the same problem.
Download a Mint iso, and burn to a usb. We now have identical starting points for a phix installation. Yes, phix will only exist until you turn the power off--an advantage since we get a fresh install for every test.
be well
_tom
13. Re: Stopping phix
- Posted by irv Jan 02, 2021
- 1868 views
To eliminate the possibility that GTK or my code is causing the problem, we eliminate both:
puts(1,"Hello\n") system("ps -U irv axjf | grep hello",0)
irv@irv-Mint19:~/pgtk$ eui hello Hello 7209 7234 7234 7209 pts/1 7234 S+ 1000 0:00 \_ eui hello 7234 7235 7234 7209 pts/1 7234 S+ 1000 0:00 \_ sh -c ps -U irv axjf | grep hello 7235 7237 7234 7209 pts/1 7234 S+ 1000 0:00 \_ grep hello irv@irv-Mint19:~/pgtk$
irv@irv-Mint19:~/pgtk$ p hello Hello 7209 7239 7239 7209 pts/1 7239 S+ 1000 0:00 \_ p hello 7239 7240 7239 7209 pts/1 7239 S+ 1000 0:00 \_ p hello 7240 7241 7239 7209 pts/1 7239 S+ 1000 0:00 \_ sh -c ps -U irv axjf | grep hello 7241 7243 7239 7209 pts/1 7239 S+ 1000 0:00 \_ grep hello irv@irv-Mint19:~/pgtk$
Note the difference.
I think it is safe to say that the problem is caused when one of those 2 instances of 'p' dies but just won't go away.
14. Re: Stopping phix
- Posted by petelomax Jan 02, 2021
- 1860 views
In builtins/syswait.ew line 430 we have:
[ELF32] mov eax,[child] -- shl eax,2 push eax call "libc.so.6","system" add esp,4 mov ebx,eax mov eax,1 -- sys_exit int 0x80 [ELF64] mov rdi,[child] -- shl rdi,2 call "libc.so.6","system" mov rdi,rax mov rax,60 -- sys_exit syscall []
Maybe (untried) if you replace those sys_exit with sys_exit_group as above and in the following recap from pStack.e:
[ELF32] mov ebx,eax -- error_code (p1) -- mov eax,1 -- sys_exit(ebx=int error_code) mov eax,252 -- sys_exit_group(ebx=int error_code) int 0x80 -- xor ebx,ebx -- (common requirement after int 0x80) [ELF64] mov rdi,rax -- error_code (p1) -- mov rax,60 -- sys_exit(rdi=int error_code) mov rax,231 -- sys_exit_group(rdi=int error_code) syscall []
it may help?...
15. Re: Stopping phix
- Posted by irv Jan 02, 2021
- 1867 views
Unfortunately, no success. No harm, either, as far as I can see.
What puzzles me is why are there two 'p's running in the first place? Threads, perhaps?
And do you get the same results, or only one instance when running this test?
16. Re: Stopping phix
- Posted by petelomax Jan 02, 2021
- 1845 views
If I invoke the clib system('gedit'), it will not return until gedit closes.
So, in the bit of code you just edited, which implements the Phix system(), I do a fork and let the child take that wait penalty.
It is in fact the system(ps) that is creating the interim child (I couldn't have told you that when I woke up this morning).
I also get two p hello, but no hang and no zombies.
So, it "still going wrong" as in "two p hello" or as in "compiled gtkwin hangs"?
17. Re: Stopping phix
- Posted by irv Jan 02, 2021
- 1852 views
Still get the two instances of p hello, whether running it interpreted or compiled.
It doesn't "hang", it comes back to the prompt.
If you add a line to the test, and open a process monitor, you'll see two instances of P, one a zombie. All I can figure is that phix (or the OS, somehow) is invoking p again on startup.
puts(1,"Hello\n") system("ps -U irv axjf | grep hello",0) ?wait_key() -- add this
BUT WAIT! I may have solved it:
?wait_key() -- add this puts(1,"Hello\n") system("ps -U irv axjf | grep hello",0) ?wait_key()
Now, watch the system monitor. ONE instance is running. Hit any key. Now TWO instances are running! The second is brain dead. That right there ought to be a clue as to what is happening.
I suspect the reason the GtkWindows "hang around" when you use them is that the zombie process doesn't have the ability to tell GTK to clean up and shut down. Zombies aren't very good communicators, according to the TV show...
19. Re: Stopping phix
- Posted by irv Jan 02, 2021
- 1840 views
system(ps) is doing a fork.
Yeah, that makes sense. sorry for the wild goose chase.
But it still doesn't explain the results in message 13 above. I'm still curious about why the exact same program calls eui once,but p twice.
Anyway, the previous changes mean that interpreted programs close properly. I'll worry about compiled ones later. Working around the problem is easy enough.
20. Re: Stopping phix
- Posted by petelomax Jan 02, 2021
- 1840 views
But it still doesn't explain the results in message 13 above. I'm still curious about why the exact same program calls eui once,but p twice.
It makes perfect sense to me. Does this help:
parent pid 7209 7239 \_ p hello -- original/parent-post-fork paused at wait() [having exited builtins\syswait.ew] 7239 7240 \_ p hello -- child-post-fork paused at system() [still inside builtins\syswait.ew] 7240 7241 \_ sh -c ps -U irv axjf | grep hello 7241 7243 \_ grep hello
21. Re: Stopping phix
- Posted by irv Jan 02, 2021
- 1821 views
OK. Which of the first two steps aren't necessary for Euphoria to do? There was no wait_key() involved in the test program used in message 13:
puts(1,"Hello\n") system("ps -U irv axjf | grep hello",0)
22. Re: Stopping phix
- Posted by petelomax Jan 02, 2021
- 1826 views
OK. Which of the first two steps aren't necessary for Euphoria to do?
Afaict eui invokes [C's] system() directly and somehow it returns immediately, which is point blank dead against everything I have ever read about it and my direct experience in a low-level debugger.
I am now utterly convinced this is a complete red herring and absolutely nothing to do with whatever else is going wrong.
23. Re: Stopping phix
- Posted by irv Jan 02, 2021
- 1883 views
Afaict eui invokes [C's] system() directly and somehow it returns immediately, which is point blank dead against everything I have ever read about it and my direct experience in a low-level debugger.
I am now utterly convinced this is a complete red herring and absolutely nothing to do with whatever else is going wrong.
I agree.