#1755 closed bug (fixed)
APR 0.9.x configure hangs
Reported by: | andreasf | Owned by: | bonefish |
---|---|---|---|
Priority: | normal | Milestone: | R1 |
Component: | - General | Version: | R1/pre-alpha1 |
Keywords: | Cc: | bonefish | |
Blocked By: | Blocking: | ||
Platform: | x86 |
Description
Configuring apr-0.9.17 or 0.9.x SVN branch (--with-build=i586-pc-beos) reproducibly hangs when supposed to generate the Makefiles.
Expected would be almost instant generation of the Makefiles as on R5.
Spawned processes include sed and sort. CPU usage is < 1% for > 5 mins. While likely some deadlock, this happens also with SMP disabled.
Entering the kernel debugger, listing all threads and exiting results in immediate error "/bin/sort: write failed: standard output: Broken pipe" and generation of the Makefiles (and then another hang when "config.status: executing default commands"). When immediately exiting the kernel debugger without other commands, it still hangs.
Last experienced at hrev23891, with GCC 2.95.3 from Haiku site with its include dir replaced from Linux' generated/, Haiku headers and libs symlinked/copied.
Change History (14)
comment:1 by , 17 years ago
comment:2 by , 17 years ago
Make that "maximizing and restoring". After maximizing stilll nothing happens.
follow-up: 5 comment:3 by , 17 years ago
That the process continues after playing with the Terminal window size is likely due our missing automatic syscall restarts (cf. #1743). Why it hangs in the first place is a different problem. If you can reproduce the problem, you could check in the kernel debugger, where the responsible thread hangs ("sc") and -- if you've kernel tracing enabled (for syscalls at least, even better also for signals and teams) -- also print the last "traced" entries of this thread.
comment:4 by , 17 years ago
Any hint how to find the "responsible thread"? I'm not even sure which process, there are about five.
follow-up: 6 comment:5 by , 17 years ago
Cc: | added |
---|
Replying to bonefish:
If you can reproduce the problem, you could check in the kernel debugger, where the responsible thread hangs ("sc")
I do have five teams, each single-threaded, waiting for different semaphores (0x9...).
sh
9375 appears to be the configure script, and interpreting sc
, it is waiting for a child process (kernel:wait_for_child
).
sed
9376 appears to be reading from a pipe (kernel:pipefs_read
).
sh
9380 appears to be writing to a pipe (kernel:pipefs_write
).
sh
9381 appears to be waiting for a child process (kernel:wait_for_child
).
sort
9382 appears to be reading from a pipe (kernel:pipefs_read
).
Obviously I've shortened the symbol names and picked a meaningful one from the top of the list - if you need the full backtrace, is there a better way than a digicam?
Sounds like a reader-writer-lock problem to me.
I don't know how to interpret the sem
output; however sem
for the sed
semaphore (0x94c7de9c) printed [*** READ/WRITE FAULT ***]
as the last line and above as name two triangles and as id 0 and as owner 1 (count and queue both large negative numbers; all others had a hexadecimal next and negative next_id instead, no name and a negative id).
and -- if you've kernel tracing enabled (for syscalls at least, even better also for signals and teams) -- also print the last "traced" entries of this thread.
traced
was not recognized as command in the kernel debugger. If I need to enable this to help debug this further, please tell me how.
comment:6 by , 17 years ago
Replying to andreasf:
Replying to bonefish:
If you can reproduce the problem, you could check in the kernel debugger, where the responsible thread hangs ("sc")
I do have five teams, each single-threaded, waiting for different semaphores (0x9...).
sh
9375 appears to be the configure script, and interpretingsc
, it is waiting for a child process (kernel:wait_for_child
).sed
9376 appears to be reading from a pipe (kernel:pipefs_read
).sh
9380 appears to be writing to a pipe (kernel:pipefs_write
).sh
9381 appears to be waiting for a child process (kernel:wait_for_child
).sort
9382 appears to be reading from a pipe (kernel:pipefs_read
).Obviously I've shortened the symbol names and picked a meaningful one from the top of the list - if you need the full backtrace, is there a better way than a digicam?
If you don't have a serial port and a second compute to record the serial output, then taking a picture is the only way.
Sounds like a reader-writer-lock problem to me.
Doesn't look too bad. At least there are both pipe readers and writers. The question is why they don't make progress. Using the "team" command for each of the teams, you can also get (a part) of their command line arguments.
I don't know how to interpret the
sem
output; howeversem
for thesed
semaphore (0x94c7de9c) printed[*** READ/WRITE FAULT ***]
as the last line and above as name two triangles and as id 0 and as owner 1 (count and queue both large negative numbers; all others had a hexadecimal next and negative next_id instead, no name and a negative id).
If the "sem/cv" number listed by the "threads" command is greater than 0x80000000, then it isn't a semaphore but a condition variable (not unlikely, since the pipefs implementation does indeed use condition variables). You get information about it via the "cvar" command -- not much, since condition variables are quite simple.
and -- if you've kernel tracing enabled (for syscalls at least, even better also for signals and teams) -- also print the last "traced" entries of this thread.
traced
was not recognized as command in the kernel debugger. If I need to enable this to help debug this further, please tell me how.
I recently wrote an article about the kernel debugger, including a section "Kernel Tracing" with a subsection "Enabling It":
http://www.haiku-os.org/documents/dev/welcome_to_kernel_debugging_land
comment:7 by , 17 years ago
Just a short note that I still get this at hrev23990 but with the new automatic syscall restart I can workaround by maximizing/minimizing without broken pipe error.
comment:8 by , 17 years ago
I also ran into this issue on hrev24209, and the max/minimize seems to restart just fine. I noticed that if I do a "ps", it prints "Bad semaphore ID(-1)" for sh and sed processes. But this seems to show up all the time so perhaps it doesn't mean anything.
follow-up: 10 comment:9 by , 17 years ago
The error message is due to my recent change to http://dev.haiku-os.org/changeset/24023/haiku/trunk/src/bin/ps.c
Previously ps would just list the previous value found by the while loop for each line which now prints the error message instead.
I guess the -1 is due to waiting on a condition variable. http://dev.haiku-os.org/browser/haiku/trunk/src/system/kernel/condition_variable.cpp?rev=23980 (Look for -1 in PrivateConditionVariableEntry::Wait())
follow-up: 11 comment:10 by , 17 years ago
Replying to jonas.kirilla:
I guess the -1 is due to waiting on a condition variable.
Yes, Ingo pointed this out above.
comment:11 by , 17 years ago
Replying to andreasf:
Replying to jonas.kirilla:
I guess the -1 is due to waiting on a condition variable.
Yes, Ingo pointed this out above.
Yeah, I just wanted to follow up on Andrew's observation of 'ps' output since I'm responsible for the latest change to it. I forgot to press reply (for proper quotation). I didn't mean to comment on the reported issue.
comment:12 by , 17 years ago
Owner: | changed from | to
---|---|
Status: | new → assigned |
I can reproduce a problem with blocking pipes with unzip -l large.zip | less
+ "G" (not always, but often enough). Looking into it.
comment:14 by , 17 years ago
Finally got around to checking on this, and it no longer hangs for me. Thanks!
Checking the 1.2.x branch I got a similar hang. Interestingly, there, maximizing the Terminal window resolves the hang and makes it continue to the end without errors or further hangs.
For 0.9.x, maximizing the window at the hang leads to the same broken pipe error as when returning from kernel debugger and hangs again at the end; maximizing once more makes the script end normally without pipe error.