#10977 closed bug (fixed)
kdl in pthread_cond_wait on netsurf buildslave
Reported by: | pulkomandy | Owned by: | bonefish |
---|---|---|---|
Priority: | normal | Milestone: | R1 |
Component: | System/Kernel | Version: | R1/Development |
Keywords: | Cc: | hamish | |
Blocked By: | Blocking: | #11000 | |
Platform: | All |
Description
Anothre KDL from NetSurf's build slave. We have tried to extract as much info as possible over IRC. This happens somewhere in OpenJDK. It seems the condition variable doesn't exist and two threads are accessing it.
This is on a virtual machine installed as follows: http://wiki.netsurf-browser.org/Continuous_Integration_Manual_Haiku_Slave_Setup
Attachments (5)
Change History (14)
by , 10 years ago
comment:1 by , 10 years ago
The immediate cause of the panic is just a missing
DEBUG_PAGE_ACCESS_END(context.page);
in vm_soft_fault()
before unlocking everything when having to wait for a to-be-unmapped page to become unwired.
Unfortunately that will leave another issue to be resolved: The page we want to unmap isn't actually wired in this case. We have two threads that want to wire the same virtual page for writing. The way wire_page()
works, they both first mark the respective address range wired before calling vm_soft_fault()
to map a writable page (there's only a readable one from a lower cache). Either thread ignores its own pre-wired range, but not that of the other thread. Hence the read-only page looks wired and cannot be unmapped. Both threads would wait forever.
Not sure how involved a solution would be. It might be possible to mark pre-wired ranges respectively and ignore them in vm_soft_fault()
, but this needs to be thought through thoroughly (particularly when to unmark the ranges).
by , 10 years ago
Attachment: | haiku-kdl-20140701.log added |
---|
Same panic occured again, attaching complete syslog.
comment:3 by , 10 years ago
Cc: | added |
---|
comment:4 by , 10 years ago
Please don't attach more stuff to this ticket (unless it's a solution or a small test case). The problem itself is understood, as documented in comment:1. Adding more comments/attachments will just bury that comment.
If you run into an issue which you think might be different, please create a new ticket. The pattern here is: one thread panic()
s in fault_get_page()
while another thread waits in vm_soft_fault()
.
comment:5 by , 10 years ago
For reference, it would be nice to know the actual virtual machine software and version.
comment:7 by , 10 years ago
Attached a test program that fairly reliably reproduces the issue on (virtual) hardware with 3 or more CPUs. Tested with qemu.
Compile with:
g++ -Wall -o 10977 10977.cpp
Run with:
while ./10977; do true; done
comment:8 by , 10 years ago
Status: | new → in-progress |
---|
comment:9 by , 10 years ago
Resolution: | → fixed |
---|---|
Status: | in-progress → closed |
Backtraces of the two threads and KDL message