#9798 closed bug (fixed)
mount_nfs hangs, blocked in nfs add on
Reported by: | jua | Owned by: | pdziepak |
---|---|---|---|
Priority: | normal | Milestone: | R1 |
Component: | Network & Internet/UDP | Version: | R1/Development |
Keywords: | nfs nfs_mount | Cc: | |
Blocked By: | Blocking: | ||
Platform: | All |
Description
While trying to mount NFS shares (old NFS, not v4) using mount_nfs, something goes wrong and the command line tool simply hangs and is then unkillable. It can neither be killed with Ctrl+C in the terminal, nor via ProcessController. The only way to get rid of it is a reboot.
I've tracked the problem down to the NFS file system add-on, the following happens in there:
(1) fs_mount() in nfs_add_on.c calls nfs_mount(), which fails for some reason (why that fails is possibly material for another bug report, have not investigated that yet) (2) fs_mount() thus goes to its error handling and calls shutdown_postoffice() (3) shutdown_postoffice() sets the quit-flag for the postoffice-thread to true, closes the socket and then waits for the postoffice-thread to exit using wait_for_thread(). This wait_for_thread() never returns and causes the hanging. (4) ... meanwhile in the postoffice-thread: The postoffice-thread is currently in recvfrom() inside its main loop in postoffice_func(). Since nothing is received anymore, it waits there forever and doesn't see that its quit flag was set, so it never terminates. (5) => deadlock!
I'm not quite sure what the correct way to handle it would be -- maybe a simple workaround would be to set a read timeout on the socket so the postoffice thread would terminate at least at some point. Any ideas?
Change History (6)
comment:1 by , 11 years ago
Component: | Network & Internet → File Systems/NFS |
---|---|
Owner: | changed from | to
comment:2 by , 11 years ago
Component: | File Systems/NFS → Network & Internet/UDP |
---|---|
Owner: | changed from | to
comment:3 by , 11 years ago
Owner: | changed from | to
---|---|
Status: | new → assigned |
I came across this issue last Summer when working on NFS4, unfortunately I completely forgot to commit the patch.
hrev45719 should fix this bug for UDP. I have no idea, though, whether the problem exists in the implementations of other transport layer protocols (i.e. TCP since we do not support SCTP). Anyway, our NFS2 client is UDP only so I believe this ticket can be closed.
comment:4 by , 11 years ago
Resolution: | → fixed |
---|---|
Status: | assigned → closed |
comment:5 by , 11 years ago
I've definitely developed a software that relied on that specific feature in TCP, so I'm pretty sure it at least did work at one point, and probably will still do so :-)
While the behavior of the socket is undefined in this case, the policy Haiku follows is that functions waiting on a file descriptor will return when the file descriptor is closed.
If UDP (I assume?) does not follow that policy, it should be fixed. Thanks for the investigation!