mount_nfs hangs, blocked in nfs add on
|Reported by:||jua||Owned by:||pdziepak|
|Component:||Network & Internet/UDP||Version:||R1/Development|
|Has a Patch:||no||Platform:||All|
While trying to mount NFS shares (old NFS, not v4) using mount_nfs, something goes wrong and the command line tool simply hangs and is then unkillable. It can neither be killed with Ctrl+C in the terminal, nor via ProcessController. The only way to get rid of it is a reboot.
I've tracked the problem down to the NFS file system add-on, the following happens in there:
(1) fs_mount() in nfs_add_on.c calls nfs_mount(), which fails for some reason (why that fails is possibly material for another bug report, have not investigated that yet) (2) fs_mount() thus goes to its error handling and calls shutdown_postoffice() (3) shutdown_postoffice() sets the quit-flag for the postoffice-thread to true, closes the socket and then waits for the postoffice-thread to exit using wait_for_thread(). This wait_for_thread() never returns and causes the hanging. (4) ... meanwhile in the postoffice-thread: The postoffice-thread is currently in recvfrom() inside its main loop in postoffice_func(). Since nothing is received anymore, it waits there forever and doesn't see that its quit flag was set, so it never terminates. (5) => deadlock!
I'm not quite sure what the correct way to handle it would be -- maybe a simple workaround would be to set a read timeout on the socket so the postoffice thread would terminate at least at some point. Any ideas?
Change History (6)
comment:2 by , 7 years ago
|Component:||File Systems/NFS → Network & Internet/UDP|