Opened 10 years ago

Closed 10 years ago

Last modified 10 years ago

#4194 closed bug (invalid)

fuse (userlandfs) causes a page fault - user access in kernel area

Reported by: Blub Owned by: bonefish
Priority: normal Milestone: R1
Component: File Systems Version: R1/pre-alpha1
Keywords: Cc:
Blocked By: Blocking:
Has a Patch: no Platform: All

Description

I'm currently trying to run sshfs on haiku, svn rev 32167 I was able to successfully run the "hello-world" fuse-filesystem using fuse, but when I try to mount using sshfs, just after I logged in and typed in my password, a page-fault happens when creating the file-cookie-locker I have prepared GDB and syslog output after building userlandfs+subdirectories with debug information and am trying to debug this, but I hardly know the code so it's hard for me to find the problem.

It's a bit hard to reproduce since sshfs depends on glib2 (with thread support) and you need a fuse pkgconfig (.pc) file, but once you have that you can simply - without any changes to the source - do:
./configure
make
mkdir ~/config/add-ons/userlandfs #if not already done
cp sshfs ~/config/add-ons/userlandfs/sshfs

then:
mount -t userlandfs -o 'sshfs user@host:' mountpoint/

You can download all the needed dependencies from:
http://rear.endoftheinternet.org/~blub/haiku-files/
(scroll down to the package list)
they can simply unzipped into /boot: unzip -d package....pkg.zip

glib2 there is compield with the system-PCRE so you need to download:
pcre
glib2
fuse-proto (if you don't already have a fuse.pc file)
pkgconfig (so ./configure finds what it needs)
and the current sshfs-fuse source from http://fuse.sourceforge.net/sshfs.html

syslog and gdb output files follow...

The Problem

The problem happens within FUSE, so I doubt it is a problem with sshfs
And even if it was a bug in sshfs, it shouldn't be allowed to crash userlandfs/fuse in such a way.
After such an unsuccessful mount, any further attempt to mount something else results in a message
mount: General system error

Attachments (3)

haiku_userlandfs_fuse_sshfs_syslog.debug.txt (101.9 KB) - added by Blub 10 years ago.
syslog output after userlandfs_server crashes
haiku_userlandfs_fuse_sshfs_gdb.debug.txt (4.5 KB) - added by Blub 10 years ago.
gdb backtrace after the crash
haiku_userlandfs_moredebug.txt (8.5 KB) - added by Blub 10 years ago.
I added the debug flag to some more directories, this might be more helpful now

Download all attachments as: .zip

Change History (21)

Changed 10 years ago by Blub

syslog output after userlandfs_server crashes

Changed 10 years ago by Blub

gdb backtrace after the crash

comment:1 Changed 10 years ago by anevilyak

Owner: changed from axeld to bonefish

Changed 10 years ago by Blub

I added the debug flag to some more directories, this might be more helpful now

comment:2 Changed 10 years ago by Blub

(damnit, consider that 3rd file I added useless since my I messed something up when I added some debug output :( )

comment:3 Changed 10 years ago by Blub

KERN: userlandfs [156373326: 136] userlandfs_open() done: (7ffffff1, 0x800b396d) KERN: userlandfs [156474719: 136] userlandfs_close(0x8105a72c, 0x810c5f60, 0x800b396d)

So basically, it tries to open something, and even though it fails, it tries to close it later... Actually, it seems this goes out of the userlandfs part since it does send the correct error value back to the kernel in userlandfs/kernel_interface.cpp

comment:4 Changed 10 years ago by Blub

Okay, even though on error it makes sense taht the filecookie doesn't need to be valid, it still helps debugging, and it's done in the other requests, (like OpenDir) I suggest this for consistency: http://stud4.tuwien.ac.at/~e0725517/patches/haiku_null_file_cookie.diff

comment:5 Changed 10 years ago by Blub

PROGRESS!! :D

Okay I got sshfs to work by making userlandfs_open return -1 on error. My motivation for this was because in vfs.cpp I saw that all the status-checks looked like:
if(status < B_OK)

And I noticed the status being a "high positive" number in the syslog.

Now the question is, which part is now actually returning a wrong error-code? Is it fuse or userlandfs?

When I read the fuse-hellofs I noticed: FUSE filesystems are supposed to return negated error values, like -EACCES. So my first thought would be that FUSE doesn't handle that correctly?

I'll see if I can find out more :)

comment:6 Changed 10 years ago by axeld

You have to compile POSIX sources with the B_USE_POSITIVE_POSIX_ERRORS macro set to 1.

Error codes are negative in Haiku, but POSIX requires them to be positive (they changed that at some point, causing some trouble for BeOS/Haiku).

In any case, the "< B_OK" checks can most of the time be replaced by a "!= B_OK".

comment:7 Changed 10 years ago by Blub

So sshfs should work when I compile it with -DB_USE_POSITIVE_POSIX_ERRORS?

Even then, the filesystem which *uses* fuse shouldn't be able to mess up Haiku in such a way that you cannot mount anything else after that happened, so I think the FUSE implementation should in any case negate positive error values.

comment:8 in reply to:  7 Changed 10 years ago by axeld

Replying to Blub:

So sshfs should work when I compile it with -DB_USE_POSITIVE_POSIX_ERRORS?

I haven't tried it, but that should have been more or less the point of the FUSE compatibility :-) Ingo would probably know more, but he's obviously short on time now.

Even then, the filesystem which *uses* fuse shouldn't be able to mess up Haiku in such a way that you cannot mount anything else after that happened, so I think the FUSE implementation should in any case negate positive error values.

Right thought, wrong solution: in hrev32184 I've made the VFS more robust against broken file systems.

comment:9 Changed 10 years ago by Blub

Okay, using !=B_OK in the VFS makes sense but I didn't know it's a valid solution, since I obviously don't know the rest of the system well enough. So I wouldn't know if maybe some other filesystem implementations relied on positive return codes not being an error.

Maybe later when I know the code better I can actually provide suitable patches for the (hopefully very few ;) ) other problems/bugs/... I'll encounter :)

Okay, so I can mount the filesystem now, even without modifications, and I can browse it, however, there are two problems left:
1) I cannot unmount it, it says "device/file/resource busy" even if I had just mounted it and not used the FS at all yet.
2) The userlandfs_server doesn't seem to be started automatically (or fails). So I have to call /boot/system/servers/userlandfs_server sshfs manually first.

comment:10 Changed 10 years ago by Blub

(oh, maybe I should note that unmount -f just hangs)

comment:11 Changed 10 years ago by Blub

Okay, seems like without B_USE_POSITIVE_POSIX_ERRORS browsing doesn't work as well as I thought. I can use ls mnt/blah/... copy, read, remove etc. but when I actually 'cd' into the filesystem and use 'ls' from within, the KDL pops up. I'm not getting these problems with B_USE_POSITIVE_POSIX_ERRORS, so I guess there's still some point in the code which isn't as robust yet.
KDL bt output: http://git.rear.endoftheinternet.org/~blub/images/haiku-sshfs-evil.png[[BR]] tail of KDL syslog output: http://git.rear.endoftheinternet.org/~blub/images/haiku-sshfs-evil-syslog.png

comment:12 Changed 10 years ago by Blub

Somehow I missed that the unmount problem also only exists without B_USE_POSITIVE_POSIX_ERRORS. However this still makes userlandfs unusable once it's in this state, which I think shouldn't be possible for the userspace fuse part

comment:13 Changed 10 years ago by bonefish

Resolution: invalid
Status: newclosed

B_USE_POSITIVE_POSIX_ERRORS has to be used together with the POSIX error mapper library. My configure line (from a few month ago -- packages might have changed) looks like this:

SSHFS_CFLAGS="-DB_USE_POSITIVE_POSIX_ERRORS -I/boot/develop/headers/userlandfs/fuse -I/boot common/include/glib-2.0 -I/boot/common/lib/glib-2.0/include -D_FILE_OFFSET_BITS=64" SSHFS_LIBS="-lposix_error_mapper -lnetwork -luserlandfs_fuse -lglib-2.0 -lgthread-2.0" ./configure --prefix=/boot/common

comment:14 Changed 10 years ago by Blub

Yes, with B_USE_POSITIVE_POSIX_ERRORS most things work, however:
With this the userlandfs_server still isn't automatically started.
Fuse still only allows one volume to be mounted so you cannot have multiple ssh filesystems mounted at the same time.
And personally I don't think it is acceptable that without these flags you can end up in the KDL, although this would only happen to developers.

Should I file tickets for the first 2 problems? (what about the 3rd?)

comment:15 in reply to:  14 Changed 10 years ago by bonefish

Replying to Blub:

Yes, with B_USE_POSITIVE_POSIX_ERRORS most things work, however:
With this the userlandfs_server still isn't automatically started.

Actually it is automatically started. The problem is that sshfs starts ssh which in turn tries to open the controlling terminal. Since the userlandfs_server when started by the kernel doesn't have a controlling terminal, ssh fails immediately and sshfs fails to mount. That's not a problem of the userlandfs, though, but of the sshfs "port". A real port should handle the interactive authentication differently. For lazy ports the FUSE bridge could offer a feature to open a terminal, but such an option doesn't exist yet.

Fuse still only allows one volume to be mounted so you cannot have multiple ssh filesystems mounted at the same time.

Yep, that's a known missing feature. For a file system add-on with Haiku or BeOS kernel style interface only one userlandfs_server needs to be started, since one instance can mount an arbitrary number of volumes. FUSE requires a process per volume, though, which doesn't quite fit the userlandfs design yet.

And personally I don't think it is acceptable that without these flags you can end up in the KDL, although this would only happen to developers.

Userland file systems shouldn't be able to cause KDLs. If that still happens after hrev32184, it's worth to file a ticket.

Should I file tickets for the first 2 problems? (what about the 3rd?)

The first isn't a bug/missing feature in Haiku or userlandfs, so there's no point in filing a ticket. Feel free to file a ticket for the second one -- I'm sure there's a TODO in the code, so you don't really have to.

comment:16 Changed 10 years ago by Blub

I was wondering if it would make sense to move the fuse-fs initialization from FUSEFileSystem to FUSEVolume, and instead of having one server per fuse-fs linked to the fuse library, a fuse-userlandfs which would instantiate the fuse programs for each volumes. Of course then the mount command would look something like "mount -t userlandfs -p 'fuse sshfs user@host:' mntpt". What do you think about that?

For the password input, I'll do the necessary porting there then :) Maybe a pop-up box would do, although, would it be possible to get the terminal of the process which executes mount? Such that if you use mount from a terminal, you could type in the password there, and if it is no terminal at all, a popup-box could appear? Would there be any sane way of getting the mount-process' terminal?

comment:17 in reply to:  16 Changed 10 years ago by bonefish

Replying to Blub:

I was wondering if it would make sense to move the fuse-fs initialization from FUSEFileSystem to FUSEVolume, and instead of having one server per fuse-fs linked to the fuse library, a fuse-userlandfs which would instantiate the fuse programs for each volumes. Of course then the mount command would look something like "mount -t userlandfs -p 'fuse sshfs user@host:' mntpt". What do you think about that?

That would be quite a bit of work. At least the solution I have in mind is significantly simpler. The userlandfs kernel module maintains a map from file system name to userlandfs server instance, so the same instance can be reused for mounting more volumes. I was thinking of introducing a FS capability flag indicating that only one volume can be mounted by a server instance, which would cause the kernel module to forget previous instances. A minor problem is, that the mechanism to start a userlandfs server first and mount the volume afterwards requires the kernel module to actively look for a server instance for the respective file system (via find_port()). This can be solved by adjusting the handshake protocol respectively, though

For the password input, I'll do the necessary porting there then :) Maybe a pop-up box would do, although, would it be possible to get the terminal of the process which executes mount? Such that if you use mount from a terminal, you could type in the password there, and if it is no terminal at all, a popup-box could appear? Would there be any sane way of getting the mount-process' terminal?

I don't think so. The mounting process is known, but I don't think there's a way to get a process' controlling terminal ATM. I wouldn't find that particularly clean either. I'm pretty sure at some point we will introduce auto mounting and some kind of server that mounts volumes from previous session (currently done by Tracker) and then there probably isn't a terminal at hand anyway. I don't think ssh can be used via pipes, so I guess one has to open a new tty for communicating with ssh. Maybe ssh has helpful options...

comment:18 Changed 10 years ago by Blub

I'm currently experimenting with a "password_app" option in sshfs, which basically works like this:
mount -t userlandfs -p 'sshfs -o password_app=/boot/common/bin/password_prompt user@host:' mntpt

When sshfs starts ssh, it creates a pty and makes it the controlling terminal for ssh, then it waits for ssh to put the password prompt into that pty, and if ssh prints to stdout first, then it assumes that the password prompt isn't going to show up.
It works fine when I start the server manually, however, when I do not, there are some problems there:

  • sshfs' execvp() call cannot find "ssh" - not really a porblem since I can use: -o ssh_command=/boot/common/bin/ssh
  • userlandfs_server's stdout somehow ends up in the socket sshfs creates using socketpair() to replace the fork's stdin/out - I assume that the fork()/exec() combination doesn't prevent userlandfs_server's stdout to end up in the fork's stdout fd, which causes sshfs to not ask for the password? (some fuse functions use printf(...) outputs and with debug flags tehre's even more output.
Note: See TracTickets for help on using tickets.