Opened 10 months ago

Closed 9 months ago

Last modified 9 months ago

#18805 closed bug (fixed)

haikuporter cannot unmount packages after building some packages

Reported by: 3dEyes Owned by: waddlesplash
Priority: normal Milestone: R1/beta5
Component: System/Kernel Version: R1/Development
Keywords: Cc: korli
Blocked By: Blocking: #18849
Platform: All

Description

At the end of building large packages like mesa or libreoffice I get an error:

unmount: unmounting failed: Device/File/Resource busy
Unable to unmount /FastData/HaikuPorts/haikuports/app-office/libreoffice/work-24.2.1.1/boot/system.
FDs in use by applications:
Haikuporter could not unmount 'system' volume in chroot. Would you like to force an unmount? [y/N]

Apparently the problem occurred after changes href57501 - href57508.

PS: Haiku x86_64, hrev57581

Attachments (2)

unmount_fail.text (30.5 KB ) - added by begasus 10 months ago.
Unmount failure syslog
unmount_fail_use-F.txt (28.1 KB ) - added by begasus 10 months ago.
Running haikuporter with -F

Download all attachments as: .zip

Change History (22)

comment:1 by waddlesplash, 10 months ago

Component: - GeneralSystem/Kernel

comment:2 by 3dEyes, 10 months ago

Owner: changed from nobody to waddlesplash
Status: newassigned

comment:3 by waddlesplash, 10 months ago

Any chance this happens with any smaller packages than those?

That range contains not only my refactors but also korli's change to open() behavior. Any chance it could be narrowed down further?

comment:4 by 3dEyes, 10 months ago

At the moment I can only reproduce every time on two recipes - mesa and libreoffice. However, today I built a few dozen small packages - everything is fine.

comment:5 by begasus, 10 months ago

Happens also on R1B4 (both arch's).

comment:6 by bipolar, 10 months ago

One "easy" way for me to trigger this (not sure if 100% reliably thou) is:

Start a haikuporter build, say... for Python. While the build is in process... Use Tracker's drill-down menus to look up something under the work-* directory.

That usually ends up with:

Unable to unmount /boot/home/SourceCode/haikuports/haikuports/dev-libs/tvision/work-2023.10.03~git/boot/system.
FDs in use by applications:
  317 105  R    18:257 /boot/system/Tracker
  317 107  R    18:255 /boot/system/Tracker
  317 108  R    18:264 /boot/system/Tracker
  317 111  R    18:226 /boot/system/Tracker
  317 113  R    18:504 /boot/system/Tracker
Haikuporter could not unmount 'system' volume in chroot. Would you like to force an unmount? [y/N]

And replying yes results in the nasty looking output:

Forcing unmount
Command '['bash', '-c', '\n\ncheckedUnmount()\n{\n\tlocal mountPoint="$1"\n\n\tif ! [[ $mountPoint = /* ]]; then\n\t\tmountPoint=$PWD/$mountPoint\n\tfi\n\n\t# retry up to 5 times to unmount the given mountpoint\n\tlocal x=0\n\twhile true; do\n\t\tif unmount "$mountPoint"; then\n\t\t\tbreak\n\t\tfi\n\n\t\tlet x+=1\n\t\tif [ $x -ge 5 ]; then\n\t\t\techo -e "Unable to unmount $mountPoint.\\nFDs in use by applications:"\n\t\t\tfdinfo -d "$mountPoint"\n\n\t\t\tread -r -d \'\' message <<-"EOF"\n\t\t\t\tHaikuporter could not unmount "\'$(basename $mountPoint)\'" volume\n\t\t\t\tin chroot. Would you like to force an unmount? [y/N]\n\t\t\t\tEOF\n\t\t\tmessage=$(eval echo -e $message)\n\n\t\t\tnoForceUnmount=1\n\t\t\tif [ -t 0 ]; then\n\t\t\t\tread -p "$message" -n 1 -r\n\t\t\t\t[[ $REPLY =~ ^[Yy]$ ]]\n\t\t\t\tnoForceUnmount=$?\n\t\t\t\techo "$noForceUnmount"\n\t\t\telse\n\t\t\t\t# not running interactively, force an unmount anyway\n\t\t\t\tnoForceUnmount=0\n\t\t\tfi\n\n\t\t\tif [ $noForceUnmount -eq 0 ]; then\n\t\t\t\techo "Forcing unmount"\n\t\t\t\tunmount -f "$mountPoint"\n\t\t\tfi\n\n\t\t\t# fail no matter what was decided\n\t\t\texit 1\n\t\tfi\n\n\t\techo "unmounting $mountPoint failed - wait and retry ..."\n\t\tsleep $x\n\tdone\n}\n\n# ignore sigint\ntrap \'\' SIGINT\n\n# try to make sure we really are in a work directory\nif ! echo $(basename $PWD) | grep -qE \'^work-\'; then\n\techo "cleanupChroot invoked in $PWD, which doesn\'t seem to be a work dir!"\n\texit 1\nfi\n\n# if it is defined, unmount the cross-build sysroot\nif [[ -n $crossSysrootDir && -e $crossSysrootDir/boot/system/develop ]]; then\n\tcheckedUnmount $crossSysrootDir/boot/system\nfi\n\ncheckedUnmount dev\ncheckedUnmount boot/system\n\n# wipe files and directories if it is ok to do so\nif [[ $buildOk ]]; then\n\techo "cleaning chroot folder"\n\trm -rf \\\n\t\tboot \\\n\t\tbuild-packages \\\n\t\tdev \\\n\t\tpackage-infos \\\n\t\tpackages \\\n\t\tpackaging \\\n\t\tprereq-repository \\\n\t\trepository\n\trm -f \\\n\t\t.PackageInfo \\\n\t\tbin \\\n\t\tetc \\\n\t\tport.recipe \\\n\t\tsystem \\\n\t\ttmp \\\n\t\tvar\nelse\n\techo "cleaning \'chroot/boot\' folder"\n\trm -rf boot\n\techo "keeping chroot folder $PWD intact for inspection"\nfi\n']' returned non-zero exit status 1.

And then me having to manually unmount system from the chroot anyway.

comment:7 by X512, 10 months ago

Use Tracker's drill-down menus to look up something under the work-* directory.

I suspect it is a different problem when some files in work directory are opened by Tracker or some other application.

in reply to:  7 comment:8 by bipolar, 10 months ago

FWIW, the only other way I've ever seen logs exactly like in 3dEyes's description (where FDs in use by applications: shows nothing before Haikuporter could not unmount 'system' volume in chroot.) is when:

Running parallel builds or --test (for Python or uncrustify, for example), and I need to kill some of the sub-processes, when they run amok.

But yeah, both of my cases need "user-interaction" to trigger, so I guess that makes them a different issue.

comment:9 by begasus, 10 months ago

Just had one for building kdenlive, adding syslog from the first run, then when running with "haikuporter -F *".

by begasus, 10 months ago

Attachment: unmount_fail.text added

Unmount failure syslog

by begasus, 10 months ago

Attachment: unmount_fail_use-F.txt added

Running haikuporter with -F

comment:10 by kallisti5, 9 months ago

Blocked By: 18849 added

comment:11 by waddlesplash, 9 months ago

Seems the problem can be reproduced with Mesa even on an incremental build/install. So that makes testing pretty simple.

comment:12 by kallisti5, 9 months ago

Yeah. I've found a workaround for getting stuff built on riscv64:

  • Build recipe
  • Once complete, you see:
    mimesetting files for package libtool_libltdl-2.4.6-3-riscv64.hpkg ...
    creating package libtool_libltdl-2.4.6-3-riscv64.hpkg ...
    ----- Package Info ----------------
    header size:                     80
    heap size:                    52133
    TOC size:                       230
    package attributes size:        578
    total size:                   52213
    -----------------------------------
    waiting for build package libtool-2.4.6-3 to be deactivated
    waiting for build package libtool_libltdl-2.4.6-3 to be deactivated
    unmount: unmounting failed: Device/File/Resource busy
    unmounting /boot/home/haikuports/dev-build/libtool/work-2.4.6/boot/system failed - wait and retry ...
    unmount: unmounting failed: Device/File/Resource busy
    unmounting /boot/home/haikuports/dev-build/libtool/work-2.4.6/boot/system failed - wait and retry ...
    unmount: unmounting failed: Device/File/Resource busy
    unmounting /boot/home/haikuports/dev-build/libtool/work-2.4.6/boot/system failed - wait and retry ...
    unmount: unmounting failed: Device/File/Resource busy
    unmounting /boot/home/haikuports/dev-build/libtool/work-2.4.6/boot/system failed - wait and retry ...
    unmount: unmounting failed: Device/File/Resource busy
    Unable to unmount /boot/home/haikuports/dev-build/libtool/work-2.4.6/boot/system.
    FDs in use by applications:
    Haikuporter could not unmount 'system' volume in chroot. Would you like to force an unmount? [y/N]n1
    

say n. Reboot. rebuild, and then the packaging is successful since you don't get as much disk activity to potentially "hold" open the packagefs mount.

comment:13 by waddlesplash, 9 months ago

Blocked By: 18849 removed
Blocking: 18849 added
Cc: korli added

The culprit appears to be hrev57507; reverting that fixes the problem here.

comment:14 by waddlesplash, 9 months ago

(Specifically, just 77cf55c4ad5e09116e956223cfdfd7695bb903ae.)

I've been looking at that commit for a while and added some asserts around its edge cases, but so far I haven't figured out what could be leaking vnode references in there.

comment:15 by waddlesplash, 9 months ago

I found a few edge cases and pushed a change for review of these: https://review.haiku-os.org/c/haiku/+/7529

However this does not fix the problem.

comment:16 by waddlesplash, 9 months ago

(Specifically, just 77cf55c4ad5e09116e956223cfdfd7695bb903ae.)

Seems I mistested somehow, because that's not actually the problem, the real culprit is 79c0b6288f5b21adbe2902a10cbcf468cc9d0fd2.

comment:17 by waddlesplash, 9 months ago

Milestone: UnscheduledR1/beta5

comment:18 by waddlesplash, 9 months ago

Resolution: fixed
Status: assignedclosed

Fixed in hrev57654.

comment:19 by X512, 9 months ago

@waddlesplash, can you use hrev* for references, not raw Git hashes? Or at least use links to git.haiku-os.org.

comment:20 by waddlesplash, 9 months ago

Some of those commits were in the middle of hrevs. I suppose I could've used the hrevxxx~1 etc. syntax.

Note: See TracTickets for help on using tickets.