Opened 10 years ago

Closed 10 years ago

Last modified 10 years ago

#10819 closed bug (fixed)

[package_daemon] goes into infinite loop calling PackageFileManager::GetPackageFile()

Reported by: ttcoder Owned by: bonefish
Priority: normal Milestone: R1
Component: Servers/package_daemon Version: R1/Development
Keywords: Cc:
Blocked By: Blocking:
Platform: All

Description

That's hrev47209

Occured twice in a row now, including after a fresh reboot: if I move my hpkg to /boot/home/config/packages, one of my 2 CPUs goes 100%. Dropping the thread ("job runner") into debugger for a "snapshot", I see it in GetPackageFile() each time.

Attachments (1)

package_daemon-480-debug-08-05-2014-22-28-55.report (10.7 KB ) - added by ttcoder 10 years ago.
Indirection on "ascii" pointer after removing a package

Download all attachments as: .zip

Change History (8)

comment:1 by ttcoder, 10 years ago

(Note to self: try removing file administrative/activated-packages see if the OS works again.)

Also, the code path is exactly the same as this one, up to GetPackageFile(): https://dev.haiku-os.org/attachment/ticket/10817/package_daemon-201-debug-07-05-2014-10-32-16.report

As to the looping, I suppose it's this while() loop that is forever-ing ? http://cgit.haiku-os.org/haiku/tree/src/servers/package/Volume.cpp#n688

EDIT: I suppose it's hard to imagine that a bug in events.RemoveHead() would fail to exhaust its contents eventually, so maybe it's actually Volume::ProcessPendingNodeMonitorEvents() itself which is being called an infinite number of times.

Last edited 10 years ago by ttcoder (previous) (diff)

by ttcoder, 10 years ago

Indirection on "ascii" pointer after removing a package

comment:2 by ttcoder, 10 years ago

While trying to reproduce that infinite loop I got a crash. The code path is suspiciously similar so I'll wait for Ingo's advice before filing a new ticket. That "new" ticket would go something like this:

Title: daemon crashes indirecting 0x676b7068 while activating/deactivating packages

Body: the "address" 0x676b7068 is in fact the ascii string "hpkg" so that's an obvious candidate of memory corruption isn't it? Here's the sequence of events (below). Seems I have a knack for rubbing this hrev47209 the wrong way so don't hesitate to ask me for experiments and reproducible cases :-)

KERN: package_daemon [144533659:   490] Volume::_PackagesEntryRemoved("ArmyKnifeTTE-5.1.0.0-1-x86_gcc2.hpkg")
KERN: package_daemon [144622162:   490] KERN: CommitTransactionHandler::_ChangePackageActivation(): activating 0, deactivating 1 packages
KERN: packagefs [144623493:   490] Volume::_ChangeActivation(): 0 new packages, 1 old packages
KERN: packagefs [144623962:   490] package "ArmyKnifeTTE-5.1.0.0-1-x86_gcc2.hpkg" deactivated
KERN: package_daemon [147890978:   490] Volume::_PackagesEntryCreated("ArmyKnifeTTE-5.1.0.0-2-x86_gcc2.hpkg")
KERN: package_daemon [148608053:   490] CommitTransactionHandler::_ChangePackageActivation(): activating 1, deactivating 0 packages
KERN: packagefs [148609381:   490] Volume::_ChangeActivation(): 1 new packages, 0 old packages
KERN: packagefs [148611939:   490] package "ArmyKnifeTTE-5.1.0.0-2-x86_gcc2.hpkg" activated
KERN: intel_extreme accelerant:CALLED status_t intel_get_edid_info(void *, long unsigned int, uint32 *)
KERN: Last message repeated 3 times.
KERN: package_daemon [166608014:   490] Volume::_PackagesEntryRemoved("taglib-1.7.2-1-x86_gcc2.hpkg")
KERN: package_daemon [172344080:   490] KERN: CommitTransactionHandler::_ChangePackageActivation(): activating 0, deactivating 1 packages
KERN: packagefs [172346161:   490] Volume::_ChangeActivation(): 0 new packages, 1 old packages
KERN: packagefs [172346706:   490] package "ArmyKnifeTTE-5.1.0.0-2-x86_gcc2.hpkg" deactivated
KERN: package_daemon [172402122:   490] KERN: CommitTransactionHandler::_ChangePackageActivation(): activating 0, deactivating 1 packages
KERN: packagefs [172403256:   490] Volume::_ChangeActivation(): 0 new packages, 1 old packages
KERN: packagefs [172403825:   490] package "taglib-1.7.2-1-x86_gcc2.hpkg" deactivated
KERN: package_daemon [172845443:   490] Volume::_PackagesEntryRemoved("ArmyKnifeTTE-5.1.0.0-2-x86_gcc2.hpkg")
KERN: package_daemon [186396923:   490] Volume::_PackagesEntryCreated("ArmyKnifeTTE-5.1.0.0-2-x86_gcc2.hpkg")
KERN: vm_soft_fault: va 0x676b7000 not covered by area in address space
KERN: vm_page_fault: vm_soft_fault returned error 'Bad address' on fault at 0x676b7068, ip 0x1f4b5d8, write 0, user 1, thread 0x1ea
KERN: vm_page_fault: thread "job runner" (490) in team "package_daemon" (480) tried to read address 0x676b7068, ip 0x1f4b5d8 ("libbe.so_seg0ro" +0x2615d8)
KERN: debug_server: Thread 490 entered the debugger: Segment violation
KERN: stack trace, current PC 0x1f4b5d8  HashValue__7BStringPCc + 0xc:
KERN:   (0x7242a4e8)  0x19383a0  GetPackageFile__18PackageFileManagerRC9entry_refRP11PackageFile + 0x940
KERN:   (0x7242a5a8)  0x19386b9  CreatePackage__18PackageFileManagerRC9entry_refRP7Package + 0x29
KERN:   (0x7242a5e8)  0x194684f  _PackagesEntryCreated__6VolumePCc + 0x18b
KERN:   (0x7242a638)  0x1945281  ProcessPendingNodeMonitorEvents__6Volume + 0x115
KERN:   (0x7242a678)  0x194161f  _ProcessNodeMonitorEvents__4RootP6Volume + 0x27
KERN:   (0x7242a768)  0x1941e8f  Do__Q24Root9VolumeJob + 0x6b
KERN:   (0x7242a7a8)  0x1941a97  _JobRunner__4Root + 0x3f
KERN:   (0x7242a7d8)  0x1941a4f  _JobRunnerEntry__4RootPv + 0x1f
KERN:   (0x7242a808)  0x253b693  thread_entry + 0x23

EDIT: is it possible to run the daemon in "malloc debug" mode? Something like quitting it, then launching it from Terminal with LIBROOT=libroot_debug.so /servers/pkg_daemon ?

Version 1, edited 10 years ago by ttcoder (previous) (next) (diff)

in reply to:  2 comment:3 by bonefish, 10 years ago

Replying to ttcoder:

EDIT: is it possible to run the daemon in "malloc debug" mode? Something like quitting it, then launching it from Terminal with LIBROOT=libroot_debug.so /servers/pkg_daemon ?

Yes, the variable is LD_PRELOAD. Cf. https://www.haiku-os.org/blog/mmlr/2010-02-08_using_malloc_debug_find_memory_related_bugs.

Anyway, the cause is likely the same as for #10817.

comment:4 by bonefish, 10 years ago

Does hrev47215 also fix this issue?

comment:5 by ttcoder, 10 years ago

Indeed it does, please close this ticket!

==

Side-notes: in order to get that hrev I used pkgman update on the "core" depot as outlined in #10278 and it worked like a charm (awesome work, Ingo, Oliver et al!). Then I rebooted and found myself in 47215 indeed.. But package_daemon was no longer "live": moving packages in and out of home/config/packages no longer resulted in the corresponding application appearing and disappearing.. Maybe it was confused between the "use now" and the "use after reboot" state introduced recently? At any rate, after an additional reboot everything worked again: it had taken into account the changes made previously in home/config/packages AND further changes were now applied "live". So I fiddled with it all for a while and it was perfectly stable. Bug solved thanks!

comment:6 by jprostko, 10 years ago

Resolution: fixed
Status: newclosed

in reply to:  5 comment:7 by bonefish, 10 years ago

Replying to ttcoder:

Side-notes: in order to get that hrev I used pkgman update on the "core" depot as outlined in #10278 and it worked like a charm (awesome work, Ingo, Oliver et al!). Then I rebooted and found myself in 47215 indeed.. But package_daemon was no longer "live": moving packages in and out of home/config/packages no longer resulted in the corresponding application appearing and disappearing.. Maybe it was confused between the "use now" and the "use after reboot" state introduced recently?

Come to think of it, there's isn't any connection between the system and home packagefs volumes with respect to when either switches to the non-live mode. I've created #10827 to track that issue.

In fact it also means that ATM home should always be live, since you generally won't install/uninstall a system package there. If you happen to run into the issue again, please file a new ticket. Since every change will be tracked by an old state directory it should be fairly simple to see what you installed uninstalled.

Note: See TracTickets for help on using tickets.