Opened 16 years ago
Closed 16 years ago
#2757 closed bug (fixed)
regression: r27665 broke booting, panics at the red rocket
Reported by: | luroh | Owned by: | bonefish |
---|---|---|---|
Priority: | normal | Milestone: | R1 |
Component: | System/Kernel | Version: | R1/pre-alpha1 |
Keywords: | Cc: | ||
Blocked By: | Blocking: | ||
Platform: | All |
Attachments (6)
Change History (35)
by , 16 years ago
Attachment: | Screenshot-Haiku.png added |
---|
by , 16 years ago
Attachment: | haiku-serial-port.txt added |
---|
comment:1 by , 16 years ago
comment:2 by , 16 years ago
Summary: | regression: r27665 broke booting under VMware → regression: r27665 broke booting, panics at the red rocket |
---|
Having tested some more, it's not a VMware-only problem; I get the same KDL on real hw as well.
comment:3 by , 16 years ago
comment:4 by , 16 years ago
Sorry, the problem is still present in hrev27680, both in VMware and on real hw.
comment:5 by , 16 years ago
After backing out my uncommitted changes, disabling all image customizations, rebuilding the image from the scratch, and trying various virtual hardware configurations (with VMware Server 1.0.6 build-91891), I still haven't been able to reproduce the problem. What image customizations and hardware configurations do you use?
comment:6 by , 16 years ago
Attaching UserBuildConfig and haiku.vmx, those are my only tweaks. No other local changes.
by , 16 years ago
Attachment: | UserBuildConfig added |
---|
by , 16 years ago
comment:7 by , 16 years ago
No customizations here other than kernel tracing enabled, attaching my UserBuildConfig also.
comment:8 by , 16 years ago
Hardware-wise: Athlon64 3200+ (single core) Asus A8N-SLI Nvidia CK804-based PCI Express motherboard ATI x800 PCI Express graphics board On-board Nforce Ethernet (via FreeBSD nforce driver). SBLive PCI audio (motherboard audio chipset disabled). Hard disk set up via on-board SATA controller (using legacy_sata driver, ata vs ide bus manager makes no difference to this KDL). Nothing else comes to mind that'd be relevant here.
comment:9 by , 16 years ago
Bah. Apologies for the formatting on that last comment, but I seem unable to edit after the fact.
comment:10 by , 16 years ago
Tested again with and without GPL add-ons enabled (forgot those first) and luroh's UserBuildConfig (the VM config is basically identical to mine anyway). Works fine. Also tested on real hardware (Core 2 Duo 2.2 GHz, 2 GB RAM), which also works fine. I'm pretty clueless...
follow-up: 12 comment:11 by , 16 years ago
Any extra tracing I can enable to help determine something useful? The PCI dump from my sys should be in the syslog from ticket #2756 if that's relevant at all (I notice it's crashing in pci_get_nth_pci_info).
comment:12 by , 16 years ago
Replying to anevilyak:
Any extra tracing I can enable to help determine something useful? The PCI dump from my sys should be in the syslog from ticket #2756 if that's relevant at all (I notice it's crashing in pci_get_nth_pci_info).
The attached serial debug output suggests that gPCI in src/add-ons/kernel/bus_managers/pci/pci.cpp is NULL. It also says that pci_init() has been invoked, though, so it should have been initialized. Supposedly someone overwrites it afterwards. You could enable kernel breakpoints (in headers/private/kernel/arch/user_debugger.h) and set a watchpoint to the memory location after it has been initialized:
arch_set_kernel_watchpoint(&gPCI, B_DATA_WRITE_WATCHPOINT, 4);
The mentioned header must be included (<arch/user_debugger.h>).
comment:13 by , 16 years ago
Didn't seem to hit the breakpoint, I did confirm that gPCI is initialized properly though (0x90d95000). Otherwise crashed identically.
follow-up: 15 comment:14 by , 16 years ago
Did you check whether gPCI is indeed NULL in pci_get_nth_pci_info() when crashing? If so and the watchpoint is not hit, that would suggest that it's not the same gPCI location, e.g. because the image is loaded twice.
comment:15 by , 16 years ago
Replying to bonefish:
Did you check whether gPCI is indeed NULL in pci_get_nth_pci_info() when crashing?
Just checked:
pci_get_nth_pci_info(0, 0x90e052d0) - gPCI = 0x00000000 vm_soft_fault: va 0x0 not covered by area in address space vm_page_fault: vm_soft_fault returned error 'Bad address' on fault at 0x0, ip 0xa0220b15, write 0, user 0, thread 0x3b PANIC: vm_page_fault: unhandled page fault in kernel space at 0x0, ip 0xa0220b15
Any way to verify if two images are in fact loaded? I'm modifying my TRACE statements now to print the mem location of gPCI in both cases to see what that yields now.
comment:16 by , 16 years ago
At init:
PCI: gPCI initialized to 0x90d95000, &gPCI = 0x808cb1f4
Note also that quite a few successful calls to pci_get_nth_pci_info appear on serial debug with this initial base address. At crash however:
pci_get_nth_pci_info(0, 0x90e052d0) - gPCI = 0x00000000, &gPCI = 0xa02bb1f4
comment:17 by , 16 years ago
I should say, those successful get_nth_pci_info calls are during the usual PCI dump that appears at the beginning of the log.
comment:18 by , 16 years ago
If it helps:
kdebug> team_images 1 Registered images of team 1 ID text size data size name 1 0x80000000 1105920 0x8010e000 139264 /Haiku/beos/system/kernel_x86 3 0x80904000 49152 0x80910000 4096 /Haiku/beos/system/add-ons/kernel/boot/usb 6 0x808e2000 16384 0x808e6000 4096 /Haiku/beos/system/add-ons/kernel/boot/scsi_periph 7 0x808df000 8192 0x808e1000 4096 /Haiku/beos/system/add-ons/kernel/boot/scsi_disk 8 0x808db000 12288 0x808de000 4096 /Haiku/beos/system/add-ons/kernel/boot/scsi_cd 9 0x808cc000 36864 0x808d5000 8192 /Haiku/beos/system/add-ons/kernel/boot/scsi 10 0x80810000 528384 0x80891000 241664 /Haiku/beos/system/add-ons/kernel/boot/pci 11 0x807fc000 61440 0x8080b000 4096 /Haiku/beos/system/add-ons/kernel/boot/ohci 12 0x807f9000 8192 0x807fb000 4096 /Haiku/beos/system/add-ons/kernel/boot/locked_pool 13 0x807f6000 8192 0x807f8000 4096 /Haiku/beos/system/add-ons/kernel/boot/legacy_sata 15 0x807ee000 4096 0x807ef000 4096 /Haiku/beos/system/add-ons/kernel/boot/isa 16 0x807e2000 40960 0x807ec000 8192 /Haiku/beos/system/add-ons/kernel/boot/intel 18 0x807db000 12288 0x807de000 4096 /Haiku/beos/system/add-ons/kernel/boot/ide_adapter 19 0x807cd000 36864 0x807d6000 4096 /Haiku/beos/system/add-ons/kernel/boot/ide 20 0x807cb000 4096 0x807cc000 4096 /Haiku/beos/system/add-ons/kernel/boot/generic_ide_pci 21 0x807b7000 61440 0x807c6000 4096 /Haiku/beos/system/add-ons/kernel/boot/ehci 22 0x807b5000 4096 0x807b6000 4096 /Haiku/beos/system/add-ons/kernel/boot/config_manager 23 0x80783000 167936 0x807ac000 4096 /Haiku/beos/system/add-ons/kernel/boot/bfs 217 0x80967000 8192 0x80969000 4096 /boot/beos/system/add-ons/kernel/drivers/dev/console 218 0x8096a000 4096 0x8096b000 4096 /boot/beos/system/add-ons/kernel/drivers/dev/dprintf 219 0x8096c000 8192 0x8096e000 4096 /boot/beos/system/add-ons/kernel/drivers/dev/keyboard 220 0x80980000 4096 0x80981000 4096 /boot/beos/system/add-ons/kernel/drivers/dev/null 221 0x80982000 12288 0x80985000 4096 /boot/beos/system/add-ons/kernel/drivers/dev/random 222 0x80986000 36864 0x8098f000 16384 /boot/beos/system/add-ons/kernel/drivers/dev/tty 223 0x80993000 4096 0x80994000 4096 /boot/beos/system/add-ons/kernel/drivers/dev/zero 293 0x8077c000 4096 0x8077d000 8192 /boot/beos/system/add-ons/kernel/cpu/generic_x86 294 0x80995000 28672 0x8099c000 94208 /boot/beos/system/add-ons/kernel/debugger/disasm 295 0x807df000 8192 0x807e1000 4096 /boot/beos/system/add-ons/kernel/debugger/hangman 296 0x807f4000 4096 0x807f5000 4096 /boot/beos/system/add-ons/kernel/debugger/invalidate_on_exit 1569 0x8074b000 86016 0x80760000 8192 /boot/beos/system/add-ons/kernel/network/stack 1579 0x80762000 24576 0x80768000 4096 /boot/beos/system/add-ons/kernel/network/protocols/udp 1580 0x80769000 36864 0x80772000 4096 /boot/beos/system/add-ons/kernel/network/protocols/ipv4 1582 0x80247000 4096 0x80248000 4096 /boot/beos/system/add-ons/kernel/network/devices/loopback 1583 0x80249000 4096 0x8024a000 4096 /boot/beos/system/add-ons/kernel/network/datalink_protocols/loopback_frame 1584 0x809b3000 86016 0x809c8000 8192 /boot/beos/system/add-ons/kernel/drivers/dev/net/3com 1585 0xa0200000 528384 0xa0281000 241664 /Haiku/beos/system/add-ons/kernel/boot/pci
comment:19 by , 16 years ago
comment:20 by , 16 years ago
I suppose some debug output in the kernel's module code (src/system/kernel/module.cpp) would allow to track it down. It is particularly interesting that image 1585 has a name that looks like it has been loaded by the boot loader (starting with "/Haiku/"), but has an ID greater than modules that have been loaded by the kernel ("/boot/").
I guess for some reason the kernel loads the image again, although it had been pre-loaded by the boot loader. Obviously it has something to do with my change that the modules pre-loaded by the boot loader do now have paths that can be resolved to actual files. No idea why I can't reproduce the problem over here, though.
comment:21 by , 16 years ago
Will try booting with TRACE_MODULE enabled and see what happens, will report back shortly. It definitely isn't a stale build in any case, I've run jam -aq a few times and even tried jam clean ; jam -q as well just to be sure, results identical. Give me a few minutes.
comment:22 by , 16 years ago
Log attached, if I'm reading this right, it managed to resolve the correct path to the existing module, but decided to load it again anyways? Will see if I can add more tracing in the right areas to narrow down why.
comment:23 by , 16 years ago
Seems it tries, but fails to look up the full path in sModuleImagesHash and thus tries to reload it. Are the preloaded modules inserted into the hash by the boot loader or is the kernel responsible for populating that once it's started?
comment:24 by , 16 years ago
Resolution: | → fixed |
---|---|
Status: | new → closed |
The kernel does that during startup. In any case, the problem has been fixed in hrev27685, thanks for looking into it :-)
comment:25 by , 16 years ago
Resolution: | fixed |
---|---|
Status: | closed → reopened |
The problem persists in hrev27688. Now I can actually reproduce it:
PANIC: vm_page_fault: unhandled page fault in kernel space at 0x4c, ip 0x90b589b6 Welcome to Kernel Debugging Land... Thread 122 "media_addon_server" running on CPU 0 kdebug> sc stack trace for thread 122 "media_addon_server" kernel stack: 0x9201e000 to 0x92022000 user stack: 0x7efef000 to 0x7ffef000 frame caller <image>:function + offset 0 92021820 (+ 48) 80057a35 </boot/beos/system/kernel_x86>:invoke_debugger_command + 0x00f5 1 92021850 (+ 64) 80057825 </boot/beos/system/kernel_x86>:invoke_pipe_segment__FP21debugger_command_pipelPc + 0x0079 2 92021890 (+ 64) 80057bad </boot/beos/system/kernel_x86>:invoke_debugger_command_pipe + 0x009d 3 920218d0 (+ 48) 800590e8 </boot/beos/system/kernel_x86>:_ParseCommandPipe__16ExpressionParserRi + 0x0234 4 92021900 (+ 64) 80058522 </boot/beos/system/kernel_x86>:EvaluateCommand__16ExpressionParserPCcRi + 0x02ba 5 92021940 (+ 224) 8005a510 </boot/beos/system/kernel_x86>:evaluate_debug_command + 0x0088 6 92021a20 (+ 64) 80055e36 </boot/beos/system/kernel_x86>:kernel_debugger_loop__Fv + 0x01ae 7 92021a60 (+ 48) 80056a09 </boot/beos/system/kernel_x86>:kernel_debugger + 0x0121 8 92021a90 (+ 192) 800568dd </boot/beos/system/kernel_x86>:panic + 0x0029 9 92021b50 (+ 80) 800b4b2d </boot/beos/system/kernel_x86>:vm_page_fault + 0x0139 10 92021ba0 (+ 64) 800c4045 </boot/beos/system/kernel_x86>:page_fault_exception + 0x00e1 11 92021be0 (+ 12) 800c7606 </boot/beos/system/kernel_x86>:int_bottom + 0x0036 (nearest) kernel iframe at 0x92021bec (end = 0x92021c3c) eax 0x90b58b25 ebx 0x90b59c84 ecx 0x8010869c edx 0x0 esi 0x90b59dc0 edi 0x90b91190 ebp 0x92021c74 esp 0x92021c20 eip 0x90b589b6 eflags 0x10296 vector: 0xe, error code: 0x0 12 92021bec (+ 136) 90b589b6 </boot/beos/system/add-ons/kernel/busses/ide/generic_ide_pci>:supports_device + 0x002e 13 92021c74 (+ 64) 80062d78 </boot/beos/system/kernel_x86>:_RegisterPath__11device_nodePCc + 0x0030 14 92021cb4 (+ 96) 80062f87 </boot/beos/system/kernel_x86>:_RegisterDynamic__11device_nodeP11device_node + 0x0147 15 92021d14 (+ 48) 800630f1 </boot/beos/system/kernel_x86>:_Probe__11device_node + 0x0055 16 92021d44 (+ 80) 80063287 </boot/beos/system/kernel_x86>:Probe__11device_nodePCcUl + 0x018b 17 92021d94 (+ 80) 800632d2 </boot/beos/system/kernel_x86>:Probe__11device_nodePCcUl + 0x01d6 18 92021de4 (+ 64) 80063ad3 </boot/beos/system/kernel_x86>:device_manager_probe + 0x004f 19 92021e24 (+ 64) 80063cf3 </boot/beos/system/kernel_x86>:scan_for_drivers__FP11devfs_vnode + 0x0073 20 92021e64 (+ 64) 8006583d </boot/beos/system/kernel_x86>:devfs_open_dir__FP9fs_volumeP8fs_vnodePPv + 0x0081 21 92021ea4 (+ 48) 80091f44 </boot/beos/system/kernel_x86>:open_dir_vnode__FP5vnodeb + 0x0028 22 92021ed4 (+ 48) 800929bd </boot/beos/system/kernel_x86>:dir_open__FiPcb + 0x0051 23 92021f04 (+ 64) 80097f2b </boot/beos/system/kernel_x86>:_user_open_dir + 0x0093 24 92021f44 (+ 100) 800c7832 </boot/beos/system/kernel_x86>:pre_syscall_debug_done + 0x0002 (nearest) user iframe at 0x92021fa8 (end = 0x92022000) eax 0x62 ebx 0x440fec ecx 0x7ffedde0 edx 0xffff0104 esi 0x7ffee269 edi 0x3eae2d ebp 0x7ffee76c esp 0x92021fdc eip 0xffff0104 eflags 0x203 user esp 0x7ffedde0 vector: 0x63, error code: 0x0 25 92021fa8 (+ 0) ffff0104 26 7ffee76c (+ 48) 0037df1c <libbe.so>:SetTo__6BEntryPCcb + 0x003c 27 7ffee79c (+ 48) 0037dc5f <libbe.so>:__6BEntryPCcb + 0x0053 28 7ffee7cc (+ 128) 00743a8d <libdevice.so>:__12RosterLooperP10BUSBRoster + 0x0069 29 7ffee84c (+ 64) 0074401e <libdevice.so>:Start__10BUSBRoster + 0x005a 30 7ffee88c (+ 48) 0072adbc <usb_webcam.media_addon>:__16WebCamMediaAddOnl + 0x0108 31 7ffee8bc (+ 64) 0072b2a9 <usb_webcam.media_addon>:make_media_addon + 0x0031 32 7ffee8fc (+ 48) 004b4b85 <libmedia.so>:LoadAddon__Q38BPrivate5media18DormantNodeManagerPP11BMediaAddOnPlPCcl + 0x00a5 33 7ffee92c (+ 128) 004b4230 <libmedia.so>:GetAddon__Q38BPrivate5media18DormantNodeManagerl + 0x00ac 34 7ffee9ac (+ 144) 00206b8d <_APP_>:AddOnAdded__16MediaAddonServerPCcx + 0x0069 35 7ffeea3c (+ 192) 00208245 <_APP_>:MessageReceived__16MediaAddonServerP8BMessage + 0x0211 36 7ffeeafc (+ 304) 00207ea0 <_APP_>:WatchDir__16MediaAddonServerP6BEntry + 0x0140 37 7ffeec2c (+ 176) 002064dd <_APP_>:ReadyToRun__16MediaAddonServer + 0x0209 38 7ffeecdc (+ 496) 002b31a1 <libbe.so>:DispatchMessage__12BApplicationP8BMessageP8BHandler + 0x02f9 39 7ffeeecc (+ 64) 002bda5d <libbe.so>:task_looper__7BLooper + 0x0211 40 7ffeef0c (+ 64) 002b1a19 <libbe.so>:Run__12BApplication + 0x0075 41 7ffeef4c (+ 48) 002087e6 <_APP_>:main + 0x007e 42 7ffeef7c (+ 48) 00205c03 <_APP_>:_start + 0x005b 43 7ffeefac (+ 48) 001008ea 3183:runtime_loader_seg0ro@0x00100000 + 0x8ea 44 7ffeefdc (+ 0) 7ffeefec 3182:media_addon_server_main_stack@0x7efef000 + 0xffffec kdebug> team_images 1 Registered images of team 1 ID text size data size name 1 0x80000000 1081344 0x80108000 139264 /boot/beos/system/kernel_x86 3 0x806fe000 49152 0x8070a000 4096 /boot/beos/system/add-ons/kernel/boot/usb 4 0x806e9000 61440 0x806f8000 8192 /boot/beos/system/add-ons/kernel/boot/uhci 6 0x806dc000 16384 0x806e0000 4096 /boot/beos/system/add-ons/kernel/boot/scsi_periph 7 0x806d9000 8192 0x806db000 4096 /boot/beos/system/add-ons/kernel/boot/scsi_disk 9 0x806c6000 36864 0x806cf000 8192 /boot/beos/system/add-ons/kernel/boot/scsi 10 0x8060b000 528384 0x8068c000 237568 /boot/beos/system/add-ons/kernel/boot/pci 12 0x805f4000 8192 0x805f6000 4096 /boot/beos/system/add-ons/kernel/boot/locked_pool 15 0x805e9000 4096 0x805ea000 4096 /boot/beos/system/add-ons/kernel/boot/isa 18 0x805d6000 12288 0x805d9000 4096 /boot/beos/system/add-ons/kernel/boot/ide_adapter 19 0x805c8000 36864 0x805d1000 4096 /boot/beos/system/add-ons/kernel/boot/ide 20 0x805c6000 4096 0x805c7000 4096 /boot/beos/system/add-ons/kernel/boot/generic_ide_pci 22 0x805b0000 4096 0x805b1000 4096 /boot/beos/system/add-ons/kernel/boot/config_manager 23 0x8057e000 167936 0x805a7000 4096 /boot/beos/system/add-ons/kernel/boot/bfs 215 0x80518000 8192 0x8051a000 4096 /boot/beos/system/add-ons/kernel/drivers/dev/console 216 0x80530000 4096 0x80531000 4096 /boot/beos/system/add-ons/kernel/drivers/dev/dprintf 217 0x80532000 8192 0x80534000 4096 /boot/beos/system/add-ons/kernel/drivers/dev/keyboard 218 0x80535000 4096 0x80536000 4096 /boot/beos/system/add-ons/kernel/drivers/dev/null 219 0x80537000 12288 0x8053a000 4096 /boot/beos/system/add-ons/kernel/drivers/dev/random 220 0x8075c000 36864 0x80765000 16384 /boot/beos/system/add-ons/kernel/drivers/dev/tty 221 0x8053b000 4096 0x8053c000 4096 /boot/beos/system/add-ons/kernel/drivers/dev/zero 255 0x8053f000 4096 0x80540000 8192 /boot/beos/system/add-ons/kernel/cpu/generic_x86 256 0x80769000 28672 0x80770000 94208 /boot/beos/system/add-ons/kernel/debugger/disasm 257 0x80574000 8192 0x80576000 4096 /boot/beos/system/add-ons/kernel/debugger/hangman 258 0x80577000 4096 0x80578000 4096 /boot/beos/system/add-ons/kernel/debugger/invalidate_on_exit 893 0x80546000 86016 0x8055b000 8192 /boot/beos/system/add-ons/kernel/network/stack 903 0x8055d000 24576 0x80563000 4096 /boot/beos/system/add-ons/kernel/network/protocols/udp 904 0x80564000 36864 0x8056d000 4096 /boot/beos/system/add-ons/kernel/network/protocols/ipv4 906 0x80244000 4096 0x80245000 4096 /boot/beos/system/add-ons/kernel/network/devices/loopback 907 0x8056e000 4096 0x8056f000 4096 /boot/beos/system/add-ons/kernel/network/datalink_protocols/loopback_frame 913 0x80787000 196608 0x807b7000 8192 /boot/beos/system/add-ons/kernel/drivers/dev/net/ipro1000 923 0x805ac000 8192 0x805ae000 8192 /boot/beos/system/add-ons/kernel/network/devices/ethernet 924 0x805b6000 4096 0x805b7000 4096 /boot/beos/system/add-ons/kernel/network/datalink_protocols/ipv4_datagram 925 0x805b8000 12288 0x805bb000 4096 /boot/beos/system/add-ons/kernel/network/datalink_protocols/arp 926 0x805bc000 4096 0x805bd000 4096 /boot/beos/system/add-ons/kernel/network/datalink_protocols/ethernet_frame 933 0x805e4000 8192 0x805e6000 8192 /boot/beos/system/add-ons/kernel/drivers/dev/graphics/vesa 935 0x9277c000 49152 0x92788000 4096 /boot/beos/system/add-ons/kernel/network/protocols/tcp 1054 0x92000000 20480 0x92005000 4096 /boot/beos/system/add-ons/kernel/bus_managers/ps2 1056 0x807fd000 8192 0x807ff000 4096 /boot/beos/system/add-ons/kernel/drivers/dev/input/wacom 1114 0x90b58000 4096 0x90b59000 4096 /boot/beos/system/add-ons/kernel/busses/ide/generic_ide_pci
The names of modules loaded by the boot loader and by the kernel still differ. The former point to the boot symlinks. Here the generic_ide_pci is loaded twice.
Unless there are objections I'll revert the whole mess.
comment:26 by , 16 years ago
Owner: | changed from | to
---|---|
Status: | reopened → new |
Investigating this a bit, first.
comment:27 by , 16 years ago
Status: | new → assigned |
---|
That's the exact same KDL I get.