Opened 9 years ago

Closed 9 years ago

Last modified 9 years ago

#5128 closed bug (duplicate)

Booting on Shuttle SN41G2 (Nvidia Nforce2 chipset) only possible within Safe Mode

Reported by: JscBeTrayer Owned by: nobody
Priority: normal Milestone: R1
Component: System/Kernel Version: R1/Development
Keywords: SN41G2 boot freeze Cc:
Blocked By: #5119 Blocking:
Has a Patch: no Platform: x86

Description

Haiku was installed on real hardware (dedicated hard drive, Shuttle SN41G2 barebone with 1Gb RAM, nForce2 chipset, NV40 GeForce 6800 video card, no other additional cards than video for now, CD burner). I tried both the alpha ISO and the revision 34648 nightly build ISO hybrid GCC2 from december 12th (http://haiku-files.org/cd/haiku-nightly-r34648-x86gcc2hybrid-cd.zip).

I downloaded ISO, burnt it, booted from CD: when I was offered the choice between running from live CD or calling installer, I called installer. Installation went fine. I removed the CD and rebooted.

When booting, the desktop background and mouse pointer appear but that's all I get: it instantly freezes before the desktop icons even display and the tracker doesn't appear either: neither keyboard nor mouse respond, and the only way I can reboot is using the hardware reset button.

In order to boot the system and be able to use it, I have to press the space bar before Haiku bootscreen turns on, and in the menu choose the very first entry "Safe Mode". I have tried various combinations of other entries (video-safe mode, disable BIOS acess, disable DMA, etc.) but none of them allows me to boot, only the very first "Safe Mode" entry seems to enable me to boot properly.

I have not been able to find syslog (opened Terminal, cd'ed to /var, listed the directory contents, didn't see any log subdirectory in it???).

I am suspecting an audio problem because fail-safe video/VESA doesn't help, but if I boot in safe mode and turn media server on, everything hangs/freezes instantly??

Change History (10)

comment:1 Changed 9 years ago by stippi

Indeed. A problem with audio is most likely, since the media_server is not started in safe mode. When the system freezes, you could try to enter the kernel debugger with Alt-SysReq-D and in the KDL, you could type "ints" to get information about interrupts. It sounds a bit like it is an interrupt storm.

comment:2 Changed 9 years ago by axeld

To be sure, you could manually launch the media_server (under /boot/system/servers/). If that still doesn't lock up, try launching the net_server from the same directory.

comment:3 Changed 9 years ago by JscBeTrayer

Many thanks for the fast answers. Here are some more info:

-> after starting in safe mode, I opened a Terminal and started the net_server manually from there: no hang, and the network worked perfectly (DHCP gave me an address in less than a few seconds and I could ping my ISP's servers).

So I started manually the media_server, and the system froze! Since it was started from Terminal I have some echoed messages telling me what was going on, and I also could fire up KDL to capture ints.

-> Here is the last Terminal output of media_server before freeze: AddOnManager::_RegisterDecoder, name vorbis AddOnManager::_RegisterAddOn(): trying to load "/boot/system/add-ons/media/plugins/wav_reader" AddOnManager::_RegisterReader, name wav_reader

-> Here is the output of ints command in KDL: int 1, enabled 1, handled 345, unhandled 0, ACTIVE

ps2:ps2_interrupt (0xcd1196a4), data 0x00000000, handled 345

int 3, enabled 1, handled 3, unhandled 0

firewire:fwohci_intr (0x80e3c6d0), data 0x81bea000, handled 3

int 10, enabled 2, handled 8651, unhandled 0

nvidia:nv_interrupt (0x8067084c), data 0xcd45c030, handled 8651 ohci:_InterruptHandler4OHCIPv (0x8052fce8), data 0x80eb5220, handled 0

int 11, enabled 2, handled 10, unhandled 0

ehci:InterruptHandler__4EHCIPv (0x804f1de8), data 0x80eb5440, handled 0 nforce:nfe_intr (0xcd1c7050), data 0x81cea000, handled <unknown>

int 12, enabled 3, handled 428, unhandled 0

auich:auich_int (0xcefc4934), data 0xcefcb240, handled 0 ps2:ps2_interrupt (0xcd1196a4), data 0x00000000, handled 0 ohci:_InterruptHandler4OHCIPv (0x8052fce8), data 0x80eb5000, handled 428

int 14, enabled 1, handled 4096, unhandled 2

ata_adapter:ata_adapter_inthand (0x804b1aec), data 0x80eb6038, handled 4096

int 15, enabled 1, handled 0, unhandled 0

ata_adapter:ata_adapter_inthand (0x804b1aec), data 0x80eb6070, handled 0

int 219, enabled 1, handled 52282, unhandled 40932

kernel_x86:apic_timer_interrupt_FPv (0x800ecca0), data 0x00000000, handled 52282

comment:4 Changed 9 years ago by JscBeTrayer

ah... wiki formatting killed my efforts :) I can provide the screenshot of the error if needed _

comment:5 Changed 9 years ago by axeld

Thanks for the info! You might want to use {{{ and !}}} surround output such as that from the ints command next time.

Anyway, the ints output doesn't look that suspicious yet - since the auich driver is already working, how did you enter KDL if the system hang? Just by pressing "alt-sysreq-d" as usual?

Also, you could try to remove the firewire driver to see if that brings any difference.

comment:6 Changed 9 years ago by anevilyak

I'm wondering why/how firewire is present, I thought we took it out of the default image?

comment:7 Changed 9 years ago by axeld

Only in the alpha branch.

comment:8 Changed 9 years ago by JscBeTrayer

Hello, I have now proof that it's the media_server failing :) First, answers: yes yesterday I entered KDL by pressing "alt-sysreq-d".

Here is now what I did: installed the latest nightly again (hrev34684), tried to boot with on screen debug: result was freeze again, and no possiblity to enter KDL (alt-sysreq-d didn't work that time).

So I rebooted in safe mode, sent firewire driver to Trash, and edited my Bootscript file: the only change I did was to comment out the two lines that launch media_server and midi_server. I rebooted normally, and tada! Haiku was there working fine with working network, backgrounds, etc. :) So I opened a Terminal and tailed syslog to monitor changes in it, opened another Terminal and launched media_server from there... and I was thrown to KDL without even having to request it this time _ In KDL I pictured ints, teams, threads of the media_server team. I wanted to take pictures of the sems but there were too many out there and I gave up after the 5th screen ;-) PLease find below transcript of this output (syslog tail, media_server last messages in the Terminal, ints, threads of the media_server team):

1/ Latest Syslog tail before KDL took over:

KERN: loaded driver /boot/system/add-ons/kernel/drivers/dev/bus/usb_raw
KERN: auich: init_hardware()
KERN: auich: init_driver()
KERN: auich: auich_setup(0xcee77240)
KERN: auich: audio/hmulti/auich/1 deviceid - 0x6a chiprev = a1 model = f541 enhanced at d800
KERN: auich: PCI command before: 7
KERN: auich: PCI command after: 7

2/ Latest media_server output to Terminal before KDL took over:

DefaultManager: Trying connect in format 2
DefaultManager: can't find free mixer output
DefaultManager: can't find free mixer output
DefaultManager: failed to connect mixer and soundcard
DefaultManager: RescanThread() leave
DefaultManager: can't find free mixer output
BMediaRoster::Connect: aborted after BBufferProducer::PrepareToConnect, status = 0x80004077
DefaultManager: failed to connect mixer and soundcard
DefaultManager: RescanThread() leave
DefaultManager: Trying connect in format 2
DefaultManager: failed to connect mixer and soundcard
DefaultManager: RescanThread() leave
BMediaRoster::Connect: aborted after BBufferProducer::PrepareToConnect, status = 0x80004077
BMediaRoster::ReleaseNode, trying to release reference counting disabled timesource, node 1, port 729223, team 267
BTimeSource::DirectAddMe should not add itself to slave nodes
DefaultManager: Trying connect in format 3
BMediaRoster::Connect: aborted after BBufferProducer::PrepareToConnect, status = 0x80004077

Note: team 267 was media_addon_server, but I forgot to capture threads of this team

3/ Auto-entering KDL:

PANIC: port 737420: no messages found

Welcome to Kernel Debugging Land...
Thread 828 "Audio Mixer control" running on CPU 0
kdebug>

4/ ints in KDL:

int   1, enabled 1, handled      177, unhandled        0
   ps2:ps2_interrupt                          (0xceaa26a4), data 0x00000000, handled      177

int  10, enabled 2, handled     6525, unhandled        0
   nvidia:nv_interrupt                        (0x803d284c), data 0xcd469030, handled     6525
   ohci:_InterruptHandler__4OHCIPv            (0x804cace8), data 0x80e37220, handled     0

int  11, enabled 2, handled       15, unhandled        0
   ehci:InterruptHandler__4EHCIPv             (0x8049cde8), data 0x80e37440, handled        0
   nforce:nfe_intr                            (0x803dcfe4), data 0x81bd3d80, handled <unknown>

int  12, enabled 3, handled      751, unhandled        0
   auich:auich_int                           (0xcee70934), data 0xcee77240, handled        8
   ps2:ps2_interrupt                         (0xceaa26a4), data 0x00000000, handled        0
   ohci:_InterruptHandler__4OHCIPv           (0x804cace8), data 0x80e37000, handled      743

int  14, enabled 1, handled     4391, unhandled        2
                                          func 0x80461aec, data 0x80e39038, handled     4391

int  15, enabled 1, handled        0, unhandled        0
                                          func 0x80461aec, data 0x80e39070, handled        0

int 219, enabled 1, handled    50800, unhandled   108787
                                          func 0x800ece10, data 0x00000000, handled    50800

5/ threads (260) / 260 being number of media_server team in KDL:

thread         id  state     wait for   object  cpu pri  stack      team  name
0xcd4f1000    281  waiting   sem          1743    -   8  0xceea5000  260  rescan defaults
0xcd4f3000    284  ready             -            -   8  0xceeb1000  260  rescan defaults
0xcd4f3800    285  waiting   sem          1743    -   8  0xceeb5000  260  rescan defaults
0xcd4ee000    260  waiting   cvar   0x80d22f70    -  10  0xcee38000  260  media_server
0xcd4ea800    263  waiting   sem          1767    -  19  0xcee40000  260  notification broadcast
0xcd4ee800    264  waiting   sem          1775    -  10  0xcee44000  260  big brother is watching you
0xcd4eb800    265  waiting   cvar   0x80d2309c    - 105  0xcee48000  260  media_server cotrol
0xcd4fb000    268  waiting   cvar   0x80d232f4    -  10  0xcee50000  260  AddOnMonitor
0xcd4fe000    275  waiting   cvar   0x80d2354c    -  20  0xcee68000  260  _BMediaRoster_

comment:9 Changed 9 years ago by bonefish

Blocked By: 5119 added
Component: - GeneralSystem/Kernel
Resolution: duplicate
Status: newclosed

Duplicate of #5119 and fixed in hrev34687.

comment:10 Changed 9 years ago by stargatefan

broken in 36769 gcc2 hybrid. same issues.auchi driver

Here is Listdev output.

The issue is with the media server to kernel hook and this specific device driver, open sound garners the same result. Either the driver formating is wrong or the kernel has a vulnerability.

Looking into the driver currently.

Welcome to the Haiku shell.

~> listdev

device Communication controller (Modem, Generic) [7|3|0]

vendor 8086: Intel Corporation device 24c6: 82801DB/DBL/DBM (ICH4/ICH4-L/ICH4-M) AC'97 Modem Controller

device Multimedia controller (Multimedia audio controller) [4|1|0]

vendor 8086: Intel Corporation device 24c5: 82801DB/DBL/DBM (ICH4/ICH4-L/ICH4-M) AC'97 Audio Controller

device Serial bus controller (SMBus) [c|5|0]

vendor 8086: Intel Corporation device 24c3: 82801DB/DBL/DBM (ICH4/ICH4-L/ICH4-M) SMBus Controller

device Mass storage controller (IDE interface) [1|1|8a]

vendor 8086: Intel Corporation device 24ca: 82801DBM (ICH4-M) IDE Controller

device Bridge (ISA bridge) [6|1|0]

vendor 8086: Intel Corporation device 24cc: 82801DBM (ICH4-M) LPC Interface Bridge

device Network controller (Ethernet controller) [2|0|0]

vendor 10ec: Realtek Semiconductor Co., Ltd. device 8139: RTL-8139/8139C/8139C+

device Bridge (CardBus bridge) [6|7|0]

vendor 1180: Ricoh Co Ltd device 0476: RL5c476 II

device Bridge (CardBus bridge) [6|7|0]

vendor 1180: Ricoh Co Ltd device 0476: RL5c476 II

device Bridge (PCI bridge, Normal decode) [6|4|0]

vendor 8086: Intel Corporation device 2448: 82801 Mobile PCI Bridge

device Serial bus controller (USB Controller, EHCI) [c|3|20]

vendor 8086: Intel Corporation device 24cd: 82801DB/DBM (ICH4/ICH4-M) USB2 EHCI Controller

device Serial bus controller (USB Controller, UHCI) [c|3|0]

vendor 8086: Intel Corporation device 24c4: 82801DB/DBL/DBM (ICH4/ICH4-L/ICH4-M) USB UHCI Controller #2

device Serial bus controller (USB Controller, UHCI) [c|3|0]

vendor 8086: Intel Corporation device 24c2: 82801DB/DBL/DBM (ICH4/ICH4-L/ICH4-M) USB UHCI Controller #1

device Display controller [3|80|0]

vendor 8086: Intel Corporation device 3582: 82852/855GM Integrated Graphics Device

device Display controller (VGA compatible controller, VGA controller) [3|0|0]

vendor 8086: Intel Corporation device 3582: 82852/855GM Integrated Graphics Device

device Bridge (PCI bridge, Normal decode) [6|4|0]

vendor 8086: Intel Corporation device 3581: 82852/82855 GM/GME/PM/GMV Processor to AGP Controller

device Generic system peripheral [8|80|0]

vendor 8086: Intel Corporation device 3585: 82852/82855 GM/GME/PM/GMV Processor to I/O Controller

device Generic system peripheral [8|80|0]

vendor 8086: Intel Corporation device 3584: 82852/82855 GM/GME/PM/GMV Processor to I/O Controller

device Bridge (Host bridge) [6|0|0]

vendor 8086: Intel Corporation device 3580: 82852/82855 GM/GME/PM/GMV Processor to I/O Controller

~>

debug log of incident

GNU gdb 6.3

Copyright 2004 Free Software Foundation, Inc. GDB is free software, covered by the GNU General Public License, and you are welcome to change it and/or distribute copies of it under certain conditions. Type "show copying" to see the conditions. There is absolutely no warranty for GDB. Type "show warranty" for details. This GDB was configured as "i586-pc-haiku"...(no debugging symbols found)

[tcsetpgrp failed in terminal_inferior: Invalid Argument] Thread 126 caused an exception: Segment violation Reading symbols from /boot/system/runtime_loader...done. Loaded symbols for /boot/system/runtime_loader Reading symbols from /boot/system/lib/libbe.so...done. Loaded symbols for /boot/system/lib/libbe.so Reading symbols from /boot/system/lib/libmedia.so...done. Loaded symbols for /boot/system/lib/libmedia.so Reading symbols from /boot/system/lib/libstdc++.hrev4.so...done. Loaded symbols for /boot/system/lib/libstdc++.hrev4.so Reading symbols from /boot/system/lib/libroot.so...done. Loaded symbols for /boot/system/lib/libroot.so [tcsetpgrp failed in terminal_inferior: Invalid Argument] [Switching to team /boot/system/servers/media_server (106) thread big brother is watching you (126)] 0x0021e9d7 in AppManager::_BigBrother () (gdb)

Note: See TracTickets for help on using tickets.