Opened 6 months ago

Closed 6 months ago

#9128 closed bug (fixed)

Network Deadlock in R1A4

Reported by: kallisti5 Owned by: axeld
Priority: normal Milestone: R1/alpha4
Component: Network & Internet/Stack Version: R1/Development
Keywords: net_timer Cc: luroh
Blocked By: Blocking:
Has a Patch: no Platform: All

Description

I've seen this issue twice in recent images... it's pretty bad.

1) boot system off CD.
2) Deskbar is frozen.
3) Open terminal, ifconfig freezes and waits forever.

No errors in syslog. Pulkomandy said that something may be putting junk data in one of the ports.

Attachments (1)

syslog (164.4 KB) - added by kallisti5 6 months ago.
syslog with port / ports kdl info

Download all attachments as: .zip

Change History (14)

Changed 6 months ago by kallisti5

syslog with port / ports kdl info

comment:1 Changed 6 months ago by kallisti5

from the syslogs, looking at a port used by Deskbar... we get a read fault...

2648	KERN: kdebug> port 333
        PORT: 0xce3c72a8
2649	KERN:  id:              333
2650	KERN:  name:            "Deskbar"
2651	KERN:  owner:           152
2652	KERN:  capacity:        200
2653	KERN:  read_count:      200
2654	KERN:  write_count:     0
2655	KERN:  total count:     39
2656	KERN: messages:
2657	KERN:
2658	KERN: [*** READ FAULT at 0x8198d000, pc: 0x8006151a ***]

comment:2 Changed 6 months ago by kallisti5

I've rebooted and seen this issue 2 out of 3 boots.

I tried killing net_server without success (it won't die). Going to gdb ifconfig to find out what it's doing.

comment:3 Changed 6 months ago by kallisti5

  • Version changed from R1/alpha3 to R1/Development

comment:4 Changed 6 months ago by kallisti5

  • Priority changed from blocker to normal

Now I can't reproduce it.

I did see a few CD scsi media errors recently in syslog, so I am going to blame a bad cdrom / cheap cd media.

Removing blocker status... It will return if anyone else sees the same issue. I booted the latest R1A4 image in qemu 5-6 times without any problems.

comment:5 Changed 6 months ago by jscipione

I was unable to reproduce this bug using hrevr1alpha4-44699 in vmware and qemu. I tried qemu with the default epro1000 network card as well as rtl8193. I also tried running from the ISO in qemu to see if it was a problem on read-only medium. I'll try running on real hardware tonight.

comment:6 Changed 6 months ago by jscipione

I was also unable to reproduce this bug using hrevr1alpha4-44699 on real hardware either on the live-cd or after installing on rw media. I see no reason this bug should hold up release.

comment:7 Changed 6 months ago by mmadia

  • Milestone changed from R1/alpha4 to R1/beta1

This was seen by others. Though, the issue seems to exist only on single-core machines.

It happens only on the first boot of a fresh install (or everyboot of a LiveCD) when safemode is enabled. I believe it's because in that scenario,

  1. net_server is not launched via the Bootscript
  2. the freshInstallIndicator exists, which triggers the launch of the post install scripts postInstallDir=/boot/common/boot/post_install
  3. lastly, default_deskbar_items.sh will unconditionally install NetworkStatus into Deskbar.

luroh was able to confirm that commenting out /boot/system/apps/NetworkStatus --deskbar from default_deskbar_items.sh allows Haiku to boot successfully into safemode on the first boot.

As far as R1a4 is concerned, this has been noted as a Known Issue in the Release Notes.

comment:8 Changed 6 months ago by luroh

  • Cc luroh added

comment:9 Changed 6 months ago by kallisti5

Great... my laptop now has had this problem 2 out of the last 3 boots on R1A4.

Another solution may be to burn the anyboot to an ISO, and choose the 'install' option vs the 'LiveCD' mode.

comment:10 Changed 6 months ago by luroh

The problem (if it is really the same one) can be repeated here on my single core machines with an R1Alpha3 CD as well, if I boot in safe mode and select Live-CD.
VMware can also repeat the problem if I boot a virgin vmdk in safe mode using a single core vm.

comment:11 Changed 6 months ago by anevilyak

Please check if this problem persists with hrev44832 or later.

Last edited 6 months ago by anevilyak (previous) (diff)

comment:12 Changed 6 months ago by luroh

My home brewed gcc2 virgin vmdk hrevr1alpha-44699 can repeat the problem every time when booting in safe mode, but my hrevr1alpha-44701 can not. Both very clean builds. I think this bug has been squashed.

comment:13 Changed 6 months ago by anevilyak

  • Component changed from Servers/net_server to Network & Internet/Stack
  • Keywords net_timer added
  • Milestone changed from R1/beta1 to R1/alpha4
  • Resolution set to fixed
  • Status changed from new to closed

Thanks for verifying!

Note: See TracTickets for help on using tickets.