Opened 17 months ago

Last modified 16 months ago

#13911 reopened bug

hrev51714-x86_gcc2_hybrid_anyboot kernel issue due to mild terminal usage and webpositive

Reported by: domcaf Owned by: bonefish
Priority: normal Milestone: Unscheduled
Component: File Systems/packagefs Version: R1/Development
Keywords: Cc:
Blocked By: Blocking:
Has a Patch: no Platform: All

Description (last modified by diver)

hrev51714-x86_gcc2_hybrid_anyboot kernel issue due to mild terminal usage and webpositive

REPRODUCTION STEPS:

  • Run from USB thumbdrive on physical hardware.
  • Make it to desktop.
  • Enter WiFi/WPA credentials.
  • Open terminal and issue following commands:
  • ping -c2 www.yahoo.com for which successful responses received.
  • set -o vi.
  • ping -c2 www.google.com for which successful responses received.
  • {{{perl -v" which produces expected output.
  • Closed terminal by typing using ctrl-d key sequence.
  • Opened webpositive with no success.
  • Open terminal again with no success.
  • Kernal debugger starts up by itself. See attached photos of white background debugger output and "bt" command output. Multiple photos of each provided in case one photo is easier to view than another.
  • Was able to get same behavior multiple times using above steps. Additional hardware info also attached in case this is processor/hardware specific.
  • Thanks for reviewing!

Attachments (16)

IMG_20171228_153212.jpg (1.0 MB) - added by domcaf 17 months ago.
Initial debugger output after it started automatically, background is white.
IMG_20171228_153234.jpg (926.9 KB) - added by domcaf 17 months ago.
Initial debugger output after it started automatically, background is white, alternate photo in case more readable
bt-IMG_20171228_183238.jpg (943.0 KB) - added by domcaf 17 months ago.
kernel "bt" command output
bt-IMG_20171228_183315.jpg (844.2 KB) - added by domcaf 17 months ago.
kernel "bt" command output; alternate for readability alternative
TestBed-system-info_20171227.txt (2.2 KB) - added by domcaf 17 months ago.
Hardware info on which problem occurred using "sysinfo" utility when same system booted into Lubuntu Linux
IMG_20171125_164446.jpg (1.1 MB) - added by domcaf 17 months ago.
Bios info # 1
IMG_20171125_164309.jpg (1.0 MB) - added by domcaf 17 months ago.
Bios info # 2
IMG_20171125_164250.jpg (1.1 MB) - added by domcaf 17 months ago.
Bios info # 3
IMG_20171125_164224.jpg (1.1 MB) - added by domcaf 17 months ago.
Bios info # 4
IMG_20171125_164131.jpg (1002.0 KB) - added by domcaf 17 months ago.
Bios info # 5
memtest86plus-IMG_20171229_003013.jpg (999.6 KB) - added by domcaf 17 months ago.
memtest86+ results screenshot
memtest86plus-IMG_20171229_003025.jpg (1003.6 KB) - added by domcaf 17 months ago.
memtest86+ results screenshot alternate for readability
bt.jpg (779.4 KB) - added by domcaf 16 months ago.
bt output
bt_alt.jpg (961.3 KB) - added by domcaf 16 months ago.
bt_alt for readability problems
syslog.jpg (966.0 KB) - added by domcaf 16 months ago.
syslog | tail from KDL
syslog_alt.jpg (872.5 KB) - added by domcaf 16 months ago.
syslog | tail from KDL alt for readability problems

Change History (30)

Changed 17 months ago by domcaf

Attachment: IMG_20171228_153212.jpg added

Initial debugger output after it started automatically, background is white.

Changed 17 months ago by domcaf

Attachment: IMG_20171228_153234.jpg added

Initial debugger output after it started automatically, background is white, alternate photo in case more readable

Changed 17 months ago by domcaf

Attachment: bt-IMG_20171228_183238.jpg added

kernel "bt" command output

Changed 17 months ago by domcaf

Attachment: bt-IMG_20171228_183315.jpg added

kernel "bt" command output; alternate for readability alternative

Changed 17 months ago by domcaf

Hardware info on which problem occurred using "sysinfo" utility when same system booted into Lubuntu Linux

Changed 17 months ago by domcaf

Attachment: IMG_20171125_164446.jpg added

Bios info # 1

Changed 17 months ago by domcaf

Attachment: IMG_20171125_164309.jpg added

Bios info # 2

Changed 17 months ago by domcaf

Attachment: IMG_20171125_164250.jpg added

Bios info # 3

Changed 17 months ago by domcaf

Attachment: IMG_20171125_164224.jpg added

Bios info # 4

Changed 17 months ago by domcaf

Attachment: IMG_20171125_164131.jpg added

Bios info # 5

comment:1 Changed 17 months ago by waddlesplash

Keywords: kernel debugger self start removed
Priority: highnormal

Can you please get the file /var/log/syslog from the booted Haiku system (before you reproduce the crash, of course) and upload it here? (Or is that somehow not possible?)

Secondly, have you run a memtest86+ on this system recently?

comment:2 Changed 17 months ago by domcaf

memtest86+ ran thru 3 full cycles with no errors before I stopped testing. Please see attached screenshots. Will try to get you syslog soon.

Changed 17 months ago by domcaf

memtest86+ results screenshot

comment:3 Changed 17 months ago by domcaf

Has a Patch: set

Changed 17 months ago by domcaf

memtest86+ results screenshot alternate for readability

comment:4 Changed 17 months ago by diver

Component: - GeneralFile Systems/packagefs
Description: modified (diff)
Owner: changed from nobody to bonefish
Platform: x86All

comment:5 Changed 17 months ago by diver

Looks like Haiku KDL's due to packagefs failing to read package content. Might be a problem with the thumb drive.

comment:6 Changed 17 months ago by domcaf

I'll try a different thumbdrive and post back what happens.

comment:7 Changed 17 months ago by domcaf

I was NOT able to reproduce this bug when I used a different thumb drive. Sorry for the "red herring". I will try multiple thumb drives BEFORE reporting other issues that I might experience. Thank you all for your patience. I consider this issue resolved.

comment:8 Changed 17 months ago by waddlesplash

Resolution: invalid
Status: newclosed

comment:9 Changed 17 months ago by mmlr

Resolution: invalid
Status: closedreopened

It should still not crash in this way. The page fault indicates that some error condition isn't checked/handled. Probably a buffer that failed to allocate/transfer but was then still used. The error should bubble up and eventually be shown to the user somehow, not KDL.

Since it seems that you are able to readily reproduce the issue, can you keep that stick around for a while so that any investigation may get more info if needed? Could you try to reproduce the issue and print the last part of the syslog in KDL with 'syslog | tail' and post the output here? Maybe the issue triggering this leaves some traces which may make it easier to find the problem.

comment:10 Changed 17 months ago by domcaf

I've set aside the thumb drive for additional research on this issue. Do you want additional reproduction attempts with same build as initially reported or a more recent one?

comment:11 Changed 17 months ago by mmlr

It is well possible that the problem only occurs when hitting certain blocks of the stick or only with a certain access pattern. Since modifying the FS layout by updating packages may make the nicely reproducible test case go away, I'd stick to the current setup for now. Once the problem is better understood updating may make sense.

comment:12 Changed 17 months ago by pulkomandy

Has a Patch: unset

comment:13 Changed 16 months ago by domcaf

Got to point where I couldn't get system to boot off of thumb drive so I re-imaged it with same build as reported and tried again. With re-imaged thumb drive, I'd get to the blue background of desktop then after about 30 seconds or so ended up in kernel debugger, KDL. Attached is "bt" and "syslog | tail" output photos.

Changed 16 months ago by domcaf

Attachment: bt.jpg added

bt output

Changed 16 months ago by domcaf

Attachment: bt_alt.jpg added

bt_alt for readability problems

Changed 16 months ago by domcaf

Attachment: syslog.jpg added

syslog | tail from KDL

Changed 16 months ago by domcaf

Attachment: syslog_alt.jpg added

syslog | tail from KDL alt for readability problems

comment:14 Changed 16 months ago by mmlr

Thanks for the update.

The syslog shows that it's a timeout on the USB device as the reason why the read fails. Since the block number seems to have changed, it doesn't look like a specific region of the device being faulty. So there probably is a compatibility issue that leads to the device timeouts. This is an issue that should be investigated separately though.

What should be solved here is that the read error apparently isn't handled and therefore leads to the KDL instead of bubbling up.

Note: See TracTickets for help on using tickets.