Opened 11 years ago

Last modified 7 days ago

#10468 new bug

KDL: assert failed, scheduler_thread.h:268: stolenTime >= 0

Reported by: jscipione Owned by: pdziepak
Priority: normal Milestone: R1
Component: System/Kernel Version: R1/Development
Keywords: Cc:
Blocked By: Blocking: #18845
Platform: All

Description (last modified by jscipione)

This KDL occurred while running the Icons screensaver overnight.

src/system/kernel/scheduler/scheduler_thread.h:268: stolenTime >= 0

KDL image attached

Attachments (4)

KDL running Icons.png (31.6 KB ) - added by jscipione 11 years ago.
KDL assert failed while running Icons screensaver overnight
stolenTime.png (53.7 KB ) - added by diver 11 years ago.
10468-assertions.diff (1.7 KB ) - added by pdziepak 11 years ago.
additional assertion checks
syslog (121.3 KB ) - added by diver 11 years ago.

Download all attachments as: .zip

Change History (21)

by jscipione, 11 years ago

Attachment: KDL running Icons.png added

KDL assert failed while running Icons screensaver overnight

comment:1 by jscipione, 11 years ago

Description: modified (diff)

comment:2 by diver, 11 years ago

Component: - GeneralSystem/Kernel

comment:3 by diver, 11 years ago

I'm also seeing this panic right after boot in vbox with 2 CPUs in hrev46759.

by diver, 11 years ago

Attachment: stolenTime.png added

comment:4 by diver, 11 years ago

Still happens in hrev47003.

by pdziepak, 11 years ago

Attachment: 10468-assertions.diff added

additional assertion checks

comment:5 by pdziepak, 11 years ago

Unfortunately, I am not able to reproduce this issue and reviewing the involved code does not show anything obviously wrong. I have attached a patch that adds numerous assertion checks to the code which, hopefully, would help to narrow down the issue.

comment:6 by jscipione, 11 years ago

Just a crazy thought due to the timing of the occurrence of this bug... could it be possible that the assert failed due to a time calculation that was skewed by the system clock changing due to DST?

comment:7 by pdziepak, 11 years ago

Not really, system_time() is monotonic (at least when invoked on the same logical processor). DST, time zones, etc affect only real_time_clock*() family of functions.

by diver, 11 years ago

Attachment: syslog added

comment:8 by diver, 11 years ago

Applied the patch. This syslog contains several crashes in scheduler related code.

comment:9 by diver, 11 years ago

ping

comment:10 by pdziepak, 11 years ago

Some of the assertion fails in the syslog really shouldn't happen (namely fail at timeUsed >= 0) and are very strange. However, it just occurred to me that these problems may have also been caused by normal threads running with B_IDLE_PRIORITY like in case of #10766.

Could you check whether the problem is still present on hrev47129?

comment:11 by diver, 11 years ago

So far I've reproduced timeUsed >= 0 assertion fail in hrev47140 in vbox with 2 cpu.

comment:12 by jua, 11 years ago

I'm seeing this issue as well, running hrev47380 in VirtualBox. Got both kinds, timeUsed >= 0 and stolenTime >= 0 failed asserts. I've never seen it happen when running directly on hardware, but since installing in virtualbox a few days ago, it occured several times already, so maybe it has to do with running in a VM... maybe something related to timing inaccuracy by the VM?

In all cases, the system was idling and not even being interacted with, the KDL was then waiting for me when I got back to it. I can provide KDL screenshots in case they are helpful (but I guess they don't show anything that's not already shown in the other attachments on this ticket).

comment:13 by waddlesplash, 6 years ago

Summary: scheduler related KDL assert failedKDL: assert failed, scheduler_thread.h:268: stolenTime >= 0

comment:14 by diver, 11 months ago

Blocking: 18845 added

comment:15 by diver, 2 months ago

Started getting this panic (scheduler_thread.h:317: timeUsed >= 0) in VMware with both 1 and 2 cpus. Curiously, booting older states also panic with the same message.

comment:16 by diver, 2 months ago

Weird, rebooted my windows laptop and now it doesn't happen anymore.

comment:17 by diver, 7 days ago

Triggered again. I think what happened was that I had rebooted windows to linux while Haiku was running in VMware. When I booted to windows again I had all my apps running including vmware. Haiku was very laggy (app_server and kernel were pegging the cpu for no reason) and now I have the same KDL on every Haiku reboot with the same panic but different back traces:

KERN: PANIC: ASSERT FAILED (../haiku-git/src/system/kernel/scheduler/scheduler_thread.h:313): timeUsed >= 0
KERN: Welcome to Kernel Debugging Land...
KERN: Thread 15 "usb explore" running on CPU 0
KERN: stack trace for thread 15 "usb explore"
KERN:     kernel stack: 0xffffffff8119a000 to 0xffffffff8119f000
KERN: frame                       caller             <image>:function + offset
KERN:  0 ffffffff8119ea60 (+  32) ffffffff80154a30   <kernel_x86_64> arch_debug_call_with_fault_handler() + 0x1a
KERN:  1 ffffffff8119eab0 (+  80) ffffffff800b89a8   <kernel_x86_64> debug_call_with_fault_handler() + 0x78
KERN:  2 ffffffff8119eb10 (+  96) ffffffff800ba094   <kernel_x86_64> _ZL20kernel_debugger_loopPKcS0_P13__va_list_tagi() + 0xf4
KERN:  3 ffffffff8119eb60 (+  80) ffffffff800ba42e   <kernel_x86_64> _ZL24kernel_debugger_internalPKcS0_P13__va_list_tagi() + 0x6e
KERN:  4 ffffffff8119ec50 (+ 240) ffffffff800ba7c7   <kernel_x86_64> panic() + 0xb7
KERN:  5 ffffffff8119ece0 (+ 144) ffffffff800a3fc1   <kernel_x86_64> _ZL10reschedulei() + 0xba1
KERN:  6 ffffffff8119ed30 (+  80) ffffffff8008de6e   <kernel_x86_64> thread_block_with_timeout() + 0xbe
KERN:  7 ffffffff8119edd0 (+ 160) ffffffff80074e96   <kernel_x86_64> switch_sem_etc() + 0x306
KERN:  8 ffffffff8119ee20 (+  80) ffffffff81e161ad   <usb> _ZN11ControlPipe11SendRequestEhhtttPvmPm() + 0x8d
KERN:  9 ffffffff8119ee80 (+  96) ffffffff81e13db8   <usb> _ZN3Hub16UpdatePortStatusEh() + 0x48
KERN: 10 ffffffff8119eee0 (+  96) ffffffff81e14454   <usb> _ZN3Hub7ExploreEPP11change_item() + 0x3b4
KERN: 11 ffffffff8119ef40 (+  96) ffffffff81e14546   <usb> _ZN3Hub7ExploreEPP11change_item() + 0x4a6
KERN: 12 ffffffff8119ef90 (+  80) ffffffff81e117dd   <usb> _ZN5Stack7ExploreEv() + 0xdd
KERN: 13 ffffffff8119efb0 (+  32) ffffffff81e11948   <usb> _ZN5Stack13ExploreThreadEPv() + 0x18
KERN: 14 ffffffff8119efd0 (+  32) ffffffff80090657   <kernel_x86_64> _ZL19common_thread_entryPv() + 0x37
KERN: 15 0000000000000000 (+   0) ffffffff8119efe0   160:usb explore_15_kstack@0xffffffff8119a000 + 0x4fe0
KERN: PANIC: ASSERT FAILED (../haiku-git/src/system/kernel/scheduler/scheduler_thread.h:313): timeUsed >= 0
KERN: Welcome to Kernel Debugging Land...
KERN: Thread 25 "scsi_bus_service" running on CPU 0
KERN: stack trace for thread 25 "scsi_bus_service"
KERN:     kernel stack: 0xffffffff8022f000 to 0xffffffff80234000
KERN: frame                       caller             <image>:function + offset
KERN:  0 ffffffff80233b50 (+  32) ffffffff80154a30   <kernel_x86_64> arch_debug_call_with_fault_handler + 0x1a
KERN:  1 ffffffff80233ba0 (+  80) ffffffff800b89a8   <kernel_x86_64> debug_call_with_fault_handler + 0x78
KERN:  2 ffffffff80233c00 (+  96) ffffffff800ba094   <kernel_x86_64> kernel_debugger_loop(char const*, char const*, __va_list_tag*, int) + 0xf4
KERN:  3 ffffffff80233c50 (+  80) ffffffff800ba42e   <kernel_x86_64> kernel_debugger_internal(char const*, char const*, __va_list_tag*, int) + 0x6e
KERN:  4 ffffffff80233d40 (+ 240) ffffffff800ba7c7   <kernel_x86_64> panic + 0xb7
KERN:  5 ffffffff80233dd0 (+ 144) ffffffff800a3fc1   <kernel_x86_64> reschedule(int) + 0xba1
KERN:  6 ffffffff80233e20 (+  80) ffffffff8008de6e   <kernel_x86_64> thread_block_with_timeout + 0xbe
KERN:  7 ffffffff80233e90 (+ 112) ffffffff8005bddb   <kernel_x86_64> ConditionVariableEntry::Wait(unsigned int, long) + 0x16b
KERN:  8 ffffffff80233ec0 (+  48) ffffffff819fcd93   <ata> ATAChannel::WaitForInterrupt(long) + 0x23
KERN:  9 ffffffff80233ef0 (+  48) ffffffff819feedc   <ata> ATADevice::ExecuteReadWrite[clone .localalias] (ATARequest*, unsigned long, unsigned int) + 0x10c
KERN: 10 ffffffff80233f20 (+  48) ffffffff819fc08e   <ata> ATAChannel::ExecuteIO(scsi_ccb*) + 0x9e
KERN: 11 ffffffff80233f40 (+  32) ffffffff819ff93b   <ata> ata_sim_scsi_io(void*, scsi_ccb*) + 0x1b
KERN: 12 ffffffff80233f80 (+  64) ffffffff81de9d55   <scsi> scsi_check_exec_service + 0xb5
KERN: 13 ffffffff80233fb0 (+  48) ffffffff81de5d9f   <scsi> scsi_service_threadproc(void*) + 0x3f
KERN: 14 ffffffff80233fd0 (+  32) ffffffff80090657   <kernel_x86_64> common_thread_entry(void*) + 0x37
KERN: 15 0000000000000000 (+   0) ffffffff80233fe0   247:scsi_bus_service_25_kstack@0xffffffff8022f000 + 0x4fe0
Note: See TracTickets for help on using tickets.