Opened 11 years ago

Closed 4 months ago

#2604 closed bug (fixed)

Heavy I/O requests makes the system unusable

Reported by: emitrax Owned by: nobody
Priority: normal Milestone: R1
Component: System/Kernel Version: R1/pre-alpha1
Keywords: Cc:
Blocked By: Blocking:
Has a Patch: no Platform: All


Couldn't find a better summary, sorry.

I was running 4 instances of bonnie++, plus a wget -r on a website, but that was on a different bfs partition.

The system became unusable quite soon. It looks like an infinite loop and not a deadlock, as there always was a thread ready (low resource manager and scsi scheduler).

To summarize: one instance of bonnie++ was waiting on a condition variable in steal_pages while holding the bfs journal lock. All the other three instances of bonnie++ were waiting for the journal lock. The page writer thread was also waiting on a condition variable on vfs_write_pages -> Request::Wait() .

Unfortunately I forgot to check the scsi scheduler thread backtrace.

I hope it helps.

I'm attaching the serial log.

Attachments (1)

haiku-serial-port.txt (43.0 KB) - added by emitrax 11 years ago.

Download all attachments as: .zip

Change History (9)

Changed 11 years ago by emitrax

Attachment: haiku-serial-port.txt added

comment:1 Changed 10 years ago by Adek336

Also note #2812.

comment:2 Changed 2 years ago by diver

Component: - GeneralSystem/Kernel

This one is probably fixed now?

comment:3 Changed 2 years ago by axeld

Owner: changed from axeld to nobody
Status: newassigned

comment:4 Changed 6 months ago by waddlesplash

Nope, the GCC2 buildmaster builder is presently hung on exactly this state: some thread is holding the BFS journal lock and waiting, and all other threads wanting to do disk I/O are blocked waiting for it.

The page writer was also blocked on a cvar in ::Go.

comment:5 Changed 6 months ago by mmlr

Are you sure this was not virtio_block related? If so it would have hung waiting for the condition variable of the virtio request.

comment:6 Changed 6 months ago by waddlesplash

I didn't manage to locate the thread that was actually blocked and didn't locate this ticket until after I rebooted, so it's possible.

comment:7 Changed 6 months ago by mmlr

It is unfortunately somewhat common, so I would intuitively blame it on that and not on the issue described in this ticket. Finding the thread is usually easy: it's the last thread the build process forked off.

The setup described in this ticket seems easy enough to reproduce, so that should be tried and the ticket then closed or investigated.

Last edited 6 months ago by mmlr (previous) (diff)

comment:8 Changed 4 months ago by waddlesplash

Resolution: fixed
Status: assignedclosed

Indeed, I can't reproduce this ticket anymore; the system gets somewhat sluggish but it never locks up altogether.

Note: See TracTickets for help on using tickets.