Opened 9 years ago

Closed 9 years ago

#8749 closed bug (fixed)

Recursive dependency waiting in the job/worker mechanism can easily result in stack overflows

Reported by: anevilyak Owned by: anevilyak
Priority: normal Milestone: R1
Component: Applications/Debugger Version: R1/Development
Keywords: Cc:
Blocked By: Blocking:
Platform: All

Description

If one is in a stack frame with several compound variables that themselves have more compound members, expands quite a few of them so lots of members are visible in the variables view, and then steps, it's quite easy to cause the debugger to crash with a stack overflow in the worker thread. Examining the backtrace indicates that the issue is one job with many dependencies, where each dependency recursively waits. This probably needs to be rethought since the problem's currently trivial to trigger.

Change History (6)

comment:1 by bonefish, 9 years ago

As a first step it would be worthwhile to examine, if certain jobs make heavy use of the stack. If so, the code could be converted to heap allocations. Also, increasing the stack size for the worker threads would be possible, though only after decreasing the stack use. Obviously this is only an acceptable solution, if the amount of jobs we're talking about, is still manageable.

The more consequent solution is to rewrite the jobs to wait asynchronously. I.e. Do() would have to return indicating that the job is waiting and needs to be re-called when the job it is waiting for is done. This requires a bit more overhead in the job implementations, since the ones that can wait would need to be stateful.

in reply to:  1 ; comment:2 by anevilyak, 9 years ago

Replying to bonefish:

As a first step it would be worthwhile to examine, if certain jobs make heavy use of the stack. If so, the code could be converted to heap allocations. Also, increasing the stack size for the worker threads would be possible, though only after decreasing the stack use. Obviously this is only an acceptable solution, if the amount of jobs we're talking about, is still manageable.

I can do that, though in some cases that might be a bit more difficult to control, since most of them call into other code outside of the job system itself. For reference though, what is our default thread stack size currently?

The more consequent solution is to rewrite the jobs to wait asynchronously. I.e. Do() would have to return indicating that the job is waiting and needs to be re-called when the job it is waiting for is done. This requires a bit more overhead in the job implementations, since the ones that can wait would need to be stateful.

This sounds like ultimately a more scalable approach to me, even if it is a bit more work.

in reply to:  2 comment:3 by bonefish, 9 years ago

Replying to anevilyak:

Replying to bonefish: I can do that, though in some cases that might be a bit more difficult to control, since most of them call into other code outside of the job system itself.

Good point.

For reference though, what is our default thread stack size currently?

256 KB.

The more consequent solution is to rewrite the jobs to wait asynchronously. I.e. Do() would have to return indicating that the job is waiting and needs to be re-called when the job it is waiting for is done. This requires a bit more overhead in the job implementations, since the ones that can wait would need to be stateful.

This sounds like ultimately a more scalable approach to me, even if it is a bit more work.

Yeah, the more I think about it, the more this appears to be the solution to go for.

comment:4 by anevilyak, 9 years ago

Owner: changed from bonefish to anevilyak
Status: newassigned

comment:5 by anevilyak, 9 years ago

Status: assignedin-progress

comment:6 by anevilyak, 9 years ago

Resolution: fixed
Status: in-progressclosed

Fixed in hrev44360.

Note: See TracTickets for help on using tickets.