Ticket #3261 (closed bug: fixed)

Opened 15 months ago

Last modified 14 months ago

multi_audio KDL: page fault with interrupts disabled

Reported by: umccullough Owned by: korli
Priority: normal Milestone: R1
Component: Drivers/Audio Version: R1/pre-alpha1
Keywords: Cc:
Blocked By: Platform: All
Blocking:

Description

Experienced a KDL while compiling some stuff on a machine I use regularly for porting stuff on Haiku.

PANIC: page fault, but interrupts were disabled. Touching address 0x70102f40 from eip 0x832ad165

Welcome to Kernel Debugging Land... Thread 149 "multi_audio audio output" running on CPU 0 kdebug>

Stack trace is attached as jpg. Machine is still sitting at the crash and I'll hook up the serial cable to get text info next. I'll leave it like this as long as possible if anyone has any additional things to check.

The machine has probably been on for > 24 hours, with some compiling here and there, testing of some software, and running of dnetc for a few hours. Was compiling when it happened.

Attachments

KDL_multi_audio_dellgx200.jpg Download (349.2 KB) - added by umccullough 15 months ago.
KDL and backtrace
multi_audio_backtrace.txt Download (2.7 KB) - added by umccullough 15 months ago.
Backtrace from serial connection
area_70102f40.txt Download (3.1 KB) - added by umccullough 14 months ago.
area -m 0x70102f40 output
page_8217b978_50.txt Download (0.7 KB) - added by umccullough 14 months ago.
page 0x8217b978 and 0x8217b950

Change History

Changed 15 months ago by umccullough

KDL and backtrace

Changed 15 months ago by umccullough

Backtrace from serial connection

follow-up: ↓ 7   Changed 14 months ago by korli

It seems the data parameter passed to the ioctl() is not valid. Maybe it's a bad idea to copy to userland with interrupts disabled.

  Changed 14 months ago by umccullough

Anything further I can get from the KDL (it's still sitting there) - I can run more commands from KDL later on if desired - about 4 hours from now.

  Changed 14 months ago by korli

Could you type "area 0x70102f40" and copy the result here ?

  Changed 14 months ago by korli

even more "area -m 0x70102f40"

Changed 14 months ago by umccullough

area -m 0x70102f40 output

  Changed 14 months ago by umccullough

I attached the text output from "area -m 0x70102f40" to the ticket.

Let me know if there's anything else that would help

  Changed 14 months ago by korli

"page 0x8217b978" and "page 0x8217b950"

in reply to: ↑ 1 ; follow-up: ↓ 8   Changed 14 months ago by axeld

Replying to korli:

Maybe it's a bad idea to copy to userland with interrupts disabled.

I can assure that this is indeed a bad idea. You either need to lock the memory (using lock_memory()) before copying (and turning off interrupts), or use user_memcpy() with enabled interrupts.

in reply to: ↑ 7   Changed 14 months ago by korli

Replying to axeld:

Maybe it's a bad idea to copy to userland with interrupts disabled.

I can assure that this is indeed a bad idea. You either need to lock the memory (using lock_memory()) before copying (and turning off interrupts), or use user_memcpy() with enabled interrupts.

I can do that, but I don't see many drivers which use user_memcpy(). This might need be reviewed globally then. What do you think ?

Changed 14 months ago by umccullough

page 0x8217b978 and 0x8217b950

  Changed 14 months ago by umccullough

Added output from the two page commands

  Changed 14 months ago by axeld

You always need to use user_memcpy() in the kernel for memory that has not been locked. The reason is simple that it could go away at any time; since an eventual page could not be resolved, the kernel would crash with memcpy() where it would just return an error with user_memcpy().

And yes, you are right in that we have to review lots of drivers to do that correctly.

We could follow BeOS that locked the buffers for read/write calls. There, only ioctl() was completely unsafe and each app could easily crash the system by passing an invalid buffer to an ioctl. Unfortunately, there is no way to automatically make sure the buffers passed to ioctl are safe to be used.

In any case, if you need to copy a buffer with interrupts disabled, you have to use lock_memory() to protect it from going away at the wrong time.

follow-up: ↓ 12   Changed 14 months ago by umccullough

Will there be anything else I need to post - or can I reboot that machine tonight? :)

in reply to: ↑ 11   Changed 14 months ago by korli

Replying to umccullough:

Will there be anything else I need to post - or can I reboot that machine tonight? :)

No, you can reboot. Thanks a lot.

  Changed 14 months ago by korli

Applied a patch in r28861. Could you check please ?

follow-up: ↓ 15   Changed 14 months ago by umccullough

I'm not sure I can reproduce the error at will - this was the first time I've ever seen it while using this rev of Haiku pretty regularly on this machine for several weeks now.

With that, I'd say close this issue if you believe it's resolved...

in reply to: ↑ 14   Changed 14 months ago by korli

Replying to umccullough:

I'm not sure I can reproduce the error at will - this was the first time I've ever seen it while using this rev of Haiku pretty regularly on this machine for several weeks now.

Understood

With that, I'd say close this issue if you believe it's resolved...

I'm not sure at all, at least check it still works normally :)

  Changed 14 months ago by umccullough

Listening to audio on that machine at this very moment :)

One thing of note, when I was compiling before, I had no audio playing that I knew of (didn't even have speakers hooked up at the time).

  Changed 14 months ago by umccullough

Oops - forgot to mention that the above comment made while using r28868

  Changed 14 months ago by korli

  • status changed from new to closed
  • resolution set to fixed

Nice! Could you also check with r28887?

Note: See TracTickets for help on using tickets.