Opened 3 weeks ago

Last modified 9 days ago

#18885 new bug

panic: bounce buffer already in use!

Reported by: davidkaroly Owned by: nobody
Priority: normal Milestone: Unscheduled
Component: Drivers/Network Version: R1/Development
Keywords: Cc:
Blocked By: Blocking:
Platform: All

Description (last modified by davidkaroly)

Got this panic while building llvm18. It seems to happen after heavy usage of the file system, with lots of small files.

Steps to reproduce looks something like this:

  1. build llvm18 with haikuporter
  2. delete the work directory: rm -rf work-18.1.2
  3. start the build again.
  4. we get a panic

Screenshot: tbd, unfortunately I didn't create one. It will take some time to trigger this again.

I saw this on recent hrevs e.g. hrev57667 in Hyper-V. Seems to happen both on 32-bit and 64-bit.

Attachments (1)

panic.png (126.1 KB ) - added by davidkaroly 9 days ago.
screenshot

Download all attachments as: .zip

Change History (7)

comment:1 by waddlesplash, 3 weeks ago

Component: System/KernelDrivers/Network
Priority: highnormal

Seems odd this would happen on heavy filesystem usage, because the message comes from the DMA code in the FreeBSD network compatibility layer.

comment:2 by davidkaroly, 3 weeks ago

Description: modified (diff)

ehh sorry I stand corrected, it happened in Hyper-V, not VMware.

Anyway, probably I also deleted the download folder so the panic could have happened during re-downloading the source tarball.

idk does this make more sense like this? e.g. the file system using some kind of buffers and therefore at a later point the network stack runs out of resources?

I saw this happening a few times, every time after heavy file system usage. If I reboot the VM and then re-run the build (possibly including a wget download) I don't get any panic.

Last edited 3 weeks ago by davidkaroly (previous) (diff)

comment:3 by waddlesplash, 3 weeks ago

The panic occurs when a driver tries to load a network buffer into a bounce buffer that already has a different network buffer in it. Maybe something about memory pressure could bring that on, but if it does, there's still a bug elsewhere in that drivers shouldn't ever try to do that.

comment:4 by davidkaroly, 9 days ago

I re-tested on hrev57708. The issue cannot be reproduced on x86_64.

I was able to reproduce it on x86 though. After deleting the old working dir for llvm build (lots of small files!), I get the panic when trying to download the new tarball.

See attached screenshot.

by davidkaroly, 9 days ago

Attachment: panic.png added

screenshot

comment:5 by davidkaroly, 9 days ago

is it possible that the filesystem takes up the buffers and doesn't release them so then in the next step the network driver runs out of buffers? (i'm just guessing, really not familiar with that part of the kernel)

comment:6 by waddlesplash, 9 days ago

No, that's not what the message means. In the FreeBSD bus-dma APIs, bounce buffers have other buffers "loaded into" them; then you send the bounce buffer to the hardware, and when the IO is done you "unload" the buffer you loaded in (and then you can reuse the bounce buffer.) This panic triggers when you try to load some buffer into a bounce buffer that is currently in use and hasn't yet had its buffer "unloaded".

I'm not sure FreeBSD has any equivalent sanity check here, so I think it's possible this is a driver bug that gets caught on Haiku but is silently missed on FreeBSD.

Looking at the logic in tulip_txput, though, I'm not sure how this happens. It first checks if there are any free descriptors, and if there aren't, it calls tulip_tx_intr, which in turn calls tulip_dequeue_mbuf (sometimes indirectly through other functions), which in turn calls bus_dmamap_unload (which is what "unloads" network buffers from bounce buffers.) If we don't wind up with any free descriptors, though, txput just bails without invoking load_mbuf_sg.

There isn't any way for bus_dmamap_unload to fail, so that can't be the problem here. Maybe somewhere in this convoluted logic there's a way for txput to return > 1 without actually having freed the buffer, but if this happens under high-load/high-memory-usage conditions, I don't know what that would be.

I do notice there are some "ifdef i386" in the driver which wouldn't be the case on x86_64. You might try disabling some of those and seeing if that changes the behavior on x86.

Note: See TracTickets for help on using tickets.