Opened 9 years ago

Closed 9 years ago

#6312 closed enhancement (fixed)

slab: protection from wrong freed objects

Reported by: lucian Owned by: axeld
Priority: normal Milestone: R1
Component: System/Kernel Version: R1/Development
Keywords: Cc:
Blocked By: Blocking:
Has a Patch: yes Platform: All

Description

A kernel add-on can easily corrupt a SLAB by calling free() on a pointer that was not received from a corresponding malloc().

Attachments (2)

Change History (6)

comment:1 by lucian, 9 years ago

Has a Patch: set

comment:2 by bonefish, 9 years ago

A few remarks:

  • Coding style: No space after the cast operator.
  • The second check is not correct. Slabs can have a "color" (slab::offset) by which the objects are misaligned.
  • The first check is not quite correct either. Due to the slab color the lower bounds check can be stricter (though the second check would catch that). The upper bounds check should also consider the slab color, and the greatest valid object address is after source->size - 1 objects only.
  • Since both checks are not exactly for free, they should be performed only for KDEBUG >= 1.
  • Generally, unless the situation is unrecoverable, one should try to gracefully continue after a panic(), i.e. return in this case.

comment:3 by lucian, 9 years ago

Are you sure the second check is not correct?

  intptr_t objectOffset = data - source->offset - (uint8*) source->pages; 
  if (objectOffset % object_size != 0) { panic... }

I do remove the slab::offset before checking: data - source->offset.

I took as guide the way objects are created in InitSlab:

    uint8* data = ((uint8*)pages) + slab->offset;
    for (size_t i = 0; i < slab->size; i++) {
        status = constructor(cookie, data);
        data += object_size;
    }

The lower bounds check can be tighter (should be data < source->pages + source->offset I'll address that in a second version, but as far as I understand the code there's no off-by-one error (the check uses ">=", not ">").

I wasn't aware one could return after a panic().

I'm not as sure as you are about the KDEBUG >= 1 checks. Trashing a slab's ->free list can easily lead to data corruption or data loss. As a user (not only as a developer) I'd prefer to reboot than have data silently corrupted (think of a malloc() from a file system that receives a pointer to an object that's being written by someone else in parallel corrupting data in transit or before being sent to the disk). I'll put them in because you requested them, but I still disagree with you.

I've rewritten the patch.

in reply to:  3 comment:4 by bonefish, 9 years ago

Resolution: fixed
Status: newclosed

Replying to lucian:

Are you sure the second check is not correct?

  intptr_t objectOffset = data - source->offset - (uint8*) source->pages; 
  if (objectOffset % object_size != 0) { panic... }

I do remove the slab::offset before checking: data - source->offset.

Yeah, sorry, I overlooked that.

I took as guide the way objects are created in InitSlab:

    uint8* data = ((uint8*)pages) + slab->offset;
    for (size_t i = 0; i < slab->size; i++) {
        status = constructor(cookie, data);
        data += object_size;
    }

The lower bounds check can be tighter (should be data < source->pages + source->offset I'll address that in a second version, but as far as I understand the code there's no off-by-one error (the check uses ">=", not ">").

You're right. The check could be tighter, but just like for the lower bounds check, the alignment check catches those cases anyway. Never mind.

I wasn't aware one could return after a panic().

Yeah, the continue KDL command does that for you. Depending on the case that might not make much sense or will cause the same panic() to occur again (e.g. on page fault), but some situations are well continuable (as this one). Although one might want to shut the system down orderly as soon as possible and fix the cause of the problem.

I'm not as sure as you are about the KDEBUG >= 1 checks. Trashing a slab's ->free list can easily lead to data corruption or data loss. As a user (not only as a developer) I'd prefer to reboot than have data silently corrupted (think of a malloc() from a file system that receives a pointer to an object that's being written by someone else in parallel corrupting data in transit or before being sent to the disk). I'll put them in because you requested them, but I still disagree with you.

There's always a trade-off when it comes to error checks. This is pretty hot code (on UP machines called for every free()), so only very cheap checks for programmer errors should be in the production code. I'd also like to add a check for double-frees, but since that is even more expensive, I would only enable it at a higher debug level or as an optional debug feature.

I've rewritten the patch.

Thanks. Committed in hrev37508 with small changes: I joined the two ifs and adjusted the panic() text.

Note: See TracTickets for help on using tickets.