Opened 11 years ago

Closed 11 years ago

Last modified 11 years ago

#1920 closed bug (fixed)

Userland Heap Garbage in Kernel Heap Page

Reported by: bonefish Owned by: bonefish
Priority: blocker Milestone: R1/alpha1
Component: System/Kernel Version: R1/pre-alpha1
Keywords: Cc: mmlr
Blocked By: Blocking:
Has a Patch: no Platform: All

Description

hrev24356, VMware

Cf. attached KDL session. Kernel page fault occurred while running the perl test suite. As can be seen in the attachment, there's a complete page of garbage in the entry array of a hash table. The contents looks like it stems from the heap of one of the perl tests (t/TEST or one of its children). The corresponding page looks good though -- it's in the right cache and has no bogus area mappings.

My suspicion would be that e.g. on an exec() a userland heap page has been freed, but not correctly unmapped, so that writes to that virtual address (likely the heap of the exec()ed program) would not cause a page fault and continue to modify that page. At some point the kernel heap would reuse the page and hand out an chunk of memory backed by it. Subsequent writes to the userland heap would then overwrite the contents of this allocation.

Needless to say that I've looked through the involved code and didn't spot a problem. There's still a TODO regarding removing page mappings in delete_cache(), but that looks obsolete to me (mappings and wired_count of the page have been checked a few lines before). There's another TODO in vm_soft_fault() talking about the possibility of mapping a page twice, but that seems unrelated.

I saved the virtual machine state, so if there're any hints what more to examine, I can do that.

Might be related to #1900.

Attachments (2)

kdl-session-userland-heap-garbage-in-kernel-heap-page (229.7 KB) - added by bonefish 11 years ago.
heap_large_alloc.diff (1.2 KB) - added by mmlr 11 years ago.
Use the first allocation page as allocation_id

Download all attachments as: .zip

Change History (10)

comment:1 Changed 11 years ago by axeld

I haven't found much time to look into this, but this looks like environment variables. And those aren't only part of userland apps, but also copied inside the kernel when a new app launches (see team.cpp create_team_arg()).

Since parts of this page are filled with 0xcccccccc, it might have been handed out by the kernel allocator, not the userland allocator (only the former fills it with 0xcc).

And since we found it in the middle of a hashtable allocation, and it's still filled with 0xcc, I'd say the kernel allocator has given this page away twice. It's just the question if it has been mapped twice (as you assumed), or if there is a bug in the heap implementation.

Note also that the cache uses hash_insert_grow() which means the memory of the hash table might change over time.

comment:2 in reply to:  1 ; Changed 11 years ago by bonefish

Cc: mmlr added

Replying to axeld:

I haven't found much time to look into this, but this looks like environment variables. And those aren't only part of userland apps, but also copied inside the kernel when a new app launches (see team.cpp create_team_arg()).

Yep, those definitely are environment variables, but this isn't an intact array. Different variables have been written on top of each other. The longest is BEINCLUDES, on top of it is probably PATH, on top of it something containing the perl test directory path, then some empty variable, on top of that "PERL_DESTRUCT_LEVEL=2" and on top of that "EMXSHELL=sh". The latter two are definitely set only by t/TEST (in that order). That's why I was more thinking of some kind of buffer that perl uses internally.

Since parts of this page are filled with 0xcccccccc, it might have been handed out by the kernel allocator, not the userland allocator (only the former fills it with 0xcc).

Good point! I thought the userland allocator did that too.

And since we found it in the middle of a hashtable allocation, and it's still filled with 0xcc, I'd say the kernel allocator has given this page away twice. It's just the question if it has been mapped twice (as you assumed), or if there is a bug in the heap implementation.

I'm having the suspicion that the page wasn't mapped twice after all, but that the kernel allocator did indeed hand out the chunk of memory more than once. It probably even kept doing that. That would perfectly explain that memory contents: First the hash table gets the chunk, clears it and repopulates the table. Later another large allocation gets the same chunk and causes the table to be 0xcc'd again. Later an environment array is copied from userland during an exec() call, each string being copied over the previous one.

Note also that the cache uses hash_insert_grow() which means the memory of the hash table might change over time.

Yep, I noticed that. The hash table already contains 2622 entries, though, so the last rehashing was already almost 400 entries ago. Since the table contains complete garbage only, I'd suspect that the trouble started only recently, maybe with the ongoing exec().

I guess it doesn't harm to review the allocator code... and add Michael to the CC. :-) Pity that I didn't have the allocator tracing enabled.

comment:3 in reply to:  2 ; Changed 11 years ago by axeld

Replying to bonefish:

I'm having the suspicion that the page wasn't mapped twice after all, but that the kernel allocator did indeed hand out the chunk of memory more than once. It probably even kept doing that. That would perfectly explain that memory contents: First the hash table gets the chunk, clears it and repopulates the table. Later another large allocation gets the same chunk and causes the table to be 0xcc'd again. Later an environment array is copied from userland during an exec() call, each string being copied over the previous one.

Not sure if that really fits so well: if the allocator would have done that, it would have overwritten the memory with 0xcc again. Another possibility would be that the page was copied into that region.

Yep, I noticed that. The hash table already contains 2622 entries, though, so the last rehashing was already almost 400 entries ago. Since the table contains complete garbage only, I'd suspect that the trouble started only recently, maybe with the ongoing exec().

Have you had a look at the pages around that one? Maybe it's not the only one with such contents?

I guess it doesn't harm to review the allocator code... and add Michael to the CC. :-) Pity that I didn't have the allocator tracing enabled.

Indeed :-)

comment:4 in reply to:  3 ; Changed 11 years ago by bonefish

Owner: changed from axeld to bonefish
Status: newassigned

Replying to axeld:

Replying to bonefish:

I'm having the suspicion that the page wasn't mapped twice after all, but that the kernel allocator did indeed hand out the chunk of memory more than once. It probably even kept doing that. That would perfectly explain that memory contents: First the hash table gets the chunk, clears it and repopulates the table. Later another large allocation gets the same chunk and causes the table to be 0xcc'd again. Later an environment array is copied from userland during an exec() call, each string being copied over the previous one.

Not sure if that really fits so well: if the allocator would have done that, it would have overwritten the memory with 0xcc again. Another possibility would be that the page was copied into that region.

Sure, I wouldn't rule that out. BTW it's very interesting to have a look at user_copy_strings_array() (team.cpp). It malloc()s a temporary buffer for the environment strings it copies from userland, which would result in exactly the memory pattern seen on that page. Even more interestingly the buffer size is 4 * B_PAGE_SIZE, while the size allocated for the hash table array is 4 * 4093. Both are "large" allocations, i.e. greater than the maximal allocator bin size.

I've reviewed the large allocation code and it looks good, save for the allocation ID stuff. heap_next_alloc_id() gives out IDs limited to interval [0, 511]. So, if enough large allocations are made, it is possible that two allocations with the same ID are assigned adjacent page ranges. If the first one is freed, the pages of the second one are freed as well and will thus be reused although still being in use. Which could definitely cause this problem.

I don't see why heap_next_alloc_id() limits the IDs like that in the first place, but even if it used the full 16 bits, that would just make the situation less likely. Instead I'd use consecutive numbers for the pages of an allocation (starting with 1). I'll do that.

Yep, I noticed that. The hash table already contains 2622 entries, though, so the last rehashing was already almost 400 entries ago. Since the table contains complete garbage only, I'd suspect that the trouble started only recently, maybe with the ongoing exec().

Have you had a look at the pages around that one? Maybe it's not the only one with such contents?

The attachment also shows the last bytes of the previous (0x00) and several bytes of the next page (0xcc).

comment:5 in reply to:  4 Changed 11 years ago by mmlr

Replying to bonefish:

I don't see why heap_next_alloc_id() limits the IDs like that in the first place, but even if it used the full 16 bits, that would just make the situation less likely. Instead I'd use consecutive numbers for the pages of an allocation (starting with 1). I'll do that.

It needs to limit the count because the allocation_id is in fact the free_count field, which is only 10 bits in size. But it is really a superfluous field anyway. You could use consecutive numbers in that field, or you could simply mark the last page of the allocation by setting the free_count to some specific value.

comment:6 Changed 11 years ago by mmlr

Indeed there is an even simpler way to solve this problem. When using the empty_index field as the allocation_id you get full 16 bits. Then one can simply use the index of the first page of the allocation as the allocation_id. This is guaranteed to be unique and there is no need for any separate id allocation. Also it needs no change to the rest of the code. I will apply a diff for that right after this comment.

Changed 11 years ago by mmlr

Attachment: heap_large_alloc.diff added

Use the first allocation page as allocation_id

comment:7 Changed 11 years ago by mmlr

Resolution: fixed
Status: assignedclosed

I couldn't resist fixing it right away. Sorry if I stepped on your toes here Ingo. Should be fixed in hrev24405.

comment:8 in reply to:  7 Changed 11 years ago by bonefish

Replying to mmlr:

I couldn't resist fixing it right away. Sorry if I stepped on your toes here Ingo.

Not at all. Thanks! Will check later, what the Perl test suite has to say about this change. :-)

Note: See TracTickets for help on using tickets.