Opened 11 years ago
Last modified 5 years ago
#10169 assigned bug
Check kernel functions' stack usage
Reported by: | korli | Owned by: | nobody |
---|---|---|---|
Priority: | normal | Milestone: | Unscheduled |
Component: | System/Kernel | Version: | R1/Development |
Keywords: | Cc: | ||
Blocked By: | Blocking: | ||
Platform: | x86-64 |
Description
I reproduced on real hardware and QEmu 1.4.0
Here is the QEmu command line (090c:1000 is the vendor_id:device_id for the usb key)
qemu-system-x86_64 -m 512 -usb -serial stdio --enable-kvm -usbdevice host:090c:1000 -boot menu=on
then press F12, then select the USB device.
As you may understand this bug is a bit blocking to test x86_64 on real hardware :)
Attachments (1)
Change History (19)
by , 11 years ago
Attachment: | log_usb_boot_hrev46281_x86_64.txt added |
---|
comment:1 by , 11 years ago
It seems the kernel stack is the same size on 32-bit and 64-bit, which might explain the behavior difference between architectures. The following patch fixes the problem by increasing the kernel stack size by another page.
Is it OK as is or would it be better to differentiate with B_HAIKU_64_BIT?
diff --git a/headers/private/kernel/kernel.h b/headers/private/kernel/kernel.h index a5c01e3..d068a48 100644 --- a/headers/private/kernel/kernel.h +++ b/headers/private/kernel/kernel.h @@ -47,7 +47,7 @@ // At least, you then know that the stack overflows in this case :) /** Size of the kernel stack */ -#define KERNEL_STACK_SIZE (B_PAGE_SIZE * 3) // 12 kB +#define KERNEL_STACK_SIZE (B_PAGE_SIZE * 4) // 16 kB #ifdef DEBUG_KERNEL_STACKS # define KERNEL_STACK_GUARD_PAGES 1
comment:2 by , 11 years ago
IIRC it's only 3 pages because of the guard pages -- at least IIRC it used to be 8 kB. Anyway, it should be a 64-bit only change. The kernel stacks can take a lot of (unpageable) memory when you have many threads around.
follow-up: 4 comment:3 by , 11 years ago
Fine. But isn't the 2832 bytes stack usage in vm_page_write_modified_page_range() a bit high anyway?
ffffffff811a8010 (+2832) ffffffff8010f3b4 <kernel_x86_64> vm_page_write_modified_page_range() + 0x4cb
comment:4 by , 11 years ago
Replying to korli:
Fine. But isn't the 2832 bytes stack usage in vm_page_write_modified_page_range() a bit high anyway?
Yes, that's way too much. The main issue is
PageWriteWrapper* wrappers[maxPages];
With maxPages
being up to 256, this means 2048 bytes on a 64 bit architecture. It should be allocated in the same way wrapperPool
is allocated (i.e. on the heap with small stack array fallback).
Another issue is the PageWriteTransfer
object allocated on the stack. The class contains the following attribute:
generic_io_vec fVecs[32]; // TODO: make dynamic/configurable
That's another 512 bytes.
One can see several heavy stack users in the stack trace. The file cache's read_into_cache()
being the most problematic, since there's no limit for nested mounting (e.g. an image file in an image file in an image file ...), with certain calls going through as many file cache layers.
I suppose many functions could use a review wrt stack usage. Even 200 bytes is quite a bit. And all code that isn't performance critical and is allowed to fail could allocate variables on the heap.
comment:6 by , 11 years ago
Here is a patch for the wrappers stack usage issue. I have no idea better than stackWrappers2 for the variable name ATM.
Testing with a 128MB VM seemed OK. I encountered "PageWriteWrapper: Failed to write page" messages in the syslog on low memory condition with or without this patch.
diff --git a/src/system/kernel/vm/vm_page.cpp b/src/system/kernel/vm/vm_page.cpp index ee20257..34d7370 100644 --- a/src/system/kernel/vm/vm_page.cpp +++ b/src/system/kernel/vm/vm_page.cpp @@ -3081,7 +3081,14 @@ vm_page_write_modified_page_range(struct VMCache* cache, uint32 firstPage, int32 nextWrapper = 0; - PageWriteWrapper* wrappers[maxPages]; + PageWriteWrapper* stackWrappers2[1]; + PageWriteWrapper** wrappers + = new(malloc_flags(allocationFlags)) PageWriteWrapper*[maxPages]; + if (wrappers == NULL) { + // don't fail, just limit our capabilities + wrappers = stackWrappers2; + maxPages = 1; + } int32 usedWrappers = 0; PageWriteTransfer transfer; @@ -3153,6 +3160,8 @@ vm_page_write_modified_page_range(struct VMCache* cache, uint32 firstPage, if (wrapperPool != stackWrappers) delete[] wrapperPool; + if (wrappers != stackWrappers2) + delete[] wrappers; return B_OK; }
comment:7 by , 11 years ago
Regarding the variable naming, I would rename the old stackWrappers
to stackWrapperPool
(to match the naming of the actual variable) and go with stackWrappers
for the new variable.
I'd also merge the two allocation checks into a single if
, like this:
PageWriteWrapper* wrapperPool = ... allocate ... PageWriteWrapper** wrappers = ... allocate ... if (wrapperPool == NULL || wrappers == NULL) { delete both use stack }
follow-up: 9 comment:8 by , 11 years ago
Thanks for the feedback. Applied with your ideas in hrev46354.
read_into_cache() has already a TODO about the stack usage. It would be interesting if there is any static analysis tool to provide stack usage information, but I doubt any exists.
follow-up: 10 comment:9 by , 11 years ago
Replying to korli:
It would be interesting if there is any static analysis tool to provide stack usage information, but I doubt any exists.
Not sure what you're looking for specifically. Coverity does have a "stack depth" checker but I have little clue how to configure it. I found this snippet in the documentation:
A.1.5.1.11. __coverity_stack_depth__(<max_memory>) Indicates to the STACK_USE checker that the function and its callees should not use more memory (in bytes) than specified by the constant integer <max_memory>. This feature is useful for situations where threads are created with different stack sizes. The primitive should be used in the thread entry-point function.
It seems perhaps you can create a "model" to declare this primitive and it will check the stack below any given function that is modeled.
follow-up: 11 comment:10 by , 11 years ago
Replying to umccullough:
It seems perhaps you can create a "model" to declare this primitive and it will check the stack below any given function that is modeled.
I suppose Coverity should fit our use case then. You can try with 8KB or less for kernel threads. We probably have to insert coverity_stack_depth in our code. Kernel entry points would be kernel threads and syscall entry points.
comment:11 by , 11 years ago
Replying to korli:
We probably have to insert coverity_stack_depth in our code. Kernel entry points would be kernel threads and syscall entry points.
I think they also provide a "model file" submission for analysis purposes, to direct Coverity how to treat certain functions so you don't have to do anything with the code.
I admit, I have only a vague idea of how it is supposed to work, but I have made you an admin if you'd like to investigate and mess with it. You can login and then visit this page: https://scan.coverity.com/projects/82?tab=Analysis+Settings
Further documentation is available here: https://scan.coverity.com/tune and here: https://scan.coverity.com/models
comment:12 by , 11 years ago
Blocking: | 7665 added |
---|
comment:13 by , 10 years ago
Milestone: | R1 → Unscheduled |
---|
Moving non-x86 related tickets out of R1 milestone.
comment:14 by , 8 years ago
Owner: | changed from | to
---|---|
Status: | new → assigned |
comment:15 by , 6 years ago
Keywords: | boot-failure added |
---|
comment:16 by , 6 years ago
Blocking: | 7665 removed |
---|
comment:17 by , 5 years ago
Keywords: | boot-failure removed |
---|---|
Summary: | Double Fault Exception on usb boot x86_64 → Check kernel functions' stack usage |
Coverity's support for this seems too limited. Instead we should probably use the GCC flag -fstack-usage
, which dumps the maximum stack usage of all functions in a file, and look over the kernel as a whole for this.
comment:18 by , 5 years ago
Below are all the functions which use more than 256 bytes of stack in the kernel (the utility I used to generate it is here.)
Likely the node_monitor should be fixed. Otherwise most of these functions using 1-2KB of stack is probably OK; they are mostly immediate syscalls that do not call much else.
file | usage | function |
user_debugger.cpp | 2336 | status_t debug_nub_thread(void*)
|
elf.cpp | 1856 | status_t UserSymbolLookup::LookupSymbolAddress(addr_t, addr_t*, const char**, const char**, bool*)
|
elf.cpp | 1648 | status_t elf_load_user_image(const char*, BKernel::Team*, uint32, addr_t*)
|
user_debugger.cpp | 1600 | status_t thread_hit_debug_event_internal(debug_debugger_message, const void*, int32, bool, bool&)
|
ddm_userland_interface.cpp | 1408 | status_t _user_find_disk_system(const char*, user_disk_system_info*)
|
safemode_settings.cpp | 1344 | status_t _user_get_safemode_option(const char*, char*, size_t*)
|
node_monitor.cpp | 1328 | status_t NodeMonitorService::NotifyEntryMoved(dev_t, ino_t, const char*, ino_t, const char*, ino_t)
|
VMAnonymousCache.cpp | 1312 | void swap_init_post_modules()
|
node_monitor.cpp | 1296 | status_t NodeMonitorService::NotifyAttributeChanged(dev_t, ino_t, ino_t, const char*, int32)
|
node_monitor.cpp | 1296 | status_t NodeMonitorService::NotifyEntryCreatedOrRemoved(int32, dev_t, ino_t, const char*, ino_t)
|
team.cpp | 1280 | status_t team_create_thread_start_internal(void*)
|
node_monitor.cpp | 1280 | status_t NodeMonitorService::NotifyStatChanged(dev_t, ino_t, ino_t, uint32)
|
elf.cpp | 1200 | void register_elf_image(elf_image_info*)
|
team.cpp | 1184 | status_t _user_exec(const char*, const char* const*, size_t, int32, int32, mode_t)
|
node_monitor.cpp | 1184 | status_t notify_query_entry_event(int32, port_id, int32, dev_t, ino_t, const char*, ino_t)
|
user_debugger.cpp | 1168 | void user_debug_image_deleted(const image_info*)
|
user_debugger.cpp | 1168 | void user_debug_image_created(const image_info*)
|
image.cpp | 1168 | status_t _user_get_next_image_info(team_id, int32*, image_info*, size_t)
|
image.cpp | 1152 | status_t _user_get_image_info(image_id, image_info*, size_t)
|
image.cpp | 1152 | image_id _user_register_image(extended_image_info*, size_t)
|
image.cpp | 1136 | void _user_image_relocated(image_id)
|
signal.cpp | 1120 | void handle_signals(BKernel::Thread*)
|
vfs_net_boot.cpp | 1088 | virtual status_t NetBootMethod::Init()
|
vfs_boot.cpp | 1072 | void vfs_mount_boot_file_system(kernel_args*)
|
hostname.cpp | 1072 | int sethostname(const char*, size_t)
|
hostname.cpp | 1072 | int gethostname(char*, size_t)
|
find_directory.cpp | 1008 | status_t __find_directory(directory_which, dev_t, bool, char*, int32)
|
file_cache.cpp | 976 | status_t write_to_cache(file_cache_ref*, void*, off_t, int32, addr_t, size_t, bool, vm_page_reservation*, size_t)
|
file_cache.cpp | 944 | status_t read_into_cache(file_cache_ref*, void*, off_t, int32, addr_t, size_t, bool, vm_page_reservation*, size_t)
|
signal.cpp | 880 | int64 _user_restore_signal_frame(signal_frame_data*)
|
vfs_net_boot.cpp | 768 | void NetStackInitializer::_ScanDevices(BPrivate::DiskDevice::KPath&)
|
vm_page.cpp | 768 | status_t vm_page_write_modified_page_range(VMCache*, uint32, uint32)
|
user_debugger.cpp | 752 | void user_debug_watchpoint_hit()
|
user_debugger.cpp | 752 | void user_debug_single_stepped()
|
user_debugger.cpp | 752 | void user_debug_breakpoint_hit(bool)
|
block_cache.cpp | 704 | status_t block_notifier_and_writer(void*)
|
block_cache.cpp | 688 | void block_cache_discard(void*, off_t, size_t)
|
block_cache.cpp | 672 | status_t cache_sync_transaction(void*, int32)
|
block_cache.cpp | 656 | status_t block_cache_sync(void*)
|
irq_routing_table.cpp | 656 | status_t choose_link_device_configurations(acpi_module_info*, IRQRoutingTable&, interrupt_available_check_function)
|
block_cache.cpp | 656 | status_t block_cache_sync_etc(void*, off_t, size_t)
|
KPartition.cpp | 640 | void BPrivate::DiskDevice::KPartition::SetVolumeID(dev_t)
|
KPartition.cpp | 640 | void BPrivate::DiskDevice::KPartition::SetDiskSystem(BPrivate::DiskDevice::KDiskSystem*, float)
|
KDiskDeviceManager.cpp | 640 | void BPrivate::DiskDevice::KDiskDeviceManager::_NotifyDeviceEvent(BPrivate::DiskDevice::KDiskDevice*, int32, uint32)
|
block_cache.cpp | 608 | static status_t {anonymous}::BlockWriter::WriteBlock({anonymous}::block_cache*, {anonymous}::cached_block*)
|
block_cache.cpp | 608 | status_t write_blocks_in_previous_transaction({anonymous}::block_cache*, {anonymous}::cache_transaction*)
|
device_manager.cpp | 592 | status_t control_device_manager(const char*, uint32, void*, size_t)
|
KPartition.cpp | 576 | virtual status_t BPrivate::DiskDevice::KPartition::PublishDevice()
|
vfs.cpp | 576 | status_t common_rename(int, char*, int, char*, bool)
|
AVLTreeBase.cpp | 576 | int AVLTreeBase::_Insert(AVLTreeNode*)
|
thread.cpp | 560 | void thread_exit()
|
IOCache.cpp | 560 | status_t IOCache::_TransferPages(size_t, size_t, bool, bool)
|
thread.cpp | 560 | status_t _user_unblock_threads(thread_id*, uint32, status_t)
|
debug.cpp | 544 | status_t _user_kernel_debugger(const char*)
|
arch_cpu.cpp | 544 | void dump_feature_string(int, cpu_ent*)
|
vfs_boot.cpp | 544 | uint32 compute_check_sum(BPrivate::DiskDevice::KDiskDevice*, off_t)
|
debug.cpp | 544 | void _user_debug_output(const char*)
|
vfs.cpp | 528 | status_t _user_read_fs_info(dev_t, fs_info*)
|
vfs.cpp | 528 | status_t _user_write_fs_info(dev_t, const fs_info*, int)
|
system_info.cpp | 496 | status_t _user_get_system_info(system_info*)
|
vfs.cpp | 496 | dev_t fs_mount(char*, const char*, const char*, uint32, const char*, bool)
|
thread.cpp | 496 | thread_id thread_create_thread(const BKernel::ThreadCreationAttributes&, bool)
|
vm.cpp | 480 | area_id vm_create_anonymous_area(team_id, const char*, addr_t, uint32, uint32, uint32, addr_t, const virtual_address_restrictions*, const physical_address_restrictions*, bool, void**)
|
arch_cpu.cpp | 480 | status_t arch_cpu_init_percpu(kernel_args*, int)
|
team.cpp | 448 | thread_id load_image_internal(char**&, size_t, int32, int32, int32, team_id, uint32, port_id, uint32)
|
vfs.cpp | 448 | status_t _user_stat_attr(int, const char*, attr_info*)
|
arch_debug.cpp | 448 | ssize_t arch_debug_gdb_get_registers(char*, size_t)
|
vm.cpp | 448 | area_id _vm_map_file(team_id, const char*, void**, uint32, size_t, uint32, uint32, bool, int, off_t, bool)
|
irq_routing_table.cpp | 432 | status_t read_irq_routing_table_recursive(acpi_module_info*, pci_module_info*, acpi_handle, uint8, IRQRoutingTable&, IRQRoutingTable&, bool, interrupt_available_check_function)
|
vfs.cpp | 416 | status_t _user_read_index_stat(dev_t, const char*, stat*)
|
irq_routing_table.cpp | 416 | status_t read_irq_routing_table(acpi_module_info*, IRQRoutingTable&, interrupt_available_check_function)
|
vfs.cpp | 416 | status_t vfs_create_special_node(const char*, fs_vnode*, mode_t, uint32, bool, fs_vnode*, vnode**)
|
ddm_userland_interface.cpp | 400 | status_t _user_get_next_disk_system_info(int32*, user_disk_system_info*)
|
vfs.cpp | 400 | int create_vnode(vnode*, const char*, int, int, bool)
|
vfs.cpp | 400 | status_t dir_vnode_to_path(vnode*, char*, size_t, bool)
|
vfs.cpp | 400 | dev_t _user_mount(const char*, const char*, const char*, uint32, const char*, size_t)
|
socket.cpp | 384 | status_t _user_get_next_socket_stat(int, uint32*, net_stat*)
|
arch_debug.cpp | 384 | int stack_trace(int, char**)
|
ddm_userland_interface.cpp | 384 | status_t _user_get_disk_system_info(disk_system_id, user_disk_system_info*)
|
vm.cpp | 384 | status_t vm_soft_fault(VMAddressSpace*, addr_t, bool, bool, bool, vm_page**)
|
team.cpp | 384 | thread_id fork_team()
|
node_monitor.cpp | 384 | status_t NodeMonitorService::NotifyMount(dev_t, dev_t, ino_t)
|
vfs.cpp | 384 | status_t vfs_write_pages(vnode*, void*, off_t, const generic_io_vec*, size_t, uint32, generic_size_t*)
|
vfs.cpp | 384 | status_t _user_create_fifo(int, const char*, mode_t)
|
vfs.cpp | 384 | status_t vfs_read_pages(vnode*, void*, off_t, const generic_io_vec*, size_t, uint32, generic_size_t*)
|
system_info.cpp | 368 | virtual void SystemNotificationService::EventOccurred(NotificationService&, const BPrivate::KMessage*)
|
vm.cpp | 368 | area_id vm_create_null_area(team_id, const char*, void**, uint32, addr_t, uint32)
|
driver_settings.cpp | 368 | void* load_driver_settings(const char*)
|
legacy_drivers.cpp | 368 | status_t legacy_driver_probe(const char*)
|
vfs.cpp | 368 | int _user_open_attr(int, const char*, const char*, uint32, int)
|
vfs.cpp | 368 | int _user_open_parent_dir(int, char*, size_t)
|
vfs.cpp | 368 | status_t normalize_path(char*, size_t, bool, bool)
|
vfs.cpp | 368 | status_t _user_entry_ref_to_path(dev_t, ino_t, const char*, char*, size_t)
|
syscalls.cpp | 352 | status_t _user_generic_syscall(const char*, uint32, void*, size_t)
|
vfs.cpp | 352 | status_t entry_ref_to_vnode(dev_t, ino_t, const char*, bool, bool, vnode**)
|
vfs_request_io.cpp | 352 | status_t do_synchronous_iterative_vnode_io(vnode*, void*, io_request*, iterative_io_get_vecs, iterative_io_finished, void*)
|
vfs.cpp | 352 | status_t vfs_init(kernel_args*)
|
vfs.cpp | 352 | status_t dir_remove(int, char*, bool)
|
device_manager.cpp | 352 | void publish_directories(const char*)
|
VMUtils.cpp | 352 | status_t get_mount_point(BPrivate::DiskDevice::KPartition*, BPrivate::DiskDevice::KPath*)
|
vfs.cpp | 336 | status_t get_vnode_name(vnode*, vnode*, char*, size_t, bool)
|
KPartition.cpp | 336 | virtual void BPrivate::DiskDevice::KPartition::Dump(bool, int32)
|
vfs.cpp | 336 | status_t common_create_link(int, char*, int, char*, bool, bool)
|
devfs.cpp | 336 | status_t devfs_open(fs_volume*, fs_vnode*, int, void**)
|
thread.cpp | 336 | void ThreadNotificationService::Notify(uint32, team_id, thread_id, BKernel::Thread*)
|
node_monitor.cpp | 336 | status_t NodeMonitorService::NotifyUnmount(dev_t)
|
devfs.cpp | 320 | status_t devfs_ioctl(fs_volume*, fs_vnode*, void*, uint32, void*, size_t)
|
vfs.cpp | 320 | status_t common_create_symlink(int, char*, const char*, int, bool)
|
vfs.cpp | 320 | ssize_t _user_write_attr(int, const char*, uint32, off_t, const void*, size_t)
|
vfs.cpp | 320 | int file_create(int, char*, int, int, bool)
|
debug.cpp | 320 | void debug_init_post_modules(kernel_args*)
|
device_manager.cpp | 320 | status_t device_node::_GetNextDriver(void*, driver_module_info*&)
|
vm.cpp | 320 | status_t vm_resize_area(area_id, size_t, bool)
|
device_manager.cpp | 320 | void init_node_tree()
|
cpu.cpp | 304 | void load_cpufreq_module()
|
vm.cpp | 304 | status_t _user_get_next_area_info(team_id, ssize_t*, area_info*)
|
file_cache.cpp | 304 | status_t file_cache_control(const char*, uint32, void*, size_t)
|
vfs.cpp | 304 | status_t _user_create_index(dev_t, const char*, uint32, uint32)
|
thread.cpp | 304 | status_t arch_thread_enter_userspace(BKernel::Thread*, addr_t, void*, void*)
|
IOSchedulerSimple.cpp | 304 | status_t IOSchedulerSimple::_Scheduler()
|
KPartition.cpp | 304 | virtual status_t BPrivate::DiskDevice::KPartition::RepublishDevice()
|
vm.cpp | 304 | status_t _user_set_memory_protection(void*, size_t, uint32)
|
signal.cpp | 304 | status_t _user_sigwait(const sigset_t*, siginfo_t*, uint32, bigtime_t)
|
vm.cpp | 304 | area_id _user_create_area(const char*, void**, uint32, size_t, uint32, uint32)
|
cpu.cpp | 304 | void load_cpuidle_module()
|
vfs.cpp | 304 | ssize_t _user_read_attr(int, const char*, off_t, void*, size_t)
|
user_debugger.cpp | 304 | void nub_thread_cleanup(BKernel::Thread*)
|
UserTimer.cpp | 304 | int32 _user_create_timer(clockid_t, thread_id, uint32, const sigevent*, const thread_creation_attributes*)
|
vfs.cpp | 304 | status_t dir_create(int, char*, int, bool)
|
vfs.cpp | 304 | int _user_open_entry_ref(dev_t, ino_t, const char*, int, int)
|
vfs.cpp | 304 | status_t common_unlink(int, char*, bool)
|
vfs.cpp | 304 | status_t _user_create_dir_entry_ref(dev_t, ino_t, const char*, int)
|
arch_cpu.cpp | 304 | status_t arch_cpu_init_post_modules(kernel_args*)
|
user_debugger.cpp | 288 | port_id install_team_debugger(team_id, port_id, thread_id, bool, bool)
|
vfs.cpp | 288 | status_t _user_remove_attr(int, const char*)
|
vm.cpp | 288 | status_t vm_set_area_protection(team_id, area_id, uint32, bool)
|
AVLTreeBase.cpp | 288 | int AVLTreeBase::_RemoveRightMostChild(AVLTreeNode**, AVLTreeNode**)
|
KPartition.cpp | 288 | virtual status_t BPrivate::DiskDevice::KPartition::GetPath(BPrivate::DiskDevice::KPath*)
|
vfs.cpp | 288 | status_t _user_remove_index(dev_t, const char*)
|
vfs.cpp | 288 | status_t common_file_io_vec_pages(vnode*, void*, const file_io_vec*, size_t, const iovec*, size_t, uint32*, size_t*, size_t*, bool)
|
image.cpp | 288 | image_id register_image(BKernel::Team*, extended_image_info*, size_t, bool)
|
socket.cpp | 288 | ssize_t _user_recvmsg(int, msghdr*, int)
|
team.cpp | 288 | pid_t _user_wait_for_child(thread_id, uint32, siginfo_t*, team_usage_info*)
|
KDiskDevice.cpp | 288 | void BPrivate::DiskDevice::KDiskDevice::_InitPartitionData()
|
vfs.cpp | 288 | int _user_open_dir_entry_ref(dev_t, ino_t, const char*)
|
module.cpp | 272 | status_t read_next_module_name(void*, char*, size_t*)
|
image.cpp | 272 | status_t unregister_image(BKernel::Team*, image_id)
|
module.cpp | 272 | status_t module_init_post_boot_device(bool)
|
vm.cpp | 272 | area_id vm_clone_area(team_id, const char*, void**, uint32, uint32, uint32, area_id, bool)
|
vm.cpp | 272 | area_id vm_copy_area(team_id, const char*, void**, uint32, uint32, area_id)
|
panic serial log