Context Navigation

#10169 assigned bug

Check kernel functions' stack usage

Reported by:	korli	Owned by:	nobody
Priority:	normal	Milestone:	Unscheduled
Component:	System/Kernel	Version:	R1/Development
Keywords:		Cc:
Blocked By:		Blocking:
Platform:	x86-64

Description

I reproduced on real hardware and QEmu 1.4.0

Here is the QEmu command line (090c:1000 is the vendor_id:device_id for the usb key)

qemu-system-x86_64 -m 512 -usb -serial stdio  --enable-kvm -usbdevice host:090c:1000 -boot menu=on

then press F12, then select the USB device.

As you may understand this bug is a bit blocking to test x86_64 on real hardware :)

Attachments (1)

log_usb_boot_hrev46281_x86_64.txt (37.7 KB ) - added by korli 11 years ago.: panic serial log

Download all attachments as: .zip

Change History (19)

by korli, 11 years ago

Attachment:	log_usb_boot_hrev46281_x86_64.txt added

panic serial log

comment:1 by korli, 11 years ago

It seems the kernel stack is the same size on 32-bit and 64-bit, which might explain the behavior difference between architectures. The following patch fixes the problem by increasing the kernel stack size by another page.

Is it OK as is or would it be better to differentiate with B_HAIKU_64_BIT?

diff --git a/headers/private/kernel/kernel.h b/headers/private/kernel/kernel.h
index a5c01e3..d068a48 100644
--- a/headers/private/kernel/kernel.h
+++ b/headers/private/kernel/kernel.h
@@ -47,7 +47,7 @@
        // At least, you then know that the stack overflows in this case :)
 
 /** Size of the kernel stack */
-#define KERNEL_STACK_SIZE              (B_PAGE_SIZE * 3)       // 12 kB
+#define KERNEL_STACK_SIZE              (B_PAGE_SIZE * 4)       // 16 kB
 
 #ifdef DEBUG_KERNEL_STACKS
 #      define KERNEL_STACK_GUARD_PAGES 1

comment:2 by axeld, 11 years ago

IIRC it's only 3 pages because of the guard pages -- at least IIRC it used to be 8 kB. Anyway, it should be a 64-bit only change. The kernel stacks can take a lot of (unpageable) memory when you have many threads around.

follow-up: 4 comment:3 by korli, 11 years ago

Fine. But isn't the 2832 bytes stack usage in vm_page_write_modified_page_range() a bit high anyway?

ffffffff811a8010 (+2832) ffffffff8010f3b4   <kernel_x86_64> vm_page_write_modified_page_range() + 0x4cb

in reply to: 3 comment:4 by bonefish, 11 years ago

Replying to korli:

Fine. But isn't the 2832 bytes stack usage in vm_page_write_modified_page_range() a bit high anyway?

Yes, that's way too much. The main issue is

PageWriteWrapper* wrappers[maxPages];

With maxPages being up to 256, this means 2048 bytes on a 64 bit architecture. It should be allocated in the same way wrapperPool is allocated (i.e. on the heap with small stack array fallback).

Another issue is the PageWriteTransfer object allocated on the stack. The class contains the following attribute:

generic_io_vec      fVecs[32]; // TODO: make dynamic/configurable

That's another 512 bytes.

One can see several heavy stack users in the stack trace. The file cache's read_into_cache() being the most problematic, since there's no limit for nested mounting (e.g. an image file in an image file in an image file ...), with certain calls going through as many file cache layers.

I suppose many functions could use a review wrt stack usage. Even 200 bytes is quite a bit. And all code that isn't performance critical and is allowed to fail could allocate variables on the heap.

comment:5 by korli, 11 years ago

Applied a working patch in hrev46314.

comment:6 by korli, 11 years ago

Here is a patch for the wrappers stack usage issue. I have no idea better than stackWrappers2 for the variable name ATM.

Testing with a 128MB VM seemed OK. I encountered "PageWriteWrapper: Failed to write page" messages in the syslog on low memory condition with or without this patch.

diff --git a/src/system/kernel/vm/vm_page.cpp b/src/system/kernel/vm/vm_page.cpp
index ee20257..34d7370 100644
--- a/src/system/kernel/vm/vm_page.cpp
+++ b/src/system/kernel/vm/vm_page.cpp
@@ -3081,7 +3081,14 @@ vm_page_write_modified_page_range(struct VMCache* cache, uint32 firstPage,
 
        int32 nextWrapper = 0;
 
-       PageWriteWrapper* wrappers[maxPages];
+       PageWriteWrapper* stackWrappers2[1];
+       PageWriteWrapper** wrappers
+               = new(malloc_flags(allocationFlags)) PageWriteWrapper*[maxPages];
+       if (wrappers == NULL) {
+               // don't fail, just limit our capabilities
+               wrappers = stackWrappers2;
+               maxPages = 1;
+       }
        int32 usedWrappers = 0;
 
        PageWriteTransfer transfer;
@@ -3153,6 +3160,8 @@ vm_page_write_modified_page_range(struct VMCache* cache, uint32 firstPage,
 
        if (wrapperPool != stackWrappers)
                delete[] wrapperPool;
+       if (wrappers != stackWrappers2)
+               delete[] wrappers;
 
        return B_OK;
 }

comment:7 by bonefish, 11 years ago

Regarding the variable naming, I would rename the old stackWrappers to stackWrapperPool (to match the naming of the actual variable) and go with stackWrappers for the new variable.

I'd also merge the two allocation checks into a single if, like this:

PageWriteWrapper* wrapperPool = ... allocate ...
PageWriteWrapper** wrappers = ... allocate ...
if (wrapperPool == NULL || wrappers == NULL) {
  delete both
  use stack
}

follow-up: 9 comment:8 by korli, 11 years ago

Thanks for the feedback. Applied with your ideas in hrev46354.

read_into_cache() has already a TODO about the stack usage. It would be interesting if there is any static analysis tool to provide stack usage information, but I doubt any exists.

in reply to: 8 ; follow-up: 10 comment:9 by umccullough, 11 years ago

Replying to korli:

It would be interesting if there is any static analysis tool to provide stack usage information, but I doubt any exists.

Not sure what you're looking for specifically. Coverity does have a "stack depth" checker but I have little clue how to configure it. I found this snippet in the documentation:

A.1.5.1.11.  __coverity_stack_depth__(<max_memory>)

Indicates to the STACK_USE checker that the function and its callees should not use more memory (in bytes) than specified by the constant integer <max_memory>. This feature is useful for situations where threads are created with different stack sizes. The primitive should be used in the thread entry-point function.

It seems perhaps you can create a "model" to declare this primitive and it will check the stack below any given function that is modeled.

in reply to: 9 ; follow-up: 11 comment:10 by korli, 11 years ago

Replying to umccullough:

It seems perhaps you can create a "model" to declare this primitive and it will check the stack below any given function that is modeled.

I suppose Coverity should fit our use case then. You can try with 8KB or less for kernel threads. We probably have to insert coverity_stack_depth in our code. Kernel entry points would be kernel threads and syscall entry points.

in reply to: 10 comment:11 by umccullough, 11 years ago

Replying to korli:

We probably have to insert coverity_stack_depth in our code. Kernel entry points would be kernel threads and syscall entry points.

I think they also provide a "model file" submission for analysis purposes, to direct Coverity how to treat certain functions so you don't have to do anything with the code.

I admit, I have only a vague idea of how it is supposed to work, but I have made you an admin if you'd like to investigate and mess with it. You can login and then visit this page: https://scan.coverity.com/projects/82?tab=Analysis+Settings

Further documentation is available here: https://scan.coverity.com/tune and here: https://scan.coverity.com/models

comment:12 by luroh, 11 years ago

Blocking:	7665 added

comment:13 by luroh, 10 years ago

Milestone:	R1 → Unscheduled

Moving non-x86 related tickets out of R1 milestone.

comment:14 by axeld, 8 years ago

Owner:	changed from axeld to nobody
Status:	new → assigned

comment:15 by waddlesplash, 7 years ago

Keywords:	boot-failure added

comment:16 by waddlesplash, 7 years ago

Blocking:	7665 removed

comment:17 by waddlesplash, 6 years ago

Keywords:	boot-failure removed
Summary:	Double Fault Exception on usb boot x86_64 → Check kernel functions' stack usage

Coverity's support for this seems too limited. Instead we should probably use the GCC flag -fstack-usage, which dumps the maximum stack usage of all functions in a file, and look over the kernel as a whole for this.

comment:18 by waddlesplash, 5 years ago

Below are all the functions which use more than 256 bytes of stack in the kernel (the utility I used to generate it is here.)

Likely the node_monitor should be fixed. Otherwise most of these functions using 1-2KB of stack is probably OK; they are mostly immediate syscalls that do not call much else.

file	usage	function
user_debugger.cpp	2336	`status_t debug_nub_thread(void*)`
elf.cpp	1856	`status_t UserSymbolLookup::LookupSymbolAddress(addr_t, addr_t, const char, const char, bool)`
elf.cpp	1648	`status_t elf_load_user_image(const char, BKernel::Team, uint32, addr_t*)`
user_debugger.cpp	1600	`status_t thread_hit_debug_event_internal(debug_debugger_message, const void*, int32, bool, bool&)`
ddm_userland_interface.cpp	1408	`status_t _user_find_disk_system(const char, user_disk_system_info)`
safemode_settings.cpp	1344	`status_t _user_get_safemode_option(const char, char, size_t*)`
node_monitor.cpp	1328	`status_t NodeMonitorService::NotifyEntryMoved(dev_t, ino_t, const char, ino_t, const char, ino_t)`
VMAnonymousCache.cpp	1312	`void swap_init_post_modules()`
node_monitor.cpp	1296	`status_t NodeMonitorService::NotifyAttributeChanged(dev_t, ino_t, ino_t, const char*, int32)`
node_monitor.cpp	1296	`status_t NodeMonitorService::NotifyEntryCreatedOrRemoved(int32, dev_t, ino_t, const char*, ino_t)`
team.cpp	1280	`status_t team_create_thread_start_internal(void*)`
node_monitor.cpp	1280	`status_t NodeMonitorService::NotifyStatChanged(dev_t, ino_t, ino_t, uint32)`
elf.cpp	1200	`void register_elf_image(elf_image_info*)`
team.cpp	1184	`status_t _user_exec(const char, const char const*, size_t, int32, int32, mode_t)`
node_monitor.cpp	1184	`status_t notify_query_entry_event(int32, port_id, int32, dev_t, ino_t, const char*, ino_t)`
user_debugger.cpp	1168	`void user_debug_image_deleted(const image_info*)`
user_debugger.cpp	1168	`void user_debug_image_created(const image_info*)`
image.cpp	1168	`status_t _user_get_next_image_info(team_id, int32, image_info, size_t)`
image.cpp	1152	`status_t _user_get_image_info(image_id, image_info*, size_t)`
image.cpp	1152	`image_id _user_register_image(extended_image_info*, size_t)`
image.cpp	1136	`void _user_image_relocated(image_id)`
signal.cpp	1120	`void handle_signals(BKernel::Thread*)`
vfs_net_boot.cpp	1088	`virtual status_t NetBootMethod::Init()`
vfs_boot.cpp	1072	`void vfs_mount_boot_file_system(kernel_args*)`
hostname.cpp	1072	`int sethostname(const char*, size_t)`
hostname.cpp	1072	`int gethostname(char*, size_t)`
find_directory.cpp	1008	`status_t __find_directory(directory_which, dev_t, bool, char*, int32)`
file_cache.cpp	976	`status_t write_to_cache(file_cache_ref, void, off_t, int32, addr_t, size_t, bool, vm_page_reservation*, size_t)`
file_cache.cpp	944	`status_t read_into_cache(file_cache_ref, void, off_t, int32, addr_t, size_t, bool, vm_page_reservation*, size_t)`
signal.cpp	880	`int64 _user_restore_signal_frame(signal_frame_data*)`
vfs_net_boot.cpp	768	`void NetStackInitializer::_ScanDevices(BPrivate::DiskDevice::KPath&)`
vm_page.cpp	768	`status_t vm_page_write_modified_page_range(VMCache*, uint32, uint32)`
user_debugger.cpp	752	`void user_debug_watchpoint_hit()`
user_debugger.cpp	752	`void user_debug_single_stepped()`
user_debugger.cpp	752	`void user_debug_breakpoint_hit(bool)`
block_cache.cpp	704	`status_t block_notifier_and_writer(void*)`
block_cache.cpp	688	`void block_cache_discard(void*, off_t, size_t)`
block_cache.cpp	672	`status_t cache_sync_transaction(void*, int32)`
block_cache.cpp	656	`status_t block_cache_sync(void*)`
irq_routing_table.cpp	656	`status_t choose_link_device_configurations(acpi_module_info*, IRQRoutingTable&, interrupt_available_check_function)`
block_cache.cpp	656	`status_t block_cache_sync_etc(void*, off_t, size_t)`
KPartition.cpp	640	`void BPrivate::DiskDevice::KPartition::SetVolumeID(dev_t)`
KPartition.cpp	640	`void BPrivate::DiskDevice::KPartition::SetDiskSystem(BPrivate::DiskDevice::KDiskSystem*, float)`
KDiskDeviceManager.cpp	640	`void BPrivate::DiskDevice::KDiskDeviceManager::_NotifyDeviceEvent(BPrivate::DiskDevice::KDiskDevice*, int32, uint32)`
block_cache.cpp	608	`static status_t {anonymous}::BlockWriter::WriteBlock({anonymous}::block_cache, {anonymous}::cached_block)`
block_cache.cpp	608	`status_t write_blocks_in_previous_transaction({anonymous}::block_cache, {anonymous}::cache_transaction)`
device_manager.cpp	592	`status_t control_device_manager(const char, uint32, void, size_t)`
KPartition.cpp	576	`virtual status_t BPrivate::DiskDevice::KPartition::PublishDevice()`
vfs.cpp	576	`status_t common_rename(int, char, int, char, bool)`
AVLTreeBase.cpp	576	`int AVLTreeBase::_Insert(AVLTreeNode*)`
thread.cpp	560	`void thread_exit()`
IOCache.cpp	560	`status_t IOCache::_TransferPages(size_t, size_t, bool, bool)`
thread.cpp	560	`status_t _user_unblock_threads(thread_id*, uint32, status_t)`
debug.cpp	544	`status_t _user_kernel_debugger(const char*)`
arch_cpu.cpp	544	`void dump_feature_string(int, cpu_ent*)`
vfs_boot.cpp	544	`uint32 compute_check_sum(BPrivate::DiskDevice::KDiskDevice*, off_t)`
debug.cpp	544	`void _user_debug_output(const char*)`
vfs.cpp	528	`status_t _user_read_fs_info(dev_t, fs_info*)`
vfs.cpp	528	`status_t _user_write_fs_info(dev_t, const fs_info*, int)`
system_info.cpp	496	`status_t _user_get_system_info(system_info*)`
vfs.cpp	496	`dev_t fs_mount(char, const char, const char, uint32, const char, bool)`
thread.cpp	496	`thread_id thread_create_thread(const BKernel::ThreadCreationAttributes&, bool)`
vm.cpp	480	`area_id vm_create_anonymous_area(team_id, const char, addr_t, uint32, uint32, uint32, addr_t, const virtual_address_restrictions, const physical_address_restrictions, bool, void*)`
arch_cpu.cpp	480	`status_t arch_cpu_init_percpu(kernel_args*, int)`
team.cpp	448	`thread_id load_image_internal(char**&, size_t, int32, int32, int32, team_id, uint32, port_id, uint32)`
vfs.cpp	448	`status_t _user_stat_attr(int, const char, attr_info)`
arch_debug.cpp	448	`ssize_t arch_debug_gdb_get_registers(char*, size_t)`
vm.cpp	448	`area_id _vm_map_file(team_id, const char, void*, uint32, size_t, uint32, uint32, bool, int, off_t, bool)`
irq_routing_table.cpp	432	`status_t read_irq_routing_table_recursive(acpi_module_info, pci_module_info, acpi_handle, uint8, IRQRoutingTable&, IRQRoutingTable&, bool, interrupt_available_check_function)`
vfs.cpp	416	`status_t _user_read_index_stat(dev_t, const char, stat)`
irq_routing_table.cpp	416	`status_t read_irq_routing_table(acpi_module_info*, IRQRoutingTable&, interrupt_available_check_function)`
vfs.cpp	416	`status_t vfs_create_special_node(const char, fs_vnode, mode_t, uint32, bool, fs_vnode, vnode*)`
ddm_userland_interface.cpp	400	`status_t _user_get_next_disk_system_info(int32, user_disk_system_info)`
vfs.cpp	400	`int create_vnode(vnode, const char, int, int, bool)`
vfs.cpp	400	`status_t dir_vnode_to_path(vnode, char, size_t, bool)`
vfs.cpp	400	`dev_t _user_mount(const char, const char, const char, uint32, const char, size_t)`
socket.cpp	384	`status_t _user_get_next_socket_stat(int, uint32, net_stat)`
arch_debug.cpp	384	`int stack_trace(int, char**)`
ddm_userland_interface.cpp	384	`status_t _user_get_disk_system_info(disk_system_id, user_disk_system_info*)`
vm.cpp	384	`status_t vm_soft_fault(VMAddressSpace, addr_t, bool, bool, bool, vm_page*)`
team.cpp	384	`thread_id fork_team()`
node_monitor.cpp	384	`status_t NodeMonitorService::NotifyMount(dev_t, dev_t, ino_t)`
vfs.cpp	384	`status_t vfs_write_pages(vnode, void, off_t, const generic_io_vec, size_t, uint32, generic_size_t)`
vfs.cpp	384	`status_t _user_create_fifo(int, const char*, mode_t)`
vfs.cpp	384	`status_t vfs_read_pages(vnode, void, off_t, const generic_io_vec, size_t, uint32, generic_size_t)`
system_info.cpp	368	`virtual void SystemNotificationService::EventOccurred(NotificationService&, const BPrivate::KMessage*)`
vm.cpp	368	`area_id vm_create_null_area(team_id, const char, void*, uint32, addr_t, uint32)`
driver_settings.cpp	368	`void* load_driver_settings(const char*)`
legacy_drivers.cpp	368	`status_t legacy_driver_probe(const char*)`
vfs.cpp	368	`int _user_open_attr(int, const char, const char, uint32, int)`
vfs.cpp	368	`int _user_open_parent_dir(int, char*, size_t)`
vfs.cpp	368	`status_t normalize_path(char*, size_t, bool, bool)`
vfs.cpp	368	`status_t _user_entry_ref_to_path(dev_t, ino_t, const char, char, size_t)`
syscalls.cpp	352	`status_t _user_generic_syscall(const char, uint32, void, size_t)`
vfs.cpp	352	`status_t entry_ref_to_vnode(dev_t, ino_t, const char, bool, bool, vnode*)`
vfs_request_io.cpp	352	`status_t do_synchronous_iterative_vnode_io(vnode, void, io_request, iterative_io_get_vecs, iterative_io_finished, void)`
vfs.cpp	352	`status_t vfs_init(kernel_args*)`
vfs.cpp	352	`status_t dir_remove(int, char*, bool)`
device_manager.cpp	352	`void publish_directories(const char*)`
VMUtils.cpp	352	`status_t get_mount_point(BPrivate::DiskDevice::KPartition, BPrivate::DiskDevice::KPath)`
vfs.cpp	336	`status_t get_vnode_name(vnode, vnode, char*, size_t, bool)`
KPartition.cpp	336	`virtual void BPrivate::DiskDevice::KPartition::Dump(bool, int32)`
vfs.cpp	336	`status_t common_create_link(int, char, int, char, bool, bool)`
devfs.cpp	336	`status_t devfs_open(fs_volume, fs_vnode, int, void**)`
thread.cpp	336	`void ThreadNotificationService::Notify(uint32, team_id, thread_id, BKernel::Thread*)`
node_monitor.cpp	336	`status_t NodeMonitorService::NotifyUnmount(dev_t)`
devfs.cpp	320	`status_t devfs_ioctl(fs_volume, fs_vnode, void, uint32, void, size_t)`
vfs.cpp	320	`status_t common_create_symlink(int, char, const char, int, bool)`
vfs.cpp	320	`ssize_t _user_write_attr(int, const char, uint32, off_t, const void, size_t)`
vfs.cpp	320	`int file_create(int, char*, int, int, bool)`
debug.cpp	320	`void debug_init_post_modules(kernel_args*)`
device_manager.cpp	320	`status_t device_node::_GetNextDriver(void, driver_module_info&)`
vm.cpp	320	`status_t vm_resize_area(area_id, size_t, bool)`
device_manager.cpp	320	`void init_node_tree()`
cpu.cpp	304	`void load_cpufreq_module()`
vm.cpp	304	`status_t _user_get_next_area_info(team_id, ssize_t, area_info)`
file_cache.cpp	304	`status_t file_cache_control(const char, uint32, void, size_t)`
vfs.cpp	304	`status_t _user_create_index(dev_t, const char*, uint32, uint32)`
thread.cpp	304	`status_t arch_thread_enter_userspace(BKernel::Thread, addr_t, void, void*)`
IOSchedulerSimple.cpp	304	`status_t IOSchedulerSimple::_Scheduler()`
KPartition.cpp	304	`virtual status_t BPrivate::DiskDevice::KPartition::RepublishDevice()`
vm.cpp	304	`status_t _user_set_memory_protection(void*, size_t, uint32)`
signal.cpp	304	`status_t _user_sigwait(const sigset_t, siginfo_t, uint32, bigtime_t)`
vm.cpp	304	`area_id _user_create_area(const char, void*, uint32, size_t, uint32, uint32)`
cpu.cpp	304	`void load_cpuidle_module()`
vfs.cpp	304	`ssize_t _user_read_attr(int, const char, off_t, void, size_t)`
user_debugger.cpp	304	`void nub_thread_cleanup(BKernel::Thread*)`
UserTimer.cpp	304	`int32 _user_create_timer(clockid_t, thread_id, uint32, const sigevent, const thread_creation_attributes)`
vfs.cpp	304	`status_t dir_create(int, char*, int, bool)`
vfs.cpp	304	`int _user_open_entry_ref(dev_t, ino_t, const char*, int, int)`
vfs.cpp	304	`status_t common_unlink(int, char*, bool)`
vfs.cpp	304	`status_t _user_create_dir_entry_ref(dev_t, ino_t, const char*, int)`
arch_cpu.cpp	304	`status_t arch_cpu_init_post_modules(kernel_args*)`
user_debugger.cpp	288	`port_id install_team_debugger(team_id, port_id, thread_id, bool, bool)`
vfs.cpp	288	`status_t _user_remove_attr(int, const char*)`
vm.cpp	288	`status_t vm_set_area_protection(team_id, area_id, uint32, bool)`
AVLTreeBase.cpp	288	`int AVLTreeBase::_RemoveRightMostChild(AVLTreeNode, AVLTreeNode)`
KPartition.cpp	288	`virtual status_t BPrivate::DiskDevice::KPartition::GetPath(BPrivate::DiskDevice::KPath*)`
vfs.cpp	288	`status_t _user_remove_index(dev_t, const char*)`
vfs.cpp	288	`status_t common_file_io_vec_pages(vnode, void, const file_io_vec, size_t, const iovec, size_t, uint32, size_t, size_t*, bool)`
image.cpp	288	`image_id register_image(BKernel::Team, extended_image_info, size_t, bool)`
socket.cpp	288	`ssize_t _user_recvmsg(int, msghdr*, int)`
team.cpp	288	`pid_t _user_wait_for_child(thread_id, uint32, siginfo_t, team_usage_info)`
KDiskDevice.cpp	288	`void BPrivate::DiskDevice::KDiskDevice::_InitPartitionData()`
vfs.cpp	288	`int _user_open_dir_entry_ref(dev_t, ino_t, const char*)`
module.cpp	272	`status_t read_next_module_name(void, char, size_t*)`
image.cpp	272	`status_t unregister_image(BKernel::Team*, image_id)`
module.cpp	272	`status_t module_init_post_boot_device(bool)`
vm.cpp	272	`area_id vm_clone_area(team_id, const char, void*, uint32, uint32, uint32, area_id, bool)`
vm.cpp	272	`area_id vm_copy_area(team_id, const char, void*, uint32, uint32, area_id)`

Note: See TracTickets for help on using tickets.

Download in other formats: