Opened 5 years ago

Closed 4 years ago

Last modified 4 years ago

#15995 closed bug (fixed)

create_area have O(n^2) on number of allocated areas

Reported by: X512 Owned by: nobody
Priority: normal Milestone: R1/beta3
Component: System/Kernel Version: R1/Development
Keywords: Cc:
Blocked By: Blocking:
Platform: All

Description

This is hrev54155.

When creating a lot areas, creating new area become slower. Also task managers (ProcessController and Slayer) become extremely slow.

Part of profile:

        hits     unknown    image
  ------------------------------------------------------------------------------
       31370           0        1 kernel_x86_64

        hits       in us    in %   image  function
  ------------------------------------------------------------------------------
       15024    15024000   46.90       1  VMUserAddressSpace::_InsertAreaSlot(unsigned long, unsigned long, unsigned long, unsigned int, unsigned long, VMUserArea*, unsigned int)
       13643    13643000   42.59       1  VMUserAddressSpace::LookupArea(unsigned long) const
         682      682000    2.13       1  x86_page_fault_exception

Problem function: https://git.haiku-os.org/haiku/tree/src/system/kernel/vm/VMUserAddressSpace.cpp#n526.

Attachments (2)

MemTest.cpp (1.1 KB ) - added by X512 5 years ago.
Test program
screenshot10.png (6.8 KB ) - added by X512 5 years ago.
ActivityMonitor memory plot

Download all attachments as: .zip

Change History (7)

by X512, 5 years ago

Attachment: MemTest.cpp added

Test program

by X512, 5 years ago

Attachment: screenshot10.png added

ActivityMonitor memory plot

comment:1 by waddlesplash, 5 years ago

Indeed, appears LookupArea() just iterates over the area list. I'm not sure why it would be O(N2) though, wouldn't that just be O(N)?

Version 0, edited 5 years ago by waddlesplash (next)

comment:2 by X512, 5 years ago

O(n^2) is time complexity of whole test (creating or deleting n areas). create_area and area_for have O(n) time complexity.

Last edited 5 years ago by X512 (previous) (diff)

comment:3 by mmlr, 4 years ago

This would be fixed with these two parts:

https://review.haiku-os.org/c/haiku/+/2839

https://review.haiku-os.org/c/haiku/+/2842

I do note that this is a rather extreme test case, albeit a nice one to work with.

This generic allocator via areas could also be optimized without the above changes. For example the insertion point hinting, implemented internally in the second change above, could be done on the application side by using B_BASE_ADDRESS and supplying the address just after the previously created area. Then, the area id could simply be stored inside or alongside the item, so the deallocation path could avoid the need for a separate area lookup via area_for(). The latter is true also with the above changes, as it removes the overhead of a syscall.

Also note, that due to ASLR the allocated areas will, purposely, not be packed densely but have randomly sized gaps. On 64 bit this is unproblematic as the address space is huge, but with 32 bit this becomes quite limiting. Then there is the heap allocator that reserves a large chunk of address space, further reducing the space where areas can be inserted into efficiently. Under 32 bit you will therefore only be able to reach around 1 instead of the expected 2 GiBs of allocations. Purely for test purposes you can get around both issues by disabling ASLR and switching the system allocator:

DISABLE_ASLR=1 LD_PRELOAD=libroot_debug.so MALLOC_DEBUG=g MemTest

With that setup you get packed areas and no hindering address space reservations and can reach the expected maximum allocation the 32 bit address space allows for.

In my tests, allocating via areas is only slightly slower than using malloc. This is somewhat surprising considering the syscall overhead and the creation and insertion of the involved structures. It either speaks for the performance of the area allocation system, or against the performance of our malloc...

comment:4 by waddlesplash, 4 years ago

Resolution: fixed
Status: newclosed

Changes merged in hrev54275.

comment:5 by nielx, 4 years ago

Milestone: UnscheduledR1/beta3

Fixed in R1/beta3

Note: See TracTickets for help on using tickets.