Opened 15 years ago

Closed 10 years ago

#5773 closed bug (duplicate)

vm_page_fault after adding graph

Reported by: andreasf Owned by: axeld
Priority: normal Milestone: R1
Component: Applications/ActivityMonitor Version: R1/Development
Keywords: Cc:
Blocked By: #6481 Blocking:
Platform: x86

Description

Occasionally, File > Add graph leads to the app not properly redrawing and unable to close, followed by a vm_page_fault.

Last seen at hrev36382.

Change History (9)

comment:1 by bonefish, 15 years ago

Can you be more specific please. vm_page_fault() is just the general page fault handler function. It is called very often during normal operation. If this is a kernel panic, please add a stack trace. If this is an application crash, please add a gdb stack trace and the relevant part of the syslog (or the complete thing).

in reply to:  1 comment:2 by andreasf, 15 years ago

It's a non-recoverable KDL. Could reproduce the hang now but not the KDL, here's the serial log, still at hrev36382:

virtual void AddOnManager::MessageReceived(BMessage*) what: vdwI
virtual void AddOnManager::MessageReceived(BMessage*) what: vdwI
Last message repeated 5 times.
Last message repeated 10 times.
Last message repeated 11 times.
PageWriteWrapper: Failed to write page 0x83a0dfb0: General system error

The window fails to redraw beneath the menu, unless where I move the mouse. The graphs are not updated and the system seems sluggish. The window can be moved but not resized or closed, while other windows are okay. The CPU goes crazy, and after killing ActivityMonitor everything seemed normal again this time.

The PageWriteWrapper error was not reproducible next time.

comment:3 by andreasf, 15 years ago

Forgot to mention that this is on real SMP hardware. But the hang happens with second CPU disabled, too.

Sometimes the third (network) graph is sufficient to trigger this, sometimes it requires over twenty adds, but the hang is very much reproducible here. Not for you?

More PageWriteWrapper errors on third try:

slab memory manager: deleting area 0xf6000000 (4452)
PageWriteWrapper: Failed to write page 0x83d02510: General system error
PageWriteWrapper: Failed to write page 0x83d0253c: General system error
PageWriteWrapper: Failed to write page 0x83d02568: General system error

in reply to:  3 comment:4 by bonefish, 15 years ago

Replying to andreasf:

Forgot to mention that this is on real SMP hardware. But the hang happens with second CPU disabled, too.

Sometimes the third (network) graph is sufficient to trigger this, sometimes it requires over twenty adds, but the hang is very much reproducible here. Not for you?

At least on real hardware I have never seen it. Though I've never tried to add 20 graphs either. I just tested a bit under qemu with hrev36408 and haven't encountered any problem either.

More PageWriteWrapper errors on third try:

slab memory manager: deleting area 0xf6000000 (4452)
PageWriteWrapper: Failed to write page 0x83d02510: General system error
PageWriteWrapper: Failed to write page 0x83d0253c: General system error
PageWriteWrapper: Failed to write page 0x83d02568: General system error

When you encounter it again please try the page command in KDL for one of the pages that could not be written and then cache for its cache (if any). Also interesting would be avail and page_stats (no parameters). Note that you can inspect the syslog from KDL via syslog | tail. Furthermore please check which thread or threads eat the CPU and get a back trace for them (bt <thread ID> in KDL).

comment:5 by andreasf, 15 years ago

0x8281fe00  89216  0xccd24200  ActivityMonitor
kdebug> threads 89216
thread         id  state     wait for   object  cpu pri  stack      team  name
0x83342800  89311  waiting   sem        326321    -  20  0xdebdb00089216  source refresh
0x83347000  89280  waiting   sem        326321    -  20  0xdebc300089216  source refresh
0x83305800  89406  waiting   sem        326321    -  20  0xdebd300089216  source refresh
0x83302000  89251  zzz                            -  20  0xdeb5b00089216  source refresh
0x83341800  89220  waiting   cvar   0x81163fdc    -  20  0xdeb2b00089216  _BMediaRoster_
0xce12b000  89376  waiting   sem        326321    -  20  0xdebc700089216  source refresh
0x83343000  89221  waiting   sem        326321    -  20  0xdeb2f00089216  source refresh
0x832cc800  89222  waiting   sem        326321    -  20  0xdeb3300089216  source refresh
0xce19e000  89223  running           -            1  15  0xdeb5300089216  w>ActivityMonitor
0x83321000  89349  waiting   sem        326321    -  20  0xdeb6300089216  source refresh
0xce1a0800  89289  waiting   sem        326321    -  20  0xdebcb00089216  source refresh
0x83345000  89258  waiting   sem        326321    -  20  0xdebab00089216  source refresh
0xce19a000  89322  waiting   sem        326321    -  20  0xdebe300089216  source refresh
0xce2d3000  89229  waiting   sem        326321    -  20  0xdeb5f00089216  source refresh
0x83303800  89417  waiting   sem        329283    -  20  0x8133000089216  source refresh
0x83346000  89264  waiting   sem        326321    -  20  0xdebb300089216  source refresh
0x83345800  89390  waiting   sem        326321    -  20  0xdebaf00089216  source refresh
0xce2d6800  89235  waiting   sem        326321    -  20  0xdeb9300089216  source refresh
0x83346800  89363  zzz                            -  20  0xdebe700089216  source refresh
0xce19c800  89301  waiting   sem        326321    -  20  0xdeb9700089216  source refresh
0xce19b800  89270  waiting   sem        326321    -  20  0xdebbb00089216  source refresh
0xce19b000  89243  zzz                            -  20  0xdeb9b00089216  source refresh
0x83327800  89337  waiting   sem        326321    -  20  0xdebeb00089216  source refresh
0x83344000  89216  waiting   cvar   0x81163c58    -  10  0xdeb4b00089216  ActivityMonitor
kdebug> bt 89223
stack trace for thread 89223 "w>ActivityMonitor"
    kernel stack: 0xdeb53000 to 0xdeb57000
      user stack: 0x700c3000 to 0x70103000
frame               caller     <image>:function + offset
 0 deb56c78 (+  48) 8005bfa4   <kernel_x86> process_pending_ici(int32: 204) + 0x0094
 1 deb56ca8 (+  16) 8005c217   <kernel_x86>:smp_intercpu_int_handler + 0x0017
 2 deb56cb8 (+  48) 8004db28   <kernel_x86>:int_io_interrupt_handler + 0x0058
 3 deb56ce8 (+  32) 800f4244   <kernel_x86> hardware_interrupt(iframe*: 0xdeb56d14) + 0x0114
 4 deb56d08 (+  12) 800f9f8d   <kernel_x86>:int_bottom + 0x003d
kernel iframe at 0xdeb56d14 (end = 0xdeb56d64)
 eax 0x0            ebx 0xcd200d70      ecx 0x0          edx 0x80000000
 esi 0x3            edi 0x82fe4258      ebp 0xdeb56d64   esp 0xdeb56d48
 eip 0x8004c786  eflags 0x10296    
 vector: 0xfd, error code: 0x0
 5 deb56d14 (+  80) 8004c786   <kernel_x86>:free_etc + 0x0006
 6 deb56d64 (+  32) 800e7b85   <kernel_x86> VMArea::~VMArea() + 0x0015
 7 deb56d84 (+  32) 800ec004   <kernel_x86> VMUserAddressSpace<0x82fe4258>::DeleteArea(VMArea*: 0xcd200d70, uint32: 0x3 (3)) + 0x0014
 8 deb56da4 (+  96) 800d7109   <kernel_x86> map_backing_store(VMAddressSpace*: NULL, VMCache*: NULL, void**: 0x19790000, int64: 2, uint32: 0x33 (51), uint32: 0x0 (09
 9 deb56e04 (+ 192) 800d8c43   <kernel_x86>:vm_create_anonymous_area + 0x02b3
10 deb56ec4 (+ 128) 800d92b1   <kernel_x86>:_user_create_area + 0x00c1
11 deb56f44 (+ 100) 800fa1c2   <kernel_x86>:handle_syscall + 0x00af
user iframe at 0xdeb56fa8 (end = 0xdeb57000)
 eax 0xb3           ebx 0x817f68        ecx 0x70102630   edx 0xffff0114
 esi 0x18000000     edi 0x19786d10      ebp 0x7010265c   esp 0xdeb56fdc
 eip 0xffff0114  eflags 0x212      user esp 0x70102630
 vector: 0x63, error code: 0x0
12 deb56fa8 (+   0) ffff0114   <commpage>:commpage_syscall + 0x0004
13 7010265c (+ 112) 007e44c8   </boot/system/lib/libroot.so@0x00744000>:unknown + 0xa04c8
14 701026cc (+  48) 007e935f   </boot/system/lib/libroot.so@0x00744000>:unknown + 0xa535f
15 701026fc (+  80) 007e9da8   </boot/system/lib/libroot.so@0x00744000>:unknown + 0xa5da8
16 7010274c (+  48) 007eb348   </boot/system/lib/libroot.so@0x00744000>:unknown + 0xa7348
17 7010277c (+  48) 007eb563   </boot/system/lib/libroot.so@0x00744000>:unknown + 0xa7563
18 701027ac (+  48) 0035058f   </boot/system/lib/libbe.so@0x00227000>:unknown + 0x12958f
19 701027dc (+ 304) 002147f0   </boot/system/apps/ActivityMonitor@0x00200000>:unknown + 0x147f0
20 7010290c (+ 464) 002149ce   </boot/system/apps/ActivityMonitor@0x00200000>:unknown + 0x149ce
21 70102adc (+ 304) 0030cbf8   </boot/system/lib/libbe.so@0x00227000>:unknown + 0xe5bf8
22 70102c0c (+  48) 00385ca8   </boot/system/lib/libbe.so@0x00227000>:unknown + 0x15eca8
23 70102c3c (+ 752) 0038fc63   </boot/system/lib/libbe.so@0x00227000>:unknown + 0x168c63
24 70102f2c (+ 112) 0038c5d7   </boot/system/lib/libbe.so@0x00227000>:unknown + 0x1655d7
25 70102f9c (+  32) 002cfe4c   </boot/system/lib/libbe.so@0x00227000>:unknown + 0xa8e4c
26 70102fbc (+  32) 0076e64e   </boot/system/lib/libroot.so@0x00744000>:unknown + 0x2a64e
27 70102fdc (+   0) 70102fec   2388290:w>ActivityMonitor_89223_stack@0x700c3000 + 0x3ffec
kdebug> avail
Available memory: 901455872/1072496640 bytes
kdebug> page_stats
page stats:
total: 261840
active: 14244 (busy: 0)
inactive: 201358 (busy: 0)
cached: 781 (busy: 0)
unused: 42 (busy: 0)
wired: 40969 (busy: 0)
modified: 0 (busy: 0)
free: 0
clear: 4446
unreserved free pages: 4446
unsatisfied page reservations: 0
mapped pages: 58183
waiting threads:

free queue: 0x801654f0, count = 0
clear queue: 0x80165504, count = 4446
modified queue: 0x801654c8, count = 0 (0 temporary, 0 swappable, inactive: 0)
active queue: 0x801654a0, count = 14244
inactive queue: 0x801654b4, count = 201358
cached queue: 0x801654dc, count = 781
kdebug> 

comment:6 by andreasf, 15 years ago

Normally with three graphs ActivityMonitor memory usage is around 1.4 MB according to ProcessController. With a dozen graphs and the hang it skyrocketed over 800 MB and kept climbing.

Overall used memory according to AboutSystem went from maybe 120 MB to only around 390 MB and stagnated there. RAM is 1 GB, swap 2 GB.

The page_writer(?) thread in the kernel_team got pretty active at first but calmed down again.

This is at hrev36408.

in reply to:  6 comment:7 by bonefish, 15 years ago

Replying to andreasf:

Normally with three graphs ActivityMonitor memory usage is around 1.4 MB according to ProcessController. With a dozen graphs and the hang it skyrocketed over 800 MB and kept climbing.

Indeed. The stack trace seems to indicate that it just tried to allocate another area and failed (probably due to lack of resources -- though avail still shows almost 900 MB available).

Overall used memory according to AboutSystem went from maybe 120 MB to only around 390 MB and stagnated there. RAM is 1 GB, swap 2 GB.

I'm not quite sure what stats AboutSystem's info is based on, but page_stats and avail (well, I forgot swap) give a detailed picture. There are only 4446 free pages and 781 cached ones. There are 40969 wired ones, which is quite a lot, unless you have kernel tracing enabled with a sizable buffer. And we have 201358 inactive pages (i.e. actually (but not very recently) used ones), which were probably mostly allocated by ActivityMonitor. An areas <team id> for ActivityMonitor would be interesting.

The page_writer(?) thread in the kernel_team got pretty active at first but calmed down again.

This is expected when a team is allocating obscene amounts of memory, as it's the one swapping out pages. The page daemon should probably be even busier in the beginning. At least the low memory situation explains why the whole system slows down. That doesn't explain the crash, but without a stack trace there's little that can be done about it.

The "PageWriteWrapper: Failed to write page" output is probably related to http://dev.haiku-os.org/browser/haiku/trunk/src/system/kernel/port.cpp#L667 and basically harmless.

comment:8 by Coldfirex, 14 years ago

FYI on hrev39551 GCC2Hybrid, if I add graphs to ActivityMonitor it never reaches the debugger, but the program stops responding completely after I add around 8 or so graphs. This is under VirtualBox with 4 CPUs.

Version 0, edited 14 years ago by Coldfirex (next)

comment:9 by pulkomandy, 10 years ago

Blocked By: 6481 added
Resolution: duplicate
Status: newclosed

Closing this since the KDL is gone (at least here I can't reproduce that part). The freeze is tracked in #6481.

Note: See TracTickets for help on using tickets.