Opened 11 years ago

Closed 10 years ago

Last modified 10 years ago

#2046 closed bug (fixed)

jam on haiku-host occasionally fails with the message: vfork: Out of memory

Reported by: kaoutsis Owned by: axeld
Priority: high Milestone: R1/alpha1
Component: System/Kernel Version: R1/pre-alpha1
Keywords: Cc: black.belt.jimmy@…
Blocked By: Blocking:
Has a Patch: no Platform: All

Description

hrev24856

for example:

~/trunk/haiku>cd src/apps/~/trunk/haiku/src/apps>jam -q
...patience...
[...]
...found 19402 target(s)...
...updating 1218 target(s)...
C++ ../../generated/objects/haiku/x86/release/apps/cortex/support/ObservableHandler.o 
C++ ../../generated/objects/haiku/x86/release/apps/cortex/support/ObservableLooper.o 
C++ ../../generated/objects/haiku/x86/release/apps/cortex/support/observe.o 
C++ ../../generated/objects/haiku/x86/release/apps/cortex/support/SoundUtils.o 
C++ ../../generated/objects/haiku/x86/release/apps/cortex/support/TextControlFloater.o 
Archive ../../generated/objects/haiku/x86/release/apps/cortex/support/cortex_support.a 
/boot/develop/tools/gnupro/bin/ar: creating ../../generated/objects/haiku/x86/release/apps/cortex/support/cortex_support.a
Ranlib ../../generated/objects/haiku/x86/release/apps/cortex/support/cortex_support.a 
MkDir1 ../../generated/objects/haiku/x86/release/apps/3dmov 
C++ ../../generated/objects/haiku/x86/release/apps/3dmov/CubeView.o 
C++ ../../generated/objects/haiku/x86/release/apps/3dmov/GLMovApp.o 
vfork: Out of memory

Change History (26)

comment:1 Changed 11 years ago by bonefish

Get ready to buy more RAM, I'd say. Seriously how much RAM does this machine have? Is anything reserved for kernel tracing?

comment:2 in reply to:  1 ; Changed 11 years ago by kaoutsis

Replying to bonefish:

Get ready to buy more RAM, I'd say.

i was afraid so; until then, what? :-)

Seriously how much RAM does this machine have?

512MB

Is anything reserved for kernel tracing?

the default (1MB)

comment:3 in reply to:  2 ; Changed 11 years ago by bonefish

Replying to kaoutsis:

Replying to bonefish:

Get ready to buy more RAM, I'd say.

i was afraid so; until then, what? :-)

Hope that someone implements swap file support, or an actual vfork() (although the standard recommends to make it synonymous to fork()).

Seriously how much RAM does this machine have?

512MB

I just checked under Linux. There jam eats almost 200 MB, pretty much all of it should be heap, I suppose. When fork()ing twice that amount will need to be committed, which means (since we don't have swap file support) you actually need to have that much available RAM. That would still leave more than 100 MB of memory in your case, which should be more than enough for Haiku, but I suppose we do still leak memory somewhere or the computation of available memory is still broken.

comment:4 Changed 11 years ago by umccullough

I've seen the same thing happen shortly after an svn up followed by configure and jam on a machine with 1GB ram - but I haven't really reproduced it since then either.

I just assumed it was some sort of issue with the current implementation of the VM/cache and would be resolved eventually.

comment:5 Changed 11 years ago by scottmc

Giving VMWARE player 1024MB, and running jam I have seen the About box report at least 470MB used, and sometimes jumps to 722MB or more.

comment:6 in reply to:  3 Changed 11 years ago by kaoutsis

Replying to bonefish:

I just checked under Linux. There jam eats almost 200 MB, pretty much all of it should be heap, I suppose. When fork()ing twice that amount will need to be committed, which means (since we don't have swap file support) you actually need to have that much available RAM. That would still leave more than 100 MB of memory in your case, which should be more than enough for Haiku, but I suppose we do still leak memory somewhere or the computation of available memory is still broken.

Some observations that might help, running haiku hrev24864 using the ProcessController and jam -q haiku-image: besides the 230MB jam takes up for itself (just before it quits due to vfork), kernel from 43MB that allocates when the desktop stuff are loaded, after doing a jam and just before jam quits, it allocates progressively another 100MB, a total of ~150MB. So, when i looked the total value of reserved memory, nearly before jam quits with the vfork: Out of memory, ProcessController says aprox. 450 MB total allocation.

comment:7 Changed 11 years ago by scottmc

Even with 1024MB in VMWARE mine errored out with vfork after 4600th target.

comment:8 Changed 11 years ago by kaoutsis

i made i small test on linux:

  • login in a terminal; no X, no gnome, etc
  • swapoff /dev/hda13
  • top showed 80MB used, (inside this number are

kernel allocations, background processes amongst others apache2, bash etc)

  • jam clean and jam -q @disk successfully
  • jam indeed as Ingo mentioned needs 200 + 200 MB

comment:9 Changed 11 years ago by scottmc

It's been a couple of months so I gave this a try on real hardware again, with just 512meg, and it still errors out on the vfork() out of memory when I try to jam it. I'll see if I can bump it up to 1Gb and try it again.

comment:10 Changed 11 years ago by anevilyak

For what it's worth, I run into that problem on my 1GB system also, but the outcome doesn't surprise me because there hasn't really been any work specifically addressing this problem (it's likely a VM issue).

comment:11 Changed 11 years ago by bonefish

Component: - GeneralSystem/Kernel
Milestone: R1R1/alpha1
Priority: normalhigh

I think I understand the core of the problem, now. One issue is, as already mentioned, that jam uses a lot of heap (checking closer to the end of the build process it was about 270 MB) and when fork()ing twice that amount is reserved due to missing swap support.(*)

The second problem is, that the block cache unnecessarily binds reserved memory in unused blocks. It doesn't free blocks until the low memory state is at least B_LOW_MEMORY_NOTE, which happens when only 2048 free pages are left. There are several issues. First of all the reserved memory and free pages aren't directly related. The available memory is just the lower boundary for pages that are either free or only used in caches (i.e. can be freed at any time). If for instance only 10 MB available memory are left although there are more than 2048 free pages, we won't be in any low memory state. fork()ing jam at that point will fail, since additional 270 MB would need to be reserved. At the same time hundreds of MB could be bound in unused blocks.

Possible changes to improve the situation:

  • Low memory states might also need to be triggered by low amount of available memory.
  • The block cache needs to indicate somehow how much memory it could free, if necessary.
  • vm_try_reserve_memory() might need to trigger freeing of memory and wait.

There might also be other problems. When checking the caches situation in roughly the middle of a "jam @image" the kernel heap consisted of a 16 MB area and 11 additional 4 MB areas. This sounds like quite a bit more than it should be. Not sure to what degree this is indirectly caused by the massive caching that happens (i.e. allocations for cache structures, vnode structures, page mappings). Will examine this some more.

(*) ATM there's a bug (already fixed in my branch) that after the first fork() + exec() there's no memory reserved for jam's heap itself. Whenever the heap is resized (which should happen quite a few times during the build process) that is remedied though, and the subsequent fork() will indeed cause twice the heap size to be reserved.

comment:12 Changed 11 years ago by anevilyak

This might not necessarily be useful any more, but for reference, caches tells me the following after an svn checkout:

total committed memory: 967630848, total used pages: 253214 83254 caches (82965 root caches), sorted by page count per cache tree 0x90d759d8: pages: 51203 - tracing log 0x90d6bb40: pages: 4096 - kernel heap 0x90d6ba50: pages: 2816 - page structures 0x90d75d98: pages: 1664 - sem_table 0x90e5b0f0: pages: 1536 - radeon local memory

  • a cache tree for libbe with 73 pages committed.

20 or so additional RAM caches for additional heap, all 1024 pages, completely committed 1024 pages for Radeon GATT another 912 pages for heap 854 pages for contig:/usr/home/rene/devel/haiku (where I had my checkout going), though the /usr prefix confuses me. 4 vnode caches, 617, 617, 602 and 602 pages 512 pages for memalign area 333 heap pages for team 120 327 heap pages for team 117 112 pages for a libroot cache tree many many pages of block cache buffers, each 256 pages. A few others, less significant. All in all, I can confirm the block cache is definitely eating me alive also. Anything else that'd be of interest?

comment:13 Changed 11 years ago by anevilyak

Also, typing 'q' while still scrolling through that list (it was down to areas of 3 pages or so, and I decided those probably weren't important here), KDL froze entirely. Should I file a ticket for that?

comment:14 in reply to:  13 Changed 11 years ago by emitrax

Replying to anevilyak:

Also, typing 'q' while still scrolling through that list (it was down to areas of 3 pages or so, and I decided those probably weren't important here), KDL froze entirely. Should I file a ticket for that?

That's actually temporarily. It has happened to me too, but eventually KDL gets back to life after a minute or so.

comment:15 in reply to:  13 ; Changed 11 years ago by bonefish

Replying to anevilyak:

Also, typing 'q' while still scrolling through that list (it was down to areas of 3 pages or so, and I decided those probably weren't important here), KDL froze entirely. Should I file a ticket for that?

Nope, it's not a bug. "q" quits the blue screen output, the command continues and prints the remaining output to the serial port only (which can take some time). You can use "a" to actually abort the command. Since the current "q" behavior is not what one wants in most cases, we might want to make it a hidden feature and make the other the default.

comment:16 in reply to:  15 Changed 11 years ago by axeld

Replying to bonefish:

Since the current "q" behavior is not what one wants in most cases, we might want to make it a hidden feature and make the other the default.

That sounds like a good idea. How about 'q' for abort, and 'h' for hide, then?

comment:17 Changed 11 years ago by anevilyak

I'm curious, under what circumstance is the 'h' actually desirable behavior? I.e. what KDL commands would you want to execute that actually have some effect beyond displaying pages of output?

comment:18 in reply to:  17 Changed 11 years ago by bonefish

Replying to anevilyak:

I'm curious, under what circumstance is the 'h' actually desirable behavior? I.e. what KDL commands would you want to execute that actually have some effect beyond displaying pages of output?

If you have a serial line connection you may want to capture extensive output for further analysis without slowing it further down by also having it printed to the screen (besides that one would first need to disable pagination). I don't do that very often, but from time to time the feature is nice to have.

comment:19 Changed 11 years ago by bonefish

The problem seems to be fixed in hrev26375 in my branch. At least a complete "jam -q @image" worked with 824 MB (1 GB - 200 MB tracing buffer, also had kernel heap leak checking enabled). Might also work with less memory, but 512 MB is definitely not enough, since, as written earlier, after a fork() jam's heap alone requires 2 * 270 MB. Swap file support will be required in this case.

I'll close the ticket when I've merged my branch into the trunk. If I forget, please do remind me.

comment:20 Changed 11 years ago by bbjimmy

Cc: black.belt.jimmy@… added

comment:21 Changed 11 years ago by mmlr

Ingo, as your branch is merged now, this should be closed I guess?

comment:22 Changed 11 years ago by bonefish

Resolution: fixed
Status: newclosed

Thanks for the reminder.

comment:23 Changed 10 years ago by haiqu

Resolution: fixed
Status: closedreopened

...patience... ...patience... ...patience... ...patience... ...patience... ...patience... ...found 73860 target(s)... ...updating 8903 target(s)... InitScript1 generated/haiku.image-init-vars vfork: Out of memory /boot/src/haiku>

hrev30142 on an 800Mb Duron with 640Mb memory. Note that I had Virtual Memory OFF and tried BOTH the native compiler and the cross-compiler with the same results.

I'm about to reboot and try with 1Gb of V.M.

comment:24 Changed 10 years ago by anevilyak

jam needs around 600MB of RAM just for itself while initially calculating deps. If you had swap disabled with that little RAM, this is perfectly expected.

comment:25 Changed 10 years ago by bonefish

Resolution: fixed
Status: reopenedclosed

I don't quite understand the "800Mb Duron with 640Mb memory" part, but assuming your system has only 640 MB RAM I agree with Rene that this is just not enough to jam Haiku.

Also, please don't reopen this ticket. The original problems have been understood, fixed, and verified fixed. Open a new ticket, if you feel something is wrong now -- e.g. any new kernel memory leak will cause the same error eventually.

comment:26 Changed 10 years ago by haiqu

Just giving some basic details about my system, bonefish. Yes, my system "only" has 640Mb of memory. :)

Re-enabling Virtual Memory did stop this happening, although it appears memory management needs a further review. I wouldn't have expected 600Mb to be needed by Jam, and most of the time real memory usage sits at 250Mb while building, unless --use-gcc-pipe has been configured (which it wasn't at the time of this report).

Note: See TracTickets for help on using tickets.