Opened 2 months ago

Last modified 2 months ago

#18828 new bug

Can not spawn process after a while

Reported by: LupusMichaelis Owned by: nobody
Priority: normal Milestone: Unscheduled
Component: - General Version: R1/beta4
Keywords: Cc:
Blocked By: Blocking:
Platform: All

Description

Notes:

  • I may use interchangeably “team id” and “pid” in this conversation
  • I'm not sure of the category in which to post this, if a moderator may sort me out and explains (in private) a more adequate place, I'd be grateful
  • I didn't open a Trac issue as they might already be one I didn't find, if not, please tell me I'll open one (and tell me in what module to triage it)

I'm encountering a stability issue with HaikuOS. I'm trying to track down what's happening, so I wrote a program for that. But I'm not sure my watching app is right, so I'd like someone to give it a look and tell me if I'm wrong.

This didn't lead to any data loss.

I'm having an HaikuOS (R1/beta4 hrev56578+95) deployed in a virtual machine qemu under Linux (Linux mothra 5.10.0-23-amd64 #1 SMP Debian 5.10.179-1 (2023-05-12) x86_64 GNU/Linux)

I'm using it daily to learn BeOS API by writing programs and sometime attempting to compile HaikuOS's components. In order to simplify my life, I downloaded and compiled Byobu, which works ok as long as it's not too demanding on the Terminal app (but that's a problem for a different day). Byobu spawn a couple of processes per second to refresh it's status bar.

About ever week or so, HaikuOS can't spawn processes or threads anymore. The Deskbar becomes unresponsive, but <ctrl>+<alt>+<del> allow (sometimes) for a soft reboot, and all running app are still usable.

When attempting to run from the Terminal app crashes it: it displays a message about forking failing and exits.

I observed that the team id was growing. During the past 2 months, I was writing down the uptime and the last team id with the ps command. Once I reach the last tether of my patience, I decided to monitor this growth: [collect-ps](https://gitlab.com/LupusMichaelis/belab/-/tree/trunk/collect-ps)

Once compiled, you can use it to fetch the last team id: `bash ./collect-ps `

or watch the system: `bash ./collect-ps -w & `

This will collect the last team id and the max team id every second, and attempt to spawn a process and a thread.

On my last attempt, the KO is reached for PID 25'465'541. This number doesn't look like anything to me. I'm waiting for the next crash ;

At this time, when I ran ps in an open Terminal app, this didn't crash and output that message:

`bash ~/workshop/belab/todo> ps -bash: fork: Unknown Device Error (-2147432385) -bash: cannot make pipe for command substitution: Too many open files `

My hypothesis is that the kernel reaches an integer ceiling and overflows. But I might be wrong, maybe an other resource id is exhausted.

I join the logs I collected.

So, in my monitoring tool, I observe a few strange problem:

  1. The team id seems stuck for

a while, even though a ps will show higher team ids. It's like if that value was cached at some point, but I don't have an good enough knowledge of the system. And my monitoring tool might be erroneous.

  1. sometime the team id will reverse to a previous value

Is this a known behaviour? I didn't find anything in the Trac concerning such issue.

Attachments (4)

watch-pid2024-02-27.log (258.3 KB ) - added by LupusMichaelis 2 months ago.
watch-pid2024-02-27.log
watch-pid2024-02-27.err (600 bytes ) - added by LupusMichaelis 2 months ago.
watch-pid2024-02-27.err
watch-pid2024-02-28.log (14.4 KB ) - added by LupusMichaelis 2 months ago.
watch-pid2024-02-28.log
collect-ps.c (5.9 KB ) - added by LupusMichaelis 2 months ago.
The monitoring tool to watch team ids

Download all attachments as: .zip

Change History (6)

by LupusMichaelis, 2 months ago

Attachment: watch-pid2024-02-27.log added

watch-pid2024-02-27.log

by LupusMichaelis, 2 months ago

Attachment: watch-pid2024-02-27.err added

watch-pid2024-02-27.err

by LupusMichaelis, 2 months ago

Attachment: watch-pid2024-02-28.log added

watch-pid2024-02-28.log

by LupusMichaelis, 2 months ago

Attachment: collect-ps.c added

The monitoring tool to watch team ids

comment:1 by waddlesplash, 2 months ago

Please retest with a nightly build (you can just change your repos and full-sync with pkgman) and see if anything is different.

comment:2 by LupusMichaelis, 2 months ago

I downloaded the nightly (hrev57609) and installed it in a new VM. Compiled and launch. I'm waiting for it to crash (I'll promote it by building HaikuOS from the VM every while).

So far the behaviour about the team id being stuck when the activity is low remains.

Note: See TracTickets for help on using tickets.