Opened 15 years ago
Last modified 5 years ago
#5790 assigned enhancement
Use "vmem" or a similar system for allocation of IDs
Reported by: | mmlr | Owned by: | nobody |
---|---|---|---|
Priority: | normal | Milestone: | Unscheduled |
Component: | System/Kernel | Version: | R1/Development |
Keywords: | Cc: | ||
Blocked By: | Blocking: | ||
Platform: | All |
Description
On the WebPositive svn/trac/build server, which is still running on a hrev35294 kernel, I've got this panic after an uptime of way over a month:
out of ports, but sUsedPorts is broken
Sadly I didn't have a working keyboard at that time so I couldn't really dig into why this happened. Since this machine is doing a clean WebPositive build on each new revision, including building the webkit parts and all generated stuff, besides hosting the WebPositive svn and trac and serving the nightlies, as well as being regularly ssh brute-force attacked, this machine is in rather heavy use. Due to the long uptime there might have been some resource that overflowed (like area ids or some such), though it shouldn't have happened quite that "quickly".
Change History (17)
comment:1 by , 15 years ago
follow-up: 3 comment:2 by , 15 years ago
The port ID computation is actually pretty stupid, and can advance rapidly:
// make the port_id be a multiple of the slot it's in if (i >= sNextPort % sMaxPorts) sNextPort += i - sNextPort % sMaxPorts; else sNextPort += sMaxPorts - (sNextPort % sMaxPorts - i);
Still, the code looks reasonably safe to overflows, so I think sUsedPorts is actually broken.
follow-up: 4 comment:3 by , 15 years ago
Replying to axeld:
Still, the code looks reasonably safe to overflows, so I think sUsedPorts is actually broken.
I don't see any code that'd deal with wrapping to negative port ids though (as port_id is a int32). Also what happens if the port id happens to become -1 which is used to indicate a free slot (I didn't actually try to figure out if this can happen though). The concept of sUsedPorts looks pretty simple to me so I don't see where it could really go wrong.
comment:4 by , 15 years ago
Replying to mmlr:
I don't see any code that'd deal with wrapping to negative port ids though (as port_id is a int32).
Indeed. The same problem exists for various other ID generating kernel services. Fixes welcome. :-)
comment:5 by , 15 years ago
Damn it, I didn't think about it being int32... but yes, that would be a good cause for this problem.
comment:6 by , 15 years ago
After a bit more than ten days of uptime the port ids are already at 544968881+, so it is quite likely that after the 30+ days of uptime they wrapped. Anything linking to libbe currently uses up the 3 default reply ports of the static BMessage initialization. Since the server is flooded by the ssh bruteforce attacks which spawn sshd instances which happen to produce these reply ports it sounds like a reasonable explanation. On one side the static BMessage initialization could be made more lazy, on the other we probably should come up with a way to handle the id reuse case in general.
follow-up: 8 comment:7 by , 15 years ago
As Rene points out, sshd doesn't actually link to libbe, but it links to libnetwork which in turn links to libbe, so that's where this is coming from. Since wget does also link to libnetwork and is run in a loop to test the availability of the trac instance, that'd be another regular port id consumer.
comment:8 by , 15 years ago
Milestone: | R1 → R1/alpha2 |
---|
Replying to mmlr:
As Rene points out, sshd doesn't actually link to libbe, but it links to libnetwork which in turn links to libbe, so that's where this is coming from.
I guess hrev28825 totally slipped by me. Apparently someone (no names :-)) reintroduced the libbe dependency after I removed it in hrev25485. Since this only concerns the private API start_watching_network() functions (which play with BMessengers), I'm very much in favor of removing the dependency again, either by simply making the functions inline (and only provide a port+token non-inline version) or move them to libbnetapi.
Anyway, regarding the ID overflow issues, I'm moving the ticket to the R1/alpha2 milestone to add further incentive to solve it soon. :-)
follow-up: 10 comment:9 by , 15 years ago
I thought about using a list of sorted "free ranges" that can be extended/joined on freeing an id and updated/removed on id allocation. Could be made generic and used for area_id as well.
comment:10 by , 15 years ago
Replying to mmlr:
I thought about using a list of sorted "free ranges" that can be extended/joined on freeing an id and updated/removed on id allocation. Could be made generic and used for area_id as well.
You might want to have a look at Bonwick's resource allocator (don't have a link at hand, but shouldn't be too hard to find -- usually in combination with the slab allocator), which was invented to do pretty much exactly that. Though, a simple solution -- like an increment + lookup loop until free spot found -- should work well enough (at least for alpha 2, where I wouldn't want to introduce larger amounts of untested code anymore), particularly in the cases where the domain (positive int32) is several orders of magnitude greater than the total count limit.
comment:11 by , 15 years ago
Here's the paper describing the resource allocator (which is called Vmem for some reason) in chapter 4. The kernel address space management was somewhat inspired by the design.
comment:13 by , 14 years ago
Milestone: | R1/alpha3 → R1/beta1 |
---|
comment:14 by , 10 years ago
The *BSD implementation of vmem (under a 2-clause BSD license): http://www.leidinger.net/FreeBSD/dox/kern/html/d8/d5d/subr__vmem_8c_source.html
comment:15 by , 10 years ago
Milestone: | R1/beta1 → Unscheduled |
---|---|
Type: | bug → enhancement |
It seems http://cgit.haiku-os.org/haiku/diff/src/system/kernel/port.cpp?id=24df65921befcd0ad0c5c7866118f922da61cb96 changed the port ID computation and the new code handles overflows. This solves the initial problem.
I'm making this an enhancement ticket and moving it out of beta1, since it would still be better to use vmem for allocation of port, area, and process IDs (and possibly in other places).
comment:16 by , 8 years ago
Owner: | changed from | to
---|---|
Status: | new → assigned |
comment:17 by , 5 years ago
Summary: | panic: out of ports, but sUsedPorts is broken → Use "vmem" or a similar system for allocation of IDs |
---|
About 6:30AM GMT I tried to svn checkout the trunk - mid way through it stopped and the site hasn't worked since, apologies if my use killed it.