Opened 4 years ago

Last modified 4 years ago

#16787 new enhancement

Bibliotheken Handhabung

Reported by: lelldorin Owned by: nobody
Priority: normal Milestone: Unscheduled
Component: - General Version: R1/beta2
Keywords: Libraried, handling Cc:
Blocked By: Blocking:
Platform: All

Description

Wäre es nicht sinnvoll, wenn anstatt alte libraries einfach durch neue zu ersetzen, diese ein halbes Jahr parallel laufen würden? Oftmals laufen Programme nach einem Update nicht mehr und müssen in den Rezepten angepasst werden, da sich Bibliotheken Versionen ändern. Hat man aber eine Weile beide parallel laufen, kann man drauf reagieren ohne das ein Programm dann längere Zeit ausfällt oder endlose Frage und Fehlermeldungen kommen.

Wouldn't it make sense if instead of simply replacing old libraries with new ones, they would run in parallel for half a year? Often programs no longer run after an update and have to be adapted in the recipes, since library versions change. But if you have both been running in parallel for a while, you can react without a program failing for a long time or endless questions and error messages appear.

Change History (19)

comment:1 by axeld, 4 years ago

Yes, that might be useful, and in fact, we're already doing this -- for some selected libraries that are supposed to live side by side (like SDL 1 vs. SDL 2).

However, our package management can only run two packages of the same library side by side if, and only if, they use different names. Unfortunately, this makes this impractical to use on a general basis.

comment:2 by X512, 4 years ago

I like idea of GoboLinux where each package is installed in separate directory. Dependencies can be provided as symbolic links.

comment:3 by pulkomandy, 4 years ago

I think this is to be handled by the haikuports team. They already do this for some packages (for example libpng) without problems. They simply have packages with different names for different versions (libpng12, libpng16, etc).

We should fix the dependency solving to allow multiple versions of a package (it is not a problem of the package manager itself, only of the dependency solving in libsolv which has a constraint of trying to keep an unique version of each package basename). But it is not an easy issue for us, the simple fix (ignoring the name) would result in people having all versions of a library or package remaining installed. Also, this is possible for some packages, but not all of them. Having multiple versions of the haiku package installed would surely create all kind of "fun and interesting" problems.

So, I think the best plan of action here is first to have the people at haikuports do the right thing and be careful about this when they update packages: check if the sonames for packages have changed, if so, create a separate recipe and make sure proper version constraints are set for dependencies.

We can solve some things on Haiku side, but there will be no magic solution if the people at haikuports keep deleting "old" versions of libraries while they are still needed by other packages.

comment:4 by X512, 4 years ago

We can solve some things on Haiku side, but there will be no magic solution if the people at haikuports keep deleting "old" versions of libraries while they are still needed by other packages.

In worst case old packages can be archived and recovered outside of HaikuPorts. Old packages can be still installed by someone and can be shared.

I think that high dependence on correct HaikuPorts usage is bad idea. Packages can be also provided by 3-rd party developers without source code so it can't be easily fixed.

comment:5 by X512, 4 years ago

Following solution can be used. Consider 2 packages: libsomelib-1.0.0-x86.hpkg and someapp-1.0.0-x86.hpkg.

Directory structure:

libsomelib
	lib
		libsomelib.so
someapp
	apps
		SomeApp

/boot/system/packages
	libsomelib-1.0.0-x86.hpkg
	someapp-1.0.0-x86.hpkg
/boot/system/package-data
	libsomelib-1.0.0-x86
		lib
			libsomelib.so
	someapp-1.0.0-x86
		apps
			SomeApp
			lib
				libsomelib.so -> /boot/system/package-data/libsomelib-1.0.0-x86/lib/libsomelib.so // generated by package_fs

package_fs will map all needed files from imported packages to current package directory as symbolic links. Something like executable symbol resolution.

It will also improve OS loading speed (I tested that mounting package_fs takes several seconds) and reduce memory usage by package_fs because directory merging will be not required anymore. package_fs can lookup specific package when its package directory is accessed.

Last edited 4 years ago by X512 (previous) (diff)

comment:6 by pulkomandy, 4 years ago

How is that different from what packagefs already does in /package-links? Also, how would generating fake symlinks be more efficient than just making the files directly available? Wouldn't it only result in more levels of indirections and roundtrips between userland and kernel?

Also, I don't see how it is related to the problem we are discussing here, which is, what happens when there is an update to libsomelib-1.0.1-x86.hpkg and it is not ABI compatible with libsomelib-1.0.0-x86.hpkg?

We handle the dependencies at the library soname level. Assuming sonames are properly set, you can end up with the following problems:

If the previous version (libsomelib-1.0.0-x86.hpkg) of the package is removed from the repository, you cannot install someapp-1.0.0-x86.hpkg unless you had a backup of libsomelib-1.0.0-x86.hpkg somewhere. This should be solved at haikuports: don't remove old version of libraries when they are still needed. We can change how we manage packages, if the packages we need are not in the depot, there is no solution.

If the previous version is kept in the depot, you can end up with situations like this:

  • someapp depends on libA-1.0.0 and libB-1.0.0
  • libA-1.0.0 also depends on libB-1.0.0
  • Then, libB is updated to version 2.0.0 (ABI incompatible)
  • libA is rebuilt to use libB-1.0.0. But, someapp is not rebuilt
  • now, to run someapp you need both libB-1.0.0 and libB-2.0.0 at the same time. Since the author of libB probably did not plan for this, there are conflicting symbols and other problems between the two versions, and the app crashes in strange ways

In either case we cannot do much about it on Haiku side. There is simply no way to get a working install from the data we are provided. These are issues to be handled when building the packages, to ensure the package repository is consistent with itself. Otherwise our package resolution, no matter what we try, cannot work. It is "garbage in, garbage out".

comment:7 by X512, 4 years ago

Assuming sonames are properly set

Wrong assumption. Also files from different packages can cause collisions, especially for different independent repositories and individually-distributed package files. Storing package files in independent directory will completely solve name collision problem. Even package name collision can be handled (auto rename to package-2 and ask which package should be used on install).

Current approach is too HaikuPorts-centered.

Last edited 4 years ago by X512 (previous) (diff)

comment:8 by pulkomandy, 4 years ago

If sonames are not set we can do absolutely nothing. All libs have the same name and it is impossible to know that they are not compatible with each other. Then I still don't see what we can do on Haiku side about it.

How is dependency resolution supposed to work in that case?

When I do pkgman install someapp, and someapp depends on "libA.so", and there are 10 or more packages providing different versions of "libA.so", all with different ABIs, what should we do? No, "ask the user to pick one" isn't a reasonable choice.

The alternative is that someapp depends explicitly on libA-1.0.0.hpkg. Then, libA-1.0.1.hpkg becomes available, it has a critical bugfix to libA. But since someapp has hardcoded a very specific version of the package, it now has to be recompiled, even though libA-1.0.1 is ABI compatible with 1.0.0. This does not scale very well either.

This is why we use the sonames to manage the dependencies. That's what they are meant to do, and they are a good solution to this problem. So I will not consider any solution where people package things without setting a soname. The way they set it is irrelevant and haikuports does not need to be involved. It is standard good practise to set a soname to libraries, and in fact haikuporter does nothing special about it, it's a feature of buildsystems like autoconf or cmake.

comment:9 by X512, 4 years ago

now, to run someapp you need both libB-1.0.0 and libB-2.0.0 at the same time. Since the author of libB probably did not plan for this, there are conflicting symbols and other problems between the two versions, and the app crashes in strange ways

It can be handled by symbol versioning. Even just enabling symbol versioning will switch symbol resolution from process-global to per so-module like in Windows DLLs. It will be nice to enable symbol versioning by default in linker. It will dramatically reduce a risk of symbol conflict and allows to display so-file of missing symbol.

Last edited 4 years ago by X512 (previous) (diff)

comment:10 by X512, 4 years ago

If sonames are not set we can do absolutely nothing.

sonames have symlinks that can cause conflict. For example libsomelib.so -> libsomelib.so.1 -> libsomelib.so.1.0 -> libsomelib.so.1.0.0. Installing different package versions will cause conflict of symlinks. Some application may work with wide ranges of library versions and others can work only with specific one. This is currently not handled well on HaikuPorts.

Also I mentioned performance issue. Global merge is inefficient and non-scalable compare to separate directory for each package. Separate directories can be mounted instantly using very few memory. Package data reading can be done on demand when directory is accessed.

comment:11 by pulkomandy, 4 years ago

sonames have symlinks that can cause conflict

In Haiku there is no libsomelib.so in /system/lib. There is only libsomelib.so.1 (the actual soname). The version of the library without a soname is moved to a different directory (/system/develop/lib) to avoid this issue. It is not possible to have multiple development versions of a lib installed at the same time, but I think that is a sane restriction.

If two libraries have the same soname, they are indeed in conflict, because the runtime loader cannot know which one to use. And I don't think it makes sense to allow multiple libs with the same soname as the normal way to use the system. It will just result in more confusion for the users.

The soname is what uniquely identifies a library, and if we can't assume that, we can do absolutely nothing about dependency resolution. It will result in either even more problems, or people giving up and using static linking or approximations of it (like flatpak/snap). Making us unable to provide bugfixes to libraries without rebuilding the whole universe everytime.

Also I mentioned performance issue.

Understood, but unrelated to the problem. Let's first make things work right, then, once we agree on what exactly we want to do and what problems we need to solve, let's see about making it fast.

I am also not immediately convinced that hundreds or thousands of smaller separate directories will be better, both in terms of performance and in terms of usability (it seems that it could be very confusing for users). It is woth experimenting it, but it is a somewhat unrelated issue and I don't think we can make it work if we don't start with a sane set of package and a clear model of how dependencies are handled and solved. Right now, we don't have that.

comment:12 by X512, 4 years ago

There is only libsomelib.so.1 (the actual soname).

Why not libsomelib.so.1.0.0? Some application can depend on specific library version and it will be nice to always allow to install different library versions at the same time.

I am also not immediately convinced that hundreds or thousands of smaller separate directories will be better, both in terms of performance

It will be better because global merge require reading all package contents even if accessing one file from package. Currently package_fs reading all packages data and building caches that takes a lot of boot time and a lot of RAM.

If mounting each package in separate directory (libsomelib-1.0.0-x86.hpkg -> /boot/system/package-data/libsomelib-1.0.0-x86/), package can mounted instantly and can be read on demand (it can't be done for merged directories because directory list can potentially consists of many packages so all package contents must be traversed). Only total package name list should be stored globally that is fast and don't require a lot of RAM.

and in terms of usability

I think that separate directories are more user-friendly than merged mess. Merged directory contents are basically useless for manual explore, attribute (SYS:PACKAGE) search is needed, but it is currently not working.

Understood, but unrelated to the problem.

It is actually related because package model and efficiency are related. Some package models can't be efficiently implemented.

Last edited 4 years ago by X512 (previous) (diff)

comment:13 by pulkomandy, 4 years ago

package can mounted instantly and can be read on demand (it can't be done for merged directories because directory list can potentially consists of many packages so all package contents must be traversed). Only total package name list should be stored globally that is fast and don't require a lot of RAM.

But on the other hand you will have a thousand copies of the Haiku package files in different places. And a hundred copies of all Qt for each Qt app, each Qt lib, etc).

I don't understand how "read on demand" would work. Maybe for access by path, but Deskbar will access all your apps to get their icons, so already all your app packages are mounted when you open the deskbar menu. And if you do a query in packagefs (for example if you use the "open with" menu), you need to load everything in the filesystem anyway.

Also it will cause other problems, searching for a file (foo.so) using Tracker find will return thousands of results in all places where that lib is used.

Some package models can't be efficiently implemented.

And some can be implemented very efficiently but they just don't work. Which is worse?

comment:14 by X512, 4 years ago

But on the other hand you will have a thousand copies of the Haiku package files in different places.

That are not copies, but dynamically generated items. It almost don't cost anything.

I don't understand how "read on demand" would work.

When some application will invoke read directory or open file syscall, package_fs will return generated symlinks in addition to actually stored package data. Something like this (pseudocode):

struct Iterator {
    // generate dependency symlinks based on HPKG header
    LinksIterator* dynIt;
    // read actual compressed files and directories
    HpkgIterator* baseIt;
};

status_t MergeIterators(Iterator* it1, Iterator* it2, struct stat &stat)
{
    // algorithm of in-place merging of 2 sorted sequences
}

status_t ReadDir(void *cookie, struct stat &stat)
{
    Iterator *it = (Iterator*)cookie;
    return MergeIterators(it->dynIt, it->baseIt, stat);
}

Maybe for access by path, but Deskbar will access all your apps to get their icons, so already all your app packages are mounted when you open the deskbar menu.

It will not require traversing and storing all package files in RAM like in current merge-based package_fs. Only needed packages that have Deskbar menu entry will be read.

And if you do a query in packagefs (for example if you use the "open with" menu), you need to load everything in the filesystem anyway.

Not everything, but only index of required attribute.

Also it will cause other problems, searching for a file (foo.so) using Tracker find will return thousands of results in all places where that lib is used.

Some option for skipping auto-generated symlinks can be provided. That symlinks do not even need to be indexed.

Last edited 4 years ago by X512 (previous) (diff)

comment:15 by pulkomandy, 4 years ago

It will not require traversing and storing all package files in RAM like in current merge-based package_fs. Only needed packages that have Deskbar menu entry will be read.

Now I'm curious how you can know that a package contains a file in the deskbar directory, without opening the package.

The idea is worth exploring, but there would be a lot of details to get right. The current solution is not perfect, but it is doing quite good. And, from an initial idea that looks simple and easy, when adding all the little things that make it actually work, in a lot of places you get to reduce performance.

To me, your solution does not look obviously better. It solves some problems, but it also creates many new ones. And I cannot say off hand if it will be better, or worse than what we have.

comment:16 by X512, 4 years ago

Now I'm curious how you can know that a package contains a file in the deskbar directory, without opening the package.

It will require to read a small part of package but not whole file hierarchy of all packages.

Current merge approach require reading whole file hierarchy of all packages because when you read some directory in some package, probability is always exist that another package have the same directory and it should be merged.

comment:17 by waddlesplash, 4 years ago

There is probably a way of lowering packagefs memory usage without so drastic a solution as that. I had looked into this at one point a while ago and thought there were things that could be done.

comment:18 by X512, 4 years ago

There is probably a way of lowering packagefs memory usage without so drastic a solution as that.

How? The only way that can I see is explicitly marking directories that can be merged and that can not. And mark as much directories unmergeable as possible. Unmergeable directories are not required to be traversed at boot.

Last edited 4 years ago by X512 (previous) (diff)

comment:19 by axeld, 4 years ago

The easiest solution would simply be to cache the entries between boots, and just read the cache file on startup. Then all packages could be traversed only when they are actually needed.

I would assume that "unmergeable" packages would require a lot of work, and would make things IMO needlessly more complicated for the user, and packager.

Note: See TracTickets for help on using tickets.