Opened 3 years ago

Last modified 3 years ago

#12917 reopened enhancement

get_package_dependencies needs to be refactored

Reported by: kallisti5 Owned by: kallisti5
Priority: normal Milestone: Unscheduled
Component: Kits/Package Kit Version: R1/Development
Keywords: Cc:
Blocked By: Blocking:
Has a Patch: yes Platform: All

Description (last modified by kallisti5)

It appears the entire reason the url exists in the repo file (BRepositoryInfo blob) is to appease the get_package_dependencies binary. The requirements of get_package_dependencies could be greatly cleaned up.

Current Behaviour

get_package_dependencies takes in the following arguments:

get_package_dependencies then 'solves' the dependencies and links to where the package should be based on the url in the provided binary repo (created from repo.info).

get_package_dependencies seems to be the only thing that actually uses the url field in the repo.info / repo manifest. (this url field makes repo's unportable to other locations)

How it should work

  • A series of remote repo URL's are provided to the tool
    • Tool downloads repo manifest from each remote repo.
    • Tool now knows "where repos exist, and what packages they contain"
  • --
  • Every local hpkg file to be placed into the image.

Attachments (6)

output.txt (9.6 KB ) - added by kallisti5 3 years ago.
get_package_dependencies.diff (2.3 KB ) - added by kallisti5 3 years ago.
haiku-repo.png (82.4 KB ) - added by kallisti5 3 years ago.
package repo - today
repository_rework_squash.diff (19.3 KB ) - added by kallisti5 3 years ago.
complete squashed repo work rev2
s21ej8r9RP.diff (15.0 KB ) - added by kallisti5 3 years ago.
Jessica's fix for Haiku build breakage.
get_package_dependencies_fix_from_jessicah.diff (2.0 KB ) - added by miqlas 3 years ago.
[PATCH] get_package_dependencies: set include path for curl on Haiku

Download all attachments as: .zip

Change History (30)

by kallisti5, 3 years ago

Attachment: output.txt added

comment:1 by kallisti5, 3 years ago

Description: modified (diff)

comment:2 by kallisti5, 3 years ago

... except there is no easy host libcurl in the limbo get_package_dependencies exists in... which is likely why it just prints url's to the terminal, then calls wget... maybe url:localrepofile ? That's really ugly though.

in reply to:  description ; comment:3 by bonefish, 3 years ago

Replying to kallisti5:

It appears the entire reason the url exists in the repo file (BRepositoryInfo blob) is to appease the get_package_dependencies binary.

I wrote get_package_dependencies after Oliver had already implemented the repository handling and after we already had a basic Haiku repository. I don't think he had that particular tool in mind originally. IIRC the URL in the repository info simply identifies the repository. A mirror would contain the same URL, with the nice side effect that also provides a pointer to the original respository.

The requirements of get_package_dependencies could be greatly cleaned up.

Which requirements?

Current Behaviour

get_package_dependencies takes in the following arguments:

No, they are not the "current" repositories. You need to understand that the Haiku git repository defines the HaikuPorts package repositories. This provides us with the ability to build an old Haiku revision with the same packages repositories they matched originally. If you start building against the "current" repositories, you'll lose this build stability.

in reply to:  3 ; comment:4 by kallisti5, 3 years ago

Replying to bonefish:

Replying to kallisti5:

It appears the entire reason the url exists in the repo file (BRepositoryInfo blob) is to appease the get_package_dependencies binary.

I wrote get_package_dependencies after Oliver had already implemented the repository handling and after we already had a basic Haiku repository. I don't think he had that particular tool in mind originally. IIRC the URL in the repository info simply identifies the repository. A mirror would contain the same URL, with the nice side effect that also provides a pointer to the original respository.

The requirements of get_package_dependencies could be greatly cleaned up.

Which requirements?

The requirement for the repo.info and repo to know where they will exist ahead of time... it's ridiculous to have a package repository "need to know" where it will exist ahead of time... no other repo has this requirement *AND* the url in repo.info / repo is only used for get_package_dependencies.

As for the mirror scenario... lets just use a UUID or something in repo.info plus a last updated timestamp in repo.

Current Behaviour

get_package_dependencies takes in the following arguments:

No, they are not the "current" repositories. You need to understand that the Haiku git repository defines the HaikuPorts package repositories. This provides us with the ability to build an old Haiku revision with the same packages repositories they matched originally. If you start building against the "current" repositories, you'll lose this build stability.

There have been multiple people commenting on how much of a pain it requiring code updates to Haiku any time a HaikuPorts package is updated. We have pretty advanced versioning schematics in the package definitions already... lets put them to good use.

Here is an example patch showing the changes to "completely remove url" from repo.info: https://github.com/kallisti5/haiku/commit/df88967bf048725cadf6afb5389490204915023b

Every instance of url in repo was cosmetic. (besides get_package_dependencies)

in reply to:  4 ; comment:5 by bonefish, 3 years ago

Replying to kallisti5:

Replying to bonefish:

Replying to kallisti5:

The requirements of get_package_dependencies could be greatly cleaned up.

Which requirements?

The requirement for the repo.info and repo to know where they will exist ahead of time... it's ridiculous to have a package repository "need to know" where it will exist ahead of time... no other repo has this requirement *AND* the url in repo.info / repo is only used for get_package_dependencies.

Again, you've got it the wrong way around: get_package_dependencies needs a URL for the package downloads. It was conveniently already available in the repository file, which is needed for dependencies resolution anyway. If the URL hadn't been in the repository file, it would have been passed as an additional parameter.

As for the mirror scenario... lets just use a UUID or something in repo.info plus a last updated timestamp in repo.

I don't quite see the advantage, but feel free to change that.

Current Behaviour

get_package_dependencies takes in the following arguments:

No, they are not the "current" repositories. You need to understand that the Haiku git repository defines the HaikuPorts package repositories. This provides us with the ability to build an old Haiku revision with the same packages repositories they matched originally. If you start building against the "current" repositories, you'll lose this build stability.

There have been multiple people commenting on how much of a pain it requiring code updates to Haiku any time a HaikuPorts package is updated. We have pretty advanced versioning schematics in the package definitions already... lets put them to good use.

Maybe you could give some details on what you have in mind, because ATM I don't see a solution that could possibly work.

in reply to:  5 ; comment:6 by kallisti5, 3 years ago

Replying to bonefish:

Replying to kallisti5:

Replying to bonefish:

Replying to kallisti5:

The requirements of get_package_dependencies could be greatly cleaned up.

Which requirements?

The requirement for the repo.info and repo to know where they will exist ahead of time... it's ridiculous to have a package repository "need to know" where it will exist ahead of time... no other repo has this requirement *AND* the url in repo.info / repo is only used for get_package_dependencies.

Again, you've got it the wrong way around: get_package_dependencies needs a URL for the package downloads. It was conveniently already available in the repository file, which is needed for dependencies resolution anyway. If the URL hadn't been in the repository file, it would have been passed as an additional parameter.

I definitely get the convenience factor after taking a few passes at solving the issue :-).

My largest push for removing the url is to solve some of the "death by 1000 paper cuts". I've noticed a trend over the last two years of anyone working on package management repos + auto building getting burnt out pretty quickly. No-one can nail down any single issue, but the large number of steps and complex requirements seem to be of some cause. Anything we can simplify might make it easier on folks and upkeep easier.

As for the mirror scenario... lets just use a UUID or something in repo.info plus a last updated timestamp in repo.

I don't quite see the advantage, but feel free to change that.

Mostly to make repository tracking based on a random identifier vs a specific url. Repositories should be portable. Tracking URL's as unique identifiers can become a bit odd if the repo doesn't live at a specified url anymore... uuid's per are a bit less confusing.

If we paired a UUID and a timestamp with some text "mirrors" file in the repository, someday the Package Kit could store + offer users multiple mirrors and check how fresh they are while tracking them regardless of url.

Current Behaviour

get_package_dependencies takes in the following arguments:

No, they are not the "current" repositories. You need to understand that the Haiku git repository defines the HaikuPorts package repositories. This provides us with the ability to build an old Haiku revision with the same packages repositories they matched originally. If you start building against the "current" repositories, you'll lose this build stability.

There have been multiple people commenting on how much of a pain it requiring code updates to Haiku any time a HaikuPorts package is updated. We have pretty advanced versioning schematics in the package definitions already... lets put them to good use.

Maybe you could give some details on what you have in mind, because ATM I don't see a solution that could possibly work.

No clue. :-)

This all isn't any kind of attack.. our existing package kit and repository code is an amazing accomplishment. I'm just trying to find a way to shave off a few pointy bits that keep poking at people :P

in reply to:  6 comment:7 by bonefish, 3 years ago

Replying to kallisti5:

Replying to bonefish:

Replying to kallisti5:

Replying to bonefish:

Replying to kallisti5:

Current Behaviour

get_package_dependencies takes in the following arguments:

No, they are not the "current" repositories. You need to understand that the Haiku git repository defines the HaikuPorts package repositories. This provides us with the ability to build an old Haiku revision with the same packages repositories they matched originally. If you start building against the "current" repositories, you'll lose this build stability.

There have been multiple people commenting on how much of a pain it requiring code updates to Haiku any time a HaikuPorts package is updated. We have pretty advanced versioning schematics in the package definitions already... lets put them to good use.

Maybe you could give some details on what you have in mind, because ATM I don't see a solution that could possibly work.

No clue. :-)

This all isn't any kind of attack..

I haven't perceived it as such. If my reaction was a bit "alarmed", that's because Oliver and I had thought about the connection between the Haiku source tree and the HaikuPorts repositories and their management quite a bit. Switching to the "current" repositories would break an aspect we considered imported, while -- if that's all you change -- still having all the drawbacks of maintaining the repositories in the Haiku source tree. So, if that is changed, it should be done only as part of a larger paradigm change.

comment:8 by kallisti5, 3 years ago

I've tried to solve this as elegantly as possible... it seems like you can use a repo config vs a repository info definition... however the package kit seems to be crashing and not giving much info.

The alternate is calling curl within the get_package_dependencies binary... which I want to avoid for the moment.

/home/kallisti5/Code/haiku/generated.x86_64/objects/linux/x86_64/release/tools/get_package_dependencies/get_package_dependencies /home/kallisti5/Code/haiku/generated.x86_64/objects/haiku/x86_64/packaging/repositories/HaikuPorts-config /home/kallisti5/Code/haiku/generated.x86_64/objects/haiku/x86_64/packaging/repositories/Haiku-config -- /home/kallisti5/Code/haiku/generated.x86_64/download/dejavu-2.34-2-any.hpkg
terminate called after throwing an instance of 'BPackageKit::BManager::BPrivate::BFatalErrorException'
Aborted (core dumped)

Everything works until I pass the BRepositoryConfig to BRepositoryBuilder. It says it will accept it... but the results seem to disagree. Patch attached.

by kallisti5, 3 years ago

comment:9 by kallisti5, 3 years ago

Has a Patch: set

comment:10 by kallisti5, 3 years ago

oh my god the layers on layers. It's like a Java developer decided to make a movie called "C++ inception"

ok, after a lot of scratching, clawing, and blank C++ exceptions I ended up here:

src/kits/manager/RepositoryBuilder.cpp

V BSolverRepository

src/kits/package/solver/SolverRepository.cpp

V BPackageRoster

src/kits/package/PackageRoster.cpp :

status_t
BPackageRoster::GetRepositoryCache(const BString& name,
    BRepositoryCache* repositoryCache)
{
    if (repositoryCache == NULL)
        return B_BAD_VALUE;

    // user path has higher precedence than common path
    BPath path;
    status_t result = GetUserRepositoryCachePath(&path);
    if (result != B_OK)
        return result;
    path.Append(name.String());

    BEntry repoCacheEntry(path.Path());
    if (repoCacheEntry.Exists())
        return repositoryCache->SetTo(repoCacheEntry);

    if ((result = GetCommonRepositoryCachePath(&path, true)) != B_OK)
        return result;
    path.Append(name.String());

    repoCacheEntry.SetTo(path.Path());
    return repositoryCache->SetTo(repoCacheEntry);
}

Which looks like a dead end. No attempts to get remote repository caches based on the url provided in the RepositoryConfig.

Not sure at this point if that function should or should not attempt to fetch a remote RepositoryCache.

in reply to:  10 comment:11 by bonefish, 3 years ago

Replying to kallisti5:

Which looks like a dead end. No attempts to get remote repository caches based on the url provided in the RepositoryConfig.

Not sure at this point if that function should or should not attempt to fetch a remote RepositoryCache.

I don't think it was intended to do that. Downloads usually happen only via requests/jobs, which have to be created and processed explicitly. E.g. cf. how the pkgman commands are implemented.

comment:12 by kallisti5, 3 years ago

Ok, I have a solid solution in place:

https://github.com/kallisti5/haiku/commits/repo-cleanup

  • get_package_dependencies now accepts a series of RepositoryConfig files
  • get_package_dependencies HTTP get's the repo caches to tmp and passes them into the solver
  • The build system gives the local RepositoryConfig files to get_package_dependencies

We now can remove url from the RepositoryCache and have it only exist in the RepositoryConfig (which makes a lot more sense)

The default configured repositories also point to a "current" Haiku repo for now. We can lock them to a release branch pretty easily by specifying in the Jam scripts.

Down the road we can track mirrors via UUID's, and it won't be as ingrained in our build process as the Cache URL's :-)

I still need to see if i've broken bootstrap, however so far everything is working as expected.

Version 0, edited 3 years ago by kallisti5 (next)

comment:13 by kallisti5, 3 years ago

I'm looking again at https://github.com/kallisti5/haiku/commit/df88967bf048725cadf6afb5389490204915023b and I think I went one level too far around removing the url from repository_info/haiku and friends. I thought repository_info was used to generate the repo.info, but it looks like repository_info is used to generate the config file.

Might need to tweak that first commit and roll back a few smaller changes in it.

repository_info should likely be renamed to repository_def or something.

comment:14 by kallisti5, 3 years ago

For my own sanity.

  • BRepositoryInfo == repo_info driver config file. thing blah.
  • BRepositoryConfig == config file of thing=blah. For OS pkgman
  • BRepositoryCache == repo binary blob of all packages for fast dependency solving

I can't for the life of me figure out why we have to have a BRepositoryConfig *AND* a BRepositoryInfo. They contain the exact same freaking info just formatted slightly different.

by kallisti5, 3 years ago

Attachment: haiku-repo.png added

package repo - today

in reply to:  14 ; comment:15 by bonefish, 3 years ago

Replying to kallisti5:

I can't for the life of me figure out why we have to have a BRepositoryConfig *AND* a BRepositoryInfo. They contain the exact same freaking info just formatted slightly different.

Huh?

BRepositoryInfo:

            BString             fName;
            BString             fOriginalBaseURL;
            BString             fVendor;
            BString             fSummary;
            uint8               fPriority;
            BPackageArchitecture    fArchitecture;
            BStringList         fLicenseNames;
            BStringList         fLicenseTexts;

BRepositoryConfig:

            BString             fName;
            BString             fBaseURL;
            uint8               fPriority;
            bool                fIsUserSpecific;
            BEntry              fEntry;

How is that the same?

in reply to:  15 comment:16 by kallisti5, 3 years ago

Replying to bonefish:

Replying to kallisti5:

I can't for the life of me figure out why we have to have a BRepositoryConfig *AND* a BRepositoryInfo. They contain the exact same freaking info just formatted slightly different.

Huh?

BRepositoryInfo:

            BString             fName;
            BString             fOriginalBaseURL;
            BString             fVendor;
            BString             fSummary;
            uint8               fPriority;
            BPackageArchitecture    fArchitecture;
            BStringList         fLicenseNames;
            BStringList         fLicenseTexts;

BRepositoryConfig:

            BString             fName;
            BString             fBaseURL;
            uint8               fPriority;
            bool                fIsUserSpecific;
            BEntry              fEntry;

How is that the same?

Forgot this one:

BRepositoryCache:

            BEntry          fEntry;
            BRepositoryInfo fInfo;
            bool            fIsUserSpecific;

Ok, you're correct, BRepositoryInfo has a bit more in it than what I originally described.

So my original patch didn't go too far. (except I removed a url check in BRepositoryConfig I shouldn't have)

I drop the url information from BRepositoryInfo (and thus BRepositoryCache) and inject the url via the create_repository_config (incorrectly named create_package_config) in my diagram.

by kallisti5, 3 years ago

complete squashed repo work rev2

comment:17 by kallisti5, 3 years ago

repository_rework_squash.diff are all of my changes squashed into one commit. Let me know if it looks sane. I pretty much remove url from RepositoryInfo and deal with the fallout of that change. This removes the need for repos to be self-aware of their url. RepositoryConfig is the only thing url aware.

I noticed that RepositoryConfig's SetTo seems to load files based on BDriverConfig and write files as "key=value". Can BDriverConfig load files formatted as "key=value"? I don't see any docs suggesting it can.

comment:19 by bonefish, 3 years ago

Unfortunately I don't really have the time to review your patch. Looking at your diagram I wonder, however, what "create_package_config" is and why it would need a BRepositoryInfo and produce a BRepositoryConfig. AFAIK the only tool to create a repository config is pkgman (via the package kit) as a consequence of the user adding a package repository.

comment:20 by kallisti5, 3 years ago

Resolution: fixed
Status: newclosed

create_package_config is mostly so the build system can "pre-load" pkgman with existing configured remote repositories.

Changes applied in hrev50723, hrev50745, hrev50748

done.

comment:21 by kallisti5, 3 years ago

Resolution: fixed
Status: closedreopened

comment:22 by kallisti5, 3 years ago

Pulkomandy reverted these as of hrev50754 due to a build breakage on Haiku.

by kallisti5, 3 years ago

Attachment: s21ej8r9RP.diff added

Jessica's fix for Haiku build breakage.

comment:23 by korli, 3 years ago

@kallisti5 this diff isn't a correct patch (seems like a html saved page).

in reply to:  23 comment:24 by miqlas, 3 years ago

Replying to korli:

@kallisti5 this diff isn't a correct patch (seems like a html saved page).

Fixed and cleaned up.

by miqlas, 3 years ago

[PATCH] get_package_dependencies: set include path for curl on Haiku

Note: See TracTickets for help on using tickets.