“Packaging Kubernetes for Debian”:
lwn.net/SubscriberLink/835599/

This raises key questions: bundling (“vendoring”) and its implications, contemporary development practices and their impact on distro relevance, avoiding/resolving technical disputes, and more.

has answers to some issues but is otherwise in a situation similar to that of .

Regarding “modern” development practices, I’m both amazed at how much can be achieved with all these libraries at our fingertips, and scared at the idea that even distro “experts” give up on understanding how things fit together.

Show thread

@civodul I see complexity as one of the modern barriers to practical software freedom. If a reasonably skilled person can't comprehend a system, one can't exercise the freedom to make any meaningful changes, letalone redistribute those changes.

Projects that describe themselves as open source may not begin to consider that point.

I feel a modern interpretation of software freedom requires mindfulness to complexity. Unfortunately, the backbones of modern systems are the opposite of that.

@mikegerwitz Agreed.

In some domains, complexity is hardly avoidable: compilers, video-editing applications, etc.

But in other domains, it’s mostly an “emerging phenomenon”: developers focus on one thing and build upon a pile of software regarded as a black box. All developers do that to some extent, but this has reached the point where everyone gives up. Definitely a barrier to practical user freedom.

@civodul @mikegerwitz after reading the article I mostly worry about people picking up development practices from a big team that manages dependencies and applying them in hobby projects. How will you ever update 30 dependencies if all you have are 5 hours per week? How will you ensure that your users get security updates? How can you actually find out that there are security updates in any of the 30 libs?

That comes down to change management — and minimizing its cost.

@civodul @mikegerwitz I wrote "30 dependencies", because the 300 I wanted to write initially seemed like a stretch. Now this: lwn.net/Articles/836143/ — 2120 dependencies to show google-maps.

As a user I certainly prefer installing only tools that are shipped in my distro. That’s why as a dev I mostly limit myself to using only the libs that are in my distro.

Also those are nicer to install :-)

@ArneBab @civodul Yes, unfortunately it's not atypical for 100s of MiB (or even >1GiB) worth of dependencies in JavaScript projects using NPM. I can't speak to Go.

This has also been a packaging headache in Guix. And for a FSDG distro, there's also the problem of trying to determine whether a program is actually free, given all of those dependencies.

Follow

@mikegerwitz @ArneBab Yeah, @cwebber explained it very well: dustycloud.org/blog/javascript

Tools like NPM and Node support and encourage complexity by making it easy for developers to build gigantic dependency graphs and to ignore everything at the levels below.

It’s both an “impressive” feature and an invitation to create this incomprehensible mess.

@civodul @mikegerwitz @ArneBab At the time I wrote that, it was nearly 500 libraries to install jquery. My suspicion is that it is many more today. Somebody want to check? I'd rather not fire up the npm beast if I can avoid it :)

@civodul @mikegerwitz @ArneBab (Also, has it really been half a decade since I wrote that post???)

@lthms @civodul @mikegerwitz @ArneBab Took me a minute to parse that. Is that a comment about the perceived length of time of 2020? ;)

@cwebber @civodul @mikegerwitz @ArneBab npm-the-tool + npm-the-software-collection really do need to be reworked. (And Yarn is not that thing; Facebook is one of the most egregious offenders of package bloat.)

Some Haxe folks at least have begun making an attempt to do package management differently, which can address some of the problems for the Haxe ecosystem. (The rest comes down to culture, though.)

github.com/lix-pm/lix.client

@cwebber @civodul @mikegerwitz @ArneBab if source code hosting platforms made repo size as prominent as number of forks, it would lead to a form of social pressure all its own. I know that @codeberg and other platforms that use Gitea (and Gogs?) display this for every repo.

@colby @cwebber @civodul @ArneBab @codeberg Dependenices are not typically committed to the repo itself (they're downloaded after cloning via the package manager) and so do not contribute to the repository size.

Even showing the size of the repository post-checkout isn't simple, since each package can run arbitrary scripts and perform environment/platform-specific tasks, including compilation.

@mikegerwitz @colby @cwebber @civodul @codeberg if most people ran Guix, there would be another kind of pressure: Users actually see the dependencies and how much needs to be rebuild when a single library has a security fix.

@mikegerwitz re not cleanly separating acquisition from execution/compilation:

I think accepting that this is the way things can and/or should be done is _the_ problem, though, Not just part of it. Anxiety about dependency graphs, including bloat, is the result. I'm aware that this goes against modern orthodoxy.

From a capability-based systems view, it strikes me as an unnecessary capability that violates POLA.

@cwebber @civodul @ArneBab @codeberg

@colby @mikegerwitz @civodul @ArneBab @codeberg Yes, this is whay an ocap/POLA approach is *more* important from security perspectives, but a) defense in depth and b) it is important for your trusted computing base though and c) we don't live in ocap systems yet oh no and d) it's still critically important for *community hacking* purposes

@cwebber @mikegerwitz @civodul @ArneBab @codeberg maybe I'm misreading, but I'm not so sure about the "critically important" part. We are arguably seeing that it's actively harmful in some way (and actively harmful towards every reaching the "yet").

This modern strategy that's been adopted for handling dependencies feels like it was an attractively packaged bad idea, like inheritance—and has been all along. (Worth noting that both are attempts to solve the code re-use problem?)

@cwebber @codeberg @ArneBab @mikegerwitz @colby Right. IMO, POLA or not, bundling and unbounded software stack complexity make it harder to reason about software composition, are a waste of resources, make it harder to modify the software, and so on.

@civodul

I see the resources argument (I suspect it's wrong in practice; cf "attractively packaged bad idea"), but where does "unbounded [...] complexity" fit in with clean separation between acquisition and compilation (surely it's the opposite?)

The effect it has on reasoning about composition looks to be null.

How is software modification made harder by this separation? Empirically, I've found it easier + @mikegerwitz: "_process_ of getting it to build".

@cwebber @codeberg @ArneBab

@colby @civodul @cwebber @codeberg @ArneBab This thread has taken a number of turns. A complex acquisition process is certainly no good, but that's just one level of complexity. Committing 100s of MiB of dependencies to a repository simplifies getting a hold of them, but it doesn't make the system any less complex.

Certainly not even being able to build the software is an unnecessarily high barrier to entry. But ironically I haven't had that problem with NPM.

@colby @civodul @cwebber @codeberg @ArneBab Actually, its not ironic at all---the principal reason for this proliferation is IMO is that it's so easy to download all those dependencies (recursively) that people don't even realize how many there are, and even if they do, don't really care; just allocate some more disk space to those CI boxes and call it a day.

@mikegerwitz @colby @civodul @cwebber @codeberg And that “don’t really care” means that they also don’t care — and don’t even know — about the quality of the code they depend on.

Just imagine what would be needed to inject malware into many of the npm-based systems.

… actually no need to imagine: it happened already.

@ArneBab @mikegerwitz @colby @civodul @codeberg The bigger imagine is: "imagine what it would take to clean up once the mess is discovered". I saw a lot of responses after the event-stream incident from npm-depending people that was "oh wait, I have no idea how many risks are in this giant nest and how to fix them... maybe this is a bad situation"

@ArneBab @mikegerwitz @colby @civodul @codeberg POLA/ocaps should help us, security-wise, contain portions of the event-stream incident. But even POLA people agree that you need a) a Trusted Computing Base and b) you want each component to be as safe, understandable, modifiable as possible.

This is why I am hopeful we can get to a point where we have something like Guix but *plus* os-level ocap security. That combination together could finally lead me to being able to trust my computer.

@mikegerwitz @civodul @cwebber @codeberg @ArneBab hence my suggestion to abandon the strategy that has become popular with modern package managers. Their value proposition has never included a means to curtail complexity—what they offer at base is a way to hide it. Worse, in practice, this leads to subtly encouraging more of it. That you can start with a deceptively small-ish repo that doesn't reflect the program's true character is something that if corrected we'd benefit from greatly.

@mikegerwitz @civodul @cwebber @codeberg @ArneBab late fetching dependencies delivers much less value (by way of resource usage) than one's gut suggests. Break it down—the only substantial benefit it delivers is to a platform/service offering repository hosting. The benefit to actual users and their disks is negligible—and even if really considered a problem, can always be better handled locally. The problem that late-fetch-by-default solves is vaporish. Its downsides are pernicious.

@colby
I doubt that just reporting the size would create the pressure you anticipate. It needs for developers to actually care, and it's not clear that all of us do.
@cwebber @civodul @mikegerwitz @ArneBab @codeberg

@cwebber @civodul @mikegerwitz @ArneBab jquery has and has always had zero dependencies on npm, so in the context of a conversation about npm encouraging dependency proliferation I'm not sure what you're basing that on, unless you're counting every element of a Debian system required to host jquery on a web server.

@cwebber @civodul @mikegerwitz @ArneBab oh, I found the article you were talking about. I see you mean you need a big pile of packages to build it.
Well that's true, but insisting on building everything from scratch is a self-imposed problem, a classic example of distros making trouble for themselves and then wondering why their job is so hard.

@danielcassidy @civodul @mikegerwitz @ArneBab It's clear you don't care about reproducibility, but many of us do. There are both security and community-hacking-health reasons to do so.

@cwebber @civodul @mikegerwitz @ArneBab I do care about reproducibility, I just think you can verify reproducibility without tying it to unrelated packaging processes in a way that creates needless problems.

@danielcassidy @civodul @mikegerwitz @ArneBab Okay, well if you care about reproducibility... to quote you:

> Well that's true, but insisting on building everything from scratch is a self-imposed problem

But that's the definition of reproducibility right there, so...

@cwebber @civodul @mikegerwitz @ArneBab reproducibility means that someone you trust builds each component from scratch once, verifies that the output matches the package on npm, and then from then on you can safely use the npm package.

@cwebber @civodul @mikegerwitz @ArneBab reproducibility doesn't mean you have to transplant the package to a completely different build-from-scratch package system. If you want to do that that's fine but don't pretend it somehow makes the package more "reproducible", and it's not the job of upstream developers to help you with this.

@cwebber @civodul @mikegerwitz @ArneBab I'm not clear if you mean that you want to build every package from scratch yourself, perhaps because you don't trust anyone else? If so then I'm afraid that's always going to be a lot of work and that's just life, if you're willing to trust literally nobody then every aspect of life becomes incredibly difficult.

@danielcassidy @cwebber @civodul @mikegerwitz Firstoff: I don’t trust anyone 100%, and I think it is dumb to do. I do trust most people to not do anything that hurts others just for the sake of doing that, but pressure exists into that direction — and under sufficient pressure almost anyone would cave. That suffices to be able to trust that most things just work, if the structure isn’t exceptionally bad.

Sadly in building software lots of things are exceptionally bad.

@danielcassidy @cwebber @civodul @mikegerwitz Reproducible builds from the source up (ideally from the hardware up!) are not a far-off request, but the minimal requirement which makes it possible that at least one person is able to verify whether it does what you see in the code.

That’s not to say that I would require everyone to provide that right now, because we inherit a lot of breakage from the past (that can’t be separated from great tools we need).

It is the direction we should take.

@ArneBab @cwebber @civodul @mikegerwitz I agree and that is exactly why almost every npm package lists all of its build time dependencies with version numbers and sha hashes in its package-lock file, so that you can reconstruct that exact environment and verify the build if you want to.

@ArneBab @cwebber @civodul @mikegerwitz it's not perfect but it's improving every day and contributing to that effort makes more sense than moaning about dependency proliferation and trying to reproduce all that work in every distribution's native package manager.

@danielcassidy @cwebber @civodul @mikegerwitz What do you do when you call native software — or software written in another language?

@ArneBab @danielcassidy @civodul @mikegerwitz It's not just an issue of trust as in terms of "is this person good or bad" (ocaps will provide better protection than reproducibility there) it's also "do I have the ability to study and repair this when I need to?"

@cwebber @ArneBab @civodul @mikegerwitz agreed again! Every Linux distributor is worrying about reproducible builds and manually QAing every package when they would get more value putting that time and effort into building a proper, modern capability based OS so that the risks of installing packages are lower to begin with.

@cwebber @ArneBab @civodul @mikegerwitz To be clear: Yes, I think it's a problem that to build jQuery I have to run 300 programs that all have near-root access to my computer.
No, I don't think the problem is the number of dependencies. I think the problem is that they're all run with all of my user permissions by default. Which is a MUCH easier problem to fix.

@danielcassidy @cwebber @civodul @mikegerwitz When I build stuff on Guix, those programs do not have near-root access (nor my user access). The final product typically does — though I’d very much like to have a Hurd system for that where I can start programs without capabilities and add during runtime what they need and only when they need it.

@danielcassidy @cwebber @civodul @mikegerwitz I like the initial steps Android did to segment access: Programs have to request what they need. What I see in flatpak is very annoying, though: half the programs need me to jump through hoops to do regular interaction.

@danielcassidy @ArneBab @mikegerwitz @cwebber Distros like Debian and Guix build software from source and make those builds reproducible. It allows users, collectively and individually, to ensure they’re running the software they think they run. It’s crucial security-wise and from the point of view of user freedom.

@danielcassidy @cwebber @civodul @mikegerwitz How do I know that someone built each component from scratch, if the system they use makes exactly that hard?

That said: what you describe is what you get when you use Guix with substitutes. With those you can actually check whether they were built from source. Trust grows from actually being able to check.

@civodul @mikegerwitz @ArneBab @cwebber this is also one of the reasons for #PeerTube, #Mastodon and such like to be almost absent in most #Distribution's that follow the #GNU #FSDG, in summary:

Following the dependency graphs of these things, the learning curve to understand package schemes, the fact that none of the repositories commit to #FSDG, all this makes the #Distro's opt to remove per language-package manager.

Sign in to participate in the conversation
Mastodon (Aquilepouet)

The social network of the future: No ads, no corporate surveillance, ethical design, and decentralization! Own your data with Mastodon!