Can we get the benefits of transitive dependencies without undermining security?
If anything is going to put capabilities into the programmer ecosystem, I think it's this problem.
The neat thing about this particular problem is that you can do some really coarse things and get some immediate benefit. Capabilities in their original form, and perhaps their truest form, carry down the call stack, so that code can do things like "restrict everything that I call can only append to files in this specific subtree" in the most sophisticated implementations. But you could do something more coarse with libraries and just do things like "These libraries can not access the network", and get big wins on some simple assertions. If you're a library for turning jpegs into their pixels, you don't need network access, and with only a bit more work, you shouldn't even need filesystem access (get passed in files if you need them, but no ability to spontaneously create files).
This would not be a complete solution, or perhaps even an adequate solution, but it would be a great bang-for-the-buck solution, and a great way to step into that world and immediately get benefits without requiring the entire world to immediately rewrite everything to use granular capabilities everywhere.
I'm bit surprised that browsers did not get a mention here. They are one of the big examples of how an application can be split into gazillion small processes, split along trust boundaries, while also being pretty performance sensitive applications.
As for transitive deps, I have some hopes for more "distro" like models to arise. In particular I see clear parallels in how traditional Linux distros work and how large faang style monorepos work. So maybe we could see more these sort of platforms/collections of curated libraries that avoid these deep transitive dep trees by hoisting all the deps into one cohesive global dependency that has at least some more eyes on it.
“All this has led me, slowly and reluctantly, to the conclusion that our dependency-heavy approach to building software is fundamentally incompatible with security.”
This vulnerability seemed so apparent though. I used to object in the past about all the dependencies We were adding that we had no idea how they were implemented, who implemented them, or why they were implemented. I always got pushback that I was holding back the team for moving quickly and using the latest frameworks. it is important to consider that many of these security issues derived from the methods that new developers are taught, which is strap together a bunch of libraries that you only use 5 to 10% of resulting in a massive application with tons of unknown threats and poor performance by the way. I mostly blame the largest tech companies like Google, Meta the turn out large frameworks that are overkill for 90% of the development out there. New developers held these companies in extremely high regard and considered their technology to be state of the art in all sense. Yes, they were state-of-the-art for solving the problems and Google and Meta needed, but adoption of these technologies by small startups and other companies has now made the dependency explosion project endemic.
The worst violators of this principle that I’ve noticed is the proliferation of web development, frameworks, like rails, react, etc. further, it is ironic that these platforms, the web platform in particular is promoted as more secure relative to the old active X model of running binaries directly in the browser. However, I would rather run a trusted binary with 5k lines of code in a browser or app that my team has fully vetted than 1200 libraries and millions of lines of code to accomplish the same task.
Perhaps this is a good use for AI that would scan source code for library dependencies for security threats or potential security threats. Another solution would be to break out libraries into smaller components that perform specific functional tasks. These would be easier to validate and also result in smaller applications. Obviously there is no easy solution and running binary in your browser is not a great solution either. However, we as developers need to consider the trade-off between danger say running a compact native app versus “safety“ of using jack of all trade frameworks that include millions of lines of code.
undefined
This is an interesting article to have up alongside the SLAP and FLOP vulnerabilities. I like capabilities as much as the next programmer, but my gut tells me that process boundaries are only going to get more important, not less, as chips get faster and untrusted code gets more widely understood. Or other sorts of hardward-enforceable memory boundaries
> Addressing the first of these points requires at least somewhat rethinking of hardware and operating systems
The (vaporware) Mill CPU design has "portals" that are like fat function calls with byte-granularity MMU capability protection to limit mutual exposure between untrusting bits of of code on opposite sides of the portal. Think of it as cheap function-call-like syscalls into your kernel, but also usable for microkernel boundaries and pure userspace components.
https://www.youtube.com/watch?v=5osiYZV8n3U
Of course, we can't have nice things and are stuck with x86/arm/riscv, where it seems nothing better than CHERI is realistic and such security boundaries will suffer relatively-enormous TLB switching overheads.
Side note, I think the serde yaml debacle was so predictable. As much as I admire Dtolnay his choice to archive the repository and push a deprecated version on cargo made everybody scramble for an alternative "maintained" crate is on him. You can say whatever you want about checkbox security but most people still have to deal with it if only to make the tooling shut up so they can do their work.
Maybe the rust fondation should take over more of those fundamental crates when maintainers are not willing or able to continue working on them. A similar problem happened with the zip crate which was transferred rather fast to a new maintainer and most people still use a very old version pre-transfer.
There are already SBOM software bill of materials standards CycloneDX and SPDX in development and in use. There is VEX and also SLSA.
Idea is if everyone does legwork to check his dependencies you can trust your dependencies because they checked theirs.
It is still trust but we go implicit into „hey you sure you checked dependencies and for sure you did not just npm install library some kid from Alaska created who pulled his dependency on kid from Montenegro?”.
Including random libraries just because we can and it had enough stars on GH was bad idea already - but nowadays it becomes an offense and rightly so.
I think there is only one proper solution to the security problem of transitive dependencies: an open database of vetted/rejected libraries and tooling which would help to pull dependency versions according to configured rules.
For example, by trusting several big players such as Rust Foundation, Servo, Mozilla, Google, Facebook, etc. (developers will decide for themselves whom exactly they trust) who would manually review dependencies used by them we will be able to cover the most important parts of the ecosystem and developers will be able to review more minor dependencies in their projects themselves. cargo-crev + cargo-audit come somewhat close to what is needed, everything else is a matter of adoption.
Capabilities and other automated tools can help immensely with manual reviews, but can not replace them completely, especially in compiled languages like Rust.
>It’s easy for me to forget that the trust I place in those 20 direct dependencies is extended unchanged to the 161 indirect dependencies.
Unfortunately, when people mention such numbers they commonly do not account for the difference between "number of libraries in the dependency tree" and "number of groups responsible for the dependency tree". In practice, the latter number is often significantly smaller than the former. In other words, many projects (at least in the Rust world) lean towards the "micro-crate" approach, meaning that one group may be responsible for 20 crate in the dependency tree, which does not make security risks 20x bigger.
I do not foresee the status quo improving on this front. If anything, it will continue to get worse until we are forced to deal with a massive problem that will make the security crises we’ve dealt with up til now look like a walk in the park.
I really like this article. I do think it's useful to consider that the unit of isolation ("process") of the cloud era is a VM or container, and that the major clouds do have some sort of permissions model.
A lot of these ideas are part of the design of Fuchsia (https://fuchsia.dev/).
I'm going to take a wild guess that WASM is orders of magnitude slower for IPC than raw unix, which is unfortunate because it seems like some of the most promising fertile soil for a security-first capability model.
Does that disqualify it as a potential path to a solution? How fine-grained would these components realistically need to be?
We use https://www.simplify4u.org/pgpverify-maven-plugin and a private PGP signing key allowlist, bound to an artifact namespace. This immediately cuts down on unknown dependencies from creeping into our build.
My one big problem with the title and the way this blog about is that it assumes infinite scaling - in performance, correctness, size, security. What works on a small scale is ludicrous to work on a huge one.
There is an assumption that your blog and your multi-billion SAAS should have transferrable skills. It's like expecting a person designing a shack and the person designing the next Fort Knox to use the same plans, materials, and people. Either you get extremely overbuilt shacks, with vault doors, separate HVAC and OSHA regulations that take decades to build and will cost you several billion dollars, or Fort Knox that anyone can kick down vault doors and steal money.
If your blog, or your 72h hackathon game takes 3000 dependencies, and maybe one of them is malicious (which is low probability) who cares?
If your multi-million SaaS has 3000 dependencies, yeah, it's time to slim it down. Granted, no one wants to do this because it costs money, and takes time away from shipping another feature.
[dead]