Docker buildkit feels way late in arriving and slow builds cost every team mucho time every day. In the age of the transpiler, this even affects non-compiled codebases. The preview flags for caching package manager builds are underwhelming and won’t help much on CI systems.

CI-based container builds are harder to cache than builds on my laptop, and buildkit won’t help without coupling more tightly with the tools that actually build software. This article is a spec for how I would want that to work if I were building it.

TLDR, every package manager that wants to work fast on docker should expose these three commands:

  • download-urls
  • unpack
  • build

And the docker buildsystem should integrate with these commands to cache stuff. More below.

download-urls: give the build cache a list of URLs

A lot of package managers use http (npm and apt-get, for example, both do). And many clouds have in-cloud caches of these downloads (amazon has in-house repos for yum, for example).

Nodejs even records https URLs in its package-lock file (look for the resolved key in the json example).

Every package manager that wants to integrate with docker should have a command that generates a list of URLs to download. Docker, in turn, can put these in a specified mount directory.

The docker buildsystem should cache these as aggressively as it’s able, but on-box caching isn’t always realistic or performant for CI, because you’re building on various boxes. And caching may not be necessary when your cloud has an on-prem package CDN. Starting with URLs and letting the buildsystem decide can be a good compromise.

unpack: unzip packages to the file system but don’t do any modification

Using nodejs as an example: most of the time, when you npm install, you’re touching an isolated folder in your node_modules directory and maybe dropping a file in node_modules/.bin. (By ‘isolated’, I mean that no other npm install command in your build will write those paths). This pattern is likely true with some OS-level package managers as well, like dpkg/apt.

The buildsystem package manager integration should differentiate between ‘unpack install’ and ‘compile install’ – these should be two separate steps. And the ‘unpack’ step can be aggressively cached: each package can unpack to its own filesystem layer, and they can be merged and checked for path collisions at the end.

build: do everything else

Caching compilation will be hard. Systems that do this in real life often take checksums of all the files that are inputs to a build, including sources, built binaries, configs, and build tools. You can’t just use a version string; build graphs are too complicated and versions aren’t always meaningful.

Advanced build tools like bazel can export your build graph. Rust sort of can. In theory tools like this could export LLB for themselves so they don’t have to rebuild every piece.

For this to work, buildkit has to allow the LLB graph to be edited, either during the build or in a final prep step before starting. Then the docker buildsystem can orchestrate a bunch of mini-builds which know their inputs (including compiler binaries and packages / system libs) and can cache where appropriate.

Why does this matter

Slow tools kill you slow. If your programmers spend their time waiting for a computer, they will do one or more of: switch tools, switch teams, or work slowly. I’ve worked at places where builds were uncacheable and circular and could take half a day to work on small changes – people mentally check out.

Computers aren’t getting that much faster anymore and software is getting slower. Builds are also doing more as the software supply ecosystem grows, tests get more elaborate, and typesystems get clunkier (and somewhat more powerful).

And everyone is a programmer now, which means fixing slow builds is a force multiplier. If your builds take an hour to finish and every dev does 2 per day and has to wait for them, a build speedup is like doubling your head count. Chew on that.