Build transparency

Introduction

When we install a program, we usually trust that the software binaries that we download correspond to the program that we want. But how do we know that we aren't installing something else, for example something potentially malicious?

Typically, we have confidence in those binaries because we download them from a trusted provider. This trusted provider might add a digital signature and a cryptographic hash to the binaries. With them, we can make sure that what we download really comes from this provider and not from somebody else, avoiding man-in-the-middle attacks.
The trusted provider might also tell us the build inputs, that is the source code, programs and commands that it supposedly used to construct the binaries, that is the build output.
But if the provider itself is compromised, arbitrary outputs can still be produced, and a signed and hashed binary - undoubtedly shipped from the provider to the user - could correspond to anything.
In summary, we know what goes in and what comes out, but what happens on the provider and whether those inputs and outputs are related is unclear.
The provider itself is thus a single point of failure in the chain of trust that ensures the integrity of software supply chains.

Trustix - build transparency reference implementation

Trustix is a tool that compares build outputs for a given build input across a group of independent providers.
Naively, we would expect that a build output must be the same if the build input is the same.
Multiple providers can then vouch for the content of such a reproducible build, and a user can for example trust in a consensus of the provider group.
In this case, Trustix can identify corrupted providers and seamlessly replace them and any binaries that they distributed.
Except, in reality the situation is more complex because builds are very often non-reproducible without being maliciously modified:
For example, dates, hard coded paths, race conditions, and even random numbers could appear in a build output.
The content of many binaries is therefore expected to change.
In this case, Trustix can still compare binaries and automatically track build reproducibility on the large scale.
Tracking and identifying non-reproducible builds is the first step to compare and verify them as well.
Thus, we see Trustix as the first step towards an entire decentralized software supply chain that can securely distribute software without any central corruptible entity.