Announcing the Sigstore Transparency Log Research Dataset

Hayden Blauzvern, Eve Martin-Jones, Google Open Source Security Team

We’re pleased to announce the creation of a new BigQuery public dataset, rekor. The rekor dataset is an easily-queryable mirror of the public good instance of Sigstore’s transparency log, Rekor.

Sigstore is an open source project for improving software supply chain security. The Sigstore framework and tooling empowers software developers and consumers to securely sign and verify software artifacts. For example, deps.dev uses Sigstore to verify the provenance of software published to upstream package registries.

Signing events are recorded in Rekor, an append-only transparency log. Software producers can verify metadata in the log, verifying that the recorded signature metadata was produced as expected when their identities or keys were used to sign artifacts. Software consumers rely on cryptographic proofs of log inclusion to verify that software artifacts are recorded to the log.

This dataset will allow open source supply chain researchers and other interested parties to gather aggregate data on how artifacts are being signed with Sigstore, answering questions like “what is the most common CI provider used to sign artifacts?” or “how many artifacts are signed each month?”.

Sample queries can be found at the BigQuery marketplace listing. You can read more about the dataset on the Sigstore blog.

If you have any questions or feedback, please contact us at rekor-dataset@google.com.