Contributed by: Andrew Larsen, SAS

Event Provenance Registry (EPR) is a culmination of several years of SAS’s effort to convert from large-ship events to CI/CD. We built the first version internally to facilitate CI/CD in a complex, aging build system. The result enables SAS to build, package, scan, promote, and ship thousands of artifacts daily.
Want to learn more about how this project came about? Read SAS’s User Story.
How it works
EPR is simple in its operation but requires some explanation. At a high level, EPR collects events based on tasks done by the pipeline and sends them to a message queue. Other services that we call “watchers” monitor the message queue and act when they see events of interest. EPR can gate events by certain criteria as well. EPR supports Redpanda and Kafka as message queues. For the examples below, I will assume we’re using Redpanda.
To facilitate event collection, EPR has three data structures of note: events, event-receivers, and event-receiver-groups. I will also refer to the latter two as “receivers” and “groups” respectively for brevity.
NVRPP
NVRPP is an unpronounceable acronym that you’ll need to be familiar with to understand how EPR works. It stands for:
- Name
- Version
- Release
- Package
- Platform ID
Each of these fields is just a string, though we strongly recommend you impose some standards for how each is formatted, depending on your situation. NVRPP is based on the NEVRA from Fedora. It allows us to represent most types of artifacts that might flow through our pipeline. Events that have matching NVRPPs are associated with the same artifact. This allows us to trace the flow of any artifact through our pipeline, so long as events are posted at each step.
Event Receivers
Event receivers are data structures stored within EPR that represent some kind of action (i.e., a build, running a test, packaging an artifact, deploying a binary, etc…) with no dependencies and are classified by their name, type, and version. You might name a receiver by the action it represents like golang-build-complete. Types are arbitrary, but work better with a comprehensible structure. Enter the CDEvents spec. This spec provides a standard on which to build interoperability between CI/CD systems. The following examples take advantage of CDEvents.
Receivers may have multiple events that correspond with them. Any events associated with a receiver must have a payload that complies with the schema defined on the receiver. This allows some guarantees about what kind of data you can expect of events going to any given receiver. The receiver defined below requires a CD Event schema for any incoming events.
{
"name": "artifact-packaged",
"type": "dev.cdevents.artifact.packaged.0.1.1",
"version": "1.0.0",
"description": "CDEvents Artifact Packaged",
"enabled": true,
"schema": {
"$schema": "https://json-schema.org/draft/2020-12/schema",
"$id": "https://cdevents.dev/0.4.0-draft/schema/artifact-packaged-event",
"properties": {
"context": {
"properties": {
"version": {
"type": "string",
"minLength": 1
},
"id": {
"type": "string",
"minLength": 1
},
"source": {
"type": "string",
"minLength": 1,
"format": "uri-reference"
},
"type": {
"type": "string",
"enum": [
"dev.cdevents.artifact.packaged.0.1.1"
],
"default": "dev.cdevents.artifact.packaged.0.1.1"
},
"timestamp": {
"type": "string",
"format": "date-time"
}
},
"additionalProperties": false,
"type": "object",
"required": [
"version",
"id",
"source",
"type",
"timestamp"
]
},
"subject": {
"properties": {
"id": {
"type": "string",
"minLength": 1
},
"source": {
"type": "string",
"minLength": 1,
"format": "uri-reference"
},
"type": {
"type": "string",
"minLength": 1,
"enum": [
"artifact"
],
"default": "artifact"
},
"content": {
"properties": {
"change": {
"properties": {
"id": {
"type": "string",
"minLength": 1
},
"source": {
"type": "string",
"minLength": 1,
"format": "uri-reference"
}
},
"additionalProperties": false,
"type": "object",
"required": [
"id"
]
}
},
"additionalProperties": false,
"type": "object",
"required": [
"change"
]
}
},
"additionalProperties": false,
"type": "object",
"required": [
"id",
"type",
"content"
]
},
"customData": {
"oneOf": [
{
"type": "object"
},
{
"type": "string",
"contentEncoding": "base64"
}
]
},
"customDataContentType": {
"type": "string"
}
},
"additionalProperties": false,
"type": "object",
"required": [
"context",
"subject"
]
}
}
When an event posts to a receiver, EPR will emit a message to Redpanda.
Events
Events are a record of some action that took place in your pipeline and whether it was successful. Each event contains an NVRPP and is linked to a receiver by way of an ID. Events that have matching NVRPPs are associated with the same artifact. This allows us to trace the flow of any artifact through our pipeline, so long as events are posted at each step. Events are strictly formatted at the root level, with a free-form JSON payload field that is validated against the schema of its receiver. Each event contains a Boolean success field that represents whether an action was successful or not.
When EPR receives an event, it posts a message and some receiver and group data to Redpanda. Downstream watchers can then consume these messages and take their own actions. A common use case is watchers matching messages based on the success field of an event (and by extension, the message). This allows you to take different actions depending on if an event passed or failed. For example, you could open a ticket against a team if their event to the artifact.scanned type receiver had success=false. If we post an event for the receiver above, it might look something like this:
{
"name": "foo",
"version": "1.0.1",
"release": "2023.11.16",
"platform_id": "aarch64-gnu-linux-7",
"package": "oci",
"description": "packaged oci image foo",
"payload": {
"context": {
"version": "0.4.0-draft",
"id": "271069a8-fc18-44f1-b38f-9d70a1695819",
"source": "/event/source/123",
"type": "dev.cdevents.artifact.packaged.0.1.1",
"timestamp": "2023-03-20T14:27:05.315384Z"
},
"subject": {
"id": "pkg:golang/mygit.com/myorg/myapp@234fd47e07d1004f0aed9c",
"source": "/event/source/123",
"type": "artifact",
"content": {
"change": {
"id": "myChange123",
"source": "my-git.example/an-org/a-repo"
}
}
}
},
"success": true,
"event_receiver_id": "01HQK4JD53RYX04HZTMTEYBDTX"
}
Event Receiver Groups
Event receiver groups can be thought of as gates that control whether an artifact advances through the pipeline. Each group comprises multiple receivers. Like receivers, groups can cause the generation of Redpanda messages. However, they only do this if each receiver has an event with a matching NVRPP where success=true. Since there may be multiple events of varying successes per receiver, only the most recent is considered. This allows you to run many tasks in parallel, but only advance your artifact through the pipeline once all its tasks have completed successfully. Groups don’t match exactly with the CDEvent spec, but we’ll still use the same event types for our example. Normally, we will define more than one receiver ID per group. This example is for simplicity.
{
"name": "release-checks",
"type": "dev.cdevents.artifact.published.0.3.0-draft",
"version": "3.3.3",
"description": "Send an event to release our application if all pipeline tasks have passed.",
"enabled": true,
"event_receiver_ids": [
"01JBX00KBSWSKPPQXQSRWRK93F"
]
}
Watchers
Watchers are applications (typically microservices) that watch the EPR Redpanda topic for messages and then take some action. They use some matching logic, defined in EPR’s SDK (Software Development Kit) to determine which messages to process. You can match on NVRPP, type, and success. We’ve written about a dozen or so watchers internally that do a variety of things. One of our more popular watchers’ fires webhooks if matching criteria are met. It is most often used to trigger Jenkins jobs, acting as glue between Jenkins and other systems. Another popular watcher is one that creates Jira tickets when messages are matched. We use this one heavily as part of our security automation to open security issues against various teams when problems are detected.
The full content of messages watchers look for are similar to this:
{
"success": true,
"id": "01JBX0GZ4VE1RZB53C37R65Q8Q",
"specversion": "1.0",
"type": "dev.cdevents.artifact.packaged.0.1.1",
"source": "epr",
"api_version": "v1",
"name": "foo",
"version": "1.0.1",
"release": "2023.11.16",
"platform_id": "aarch64-gnu-linux-7",
"package": "oci",
"data": {
"events": [
{
"id": "01JBX0GZ4VE1RZB53C37R65Q8Q",
"name": "foo",
"version": "1.0.1",
"release": "2023.11.16",
"platform_id": "aarch64-gnu-linux-7",
"package": "oci",
"description": "packaged oci image foo",
"payload": {
"context": {
"version": "0.4.0-draft",
"id": "271069a8-fc18-44f1-b38f-9d70a1695819",
"source": "/event/source/123",
"type": "dev.cdevents.artifact.packaged.0.1.1",
"timestamp": "2023-03-20T14:27:05.315384Z"
},
"subject": {
"id": "pkg:golang/mygit.com/myorg/myapp@234fd47e07d1004f0aed9c",
"source": "/event/source/123",
"type": "artifact",
"content": {
"change": {
"id": "myChange123",
"source": "my-git.example/an-org/a-repo"
}
}
}
},
"success": true,
"created_at": "2024-11-04T20:55:13.180102-05:00",
"event_receiver_id": "01JBX00KBSWSKPPQXQSRWRK93F",
"EventReceiver": {
"id": "01JBX00KBSWSKPPQXQSRWRK93F",
"name": "artifact-packaged",
"type": "dev.cdevents.artifact.packaged.0.1.1",
"version": "1.0.0",
"description": "CDEvents Artifact Packaged",
"schema": {
// "schema omitted for brevity"
},
"fingerprint": "ce507df4dc8dac35a365f720d540fd1be29fc8639e173a69804443ab4430aae8",
"created_at": "2024-11-04T20:46:16.826009-05:00"
}
}
],
"event_receivers": [
{
"id": "01JBX00KBSWSKPPQXQSRWRK93F",
"name": "artifact-packaged",
"type": "dev.cdevents.artifact.packaged.0.1.1",
"version": "1.0.0",
"description": "CDEvents Artifact Packaged",
"schema": {
// "schema omitted for brevity"
},
"fingerprint": "ce507df4dc8dac35a365f720d540fd1be29fc8639e173a69804443ab4430aae8",
"created_at": "2024-11-04T20:46:16.826009-05:00"
}
],
"event_receiver_groups": null
}
}
Running EPR in Production
Now that you understand the basics, here’s a real-world example. We’ll be starting with a build of the fabulous my-app application. At the end of the build process, the build automation will post a passing event to EPR. The build automation generates an NVRPP which will be used by the first and subsequent events for this artifact. Downstream watchers consume the successful build message on the Redpanda topic, triggering a security scan, integration tests, and artifact signing. Watchers for each of those tasks invoke them with the NVRPP used in the build event. These three tasks post more events to their corresponding receivers that are all contained inside a release group. Once those three receivers have passing events for the NVRPP, the group triggers a release message that a downstream watcher catches to release the software.
Conclusion
By this point, you should have a good idea of how EPR works and what it can do. It will require some assembly, care, and feeding. Adopting new tools isn’t always easy. If, however, you’re willing to brave the change, our success can be yours as well.