Skip to main content

No more Additional Network Requests – Enter: OCI Image Layout

By December 20, 2022Blog, Community

Contributed by @developer-guy, Batuhan Apaydin

This blog post is heavily inspired by Brandon Mitchell’s talk from the Open Source Summit 2022: Image Layout: Stop Putting Everything in Registries.

OCI Image Layout Meme
Credit: https://imgflip.com/i/6y6eti

The invention of containers has transformed the way we package and distribute software. Today, container images is one of the de facto standards. But in this blog post, we will demonstrate that a container image is nothing magical—it’s just a packed directory structure, much like a good old tar archive.

In this first part of the two blog series, we’ll talk about one of the features included in the Image Spec, called OCI image layout. In the second part, we’ll talk about the benefits of using the OCI image layout in today’s CI/CD pipelines.

Before diving into the details of what an OCI image layout is, we should talk about what an OCI term means first.

What is OCI?

Most of us stepped into the containerization world with Docker. As the adaptation to containers increases, so does the need for a new open standard around containers to avoid vendor-locking. That’s why OCI (Open Container Initiative) was created, to provide an open standard for container images, runtimes, and, most recently, distribution. Docker continues to be a major player in the ecosystem, but it will no longer be the only whale in the sea. These standards are vendor-neutral allowing for interoperability and collaboration, which means individuals and organizations can now build their own container tools.

Since Docker was the most widely used format for container images and runtime at the time, the OCI standards create their own specifications with the Docker formats as their basis, which provided a smooth transition for the people who were already sailing the containers sea.

The OCI Image Specification was created based on the Docker Image Manifest V 2, Schema 2. This means that there are two common image formats in the wild, Docker (historically first) and OCI (a standard version), but over 95% are probably still in the Docker schema in Docker Hub.

The OCI Image Specification defines an OCI image, which consists of an image manifest, an image index (optional), a set of filesystem layers, and a configuration. You can find all the specifications that are hosted by OCI here.

Now, we need to answer the following questions:

  • What are the main differences between the two container image formats?
  • How do you know whether the given image is an OCI or a Docker image?

Let’s find out!

Docker vs OCI Images

Docker Format

The Docker V2.2 manifest format is a JSON blob with the following fields:

  • schemaVersion2 in this case
  • mediaTypeapplication/vnd.docker.distribution.manifest.v2+json
  • configdescriptor of container configuration blob
  • layers – list of descriptors of layer blobs

A descriptor is a crucial data structure in the OCI world. It has fields to describe the content including the type of content, a content identifier (digest), and the byte size of the raw content.

You can read more about descriptors here.

Here is an example of a V2.2 manifest (for the Docker Hub busybox image):

{
  "schemaVersion": 2,
  "mediaType": "application/vnd.docker.distribution.manifest.v2+json",
  "config": {
    "mediaType": "application/vnd.docker.container.image.v1+json",
    "size": 1497,
    "digest": "sha256:3a093384ac306cbac30b67f1585e12b30ab1a899374dabc3170b9bca246f1444"
  },
  "layers": [
    {
      "mediaType": "application/vnd.docker.image.rootfs.diff.tar.gzip",
      "size": 755724,
      "digest": "sha256:57c14dd66db0390dbf6da578421c077f6de8e88edd0815b4caa94607ba5f4c09"
    }
  ]
}

⚠️ Don’t worry if you don’t know how to inspect the image manifest, we’ll be giving some tool suggestions that will allow you to do this.

OCI Format

The OCI manifest format is essentially the same as the Docker V2.2 format, but with a few differences.

  • mediaType – must be set to application/vnd.oci.image.manifest.v1+json
  • config.mediaType – must be set to application/vnd.oci.image.config.v1+json
  • Each object in layers must have mediaType be either application/vnd.oci.image.layer.v1.tar+gzip or application/vnd.oci.image.layer.v1.tar.

Here is an example of an OCI manifest (for Chainguard’s static image):

{
  "schemaVersion": 2,
  "mediaType": "application/vnd.oci.image.manifest.v1+json",
  "config": {
    "mediaType": "application/vnd.oci.image.config.v1+json",
    "size": 509,
    "digest": "sha256:603d476127fee8be7f6b199e99c50f5c53ac04effbdec369d77a5e7b030c4915"
  },
  "layers": [
    {
      "mediaType": "application/vnd.oci.image.layer.v1.tar+gzip",
      "size": 477639,
      "digest": "sha256:ad866408d508158af6c147ea71455ff3fa029610ec6551798d34ce2fa3641694"
    }
  ],
  "annotations": {
    "org.opencontainers.image.revision": "11e020dbca4e903163c0f8b766acb9a9114a4a2c",
    "org.opencontainers.image.source": "<https://github.com/chainguard-images/static>"
  }
}

https://containers.gitbook.io/build-containers-the-hard-way/#registry-format-oci-image-manifest

As you can see from the OCI image manifest above, the fields are the same as the Docker V2.2 format, which shows that it started based on the Docker V2.2 format. The key difference between the two is the mediaTypes. Docker V2.2 format’s mediaType starts with application/vnd.docker, but the OCI image’s mediaType starts with application/vnd.oci. You can find more details about them in Dan Lorenc’s blog post.

OCI Image Layout

OCI Image Layout is a directory structure that contains files and folders that refer to an OCI image. The container image is a collection of tarballs and JSON blobs only, and after extracting in the OCI Image Layout format, it will look like the following directory structure:

├── blobs
│   └── sha256
│       ├── ...       (image.manifest)
│       └── ...       (image.config)
│       └── ...       (image.layer)
│       └── ...       (image.layer)
│       └── ...       ...
│       └── ...       ...
│       └── ...       (image.layerN)
├── index.json
└── oci-layout

Fortunately, you don’t have to do anything special to create the OCI image layout format on your own because many of the container image builders such as Docker Buildx, ko, and kaniko already support outputting the image that they built in the form of OCI Image Layout on disk. CLI tools for interacting with OCI registries, such as crane, skopeo, and regctl also have the ability to convert container images to OCI image layout format.

However, it is worth saying you can build your own container image without using any of the tools mentioned by simply creating an OCI image layout with bare-bone Linux commands like tar, sha256sum, wc, etc. If you want to know how, watch this great talk “Unraveling the Magic Behind Buildpacks” by Sambhav Kothari and Natalie Arellano from KubeCon + CloudNativeCon Europe 2022. They explain how you can achieve this in detail.

Saving OCI Image to Disk

Let me show you a quick example of how you can save an OCI image to disk in the form of OCI image layout by using one of the tools mentioned above and have a look at what it looks like in practice.

I’m going to use Docker Buildx, which is Docker CLI plugin for extended build capabilities with BuildKit, to build and output my OCI image to disk, and regctl, one of the most recent CLI tools that facilitates interacting with container registries without leaving your lovely terminal.

Let’s benefit from the Buildx features and use the following one-liner to build an image:

$ docker buildx create --use
$ docker buildx build -o type=oci,dest=oci-layout.tar  --quiet -<<-EOF
	FROM busybox
	CMD echo 'hello world!!'
EOF

As you can see from the build command’s output options above (-o type=oci,dest=oci-layout.tar), the only thing that we need to do is specify the output type as oci and the file name that we want to output. This command will output a file called oci-layout.tar and we can use the regctl to extract that tar file into a directory. The image import command is one of the useful commands in regctl since it is compatible with both a docker formatted tar from “docker save” or an OCI Layout compatible tar.

regctl image import ocidir://oci-layout oci-layout.tar

Once you do that, you should be able to see the following directory structure:

$ tree -L 5 oci-layout/
oci-layout/
├── blobs
│   └── sha256
│       ├── 22b70bddd3acadc892fca4c2af4260629bfda5dfd11ebc106a93ce24e752b5ed
│       ├── 6088636f9d31a31b66cacb36e04a4f11b04c76b0fcc83418f86a6d607df731df
│       └── f6ccd1e532f6c7ab2f352dcce5bdabce17bd382bce3520008c98a75c22cf2953
├── index.json
└── oci-layout

Let’s demonstrate another approach that you might already be pushing container images to one of your favorite container registries by using the following output options: -o type=registry,name=devopps/hello-world-server:latest

and you want to save that image in the form of OCI image layout in disk and we can use “regctl image copy” command to do this:

regctl image copy devopps/hello-world-server:latest ocidir://oci-layout

Once the command above is successfully completed, you should have the same directory structure as above.

Let’s clarify how we can easily find what we are looking for while navigating the OCI image layout directory:

  • An index.json is an image index JSON object that is a higher-level manifest point to specific image manifests. We said “manifests,” so we might be building multi-architecture images. But if you follow the example above, you will see only one manifest pointed out by the image index JSON object like the following:
$ cat oci-layout/index.json | jq
{
  "schemaVersion": 2,
  "mediaType": "application/vnd.oci.image.index.v1+json",
  "manifests": [
    {
      "mediaType": "application/vnd.oci.image.manifest.v1+json",
      "size": 505,
      "digest": "sha256:f6ccd1e532f6c7ab2f352dcce5bdabce17bd382bce3520008c98a75c22cf2953",
      "annotations": {
        "org.opencontainers.image.ref.name": "latest"
      }
    }
  ]
}
  • An oci-layout file provides the version of the image layout in use. It includes the imageLayoutVersion field to indicate the actual version that will align with the OCI Image Specification version.
$ cat oci-layout/oci-layout | jq
{
  "imageLayoutVersion": "1.0.0"
}
  • A blobs directory contains the content-addressable blobs, and this is where everything we’ve learned about container images so far makes sense. We discussed how container images are composed of a bunch of read-only layers that are stacked on top of each other to assemble the image’s root filesystem, rootfs for short. Therefore, everything in the container registry is content addressable, which is key for security by assuring that the content you put in there is the content you pull into using the sha256 hash algorithm.

Let’s verify this by simply running the shasum -a 256 command against one of the files stored in the ./oci-layout/blobs/sha256 directory and check whether the file name, which is the hash of the actual content of that file, is the same one you get from the command output:

$ shasum -a 256 oci-layout/blobs/sha256/22b70bddd3acadc892fca4c2af4260629bfda5dfd11ebc106a93ce24e752b5ed
22b70bddd3acadc892fca4c2af4260629bfda5dfd11ebc106a93ce24e752b5ed  oci-layout/blobs/sha256/22b70bddd3acadc892fca4c2af4260629bfda5dfd11ebc106a93ce24e752b5ed

Navigating the Land of OCI Image Layout

Now that we have a clear understanding of what the files and folders inside that directory are and what they mean to us, let’s find out what the hidden gems are within that.

The OCI container image consists of three main elements: config blob, manifest blob, and a bunch of layer blobs as we mentioned earlier in the blog post.

To explain it better, we can visualize the OCI container image to see what it looks like in general:

OCI Index Structure
Credit: https://github.com/sudo-bmitch/presentations/blob/main/oci-refers/img/oci-image.png

As you can see from the diagram above, you can reach the correct manifest for the platform that we’re currently working on (for example, linux/amd64). To do so, look into the index.json file first to find the manifest blob in the ./blobs/sha256 sub-directory within the OCI image layout directory.

As we said, everything in the container registry is stored as a content-addressable blob. Therefore, you can find all things, including config and manifest, within that directory (./blobs/sha256) and the layers.

To find the correct manifest, we should take a look at the digest field that acts as a content identifier in the image index, which is stored in index.json file.

Let’s have a look at the index.json file:

$ cat oci-layout/index.json | jq
{
  "schemaVersion": 2,
  "mediaType": "application/vnd.oci.image.index.v1+json",
  "manifests": [
    {
      "mediaType": "application/vnd.oci.image.manifest.v1+json",
      "size": 505,
      "digest": "sha256:f6ccd1e532f6c7ab2f352dcce5bdabce17bd382bce3520008c98a75c22cf2953",
      "annotations": {
        "org.opencontainers.image.ref.name": "latest"
      }
    }
  ]
}

Voilà! You can use the digest above, which is f6ccd1e532f6c7ab2f352dcce5bdabce17bd382bce3520008c98a75c22cf2953, to reach out to the manifest blob.

⚠️ If the image has been built for multiple architectures, we should use the digest of the corresponding OS and arch that we are taking care of specified within the platform field.

As config and manifest blobs are just regular JSON files (look at the mediaTypes of these blobs), you can see their content by simply running the cat command and piping the output to jq for better visualization.

Let’s have a look at the manifest blob:

$ cat oci-layout/blobs/sha256/f6ccd1e532f6c7ab2f352dcce5bdabce17bd382bce3520008c98a75c22cf2953 | jq
{
  "mediaType": "application/vnd.oci.image.manifest.v1+json",
  "schemaVersion": 2,
  "config": {
    "mediaType": "application/vnd.oci.image.config.v1+json",
    "digest": "sha256:6088636f9d31a31b66cacb36e04a4f11b04c76b0fcc83418f86a6d607df731df",
    "size": 1086
  },
  "layers": [
    {
      "mediaType": "application/vnd.oci.image.layer.v1.tar+gzip",
      "digest": "sha256:22b70bddd3acadc892fca4c2af4260629bfda5dfd11ebc106a93ce24e752b5ed",
      "size": 772993
    }
  ]
}

The manifest blob points to the config blob and the layer blobs. You can find the digest of the config blob by looking at the config field, which is 6088636f9d31a31b66cacb36e04a4f11b04c76b0fcc83418f86a6d607df731df, in this case.

Let’s have a look at the configuration of an image:

$ cat oci-layout/blobs/sha256/6088636f9d31a31b66cacb36e04a4f11b04c76b0fcc83418f86a6d607df731df | jq
{
  "architecture": "amd64",
  "config": {
    "Env": [
      "PATH=/usr/local/sbin:/usr/local/bin:/usr/sbin:/usr/bin:/sbin:/bin"
    ],
    "Cmd": [
      "/bin/sh",
      "-c",
      "echo 'hello world!!'"
    ],
    "ArgsEscaped": true,
    "OnBuild": null
  },
  "created": "2022-10-26T06:30:33.794221299Z",
  "history": [
    {
      "created": "2022-10-26T06:30:33.700079457Z",
      "created_by": "/bin/sh -c #(nop) ADD file:5e991de3200129dc05c3130f7a64bebb5704486b4f773bfcaa6b13165d6c2416 in / "
    },
    {
      "created": "2022-10-26T06:30:33.794221299Z",
      "created_by": "/bin/sh -c #(nop)  CMD [\\"sh\\"]",
      "empty_layer": true
    },
    {
      "created": "2022-10-26T06:30:33.794221299Z",
      "created_by": "CMD [\\"/bin/sh\\" \\"-c\\" \\"echo 'hello world!!'\\"]",
      "comment": "buildkit.dockerfile.v0",
      "empty_layer": true
    }
  ],
  "moby.buildkit.buildinfo.v1": "eyJmcm9udGVuZCI6ImRvY2tlcmZpbGUudjAiLCJzb3VyY2VzIjpbeyJ0eXBlIjoiZG9ja2VyLWltYWdlIiwicmVmIjoiZG9ja2VyLmlvL2xpYnJhcnkvYnVzeWJveDpsYXRlc3QiLCJwaW4iOiJzaGEyNTY6NmJkZDkyYmY1MjQwYmUxYjVmM2JmNzEzMjRmNWUzNzFmZTU5ZjBlMTUzYjI3ZmExZjE2MjBmNzhiYTE2OTYzYyJ9XX0=",
  "os": "linux",
  "rootfs": {
    "type": "layers",
    "diff_ids": [
      "sha256:0438ade5aeea533b00cd75095bec75fbc2b307bace4c89bb39b75d428637bcd8"
    ]
  }
}

Last but not least, let’s continue with the layer blobs. As I mentioned, you can understand what type of file is stored within that blob by looking at the mediaType of the descriptor of that blob. This is essential for the client tool to decide how it needs to interpret that blob. To make it clearer, here’s a Helm chart OCI support as an example.

The OCI support graduated from experimental to general availability with Helm v3.8.0, which means that you can start storing and distributing your Helm charts as OCI Artifacts in any OCI-compliant registries. You can follow the official documentation to start publishing your Helm charts to OCI registries, so I won’t dive into these details. In this post, we’re more interested in what they will look like after you pushed them from an OCI standpoint.

Let’s have a look at the manifest of the OCI image that includes the Helm chart:

$ regctl image manifest devopps/nginx:0.1.0 --format body | jq
{
  "schemaVersion": 2,
  "config": {
    "mediaType": "application/vnd.cncf.helm.config.v1+json",
    "digest": "sha256:cbcce8ad7fdaea54f60aeef01f57f2f46f67382b5d391d63d2b55f536195935a",
    "size": 139
  },
  "layers": [
    {
      "mediaType": "application/vnd.cncf.helm.chart.content.v1.tar+gzip",
      "digest": "sha256:5941d0917965edf6c530feea26c8a131eb2a108bf577008a32f7b920bfc44559",
      "size": 3752
    }
  ]
}

What if you try to pull that image with a regular docker pull command? Let’s see:

$ docker image pull devopps/nginx:0.1.0
0.1.0: Pulling from devopps/nginx
5941d0917965: Pulling fs layer
invalid rootfs in image configuration

We got an error invalid rootfs in image configuration, which is strange. 🤔 This is what I’m trying to explain. I said before that mediaTypes are the essential thing the client tool needs to decide how to interpret that blob. Docker CLI doesn’t know how it interprets the Helm OCI Artifact mediaTypes. In the example above, it didn’t work as we expected, but if you use helm pull command, it will work, because it knows how to interpret with these mediaTypes.

Conclusion

Thanks to this post, you now have a clear and solid understanding of the OCI image layout format and the OCI image specification! In the second part of the series (coming early 2023), we will explore the opportunities to use the OCI image layout format in CI/CD pipelines to reduce network traffic between the pipeline system and OCI registries, where we create SBOMs, perform vulnerability scans, and create provenances.