Skip to main content

Kata Containers in Screwdriver

By June 10, 2020Blog

Written by Lakshminarasimhan Parthasarathy

Introduction

Screwdriver is a scalable CI/CD solution which uses Kubernetes to manage user builds. Screwdriver build workers interfaces with Kubernetes using either “executor-k8s” or “executor-k8s-vm” depending on required build isolation. 

executor-k8s runs builds directly as Kubernetes pods while executor-k8s-vm  uses HyperContainers along with Kubernetes for stricter build isolation with containerized Virtual Machines (VMs). This setup was ideal for running builds in an isolated, ephemeral, and lightweight environment. However, hyperd is now deprecated, has no support, is based on an older Docker runtime and it also required non-native Kubernetes setup for build execution. Therefore, it was time to find a new solution.

Why Kata Containers ?

Kata Containers is an open source project and community that builds a standard implementation of lightweight virtual machines (VMs) that perform like containers, but provide the workload isolation and security advantages of VMs. It combines the benefits of using a hypervisor, such as enhanced security, along with container orchestration capabilities provided by Kubernetes. It is the same team behind HyperD where they successfully merged the best parts of Intel Clear Containers with Hyper.sh RunV. As a Kubernetes runtime, Kata enables us to deprecate executor-k8s-vm and use executor-k8s exclusively for all Kubernetes based builds.

Screwdriver Journey to Kata

As we faced a growing number of instabilities with the current HyperD – like network and devicemapper issues and IP cleanup workarounds, we started our initial evaluation of Kata in early 2019 (https://github.com/screwdriver-cd/screwdriver/issues/818#issuecomment-482239236) and identified two major blockers to move ahead with Kata:

1. Security concern for privileged mode (required to run docker daemon in kata)

2. Disk performance. 

We recently started reevaluating Kata in early 2020 based on a fix to “add flag to overload default privileged host device behaviour” provided by Containerd/cri (https://github.com/containerd/cri/pull/1225), but still we faced issues with disk performance and switched from overlayfs to devicemapper, which yielded significant improvement. With our two major blockers resolved and initial tests with Kata looking promising, we moved ahead with Kata.

Screwdriver Build Architecture

Replacing Hyper with Kata led to a simpler build architecture. We were able to remove the custom build setup scripts to launch Hyper VM and rely on native Kubernetes setup. 

Setup

To use Kata containers for running user builds in a Screwdriver Kubernetes build cluster, a cluster admin needs to configure Kubernetes to use Containerd container runtime with Cri-plugin.

Components

Screwdriver build Kubernetes cluster (minimum version: 1.14+) nodes must have the following components set up for using Kata containers for user builds. 

Containerd:

Containerd is a container runtime that helps with management of the complete lifecycle of the container.

Reference: https://containerd.io/docs/getting-started/

CRI-Containerd plugin:

Cri-Containerd is a containerd plugin which implements Kubernetes container runtime interface. CRI plugin interacts with containerd to manage the containers.

Reference: https://github.com/containerd/cri

Image credit: containerd / cri. Photo licensed under CC-BY-4.0.

Architecture:

Image credit: containerd / cri. Photo licensed under CC-BY-4.0

Installation:

Reference: 

https://github.com/containerd/cri/blob/master/docs/installation.md

https://github.com/containerd/containerd/blob/master/docs/ops.md

Tarball: https://storage.googleapis.com/cri-containerd-release/cri-containerd-1.3.3.linux-amd64.tar.gz

Crictl:

To debug, inspect, and manage their pods, containers, and container images.

Reference: https://github.com/containerd/cri/blob/master/docs/crictl.md

Kata:

Builds lightweight virtual machines that seamlessly plugin to the containers ecosystem.

Architecture:

Image credit: kata-containers Project licensed under Apache License Version 2.0

Installation:

  1. https://github.com/kata-containers/documentation/blob/master/Developer-Guide.md#run-kata-containers-with-kubernetes
  2. https://github.com/kata-containers/documentation/blob/master/how-to/containerd-kata.md
  3. https://github.com/kata-containers/documentation/blob/master/how-to/how-to-use-k8s-with-cri-containerd-and-kata.md
  4. https://github.com/kata-containers/documentation/blob/master/how-to/containerd-kata.md#kubernetes-runtimeclass
  5. https://github.com/kata-containers/documentation/blob/master/how-to/containerd-kata.md#configuration

Routing builds to Kata nodes in Screwdriver build cluster

Screwdriver uses Runtime Class to route builds to Kata nodes in Screwdriver build clusters. The Screwdriver plugin executor-k8s config handles this based on: 

  1. Pod configuration:
apiVersion: v1
kind: Pod
metadata:
  name: kata-pod
  namespace: sd-build-namespace
  labels:
    sdbuild: "sd-kata-build"
    app: screwdriver
    tier: builds
spec:
  runtimeClassName: kata
  containers:
  - name: "sd-build-container"
    image: <<image>>
    imagePullPolicy: IfNotPresent
  1. Update  the plugin to use k8s in your buildcluster-queue-worker configuration

---
executor:
    # Default executor
    plugin: k8s
    k8s:
      exclusion:
        - 'rhel6'
      weightage: 0
      options:
        kubernetes:
            # The host or IP of the kubernetes cluster
            host: kubernetes.default
            # Privileged mode, default restricted, set to true for trusted container runtime use-case
            privileged: false
            automountServiceAccountToken: false
            dockerFeatureEnabled: false
            resources:
                cpu:
                    # Number of cpu cores
                    micro: "0.5"
                    low: 2
                    high: 6
                    turbo: 12
                memory:
                    # Memory in GB
                    micro: 1
                    low: 2
                    high: 12
                    turbo: 16
            # Default build timeout for all builds in this cluster
            buildTimeout: 90
            # Default max build timeout
            maxBuildTimeout: 120
            # k8s node selectors for approprate pod scheduling
            nodeSelectors: {"dedicated":"screwdriver-kata"}
            preferredNodeSelectors: {}
            annotations: {}
            # support for kata-containers-as-a-runtimeclass
            runtimeClass: "kata"
        # Launcher image to use
        launchImage: screwdrivercd/launcher
        # Container tags to use
        launchVersion: stable
        # Circuit breaker config
        fusebox:
            breaker:
                # in milliseconds
                timeout: 10000
        # requestretry configs
        requestretry:
            # in milliseconds
            retryDelay: 3000
            maxAttempts: 5

Production rollout

  1. Test out the new setup with pilot users
  2. Route a percentage of traffic to Kata nodes using the weightage configuration
  3. Based on the limitation “Kata default guest kernel does not support IA32 bit binaries”, maintain a list of containers to exclude; only route builds to nodes with Kata when the container is not in the list

Performance

The below tables compare build setup and overall execution time for Kata and Hyper when the image is pre-cached or not cached.

Image: node12with Image cached in nodeKata (with 1 min wait in build)Hyper (with 1 min wait in build)
Setup time28 secs50 secs
Overall execution time1 min 32 secs1 min 56 secs
Image: node12without Image cached in nodeKata (with 1 min wait in build)HyperD (with 1 min wait in build)
Setup time51 secs1 min 32 secs
Overall time1 min 55 secs2 min 40 secs

Known problems

While the new Kata implementation offers a lot of advantages, there are some known problems we are aware of with fixes or workarounds:

  1. Run images based on Rhel6 containers don’t start and immediately exit 
  1. Yum install will hang forever
Before fixAfter fix
sh-4.1# time yum remove wget -yreal 6m22.190suser 2m38.387ssys 3m38.619s
sh-4.1# time yum install wget -yreal 6m23.407suser 2m39.387ssys 3m42.606s
sh-4.1# time yum remove wget -yreal 0m4.774suser 0m0.783ssys 0m0.123s
sh-4.1# time yum install wget -yreal 0m2.169suser 0m1.760ssys 0m0.298s
  1. 32-bit executable cannot be loaded refer kata issue  https://github.com/kata-containers/runtime/issues/886 
  • To workaround/mitigate we maintain a container exclusion list and route to current hyperd setup and we have plans to eol these containers by Q4 of this year.
  1. Containerd IO snapshotter – Overlayfs vs devicemapper for storage driver
  • Devicemapper gives better performance with kata
OverlayfsDevicemapper
1024000000 bytes (976.6MB) copied, 19.325605 seconds, 50.5MB/s1024000000 bytes (976.6MB) copied, 5.860671 seconds, 166.6MB/s
  1. Image stored in both sys-root and devicemapper volume, consuming both volume disk space 

Compatibility List

In order to use this feature, you will need these minimum versions:

Contributors

Thanks to the following contributors for making this feature possible:

Questions & Suggestions

We’d love to hear from you. If you have any questions, please feel free to reach out here. You can also visit us on Github and Slack.