Written by Lakshminarasimhan Parthasarathy
Introduction
Screwdriver is a scalable CI/CD solution which uses Kubernetes to manage user builds. Screwdriver build workers interfaces with Kubernetes using either “executor-k8s” or “executor-k8s-vm” depending on required build isolation.
executor-k8s runs builds directly as Kubernetes pods while executor-k8s-vm uses HyperContainers along with Kubernetes for stricter build isolation with containerized Virtual Machines (VMs). This setup was ideal for running builds in an isolated, ephemeral, and lightweight environment. However, hyperd is now deprecated, has no support, is based on an older Docker runtime and it also required non-native Kubernetes setup for build execution. Therefore, it was time to find a new solution.
Why Kata Containers ?
Kata Containers is an open source project and community that builds a standard implementation of lightweight virtual machines (VMs) that perform like containers, but provide the workload isolation and security advantages of VMs. It combines the benefits of using a hypervisor, such as enhanced security, along with container orchestration capabilities provided by Kubernetes. It is the same team behind HyperD where they successfully merged the best parts of Intel Clear Containers with Hyper.sh RunV. As a Kubernetes runtime, Kata enables us to deprecate executor-k8s-vm and use executor-k8s exclusively for all Kubernetes based builds.
Screwdriver Journey to Kata
As we faced a growing number of instabilities with the current HyperD – like network and devicemapper issues and IP cleanup workarounds, we started our initial evaluation of Kata in early 2019 (https://github.com/screwdriver-cd/screwdriver/issues/818#issuecomment-482239236) and identified two major blockers to move ahead with Kata:
1. Security concern for privileged mode (required to run docker daemon in kata)
2. Disk performance.
We recently started reevaluating Kata in early 2020 based on a fix to “add flag to overload default privileged host device behaviour” provided by Containerd/cri (https://github.com/containerd/cri/pull/1225), but still we faced issues with disk performance and switched from overlayfs to devicemapper, which yielded significant improvement. With our two major blockers resolved and initial tests with Kata looking promising, we moved ahead with Kata.
Screwdriver Build Architecture
Replacing Hyper with Kata led to a simpler build architecture. We were able to remove the custom build setup scripts to launch Hyper VM and rely on native Kubernetes setup.
Setup
To use Kata containers for running user builds in a Screwdriver Kubernetes build cluster, a cluster admin needs to configure Kubernetes to use Containerd container runtime with Cri-plugin.
Components
Screwdriver build Kubernetes cluster (minimum version: 1.14+) nodes must have the following components set up for using Kata containers for user builds.
Containerd:
Containerd is a container runtime that helps with management of the complete lifecycle of the container.
Reference: https://containerd.io/docs/getting-started/
CRI-Containerd plugin:
Cri-Containerd is a containerd plugin which implements Kubernetes container runtime interface. CRI plugin interacts with containerd to manage the containers.
Reference: https://github.com/containerd/cri
Image credit: containerd / cri. Photo licensed under CC-BY-4.0.
Architecture:
Image credit: containerd / cri. Photo licensed under CC-BY-4.0
Installation:
Reference:
https://github.com/containerd/cri/blob/master/docs/installation.md
https://github.com/containerd/containerd/blob/master/docs/ops.md
Tarball: https://storage.googleapis.com/cri-containerd-release/cri-containerd-1.3.3.linux-amd64.tar.gz
Crictl:
To debug, inspect, and manage their pods, containers, and container images.
Reference: https://github.com/containerd/cri/blob/master/docs/crictl.md
Kata:
Builds lightweight virtual machines that seamlessly plugin to the containers ecosystem.
Architecture:
Image credit: kata-containers Project licensed under Apache License Version 2.0
Installation:
- https://github.com/kata-containers/documentation/blob/master/Developer-Guide.md#run-kata-containers-with-kubernetes
- https://github.com/kata-containers/documentation/blob/master/how-to/containerd-kata.md
- https://github.com/kata-containers/documentation/blob/master/how-to/how-to-use-k8s-with-cri-containerd-and-kata.md
- https://github.com/kata-containers/documentation/blob/master/how-to/containerd-kata.md#kubernetes-runtimeclass
- https://github.com/kata-containers/documentation/blob/master/how-to/containerd-kata.md#configuration
Routing builds to Kata nodes in Screwdriver build cluster
Screwdriver uses Runtime Class to route builds to Kata nodes in Screwdriver build clusters. The Screwdriver plugin executor-k8s config handles this based on:
- Pod configuration:
apiVersion: v1
kind: Pod
metadata:
name: kata-pod
namespace: sd-build-namespace
labels:
sdbuild: "sd-kata-build"
app: screwdriver
tier: builds
spec:
runtimeClassName: kata
containers:
- name: "sd-build-container"
image: <<image>>
imagePullPolicy: IfNotPresent
- Update the plugin to use k8s in your buildcluster-queue-worker configuration
---
executor:
# Default executor
plugin: k8s
k8s:
exclusion:
- 'rhel6'
weightage: 0
options:
kubernetes:
# The host or IP of the kubernetes cluster
host: kubernetes.default
# Privileged mode, default restricted, set to true for trusted container runtime use-case
privileged: false
automountServiceAccountToken: false
dockerFeatureEnabled: false
resources:
cpu:
# Number of cpu cores
micro: "0.5"
low: 2
high: 6
turbo: 12
memory:
# Memory in GB
micro: 1
low: 2
high: 12
turbo: 16
# Default build timeout for all builds in this cluster
buildTimeout: 90
# Default max build timeout
maxBuildTimeout: 120
# k8s node selectors for approprate pod scheduling
nodeSelectors: {"dedicated":"screwdriver-kata"}
preferredNodeSelectors: {}
annotations: {}
# support for kata-containers-as-a-runtimeclass
runtimeClass: "kata"
# Launcher image to use
launchImage: screwdrivercd/launcher
# Container tags to use
launchVersion: stable
# Circuit breaker config
fusebox:
breaker:
# in milliseconds
timeout: 10000
# requestretry configs
requestretry:
# in milliseconds
retryDelay: 3000
maxAttempts: 5
Production rollout
- Test out the new setup with pilot users
- Route a percentage of traffic to Kata nodes using the weightage configuration
- Based on the limitation “Kata default guest kernel does not support IA32 bit binaries”, maintain a list of containers to exclude; only route builds to nodes with Kata when the container is not in the list
Performance
The below tables compare build setup and overall execution time for Kata and Hyper when the image is pre-cached or not cached.
Image: node12with Image cached in node | Kata (with 1 min wait in build) | Hyper (with 1 min wait in build) |
Setup time | 28 secs | 50 secs |
Overall execution time | 1 min 32 secs | 1 min 56 secs |
Image: node12without Image cached in node | Kata (with 1 min wait in build) | HyperD (with 1 min wait in build) |
Setup time | 51 secs | 1 min 32 secs |
Overall time | 1 min 55 secs | 2 min 40 secs |
Known problems
While the new Kata implementation offers a lot of advantages, there are some known problems we are aware of with fixes or workarounds:
- Run images based on Rhel6 containers don’t start and immediately exit
- Enabled kernel_params = “vsyscall=emulate” refer kata issue https://github.com/kata-containers/runtime/issues/1916 if trouble running pre-2.15 glibc.
- Yum install will hang forever
- Enabled kernel_params = “init=/usr/bin/kata-agent” refer kata issue https://github.com/kata-containers/runtime/issues/1916 to get a better boot time, small footprint .
Before fix | After fix |
sh-4.1# time yum remove wget -yreal 6m22.190suser 2m38.387ssys 3m38.619s sh-4.1# time yum install wget -yreal 6m23.407suser 2m39.387ssys 3m42.606s | sh-4.1# time yum remove wget -yreal 0m4.774suser 0m0.783ssys 0m0.123s sh-4.1# time yum install wget -yreal 0m2.169suser 0m1.760ssys 0m0.298s |
- 32-bit executable cannot be loaded refer kata issue https://github.com/kata-containers/runtime/issues/886
- To workaround/mitigate we maintain a container exclusion list and route to current hyperd setup and we have plans to eol these containers by Q4 of this year.
- Containerd IO snapshotter – Overlayfs vs devicemapper for storage driver
- Devicemapper gives better performance with kata
Overlayfs | Devicemapper |
1024000000 bytes (976.6MB) copied, 19.325605 seconds, 50.5MB/s | 1024000000 bytes (976.6MB) copied, 5.860671 seconds, 166.6MB/s |
- Image stored in both sys-root and devicemapper volume, consuming both volume disk space
Compatibility List
In order to use this feature, you will need these minimum versions:
- API – v0.5.902
- UI – v1.0.515
- Build Cluster Queue Worker – v1.18.0
- Launcher – v6.0.71
Contributors
Thanks to the following contributors for making this feature possible:
- Lakshminarasimhan Parthasarathy
- Suresh Visvanathan
- Pritam Paul
- Chester Yuan
- Nandhakumar Venkatachalam
- Min Zhang
Questions & Suggestions
We’d love to hear from you. If you have any questions, please feel free to reach out here. You can also visit us on Github and Slack.