Skip to main content
Category

Blog

Introducing Our Newest CDF Ambassador – Steven Terrana

By Blog, Staff

Heyyo!

My name is Steven Terrana. It’s great to be here! I’m currently a DevSecOps & Platforms Engineer at Booz Allen Hamilton.

My day to day largely consists of working with teams to implement large-scale CI/CD pipelines using Jenkins, implementing DevSecOps principles, and adopting all the buzzwords :).

Through experiencing all of the pains associated with the “large-scale” pipeline development, I developed the Jenkins Templating Engine: a Jenkins plugin that allows users to stop copying and pasting Jenkinsfiles by creating tool-agnostic, pipeline templates that can be shared across teams enabling organizational governance while optimizing for developer autonomy. If that sounds cool, you can check out the Jenkins Online Meetup.

You can probably find me somewhere in the Jenkins community. I help drive the Pipeline Authoring SIG and contribute to community plugins and pipeline documentation where I can.

I’m excited to be a part of an organization in CDF that’s helping to establish best practices, propel the adoption of continuous delivery tooling, and facilitate interoperability across emerging technologies to streamline software delivery.

Oh, yeah, and I have two cats and a turtle. Meet James Bond, GG, and Sheldon:

Follow me on Twitter @steven_terrana

From Harness – Automating Enterprise Governance within Delivery Pipelines

By Blog, Member

Originally posted on the Harness.io blog by Tiffany Jachja (@tiffanyjachja)

In an organization where developers are continuously pushing code to production, managing risks can be difficult. In Measuring and Managing Information Risk: A FAIR Approach, Jack Freund and Jack Jones describe governance as a cost-effective approach to “govern the organization’s risk landscape.” You want to ensure your organization actively understands and manages risk, especially in heavily regulated industries expected to comply with governing authorities and standards, see compliance, or a blog post on measuring compliance

Governance, risk management, and compliance (GRC) is an umbrella term covering an organization’s approach across these three practices: governance, risk management, and compliance. Freund and Jones describe risk and compliance. 

“This [the risk] objective is all about making better-informed risk decisions, which boils down to three things: (1) identifying ‘risks,’ (2) effectively rating and prioritizing ‘risks,’ and (3) making decisions about how to mitigate ‘risks’ that are significant enough to warrant mitigation.” 

“Of the three objectives, compliance management is the simplest—at least on the surface. On the surface, compliance is simply a matter of identifying the relevant expectations (e.g., requirements defined by Basel, Payment Card Industry (PCI), SOX, etc.), documenting and reporting on how the organization is (or is not) complying with those expectations, and tracking and reporting on activities to close any gaps.”

So if GRC is about aligning an organization to managing risks, what role do developers play?

From code commit to production

We discussed in the previous blog posts the importance of taking a systematic approach to developing software delivery processes. We shared practices like Value Stream Mapping, to give organizations the tools to better understand their value streams and to accelerate their DevOps journey. These DevOps practices indicate that every software delivery stakeholder is responsible for the value they deliver. But on the flip side, they also indicate that stakeholders are responsible for any risks that they create or introduce. 

The DevOps Automated Governance Reference Architecture found here, shares how to further adopt a systems approach to delivery. 

By looking at each stage in your delivery pipeline, you can define the inputs, outputs, actors, actions, risks, and control points related to that stage. 

The essential part of governance is that developers are aware of the risks at each stage. The reference architecture paper shares some of the common risks associated with code commits, such as unapproved changes and PII or credentials in source code. Likewise, for deploying to production, you can have risks such as low-quality code in production, lack of quality gates, and unexpected system behaviors in production. 

These risks help define areas of control points that help manage that risk. If you face the risk of unapproved changes, introduce a change approval process. Likewise, you can control risks through secrets management, application quality analysis, quality gate evaluation, and enforced deployment strategies.

Everyone involved in the process from code commits to production is responsible for mitigating risks. 

The pieces to Enterprise Governance

Now let’s discuss the components of a governance process for a cloud environment. The DevOps Automated Governance Reference Architecture, found here, shares an approach to navigating your automated governance journey. Many of these concepts to be discussed here are explained in detail in that reference paper. 

Notes are metadata definitions. Occurrences are generated for each artifact or resource that needs this note. As an example, a note could provide details of a specific vulnerability, such as the impacts, names, and status. I would generate an occurrence for every container image with that security vulnerability. Similarly, I could have a note that defines a specific application deployment, as I promote the deployment across different environments, I would generate an occurrence. There is a one to many relationships between notes and occurrences.

An attestation is a particular type of note that represents a verification that you’ve satisfied in your governance process. Attestations are tied to attestors, which hold the authority to verify a control point within your governance process. For example, determining you’ve passed a code review or a unit test is an attestation. Each attestation represents a control point within your governance process. 

A binary authorization policy uses a list of attestors to represent your governance as code. A binary auth policy acts as a series of gates so that you can not get to the next stage of your software delivery before getting an attestation from each attestor. Therefore, it’s common practice to turn on binary authorization (BinAuthz) in your Kubernetes environment to ensure you are governing changes and deployments. You’ll have an Admission Controller in your Google Kubernetes Engine(GKE) that does the checks for attestation when you go interact with your environment. Here’s more information on how BinAuthz works for GKE.

If you’d like to learn more about designing control points for your governance process, Captial One also shared their pipeline design through a concept called “16 Gates” in a blog post called “Focusing on the DevOps Pipeline.” 

Harnessing your governance process

Governance processes require automation to accelerate software delivery; otherwise, it can harm your velocity and time to market. A popular topic to emerge in the past year is automated pipeline governance, which gives enterprises the ability to attest to the integrity of assets in a delivery pipeline. Pipeline governance goes beyond traditional CICD, where developers simply automate delivery without truly mitigating risk. Continuous Integration and Continuous Delivery platforms can help heavily regulated industries manage their governance processes when developers and operations understand the organization’s risks. 

From Harness – Value-Stream Mapping (VSM): Your Software Delivery

By Blog, Member

Originally posted on the Harness.io blog by Tiffany Jachja (@tiffanyjachja)

Value-stream mapping (VSM) is a lean manufacturing technique popularized in the 90s after its successful applications in the manufacturing industry with companies like Toyota. Since then, DevOps practitioners have shared these processes as it applies to software development. DevOps literature, such as the DevOps Handbook, suggests value stream maps inform the most critical areas of application for DevOps practices and technology. A VSM is also known as a material and information flow map. Using this map, you can identify areas of improvement and map your current state to your future state. If you’d like an in-depth look at VSM, I recommend “Value Stream Mapping: How to Visualize Work and Align Leadership for Organizational Transformation” written by Karen Martin and Mike Osterling. This blog post summarizes the key concepts of VSM and shares how you can use VSM within your IT organization. 

Value Stream Map Symbols and Components

In a VSM, three types of flow deliver a product/service to a consumer. The flow of information, the flow of materials, and the flow of time. The flow of information goes from right to left in a VSM. And in contrast, the flow of information and time goes from left to right in a VSM. The illustration below shows the major components of a VSM. 

Within the information section, the factory symbol denotes the customer, supplier, and any other entities. All VSMs have a customer and supplier specified. A truck indicates the frequency of delivery. Within the materials section, the process boxes host additional space to include information about the resources needed to complete a particular process. It can be advantageous to also track inventory between processes. This is often denoted by a triangle. We also have the arrows which can denote nonlinear/sequential relationships between processes. Lastly, a VSM has a flow of time to showcase lead times in the flow of materials and information. 

These are the major components of a VSM. There may be additional VSM components and symbols that are helpful for more complex value stream exercises. This resource explains those VSM components in more detail.

Creating a Value Stream Map

Example Value Stream Map (edited from a Lucid Chart Template found here).

Here are the steps to creating a value stream map:

  1. Select the product or service you’d like to value stream map. Start with the most important / highest value proposition. Here I have Simple Service at the top center of my VSM.
  2. Start with your customers. The customer should drive the entire stream of value. In the example, a customer sends all their requests to a simple service.
  3. Define your start and end processes. Scope your VSM with a start and end trigger. Here I have prioritizing reported bugs as my starting event. Likewise, deploying the feature is my final event.
  4. Include information flow. Reach that information flows from right to left in a VSM. In my example, I have “Feature Requests” as the triggers to my start process. Once code has made it to production, I have “Ship the Service” as my final flow of information.
  5. Fill in the remaining processes. In my example, I have the components of a standard software lifecycle: develop, build, test, and deploy. Here I could also add in counts of inventory or average quantity between processes.
  6. Gather process data. I am including who is involved in each process along with the tools they use. Be objective in this step and the following one. The goal is to capture the current state. 
  7. Create the timeline. Here I am mapping current metrics on each process. There are three standard metrics used in VSM. Lead Time, Process Time, Percent Complete & Accurate. Lead time is the elapsed time or time from initiating the phase to completing it. The processing time is the amount of time it takes to handle the said request. And Percent Compete&Accurate (%C&A or PCA) is the percentage occurrence where the finished output was correct according to the requirements of the customer of the process.

Evaluating a Value Stream Map

Now that we have a VSM let’s discuss how to analyze it. If you are not adding value to your end customer, you are adding to the cost of production. Value stream management practices encourages organizations to focus on the flow of value.

Organizations will look at a Value Stream Map to consider the performance of entire systems. Starting with a single stream of value can help ease other parts of your organization into adopting new ideas and practices. 

Flow refers to the flow of work in your value stream. Work can come from features requests, requirements, defects, and or support tickets. The goal is to ensure that the value stream is always moving forward. Some things that can affect flow include: 

  • Changing priorities and task switching,
  • Lack of visibility around problems and processes,
  • Long development cycles,
  • And not getting code to higher environments

A book titled Team Topologies authored by Matthew Skelton and Manuel Pais share a few more obstacles to flow. https://teamtopologies.com/

Here is a guide to improving the flow of work within a value stream. It involves creating a dedicated transformation team that’s held “accountable for achieving a clearly defined, measurable, system-level result (e.g., reduce the deployment lead time from “code committed into version control to successfully running in production” by 50%).” This dedicated team should have the resources and freedom to utilize the different DevOps practices to achieve the results.

There are also eight types of waste defined through the lean manufacturing movement. The diagram below describes these types of waste.

The 8 Types of Waste. (Image source: https://theleanway.net/The-8-Wastes-of-Lean)

If we’d like to minimize in our software delivery processes we need to consider how to manage these findings.

Here are some more ways to look at a Value Stream Maps. 

  1. Optimize Value Streams through Process: Look to see if your process is of the following characteristics: valuable(does the customer need this), capable(quality results), available(minimal downtimes), adequate(meets demands), flexible(can be switched or configured). 
  2. Focus on flow: Is flow continuous or stagnate? Consider your quantity levels. In the earlier VSM example, the flow could stagnate if there are no feature requests. Likewise, if you have too many features developed, your QA process may need more scaling.  Identify areas where your value stream is suffering, if stagnation exists. 
  3. Analyze the flow of information: Is it a push or pull model? In push models, the supplier supplies customers with features and deliverables. In a pull model, the customer is requesting those features. It’s important to have the right balance to ensure you’re not overtaxing specific job functions and processes in your value stream.

A well-executed VSM workshop gives those in the room the opportunity to champion change. Maybe you’re using an outdated tool for a certain process. A VSM workshop invites critical stakeholders to come and challenges areas of your value stream.

Harness Your Value Stream

Every organization has a process for delivering their product or service to their customers. Value-stream mapping allows you to optimize your flow of materials and information by lowering costs and improving value adds. This blog posts shares some tips for navigating your DevOps journey through Value stream management techniques such as Value Stream Mapping.

Introducing Our Newest CDF Ambassador – Alex Jones

By Blog, Staff

Hello folks,

My name is Alex and I am excited to be a part of the Continuous Delivery Foundation community. I believe that the work we do here will help the world deliver faster through interoperability of technology and foundational component governance.

I am an open source contributor to many projects and also advocate for the work of the Cloud Native Computing Foundation. I hold it a great privilege to help give a platform for those without a voice and act as a force multiplier to create opportunities for others.

I have worked for the past thirteen years at Microsoft, BSkyB, Blinkbox, Beamery and more. Working as both an individual contributor and engineering leader.

Now I am an engineering director at American Express.
I work on large scale serverless runtime observability, proliferating DevOps practices and hybrid-cloud cluster development on hyper-converged infrastructure.

It is my hope that I can bring real end-user community desires and feedback to the governance body of the CDF to help us understand and shape the activities that we invest time in. Equally I am looking forward to working as an ambassador to help the community gain tangible benefits of a collaborative vendor-neutral continuous delivery ecosystem.

– Alex

Near my home in the beautiful south of England.

Introducing Our Newest CDF Ambassador – Hector Calderon

By Blog, Staff

Hi All,

I am Hector Calderon, an engineer, architect, or insert open source loving nerd adjective here. I have worked with companies of all sizes from small to enterprise. With a vendor/cloud agnostic mentality, I have had the chance to work with many open source projects, including all of the graduate CDF projects. When I am not trying to automate myself out of a position I am out exploring the world with my dog, Piper. 

My delivery journey started a few years ago when I was assigned to optimize RPM libraries for modernizing mainframes.. Yes, MAINFRAMES. It’s actually a lot cooler than it sounds and I had the opportunity to work with some of the best engineers I know to this day. 

Over the years and various companies, I noticed the most important requirement for delivering success is largely based upon the mentality of the team and company (below is a meme we put on the back of our CI team shirts). Note, huge fan of Napoleon Hill’s QQMA formula – quality, quantity, and mental attitude. 

At the end of the day everybody wants a high-quality product at a fast speed with least amount of overhead. 

I have not been much of a contributor in the past, but now I am looking to give back to the community that has taught me so much. My objective as Ambassador is to make delivery easier for everyone. Feel free to reach out.

Stay safe and stay tuned for my next post.

Join Us at Spinnaker Live TODAY!

By Announcement, Blog, Staff

Find out why 2020 is the year of Spinnaker at https://Spinnaker.Live on June 18th at 9:00am PDT. Learn how enterprises accelerate with open source Spinnaker at this Linux Foundation virtual conference co-hosted by the CD Foundation and Armory.

Register now!

“The CD Foundation seeks to improve the world’s capacity to deliver software with security and speed,” said Rosalind Benoit, Director of Community at Armory, and Chair of the CD Foundation Outreach Committee. “Spinnaker.Live speaks to everyone invested in software delivery collaboration and automation. Open source is powered by connections, and this event is to meet, connect, and hear great stories. Please bring your energy and ideas to this incredible global community!”

Spinnaker is a free and open source continuous delivery software platform developed by Netflix and Google to create tailor-made, collaborative continuous delivery pipelines. With unique multi-cloud building blocks, it integrates all the tools, approvals, and infrastructure needed to automate an enterprise software delivery lifecycle.

Spinnaker is housed under the CD Foundation umbrella at the Linux Foundation. It is a Founding Project of the CD Foundation. 

Continued Growth in 2020

Spinnaker is continuing to grow in 2020, boasting more contributors and more Pull Requests than ever before.

Key statistics for 2020

  • Q1 2020 was the first quarter since Spinnaker was open sourced that the project had at least 2 new contributors each week
  • Of the 1,183 contributors to Spinnaker in the last year, 464, or 40%, contributed in Q1 2020
  • Merged Pull Requests have skyrocketed in 2020. These are the code and documentation contributions that the project accepts and incorporates.
    • Average since open sourced: 399/month
    • Average in the last 12 months: 605/month
      • Previous high was 656 (March 2019, 1.6x the average since being open sourced)
    • February 674 (1.7x avg)
    • March 891 (2.2x avg)
    • April 962 (2.5x avg)
    • May 755 (1.9x avg)

Notable Amazon Support

Spinnaker has been implemented widely with well known companies like Adobe, AirBnb, Autodesk, Comcast, Salesforce, SAP, and many more using Spinnaker to handle the software delivery life cycle. Of note, Amazon Web Service (AWS) has dramatically increased contributions to Spinnaker in 2020. 

Up-to-date statistics are available on Devstats. They show a strong spike coming into 2020 in AWS contributions, with pull requests in recent months more than tripling 2019’s monthly highs. Amazon has stated publicly that they are backing Spinnaker due to strong enterprise customer demand. 

AWS will be prominently represented at Spinnaker.Live with a keynote, breakout session, panel, and use case talks from AWS experts and companies who deploy software to AWS. Don’t miss it!

Register Now!

Let’s Continuous CI/CD in China!

By Blog, Staff

We held the first CI/CD Meetup in China on February 29th. There were more than 5000 attendees who joined us together.  It is time to do more CI/CD, so we will hold the second CI/CD Meetup in China on June 19th.

There will be four topics focused on: Pipeline, CI, SCM, and Automation Testing. 

1. Build a dedicated pipeline engine: Jenkins shared library in-depth design and practice

Combining actual business, design patterns, and Jenkins features step by step, tells how to support large-scale, complex pipelined practice based on Jenkins SharedLibrary expansion library: from how native solutions solve actual business needs to structured design to solve atoms. The minimum unit) abstraction and basic capabilities are enhanced, and then to dynamic refactoring to build a pipeline execution engine.

Gu Zheng

JD Software Architect

2. Construction of engineering efficiency data in Continuous Integration

Engineering performance data is an important tool for improving R&D efficiency. Through years of tool construction, Didi has already possessed a relatively complete R&D tool chain, and the improvement of engineering efficiency has shifted from the original single-point capability optimization to the optimization of the entire process. As the most important continuous integration and continuous release process in the entire R&D process, how to perceive and improve it through data driving will be the key to continuously improve the engineering efficiency.

Zhou Fan

Didi Chuxing, Development Tool Team Leader

Personal profile: Graduated from Beijing University of Posts and Telecommunications in 2007 with a bachelor’s degree and a master’s degree in communications engineering. After graduation, he joined Google directly. He has worked in the Beijing office and the US headquarters for more than 10 years. Gained experience in development, testing and operation and maintenance. In 2018, he joined Didi Chuxing as the head of the R&D tool team, leading the team to improve the company’s R&D tools and engineering efficiency.

3. Ways to improve software quality: code review

Code review is a common topic, so why exactly do code review, how to do it, how to improve efficiency, those that can be manual, those that can be automated, what detailed considerations, let industry veterans take you around.

Li Peng

Senior experts in SCM and DevOps

Personal profile

With more than 20 years of experience in the software industry, he has comprehensive experience in development, operation, maintenance and management. He has worked in many companies such as Xin’an Century, Motorola, Ericsson, Alpha Motors, Horizon Robotics and so on. Familiar with the industry’s CMMI, TL9000, ISO9001: 2000 standards, proficient in various development methodologies; proficient in software quality management, configuration management, CICD, automated testing and other key aspects of DevOps.

4. Test environment, practice and implementation of full-stack DevOps platform

Introduce the development and testing side of Minsheng Bank, the DevOps platform architecture, core capabilities, and implementation. There are currently more than 4,000 active users on the DevOps platform; more than 240 supported projects; full-stack support for back-end, front-end and client.

Hu Wenan

Minsheng Bank, DevOps architect

Personal profile:

Hu Wenan, DevOps Architect of Minsheng Bank, is responsible for organization-level R&D specifications and process formulation, as well as the planning and construction of the PaaS platform and DevOps platform.

Register Now (in Chinese): https://www.bagevent.com/event/6518004?bag_track=bagevent

Kata Containers in Screwdriver

By Blog, Project

Written by Lakshminarasimhan Parthasarathy

Introduction

Screwdriver is a scalable CI/CD solution which uses Kubernetes to manage user builds. Screwdriver build workers interfaces with Kubernetes using either “executor-k8s” or “executor-k8s-vm” depending on required build isolation. 

executor-k8s runs builds directly as Kubernetes pods while executor-k8s-vm  uses HyperContainers along with Kubernetes for stricter build isolation with containerized Virtual Machines (VMs). This setup was ideal for running builds in an isolated, ephemeral, and lightweight environment. However, hyperd is now deprecated, has no support, is based on an older Docker runtime and it also required non-native Kubernetes setup for build execution. Therefore, it was time to find a new solution.

Why Kata Containers ?

Kata Containers is an open source project and community that builds a standard implementation of lightweight virtual machines (VMs) that perform like containers, but provide the workload isolation and security advantages of VMs. It combines the benefits of using a hypervisor, such as enhanced security, along with container orchestration capabilities provided by Kubernetes. It is the same team behind HyperD where they successfully merged the best parts of Intel Clear Containers with Hyper.sh RunV. As a Kubernetes runtime, Kata enables us to deprecate executor-k8s-vm and use executor-k8s exclusively for all Kubernetes based builds.

Screwdriver Journey to Kata

As we faced a growing number of instabilities with the current HyperD – like network and devicemapper issues and IP cleanup workarounds, we started our initial evaluation of Kata in early 2019 (https://github.com/screwdriver-cd/screwdriver/issues/818#issuecomment-482239236) and identified two major blockers to move ahead with Kata:

1. Security concern for privileged mode (required to run docker daemon in kata)

2. Disk performance. 

We recently started reevaluating Kata in early 2020 based on a fix to “add flag to overload default privileged host device behaviour” provided by Containerd/cri (https://github.com/containerd/cri/pull/1225), but still we faced issues with disk performance and switched from overlayfs to devicemapper, which yielded significant improvement. With our two major blockers resolved and initial tests with Kata looking promising, we moved ahead with Kata.

Screwdriver Build Architecture

Replacing Hyper with Kata led to a simpler build architecture. We were able to remove the custom build setup scripts to launch Hyper VM and rely on native Kubernetes setup. 

Setup

To use Kata containers for running user builds in a Screwdriver Kubernetes build cluster, a cluster admin needs to configure Kubernetes to use Containerd container runtime with Cri-plugin.

Components

Screwdriver build Kubernetes cluster (minimum version: 1.14+) nodes must have the following components set up for using Kata containers for user builds. 

Containerd:

Containerd is a container runtime that helps with management of the complete lifecycle of the container.

Reference: https://containerd.io/docs/getting-started/

CRI-Containerd plugin:

Cri-Containerd is a containerd plugin which implements Kubernetes container runtime interface. CRI plugin interacts with containerd to manage the containers.

Reference: https://github.com/containerd/cri

Image credit: containerd / cri. Photo licensed under CC-BY-4.0.

Architecture:

Image credit: containerd / cri. Photo licensed under CC-BY-4.0

Installation:

Reference: 

https://github.com/containerd/cri/blob/master/docs/installation.md

https://github.com/containerd/containerd/blob/master/docs/ops.md

Tarball: https://storage.googleapis.com/cri-containerd-release/cri-containerd-1.3.3.linux-amd64.tar.gz

Crictl:

To debug, inspect, and manage their pods, containers, and container images.

Reference: https://github.com/containerd/cri/blob/master/docs/crictl.md

Kata:

Builds lightweight virtual machines that seamlessly plugin to the containers ecosystem.

Architecture:

Image credit: kata-containers Project licensed under Apache License Version 2.0

Installation:

  1. https://github.com/kata-containers/documentation/blob/master/Developer-Guide.md#run-kata-containers-with-kubernetes
  2. https://github.com/kata-containers/documentation/blob/master/how-to/containerd-kata.md
  3. https://github.com/kata-containers/documentation/blob/master/how-to/how-to-use-k8s-with-cri-containerd-and-kata.md
  4. https://github.com/kata-containers/documentation/blob/master/how-to/containerd-kata.md#kubernetes-runtimeclass
  5. https://github.com/kata-containers/documentation/blob/master/how-to/containerd-kata.md#configuration

Routing builds to Kata nodes in Screwdriver build cluster

Screwdriver uses Runtime Class to route builds to Kata nodes in Screwdriver build clusters. The Screwdriver plugin executor-k8s config handles this based on: 

  1. Pod configuration:
apiVersion: v1
kind: Pod
metadata:
  name: kata-pod
  namespace: sd-build-namespace
  labels:
    sdbuild: "sd-kata-build"
    app: screwdriver
    tier: builds
spec:
  runtimeClassName: kata
  containers:
  - name: "sd-build-container"
    image: <<image>>
    imagePullPolicy: IfNotPresent
  1. Update  the plugin to use k8s in your buildcluster-queue-worker configuration

---
executor:
    # Default executor
    plugin: k8s
    k8s:
      exclusion:
        - 'rhel6'
      weightage: 0
      options:
        kubernetes:
            # The host or IP of the kubernetes cluster
            host: kubernetes.default
            # Privileged mode, default restricted, set to true for trusted container runtime use-case
            privileged: false
            automountServiceAccountToken: false
            dockerFeatureEnabled: false
            resources:
                cpu:
                    # Number of cpu cores
                    micro: "0.5"
                    low: 2
                    high: 6
                    turbo: 12
                memory:
                    # Memory in GB
                    micro: 1
                    low: 2
                    high: 12
                    turbo: 16
            # Default build timeout for all builds in this cluster
            buildTimeout: 90
            # Default max build timeout
            maxBuildTimeout: 120
            # k8s node selectors for approprate pod scheduling
            nodeSelectors: {"dedicated":"screwdriver-kata"}
            preferredNodeSelectors: {}
            annotations: {}
            # support for kata-containers-as-a-runtimeclass
            runtimeClass: "kata"
        # Launcher image to use
        launchImage: screwdrivercd/launcher
        # Container tags to use
        launchVersion: stable
        # Circuit breaker config
        fusebox:
            breaker:
                # in milliseconds
                timeout: 10000
        # requestretry configs
        requestretry:
            # in milliseconds
            retryDelay: 3000
            maxAttempts: 5

Production rollout

  1. Test out the new setup with pilot users
  2. Route a percentage of traffic to Kata nodes using the weightage configuration
  3. Based on the limitation “Kata default guest kernel does not support IA32 bit binaries”, maintain a list of containers to exclude; only route builds to nodes with Kata when the container is not in the list

Performance

The below tables compare build setup and overall execution time for Kata and Hyper when the image is pre-cached or not cached.

Image: node12with Image cached in nodeKata (with 1 min wait in build)Hyper (with 1 min wait in build)
Setup time28 secs50 secs
Overall execution time1 min 32 secs1 min 56 secs
Image: node12without Image cached in nodeKata (with 1 min wait in build)HyperD (with 1 min wait in build)
Setup time51 secs1 min 32 secs
Overall time1 min 55 secs2 min 40 secs

Known problems

While the new Kata implementation offers a lot of advantages, there are some known problems we are aware of with fixes or workarounds:

  1. Run images based on Rhel6 containers don’t start and immediately exit 
  1. Yum install will hang forever
Before fixAfter fix
sh-4.1# time yum remove wget -yreal 6m22.190suser 2m38.387ssys 3m38.619s
sh-4.1# time yum install wget -yreal 6m23.407suser 2m39.387ssys 3m42.606s
sh-4.1# time yum remove wget -yreal 0m4.774suser 0m0.783ssys 0m0.123s
sh-4.1# time yum install wget -yreal 0m2.169suser 0m1.760ssys 0m0.298s
  1. 32-bit executable cannot be loaded refer kata issue  https://github.com/kata-containers/runtime/issues/886 
  • To workaround/mitigate we maintain a container exclusion list and route to current hyperd setup and we have plans to eol these containers by Q4 of this year.
  1. Containerd IO snapshotter – Overlayfs vs devicemapper for storage driver
  • Devicemapper gives better performance with kata
OverlayfsDevicemapper
1024000000 bytes (976.6MB) copied, 19.325605 seconds, 50.5MB/s1024000000 bytes (976.6MB) copied, 5.860671 seconds, 166.6MB/s
  1. Image stored in both sys-root and devicemapper volume, consuming both volume disk space 

Compatibility List

In order to use this feature, you will need these minimum versions:

Contributors

Thanks to the following contributors for making this feature possible:

Questions & Suggestions

We’d love to hear from you. If you have any questions, please feel free to reach out here. You can also visit us on Github and Slack.