Skip to main content


Join Us at Spinnaker Live TODAY!

By Announcement, Blog, Staff

Find out why 2020 is the year of Spinnaker at https://Spinnaker.Live on June 18th at 9:00am PDT. Learn how enterprises accelerate with open source Spinnaker at this Linux Foundation virtual conference co-hosted by the CD Foundation and Armory.

Register now!

“The CD Foundation seeks to improve the world’s capacity to deliver software with security and speed,” said Rosalind Benoit, Director of Community at Armory, and Chair of the CD Foundation Outreach Committee. “Spinnaker.Live speaks to everyone invested in software delivery collaboration and automation. Open source is powered by connections, and this event is to meet, connect, and hear great stories. Please bring your energy and ideas to this incredible global community!”

Spinnaker is a free and open source continuous delivery software platform developed by Netflix and Google to create tailor-made, collaborative continuous delivery pipelines. With unique multi-cloud building blocks, it integrates all the tools, approvals, and infrastructure needed to automate an enterprise software delivery lifecycle.

Spinnaker is housed under the CD Foundation umbrella at the Linux Foundation. It is a Founding Project of the CD Foundation. 

Continued Growth in 2020

Spinnaker is continuing to grow in 2020, boasting more contributors and more Pull Requests than ever before.

Key statistics for 2020

  • Q1 2020 was the first quarter since Spinnaker was open sourced that the project had at least 2 new contributors each week
  • Of the 1,183 contributors to Spinnaker in the last year, 464, or 40%, contributed in Q1 2020
  • Merged Pull Requests have skyrocketed in 2020. These are the code and documentation contributions that the project accepts and incorporates.
    • Average since open sourced: 399/month
    • Average in the last 12 months: 605/month
      • Previous high was 656 (March 2019, 1.6x the average since being open sourced)
    • February 674 (1.7x avg)
    • March 891 (2.2x avg)
    • April 962 (2.5x avg)
    • May 755 (1.9x avg)

Notable Amazon Support

Spinnaker has been implemented widely with well known companies like Adobe, AirBnb, Autodesk, Comcast, Salesforce, SAP, and many more using Spinnaker to handle the software delivery life cycle. Of note, Amazon Web Service (AWS) has dramatically increased contributions to Spinnaker in 2020. 

Up-to-date statistics are available on Devstats. They show a strong spike coming into 2020 in AWS contributions, with pull requests in recent months more than tripling 2019’s monthly highs. Amazon has stated publicly that they are backing Spinnaker due to strong enterprise customer demand. 

AWS will be prominently represented at Spinnaker.Live with a keynote, breakout session, panel, and use case talks from AWS experts and companies who deploy software to AWS. Don’t miss it!

Register Now!

Let’s Continuous CI/CD in China!

By Blog, Staff

We held the first CI/CD Meetup in China on February 29th. There were more than 5000 attendees who joined us together.  It is time to do more CI/CD, so we will hold the second CI/CD Meetup in China on June 19th.

There will be four topics focused on: Pipeline, CI, SCM, and Automation Testing. 

1. Build a dedicated pipeline engine: Jenkins shared library in-depth design and practice

Combining actual business, design patterns, and Jenkins features step by step, tells how to support large-scale, complex pipelined practice based on Jenkins SharedLibrary expansion library: from how native solutions solve actual business needs to structured design to solve atoms. The minimum unit) abstraction and basic capabilities are enhanced, and then to dynamic refactoring to build a pipeline execution engine.

Gu Zheng

JD Software Architect

2. Construction of engineering efficiency data in Continuous Integration

Engineering performance data is an important tool for improving R&D efficiency. Through years of tool construction, Didi has already possessed a relatively complete R&D tool chain, and the improvement of engineering efficiency has shifted from the original single-point capability optimization to the optimization of the entire process. As the most important continuous integration and continuous release process in the entire R&D process, how to perceive and improve it through data driving will be the key to continuously improve the engineering efficiency.

Zhou Fan

Didi Chuxing, Development Tool Team Leader

Personal profile: Graduated from Beijing University of Posts and Telecommunications in 2007 with a bachelor’s degree and a master’s degree in communications engineering. After graduation, he joined Google directly. He has worked in the Beijing office and the US headquarters for more than 10 years. Gained experience in development, testing and operation and maintenance. In 2018, he joined Didi Chuxing as the head of the R&D tool team, leading the team to improve the company’s R&D tools and engineering efficiency.

3. Ways to improve software quality: code review

Code review is a common topic, so why exactly do code review, how to do it, how to improve efficiency, those that can be manual, those that can be automated, what detailed considerations, let industry veterans take you around.

Li Peng

Senior experts in SCM and DevOps

Personal profile

With more than 20 years of experience in the software industry, he has comprehensive experience in development, operation, maintenance and management. He has worked in many companies such as Xin’an Century, Motorola, Ericsson, Alpha Motors, Horizon Robotics and so on. Familiar with the industry’s CMMI, TL9000, ISO9001: 2000 standards, proficient in various development methodologies; proficient in software quality management, configuration management, CICD, automated testing and other key aspects of DevOps.

4. Test environment, practice and implementation of full-stack DevOps platform

Introduce the development and testing side of Minsheng Bank, the DevOps platform architecture, core capabilities, and implementation. There are currently more than 4,000 active users on the DevOps platform; more than 240 supported projects; full-stack support for back-end, front-end and client.

Hu Wenan

Minsheng Bank, DevOps architect

Personal profile:

Hu Wenan, DevOps Architect of Minsheng Bank, is responsible for organization-level R&D specifications and process formulation, as well as the planning and construction of the PaaS platform and DevOps platform.

Register Now (in Chinese):

Kata Containers in Screwdriver

By Blog, Project

Written by Lakshminarasimhan Parthasarathy


Screwdriver is a scalable CI/CD solution which uses Kubernetes to manage user builds. Screwdriver build workers interfaces with Kubernetes using either “executor-k8s” or “executor-k8s-vm” depending on required build isolation. 

executor-k8s runs builds directly as Kubernetes pods while executor-k8s-vm  uses HyperContainers along with Kubernetes for stricter build isolation with containerized Virtual Machines (VMs). This setup was ideal for running builds in an isolated, ephemeral, and lightweight environment. However, hyperd is now deprecated, has no support, is based on an older Docker runtime and it also required non-native Kubernetes setup for build execution. Therefore, it was time to find a new solution.

Why Kata Containers ?

Kata Containers is an open source project and community that builds a standard implementation of lightweight virtual machines (VMs) that perform like containers, but provide the workload isolation and security advantages of VMs. It combines the benefits of using a hypervisor, such as enhanced security, along with container orchestration capabilities provided by Kubernetes. It is the same team behind HyperD where they successfully merged the best parts of Intel Clear Containers with RunV. As a Kubernetes runtime, Kata enables us to deprecate executor-k8s-vm and use executor-k8s exclusively for all Kubernetes based builds.

Screwdriver Journey to Kata

As we faced a growing number of instabilities with the current HyperD – like network and devicemapper issues and IP cleanup workarounds, we started our initial evaluation of Kata in early 2019 ( and identified two major blockers to move ahead with Kata:

1. Security concern for privileged mode (required to run docker daemon in kata)

2. Disk performance. 

We recently started reevaluating Kata in early 2020 based on a fix to “add flag to overload default privileged host device behaviour” provided by Containerd/cri (, but still we faced issues with disk performance and switched from overlayfs to devicemapper, which yielded significant improvement. With our two major blockers resolved and initial tests with Kata looking promising, we moved ahead with Kata.

Screwdriver Build Architecture

Replacing Hyper with Kata led to a simpler build architecture. We were able to remove the custom build setup scripts to launch Hyper VM and rely on native Kubernetes setup. 


To use Kata containers for running user builds in a Screwdriver Kubernetes build cluster, a cluster admin needs to configure Kubernetes to use Containerd container runtime with Cri-plugin.


Screwdriver build Kubernetes cluster (minimum version: 1.14+) nodes must have the following components set up for using Kata containers for user builds. 


Containerd is a container runtime that helps with management of the complete lifecycle of the container.


CRI-Containerd plugin:

Cri-Containerd is a containerd plugin which implements Kubernetes container runtime interface. CRI plugin interacts with containerd to manage the containers.


Image credit: containerd / cri. Photo licensed under CC-BY-4.0.


Image credit: containerd / cri. Photo licensed under CC-BY-4.0





To debug, inspect, and manage their pods, containers, and container images.



Builds lightweight virtual machines that seamlessly plugin to the containers ecosystem.


Image credit: kata-containers Project licensed under Apache License Version 2.0



Routing builds to Kata nodes in Screwdriver build cluster

Screwdriver uses Runtime Class to route builds to Kata nodes in Screwdriver build clusters. The Screwdriver plugin executor-k8s config handles this based on: 

  1. Pod configuration:
apiVersion: v1
kind: Pod
  name: kata-pod
  namespace: sd-build-namespace
    sdbuild: "sd-kata-build"
    app: screwdriver
    tier: builds
  runtimeClassName: kata
  - name: "sd-build-container"
    image: <<image>>
    imagePullPolicy: IfNotPresent
  1. Update  the plugin to use k8s in your buildcluster-queue-worker configuration

    # Default executor
    plugin: k8s
        - 'rhel6'
      weightage: 0
            # The host or IP of the kubernetes cluster
            host: kubernetes.default
            # Privileged mode, default restricted, set to true for trusted container runtime use-case
            privileged: false
            automountServiceAccountToken: false
            dockerFeatureEnabled: false
                    # Number of cpu cores
                    micro: "0.5"
                    low: 2
                    high: 6
                    turbo: 12
                    # Memory in GB
                    micro: 1
                    low: 2
                    high: 12
                    turbo: 16
            # Default build timeout for all builds in this cluster
            buildTimeout: 90
            # Default max build timeout
            maxBuildTimeout: 120
            # k8s node selectors for approprate pod scheduling
            nodeSelectors: {"dedicated":"screwdriver-kata"}
            preferredNodeSelectors: {}
            annotations: {}
            # support for kata-containers-as-a-runtimeclass
            runtimeClass: "kata"
        # Launcher image to use
        launchImage: screwdrivercd/launcher
        # Container tags to use
        launchVersion: stable
        # Circuit breaker config
                # in milliseconds
                timeout: 10000
        # requestretry configs
            # in milliseconds
            retryDelay: 3000
            maxAttempts: 5

Production rollout

  1. Test out the new setup with pilot users
  2. Route a percentage of traffic to Kata nodes using the weightage configuration
  3. Based on the limitation “Kata default guest kernel does not support IA32 bit binaries”, maintain a list of containers to exclude; only route builds to nodes with Kata when the container is not in the list


The below tables compare build setup and overall execution time for Kata and Hyper when the image is pre-cached or not cached.

Image: node12with Image cached in nodeKata (with 1 min wait in build)Hyper (with 1 min wait in build)
Setup time28 secs50 secs
Overall execution time1 min 32 secs1 min 56 secs
Image: node12without Image cached in nodeKata (with 1 min wait in build)HyperD (with 1 min wait in build)
Setup time51 secs1 min 32 secs
Overall time1 min 55 secs2 min 40 secs

Known problems

While the new Kata implementation offers a lot of advantages, there are some known problems we are aware of with fixes or workarounds:

  1. Run images based on Rhel6 containers don’t start and immediately exit 
  1. Yum install will hang forever
Before fixAfter fix
sh-4.1# time yum remove wget -yreal 6m22.190suser 2m38.387ssys 3m38.619s
sh-4.1# time yum install wget -yreal 6m23.407suser 2m39.387ssys 3m42.606s
sh-4.1# time yum remove wget -yreal 0m4.774suser 0m0.783ssys 0m0.123s
sh-4.1# time yum install wget -yreal 0m2.169suser 0m1.760ssys 0m0.298s
  1. 32-bit executable cannot be loaded refer kata issue 
  • To workaround/mitigate we maintain a container exclusion list and route to current hyperd setup and we have plans to eol these containers by Q4 of this year.
  1. Containerd IO snapshotter – Overlayfs vs devicemapper for storage driver
  • Devicemapper gives better performance with kata
1024000000 bytes (976.6MB) copied, 19.325605 seconds, 50.5MB/s1024000000 bytes (976.6MB) copied, 5.860671 seconds, 166.6MB/s
  1. Image stored in both sys-root and devicemapper volume, consuming both volume disk space 

Compatibility List

In order to use this feature, you will need these minimum versions:


Thanks to the following contributors for making this feature possible:

Questions & Suggestions

We’d love to hear from you. If you have any questions, please feel free to reach out here. You can also visit us on Github and Slack.

Introducing Our Newest CDF Ambassador – Oscar Medina

By Blog, Staff

Hi Everyone,

My name is Oscar Medina, and I am thrilled to be part of this fantastic community. I have spent over 22 years in the technology industry, and have seen things come and go.

One thing that excites me these days (aside from the plethora of outdoor activities), is the paradigm shift I’ve seen throughout my career when it comes to systems architecture.  

Microservices and container orchestration is not going away as other things have. This is why I am committed to spreading the word and helping educate folks on what the Continuous Delivery Foundation is all about.

Over the past 4.5 years or so, I have spent a lot of time in the open-source world. I am currently a Developer Advocate for the Jenkins X project, which is also now under the CD Foundation umbrella along with other projects such as Spinnaker, Jenkins, Tekton, and Jenkins X, of course.

I look forward to meeting you at different organized events, virtual or hopefully in person in the future.

Standup Paddle Boarding on Lake Tahoe, California

My coding buddy, Ginger loves the outdoors too!


Introducing Our Newest CDF Ambassador – BMK Lakshminarayanan

By Blog, Staff

Hello CDF Members,

I am BMK Lakshminarayanan from New Zealand. I am excited to join on the other great ambassadors’ line-up with Continuous Delivery Foundation (CDF) as a newly appointed CDF Ambassador.

About me:

I am a passionate Solutions Architect over 20 years of ICT experience working with Bank of New Zealand. I am a hands-on engineer, architect and worked on various challenging assignments ranging from desktop applications to distributed systems.

I am #DevOps #ContinuousDelivery advocate and evangelist for modern engineering & developer practices including helping developer productive, effective and efficient at the same time simple methods, approaches to software architecture.

I am passionate about sharing and learning with the community. Outside of my work, I run community groups and host CNCF New Zealand #meetup for Cloud-Native enthusiasts and The Future ICT to help students, people returning to work or looking for career opportunities in ICT.

I am also CNCF & DevOps Institue ambassador with a commitment to connecting the Humans of DevOps and Modern IT to advance the Skills, Knowledge, Ideas & Learning (SKIL). I am a frequent speaker at local meetups and international in-person & virtual conferences and also the Core organiser of DevOpsDays New Zealand and co-chair Cloud-Native Summit Wellington conferences. I am passionate about engaging, connecting & learning with various community members & leaders.

I am excited to associate with CDF as an Ambassador in promoting core values of CDF, open-source and vendor-neutral CI/CD tools. I am enthusiastic about the opportunity of contributing to and supporting the global community of CD Foundation & growing continuous delivery ecosystem.

Let us learn, share, care and grow together.

Please feel free to reach me out on LinkedIn or Twitter: @LBMKRISHNA

Introducing Our Newest CDF Ambassador – Eduardo Arango

By Blog, Staff

Carlos Eduardo Arango Gutierrez – Red Hat (

Eduardo is a performance engineer at Red Hat, working on the OpenShift performance & latency sensitive applications (PSAP) . Eduardo is also a Computer Science PhD student at Universidad del Valle, Cali, Colombia, working on containerized distributed systems for research computing, with high focus on automated workflows and GitOps.

His research interests include High Performance Computing, Distributed systems, Dependency management, Linux containers and most recently, Container orchestration. 

Over the past 5 years Eduardo has focused on enabling researchers to build and deploy performance sensitive applications with containers on distributed environments, by creating tutorials, talks and meetups around how to bridge Research computing and Cloud Native ecosystems.

Introducing Our Newest CDF Ambassador – Helen Beal

By Blog, Staff

Hello there! I’m Helen Beal, a new CD Foundation Ambassador from Chichester in the UK. I’m also Chief Ambassador at DevOps Institute so you can tell I’m a huge fan of the power of community. I’m also a DevOps coach, writer and speaker and strategic advisor. Books and words are a huge part of my life – I read constantly and have also written several novels with more in the pipeline. I also love playing Scrabble and Bananagrams.

This is me at home in Chichester, in the beautiful Priory Park. Behind me is the priory where William Blake was tried for sedition in 1803.

I’m really excited to have this opportunity to work closely with the Continuous Delivery community as it’s such an integral part of what we do in DevOps and I’ve been working with the software development lifecycle for my whole career, starting with Lotus Notes (remember that?!) in 1995! I write and speak about many different aspects of DevOps – recently I’ve been really focused on neuroscience in the workplace and value stream management. I also just did my first talk on the relationship between community and capitalism for TechStrongCon. Here’s a beautiful visual rendition of a recent talk by the wonderful MindsEyeCCF.

When I’m not DevOpsing, I tend to be out enjoying the beautiful British countryside. I’m a volunteer warden at a local nature reserve, Kingley Vale, where I pick up litter, ask people to put their dogs on leads and monitor species like the Chalk Hill Blue butterfly and one of our two UK snake species, the (lightly venomous) adder. Here’s me with a baby tawny owl.

I’m excited to be contributing to this community and meeting new people and learning new things. You can follow me on InfoQ here and Medium here. Find me on LinkedIn here and Twitter here.

Introducing Our Newest CDF Ambassador – Alexander Raul

By Blog, Staff

My name is Alexander Raul – and I am extremely happy to join the Continuous Delivery Foundation as a Community Ambassador!  

I am the CEO of Rackner, which is a cloud native consultancy focused on Kubernetes and Open Source – so my day to day is really driven by projects in the ecosystem.  Continuous Delivery is a piece which doesn’t get as much credit as it deserves and where there’s still plenty of work to be done.  

I am looking forward to introducing projects like Spinnaker and Tekton to developers all over the globe – and let’s be clear, Continuous Delivery should make the developer’s job easier while improving operational capability.  If it only does one of the two, there’s probably a better solution.

A case for declarative configurations for ML training

By Blog, Community

Contributed by Benedikt Koller

Original article posted on May 17, 2020

No way around it: I am what you call an “Ops guy”. In my career I admin’ed more servers than I’ve written code. Over twelve years in the industry have left their permanent mark on me. For the last two of those I’m exposed to a new beast – Machine Learning. My hustle is bringing Ops-Knowledge to ML. These are my thoughts on that.

Deploying software into production

Hundreds of thousands of companies deploy software into production every day. Every deployment mechanism has someone who built it. Whoever it was (The Ops Guy™, SRE-Teams, “Devops Engineers”), all follow tried-and-true paradigms. After all, the goal is to ship code often, in repeatable and reliable ways. Let me give you a quick primer on two of those.

Infrastructure-as-code (IaC)

Infrastructure as code, or IaC, applies software engineering rules to infrastructure management. The goal is to avoid environment drift, and to ensure idempotent operations. In plain words, read the infrastructure configuration and you’ll know exactly how the resulting environment looks like. You can rerun the provisioning without side effects, and your infrastructure has a predictable state. IaC allows for version-controlled evolution of infrastructures and quick provisioning of extra resources. It does so through declarative configurations.

Famous tools for this paradigm are Terraform, and to a large degree Kubernetes itself.

Immutable infrastructure

In conjunction with IaC, immutable infrastructure ensures the provisioned state is maintained. Someone ssh’ed onto your server? Its tainted – you have no guarantee that it still is in the identical shape to the rest of your stack. Interaction between a provisioned infrastructure and new code happens only through automation. Infrastructure, e.g. a Kubernetes cluster, is never modified after it’s provisioned. Updates, fixes and modifications are only possible through new deployments of your infrastructure.

Operational efficiency requires thorough automation and handling of ephemeral data. Immutable infrastructure mitigates config drift and snowflake server woes entirely.

ML development

Developing machine learning models works in different ways. In a worst case scenario, new models begin their “life” in a Jupyter Notebook on someones laptop. Code is not checked into git, there is no requirements file, and cells can be executed in any arbitrary order. Data exploration and preprocessing are intermingled. Training happens on that one shared VM with the NVIDIA K80, but someone messed with the CUDA drivers. Ah, and does anyone remember where I put those matplotlib-screenshots that showed the AUROC and MSE?

Getting ML models into production reliably, repeatedly and fast remains a challenge, and large data sets become a multiplying factor. The solution? Learn from our Ops-brethren.

We can extract key learnings from the evolution of infrastructure management and software deployments:

  1. Automate processing and provisioning
  2. Version-control states and instructions
  3. Write declarative configs

How can we apply them to a ML training flow?

Fetching data

Automate fetching of data. Declaratively define the datasource, the subset of data to use and then persist the results. Repeated experiments on the same source and subset can use the cached results.

Thanks to automation, fetching data can be rerun at any time. The results are persisted, so data can be versioned. And by reading the input configuration everyone can clearly tell what went into the experiment.

Splitting (and preprocessing data)

Splitting data can be standardized into functions.

  • Splitting happens on a quota, e.g. 70% into train, 30% into eval. Data might be sorted on an index, data might be categorized.
  • Splitting happens based on features/colums. Data might be categorized, Data might be sorted on an index.
  • Data might require preprocessing / feature engineering (e.g. filling, standardization).
  • A wild mix of the above.

Given those, we can define an interface and invoke processing through parameters – and use a declarative config. Persist the results so future experiments can warm-start.

Implementation of interfaces makes automated processing possible. The resulting train/eval datasets are versionable, and my input config is the declarative authority on the resulting state of the input dataset.



Standardizing models is hard. Higher-level abstractions like Tensorflow and Keras already provide comprehensive APIs, but complex architectures need custom code injection.

A declarative config will, at least, state which version-controlled code was used. Re-runs on the same input will deliver the same results, re-runs on different inputs can be compared. Automation of training will yield a version-controllable artefact – the model – of a declared and therefore anticipatable shape.


Surprisingly, this is the hardest to fully automate. The dataset and individual usecase define the required evaluation metrics. However, we can stand on the shoulders of giants. Great tools like Tensorboard and the What-If-Tool go a long way. Our automation just needs to account for enough flexibility that a.) custom metrics for evaluation can be injected, and b.) raw training results are exposed for custom evaluation means.


Serving is caught between the worlds. It would be easy to claim that a trained model is a permanent artifact, like you might claim that a Docker container acts as an artifact of software development. We can borrow another learning from software developers – if you don’t understand where your code is run, you don’t understand your code.

Only by understanding how a model is served will a ML training flow ever be complete. For one, data is prone to change. A myriad of reasons might be the cause, but the result remains the same: Models need to be retrained to account for data drift. In short, continuous training is required. Through the declarative configuration of our ML flow so far we can reuse this configuration and inject new data – and iterate on those new results.

For another, preprocessing might need embedding with your model. Automation lets us apply the same preprocessing steps used in training to live data, guaranteeing identical shape of input data.


Outside academia, performance of machine learning models is measured through impact – economically, or by increased efficiency. Only reliable and consistent results are true measures for the success of applied ML. We as a new and still growing part of software engineering have to make sure of this success. And the reproducibility of success hinges on the repeatability of the full ML development lifecycle.