Skip to main content
Category

Project

Kata Containers in Screwdriver

By Blog, Project

Written by Lakshminarasimhan Parthasarathy

Introduction

Screwdriver is a scalable CI/CD solution which uses Kubernetes to manage user builds. Screwdriver build workers interfaces with Kubernetes using either “executor-k8s” or “executor-k8s-vm” depending on required build isolation. 

executor-k8s runs builds directly as Kubernetes pods while executor-k8s-vm  uses HyperContainers along with Kubernetes for stricter build isolation with containerized Virtual Machines (VMs). This setup was ideal for running builds in an isolated, ephemeral, and lightweight environment. However, hyperd is now deprecated, has no support, is based on an older Docker runtime and it also required non-native Kubernetes setup for build execution. Therefore, it was time to find a new solution.

Why Kata Containers ?

Kata Containers is an open source project and community that builds a standard implementation of lightweight virtual machines (VMs) that perform like containers, but provide the workload isolation and security advantages of VMs. It combines the benefits of using a hypervisor, such as enhanced security, along with container orchestration capabilities provided by Kubernetes. It is the same team behind HyperD where they successfully merged the best parts of Intel Clear Containers with Hyper.sh RunV. As a Kubernetes runtime, Kata enables us to deprecate executor-k8s-vm and use executor-k8s exclusively for all Kubernetes based builds.

Screwdriver Journey to Kata

As we faced a growing number of instabilities with the current HyperD – like network and devicemapper issues and IP cleanup workarounds, we started our initial evaluation of Kata in early 2019 (https://github.com/screwdriver-cd/screwdriver/issues/818#issuecomment-482239236) and identified two major blockers to move ahead with Kata:

1. Security concern for privileged mode (required to run docker daemon in kata)

2. Disk performance. 

We recently started reevaluating Kata in early 2020 based on a fix to “add flag to overload default privileged host device behaviour” provided by Containerd/cri (https://github.com/containerd/cri/pull/1225), but still we faced issues with disk performance and switched from overlayfs to devicemapper, which yielded significant improvement. With our two major blockers resolved and initial tests with Kata looking promising, we moved ahead with Kata.

Screwdriver Build Architecture

Replacing Hyper with Kata led to a simpler build architecture. We were able to remove the custom build setup scripts to launch Hyper VM and rely on native Kubernetes setup. 

Setup

To use Kata containers for running user builds in a Screwdriver Kubernetes build cluster, a cluster admin needs to configure Kubernetes to use Containerd container runtime with Cri-plugin.

Components

Screwdriver build Kubernetes cluster (minimum version: 1.14+) nodes must have the following components set up for using Kata containers for user builds. 

Containerd:

Containerd is a container runtime that helps with management of the complete lifecycle of the container.

Reference: https://containerd.io/docs/getting-started/

CRI-Containerd plugin:

Cri-Containerd is a containerd plugin which implements Kubernetes container runtime interface. CRI plugin interacts with containerd to manage the containers.

Reference: https://github.com/containerd/cri

Image credit: containerd / cri. Photo licensed under CC-BY-4.0.

Architecture:

Image credit: containerd / cri. Photo licensed under CC-BY-4.0

Installation:

Reference: 

https://github.com/containerd/cri/blob/master/docs/installation.md

https://github.com/containerd/containerd/blob/master/docs/ops.md

Tarball: https://storage.googleapis.com/cri-containerd-release/cri-containerd-1.3.3.linux-amd64.tar.gz

Crictl:

To debug, inspect, and manage their pods, containers, and container images.

Reference: https://github.com/containerd/cri/blob/master/docs/crictl.md

Kata:

Builds lightweight virtual machines that seamlessly plugin to the containers ecosystem.

Architecture:

Image credit: kata-containers Project licensed under Apache License Version 2.0

Installation:

  1. https://github.com/kata-containers/documentation/blob/master/Developer-Guide.md#run-kata-containers-with-kubernetes
  2. https://github.com/kata-containers/documentation/blob/master/how-to/containerd-kata.md
  3. https://github.com/kata-containers/documentation/blob/master/how-to/how-to-use-k8s-with-cri-containerd-and-kata.md
  4. https://github.com/kata-containers/documentation/blob/master/how-to/containerd-kata.md#kubernetes-runtimeclass
  5. https://github.com/kata-containers/documentation/blob/master/how-to/containerd-kata.md#configuration

Routing builds to Kata nodes in Screwdriver build cluster

Screwdriver uses Runtime Class to route builds to Kata nodes in Screwdriver build clusters. The Screwdriver plugin executor-k8s config handles this based on: 

  1. Pod configuration:
apiVersion: v1
kind: Pod
metadata:
  name: kata-pod
  namespace: sd-build-namespace
  labels:
    sdbuild: "sd-kata-build"
    app: screwdriver
    tier: builds
spec:
  runtimeClassName: kata
  containers:
  - name: "sd-build-container"
    image: <<image>>
    imagePullPolicy: IfNotPresent
  1. Update  the plugin to use k8s in your buildcluster-queue-worker configuration

---
executor:
    # Default executor
    plugin: k8s
    k8s:
      exclusion:
        - 'rhel6'
      weightage: 0
      options:
        kubernetes:
            # The host or IP of the kubernetes cluster
            host: kubernetes.default
            # Privileged mode, default restricted, set to true for trusted container runtime use-case
            privileged: false
            automountServiceAccountToken: false
            dockerFeatureEnabled: false
            resources:
                cpu:
                    # Number of cpu cores
                    micro: "0.5"
                    low: 2
                    high: 6
                    turbo: 12
                memory:
                    # Memory in GB
                    micro: 1
                    low: 2
                    high: 12
                    turbo: 16
            # Default build timeout for all builds in this cluster
            buildTimeout: 90
            # Default max build timeout
            maxBuildTimeout: 120
            # k8s node selectors for approprate pod scheduling
            nodeSelectors: {"dedicated":"screwdriver-kata"}
            preferredNodeSelectors: {}
            annotations: {}
            # support for kata-containers-as-a-runtimeclass
            runtimeClass: "kata"
        # Launcher image to use
        launchImage: screwdrivercd/launcher
        # Container tags to use
        launchVersion: stable
        # Circuit breaker config
        fusebox:
            breaker:
                # in milliseconds
                timeout: 10000
        # requestretry configs
        requestretry:
            # in milliseconds
            retryDelay: 3000
            maxAttempts: 5

Production rollout

  1. Test out the new setup with pilot users
  2. Route a percentage of traffic to Kata nodes using the weightage configuration
  3. Based on the limitation “Kata default guest kernel does not support IA32 bit binaries”, maintain a list of containers to exclude; only route builds to nodes with Kata when the container is not in the list

Performance

The below tables compare build setup and overall execution time for Kata and Hyper when the image is pre-cached or not cached.

Image: node12with Image cached in nodeKata (with 1 min wait in build)Hyper (with 1 min wait in build)
Setup time28 secs50 secs
Overall execution time1 min 32 secs1 min 56 secs
Image: node12without Image cached in nodeKata (with 1 min wait in build)HyperD (with 1 min wait in build)
Setup time51 secs1 min 32 secs
Overall time1 min 55 secs2 min 40 secs

Known problems

While the new Kata implementation offers a lot of advantages, there are some known problems we are aware of with fixes or workarounds:

  1. Run images based on Rhel6 containers don’t start and immediately exit 
  1. Yum install will hang forever
Before fixAfter fix
sh-4.1# time yum remove wget -yreal 6m22.190suser 2m38.387ssys 3m38.619s
sh-4.1# time yum install wget -yreal 6m23.407suser 2m39.387ssys 3m42.606s
sh-4.1# time yum remove wget -yreal 0m4.774suser 0m0.783ssys 0m0.123s
sh-4.1# time yum install wget -yreal 0m2.169suser 0m1.760ssys 0m0.298s
  1. 32-bit executable cannot be loaded refer kata issue  https://github.com/kata-containers/runtime/issues/886 
  • To workaround/mitigate we maintain a container exclusion list and route to current hyperd setup and we have plans to eol these containers by Q4 of this year.
  1. Containerd IO snapshotter – Overlayfs vs devicemapper for storage driver
  • Devicemapper gives better performance with kata
OverlayfsDevicemapper
1024000000 bytes (976.6MB) copied, 19.325605 seconds, 50.5MB/s1024000000 bytes (976.6MB) copied, 5.860671 seconds, 166.6MB/s
  1. Image stored in both sys-root and devicemapper volume, consuming both volume disk space 

Compatibility List

In order to use this feature, you will need these minimum versions:

Contributors

Thanks to the following contributors for making this feature possible:

Questions & Suggestions

We’d love to hear from you. If you have any questions, please feel free to reach out here. You can also visit us on Github and Slack.

Jenkins & Spinnaker: Tale As Old As Screen Time

By Blog, Project

CDF Newsletter – May 2020 Article
Subscribe to the Newsletter

By Rosalind Benoit

Don’t worry. As long as you hit that wire with the connecting hook at precisely eighty-eight miles per hour the instant the lightning strikes the tower…everything will be fine.

– Dr. Emmett Brown, “Back To The Future”

If you’re reading this, you’ve probably experienced the feeling of your heart racing — hopefully with excitement, but more likely, with anxiety — as a result of your involvement in the software development lifecycle (SDLC). At most organizations, artifacts must traverse a complex network of teams, tools, and constraints to come into being and arrive in production. As software becomes more and more vital to social connection and economic achievement, we feel the pressure to deliver transformational user experiences.

No company has influenced human expectations for reliably delightful software experiences more than Netflix. After 10 years of supporting large-scale logistics workloads with its mail-order business, Netflix launched an addictive streaming service in 2007. It soon experienced SDLC transformation at an uncommonly rapid pace, and at massive scale. After pioneering a new entertainment standard, Netflix survived and innovated through all the learnings that come with growth.

We’ll soon have one more reason to be glad it did; Back to the Future arrives on Netflix May 1!

https://www.youtube.com/watch?v=KqYvQchlriY

Jenkins at Netflix

You may know Netflix as the birthplace of open source Spinnaker, but it is also a perennial Jenkins user. As early cloud adopters, Netflix teams quickly learned to automate build and test processes, and heavily leveraged Jenkins, evolving from “a single massive Jenkins master in our datacenter, to running 25 Jenkins masters in AWS” as of 2016. 

Jenkins changed the software development and delivery game by freeing teams from rigid, inflexible build processes and moving them into continuous integration. With test and build automation, “it works on my laptop” became a moot point. A critical leap for software-centric businesses like Netflix, this ignited a spark of the possible. 

As Jenkins became an open source standard, engineers leveraged it to prove the power of software innovation, and the difference that velocity makes to improving user experiences and business outcomes. This approachable automation still works, and most of us still use it, over 15 years after its first release. 

Over time, Netflix teams found it increasingly difficult to meet velocity, performance, and reliability demands when deploying their code to AWS with Jenkins alone. Too much technical debt had accumulated in their Jenkins and its scripts, and developers, feeling the anxiety, craved more deployment automation features. So, Netflix began to build the tooling that evolved into today’s Spinnaker. 

Spinnaker & Delegation

Much like what Jenkins did for testing and integration, Spinnaker has done for release automation. It allows us to stitch together the steps required to safely deliver updates and features to production; it delegates pipeline stages to systems across the toolchain, from build and test, to monitoring, compliance, and more. Spinnaker increasingly uses its plugin framework to integrate tools. However, its foundational Jenkins integration exists natively, using triggers to pick up artifacts from it, and stages to delegate tasks to it. With property files to pass data for use in variables further down the pipeline, and concepts like Jenkins’ “unstable build” built in, Spinnaker can leverage the power of existing Jenkins assets. 

Then, out of the box, Spinnaker adds the “secret sauce” pioneered by companies like Netflix to deliver the software experiences users now expect. With Spinnaker, you can skip change approval meetings by adding manual judgments to pipelines where human decisions are required. You can perform hotfixes with confidence and limit the blast radius of experiments by using automated canary analysis or your choice of deployment strategy. Enjoy these features when deploying code or functions to any cloud and/or Kubernetes, without maintaining custom scripts to architect pipelines. 

As a developer, I found that I had the best experience using Jenkins for less complicated jobs and pipelines; even with much of the process defined as code, I didn’t always have enough context to fully understand the progression of the artifact or debug. Since joining the Spinnaker community, I’ve learned to rely on Jenkins stages for discrete steps like applying a Chef cookbook or signalling a Puppet run. I can manage these steps from Spinnaker, where, along with deployment strategies and native infrastructure dashboards, I can also experiment with data visualization using tools like SumoLogic, and even run terraform code. 

It’s simple to get started with the integration. I use Spinnaker’s Halyard tool to add my Jenkins master, and boom:

If Jenkins is a Swiss Army knife, Spinnaker is a magnetic knife strip. Their interoperability story is the story of continuous delivery’s evolution, and allows us to use the right tool for the right job:

  • Jenkins: not only do I have all the logic and capability needed to perform your testing, integration, and deployment steps, I’m also an incredibly flexible tool with a plugin for every special need of every development team under the sun. I’m game for any job!
  • Spinnaker: not only can I give your Jenkins jobs a context-rich home, I also delegate to all your other SDLC tools, and visualize the status and output of each. My fancy automation around deployment verifications, windows, and strategies makes developers happy and productive!

My first real experience with DevOps was a Jenkins talk delivered by Tracy Ragan at a conference in Albuquerque, where I worked as an (anxious) sysadmin for learning management systems at UNM. It’s amazing to have come full circle and joined the CDF landscape as a peer from a fellow member company. I look forward to aiding the interoperability story as it unfolds in our open source ecosystem. We’re confident the tale will transform software delivery, yet again. 

Join Spinnaker Slack to connect with other DevOps professionals using Jenkins and Spinnaker to deliver software with safety and velocity!

From Jenkins – Google Summer of Code 2019 Report

By Blog, Project

Originally posted on the Jenkins blog by Martin d’AnjouJeff PearceOleg NenashevMarky Jackson

Google Summer of Code is much more than a summer internship program, it is a year-round effort for the organization and some community members. Now, after the DevOps World | Jenkins World conference in Lisbon and final retrospective meetings, we can say that GSoC 2019 is officially over. We would like to start by thanking all participants: students, mentors, subject matter experts and all other contributors who proposed project ideas, participated in student selection, in community bonding and in further discussions and reviews. Google Summer of Code is a major effort which would not be possible without the active participation of the Jenkins community.

In this blogpost we would like to share the results and our experience from the previous year.

Results

Five GSoC projects were successfully completed this year: Role Strategy Plugin Performance ImprovementsPlugins Installation Manager CLI Tool/LibraryWorking Hours Plugin – UI ImprovementsRemoting over Apache Kafka with Kubernetes featuresMulti-branch Pipeline support for Gitlab SCM. We will talk about the projects a little later in the document.

Highlights

Project details

We held the final presentations as Jenkins Online Meetups in late August and Google published the results on Sept 3rd. The final presentations can be found here: Part 1Part 2Part 3. We also presented the 2019 Jenkins GSoC report at the DevOps World | Jenkins World San Francisco and at the DevOps World | Jenkins World 2019 Lisbon conferences.

In the following sections, we present a brief summary of each project, links to the coding phase 3 presentations, and to the final products.

Role Strategy Plugin Performance Improvements

Role Strategy Plugin is one of the most widely used authorization plugins for Jenkins, but it has never been famous for performance due to architecture issues and regular expression checks for project roles. Abhyudaya Sharma was working on this project together with hist mentors: Oleg NenashevRunze Xia and Supun Wanniarachchi. He started the project from creating a new Micro-benchmarking Framework for Jenkins Plugins based on JMH, created benchmarks and achieved a 3501% improvement on some real-world scenarios. Then he went further and created a new Folder-based Authorization Strategy Plugin which offers even better performance for Jenkins instances where permissions are scoped to folders. During his project Abhyudaya also fixed the Jenkins Configuration-as-Code support in Role Strategy and contributed several improvements and fixes to the JCasC Plugin itself.

Role strategy performance improvements

Plugins Installation Manager CLI Tool/Library

Natasha Stopa was working on a new CLI tool for plugin management, which should unify features available in other tools like install-plugins.sh in Docker images. It also introduced many new features like YAML configuration format support, listing of available updates and security fixes. The newly created tool should eventually replace the previous ones. Natasha’s mentors: Kristin WhetstoneJon Brohauge and Arnab Banerjee. Also, many contributors from Platform SIG and JCasC plugin team joined the project as a key stakeholders and subject-matter experts.

Plugin Manager Tool YAML file

Working Hours Plugin – UI Improvements

Jenkins UI and frontend framework are a common topic in the Jenkins project, especially in recent months after the new UX SIG was established. Jack Shen was working on exploring new ways to build Jenkins Web UI together with his mentor Jeff Pearce. Jack updated the Working Hours Plugin to use UI controls provided by standard React libraries. Then he documented his experienced and created template for plugins with React-based UI.

Web UI controls in React

Remoting over Apache Kafka with Kubernetes features

Long Le Vu Nguyen was working on extended Kubernetes support in the Remoting over Apache Kafka Plugin. His mentors were Andrey Falco and Pham vu Tuan who was our GSoC 2018 student and the plugin creator. During this project Long has added a new agent launcher which provisions Jenkins agents in Kubernetes and connects them to the master. He also created a Cloud API implementation for it and a new Helm chart which can provision Jenkins as entire system in Kubernetes, with Apache Kafka enabled by default. All these features were released in Remoting over Apache Kafka Plugin 2.0.

Jenkins in Kubernetes with Apache Kafka

Multi-branch Pipeline support for Gitlab SCM

Parichay Barpanda was working on the new GitLab Branch Source Plugin with Multi-branch Pipeline Jobs and Folder Organisation support. His mentors were Marky Jackson-TauliaJustin HarringaZhao Xiaojie and Joseph Petersen. The plugin scans the projects, importing the pipeline jobs it identifies based on the criteria provided. After a project is imported, Jenkins immediately runs the jobs based on the Jenkinsfile pipeline script and notifies the status to GitLab Pipeline Status. This plugin also provides GitLab server configuration which can be configured in Configure System or via Jenkins Configuration as Code (JCasC). read more about this project in the GitLab Branch Source 1.0 announcement.

Gitlab Multi-branch Pipeline support

Projects which were not completed

Not all projects have been completed this year. We were also working on Artifact Promotion plugin for Jenkins Pipeline and on Cloud Features for External Workspace Manager Plugin, but unfortunately both projects were stopped after coding phase 1. Anyway, we got a lot of experience and takeaways in these areas (see linked Jira tickets!. We hope that these stories will be implemented by Jenkins contributors at some point. Google Summer of Code 2020 maybe?

Running the GSoC program at our organization level

Here are some of the things our organization did before and during GSoC behind the scenes. To prepare for the influx of students, we updated all our GSoC pages and wrote down all the knowledge we accumulated over the years of running the program. We started preparing in October 2018, long before the official start of the program. The main objective was to address the feedback we got during GSoC 2018 retrospectives.

Project ideas. We started gathering project ideas in the last months of 2018. We prepared a list of project ideas in a Google doc, and we tracked ownership of each project in a table of that document. Each project idea was further elaborated in its own Google doc. We find that when projects get complicated during the definition phase, perhaps they are really too complicated and should not be done.

Since we wanted all the project ideas to be documented the same way, we created a template to guide the contributors. Most of the project idea documents were written by org admins or mentors, but occasionally a student proposed a genuine idea. We also captured contact information in that document such as GitHub and Gitter handles, and a preliminary list of potential mentors for the project. We embedded all the project documents on our website.

Mentor and student guidelines. We updated the mentor information page with details on what we expect mentors to do during the program, including the number of hours that are expected from mentors, and we even have a section on preventing conflict of interest. When we recruit mentors, we point them to the mentor information page.

We also updated the student information page. We find this is a huge time saver as every student contacting us has the same questions about joining and participating in the program. Instead of re-explaining the program each time, we send them a link to those pages.

Application phase. Students started to reach out very early on as well, many weeks before GSoC officially started. This was very motivating. Some students even started to work on project ideas before the official start of the program.

Project selection. This year the org admin team had some very difficult decisions to make. With lots of students, lots of projects and lots of mentors, we had to request the right number of slots and try to match the projects with the most chances of success. We were trying to form mentor teams at the same time as we were requesting the number of slots, and it was hard to get responses from all mentors in time for the deadline. Finally we requested fewer slots than we could have filled. When we request slots, we submit two numbers: a minimum and a maximum. The GSoC guide states that:

  • The minimum is based on the projects that are so amazing they really want to see these projects occur over the summer,
  • and the maximum number should be the number of solid and amazing projects they wish to mentor over the summer.

We were awarded minimum. So we had to make very hard decisions: we had to decide between “amazing” and “solid” proposals. For some proposals, the very outstanding ones, it’s easy. But for the others, it’s hard. We know we cannot make the perfect decision, and by experience, we know that some students or some mentors will not be able to complete the program due to uncontrollable life events, even for the outstanding proposals. So we have to make the best decision knowing that some of our choices won’t complete the program.

Community Bonding. We have found that the community bonding phase was crucial to the success of each project. Usually projects that don’t do well during community bonding have difficulties later on. In order to get students involved in the community better, almost all projects were handled under the umbrella of Special Interest Groups so that there were more stakeholders and communications.

Communications. Every year we have students who contact mentors via personal messages. Students, if you are reading this, please do NOT send us personal messages about the projects, you will not receive any preferential treatment. Obviously, in open source we want all discussions to be public, so students have to be reminded of that regularly. In 2019 we are using Gitter chat for most communications, but from an admin point of view this is more fragmented than mailing lists. It is also harder to search. Chat rooms are very convenient because they are focused, but from an admin point of view, the lack of threads in Gitter makes it hard to get an overview. Gitter threads were added recently (Nov 2019) but do not yet work well on Android and iOS. We adopted Zoom Meetings towards the end of the program and we are finding it easier to work with than Google Hangouts.

Status tracking. Another thing that was hard was to get an overview of how all the projects were doing once they were running. We made extensive use of Google sheets to track lists of projects and participants during the program to rank projects and to track statuses of project phases (community bonding, coding, etc.). It is a challenge to keep these sheets up to date, as each project involves several people and several links. We have found it time consuming and a bit hard to keep these sheets up to date, accurate and complete, especially up until the start of the coding phase.

Perhaps some kind of objective tracking tool would help. We used Jenkins Jira for tracking projects, with each phase representing a separate sprint. It helped a lot for successful projects. In our organization, we try to get everyone to beat the deadlines by a couple of days, because we know that there might be events such as power outages, bad weather (happens even in Seattle!), or other uncontrolled interruptions, that might interfere with submitting project data. We also know that when deadlines coincide with weekends, there is a risk that people may forget.

Retrospective. At the end of our project, we also held a retrospective and captured some ideas for the future. You can find the notes here. We already addressed the most important comments in our documentation and project ideas for the next year.

Recognition

Last year, we wanted to thank everyone who participated in the program by sending swag. This year, we collected all the mailing addresses we could and sent to everyone we could the 15-year Jenkins special edition T-shirt, and some stickers. This was a great feel good moment. I want to personally thank Alyssa Tong her help on setting aside the t-shirt and stickers.

swag before shipping

Mentor summit

Each year Google invites two or more mentors from each organization to the Google Summer of Code Mentor Summit. At this event, hundreds of open-source project maintainers and mentors meet together and have unconference sessions targeting GSoC, community management and various tools. This year the summit was held in Munich, and we sent Marky Jackson and Oleg Nenashev as representatives there.

Apart from discussing projects and sharing chocolate, we also presented Jenkins there, conducted a lightning talk and hosted the unconference session about automation bots for GitHub. We did not make a team photo there, so try to find Oleg and Marky on this photo:

GSoC2019 Mentor summit

GSoC Team at DevOps World | Jenkins World

We traditionally use GSoC organization payments and travel grants to sponsor student trips to major Jenkins-related events. This year four students traveled to the DevOps World | Jenkins World conferences in San-Francisco and Lisbon. Students presented their projects at the community booth and at the contributor summits, and their presentations got a lot of traction in the community!

Thanks a lot to Google and CloudBees who made these trips possible. You can find a travel report from Natasha Stopa here, more travel reports are coming soon.

gsoc2019 team jw us
gsoc2019 team jw lisbon

Conclusion

This year, five projects were successfully completed. We find this to be normal and in line with what we hear from other participating organizations.

Taking the time early to update our GSoC pages saved us a lot of time later because we did not have to repeat all the information every time someone contacted us. We find that keeping track of all the mentors, the students, the projects, and the meta information is a necessary but time consuming task. We wish we had a tool to help us do that. Coordinating meetings and reminding participants of what needs to be accomplished for deadlines is part of the cheerleading aspect of GSoC, we need to keep doing this.

Lastly, I want to thank again all participants, we could not do this without you. Each year we are impressed by the students who do great work and bring great contributions to the Jenkins community.

GSoC 2020?

Yes, there will be Google Summer of Code 2020! We plan to participate, and we are looking for project ideas, mentors and students. Jenkins GSoC pages have been already updated towards the next year, and we invite everybody interested to join us next year!

Jenkins X and Kubernetes-native OSS Integration and Extension

By Blog, Project

CDF Newsletter – May 2020 Article
Subscribe to the Newsletter

By Kara de la Marck

Jenkins X is an automated CI/CD platform built on Kubernetes. Jenkins X enables users to harness the power of Kubernetes without needing to be Kubernetes experts. How does a CI/CD platform do this? Jenkins X forms an abstraction layer over Kubernetes, simplifying the developer experience of building, deploying, and running Kubernetes applications. Under the hood, Jenkins X combines best-of-breed open source tools, creating a Kubernetes-native CI/CD platform that facilitates developer and GitOps best practices. 

In this post, we’ll look at how Jenkins X uses Kubernetes Custom Resource Definitions (CRDs) and the Kubernetes API to bring together these best-of-breed open source projects, creating a cutting edge continuous delivery platform on Kubernetes. We’ll highlight two Kubernetes design principles that help us understand how Jenkins X natively extends Kubernetes:

  • Kubernetes API is declarative
  • Kubernetes has no hidden APIs

Kubernetes itself is decomposed into multiple components which interact through the Kubernetes API. Kubernetes’ declarative, API driven infrastructure enables it to be composable and extensible.

Kubernetes API is declarative

The Kubernetes API is declarative rather than imperative: as a user, you declare the desired state of your application and the Kubernetes system drives to make it so. One important benefit of this is automatic recovery. If something happens to your application, for example, a node crashes, then Kubernetes will restore the desired state.

Kubernetes has no hidden APIs

The Kubernetes API is exposed by the Kubernetes API server, which is a component of the Kubernetes control plane. The Kubernetes control plane is transparent in that there are no hidden internal APIs in Kubernetes: Kubernetes components interact through the same API that Kubernetes exposes to its users.

A declarative, API driven infrastructure

Kubernetes’ declarative, API driven infrastructure means that components, such as nodes, talk to the Kubernetes API server to figure out what their state ought to be. Instead of having the decision centralised and sent out, each node is responsible for its own health, and figuring out its desired behaviour. If a node fails and is brought back up, the newly created node can query the API server to figure out what it’s supposed to do.

The declarative way the Kubernetes API server communicates with remote nodes is in contrast to traditional client – server relationships, where the client tells the server what to do in an imperative manner and the server does it. Building the Kubernetes API server this way would have meant it grew as more functionality was added; the API server would have been brittle and difficult to extend.

Kubernetes is using a pattern called level triggered, which is generally opposed to edge triggered. In edge triggered systems the system responds to events, but if the system doesn’t receive an event, then the event needs to be replayed for the system to recover.

“If you are edge triggered you run risk of compromising your state and never being able to re-create the state. If you are level triggered the pattern is very forgiving, and allows room for components not behaving as they should to be rectified. This is what makes Kubernetes work so well.”

Joe Beda, as quoted in Cloud Native Infrastructure, by Justin Garrison and Kris Nova 

In Kubernetes, if any component goes down, when it comes back up, it requests the desired state from the Kubernetes API server and works to match that state. Components that can recover in this way tend to be more robust and the overall system is more reliable. This is especially true in distributed systems, where there are so many components in the system that the expectation is that there will always be components failing. Distributed systems need to be designed to tolerate the failure of components. If your system has one central manager component, which tells all the parts of the system what they should be doing, and that central manager component goes down, your system is down. Distributing that responsibility, so every component can figure out what it should be doing, makes the system more reliable. No longer is there a single point of failure. 

What happens when the Kubernetes API server, which acts as a central point, goes down? All the components will continue to operate on the last information they received. When the API server comes back up, the components will then operate on the new state if there were any changes. If any of the components go down, the other components can continue to function independently of that failure. When failed components come back up, they can read the state they should work towards from the API server.

These design choices make Kubernetes reliable. They also make Kubernetes very composable and extensible. Because all components use the same Kubernetes API as you do as an end user, you can replace any default component with your own. You can also add new components to enable new functionality. This extensibility has helped create a vibrant ecosystem of Kubernetes-native open source projects that like Jenkins X are built on Kubernetes using Kubernetes resources and the Kubernetes API machinery.

Custom Resource Definitions (CRDs)

Kubernetes is extended through Custom Resource Definitions (CRDs). A Kubernetes resource is an endpoint in the Kubernetes API that stores API objects of a certain type. Kubernetes uses API objects to represent the state of your cluster. 

To create your own custom Kubernetes API object type, define a new CRD of your type and define the schema. Then you can create your own objects against the Kubernetes API server. In this way, a custom resource extends the Kubernetes API: creating CRDs is like embedding your own APIs inside Kubernetes itself. To use the custom API objects you have created, you write your own custom controllers that act on the data contained in your custom object types. Kubernetes controllers are the mechanism by which Kubernetes reconciles the state state of your cluster to the state declared in the Kubernetes API.

How do CRDs relate to Kubernetes built-in types? Tim Hockin, co-founder of the Kubernetes project, has said, “If we had CRDs on day zero of Kubernetes there would be no built-in types.” If CRDs had existed from the start, pods and nodes and everything else would also be a CRD! 

If they weren’t part of the original design, why were CRDs created? CRDs were first created as a way to extend Kubernetes functionality to enable rapid prototyping. 

“That’s what fascinates me about CRD. It started as a prototyping tool. K8s API machinery was not intended to be a framework, but that is what shook out. If we did that intentionally we would have messed it up.”

– Tim Hockin, Twitter

It’s extremely interesting that CRDs, which started as a prototyping mechanism, are now the main resource definition mechanism in Kubernetes. This enables Kubernetes to be more modular, and many core Kubernetes functions are now built using custom resources. 

The Kubernetes API machinery is now distilled such that it can be used as API machinery for any project, not just Kubernetes. The extensible nature of the Kubernetes API enables higher level applications and platforms to be built on Kubernetes. Jenkins X  runs directly on Kubernetes, uses the Kubernetes API, and defines CRDs for its workflow. Moreover, the same Kubernetes API machinery that makes Kubernetes extensible also enables Kubernetes-native applications to integrate well with each other. Jenkins X both creates its own CRDs and integrates with other Kubernetes-native applications through the Kubernetes API to form a Kubernetes-native CI/CD platform.

Jenkins X High Level Architecture:

As seen in the diagram above, Jenkins X integrates with a number of open source projects such as TektonProw, and Vault, among others, to create an automated Kubernetes-native CI/CD platform. Jenkins X relies on CRDs to create new resources and extend the Kubernetes API. The Kubernetes API machinery enables Jenkins X to integrate with other open source projects through the Kubernetes API server.

Tekton, the pipeline execution engine for Jenkins X

Tekton is the pipeline execution engine for Jenkins X. Like Jenkins X, Tekton is Kubernetes-native and extends Kubernetes using CRDs. Jenkins X leverages Prow, or Jenkins X’s own Lighthouse, to signal to Tekton to run builds. Lighthouse is a lightweight webhook handler, which listens for Git webhook events and uses them to trigger Tekton PipelineRun CRDs for Tekton to use to perform builds. Tekton then generates a status update which Jenkins X communicates back to source code management providers, such as GitHub. 

The integration between Jenkins X as a CI/CD platform and Tekton as the execution engine for Jenkins X happens within Kubernetes using CRDs and the Kubernetes API. That both projects are Kubernetes-native enables them to seamlessly integrate using the Kubernetes API machinery.

“Tekton Pipelines lets us power Jenkins X’s execution and management of pipelines natively within Kubernetes.”

 – Andrew Bayer, Software Engineer, CloudBees, and creator of Jenkins X Pipeline Syntax

Tekton Pipelines for CD Interoperability

By Blog, Project

CDF Newsletter – May 2020 Article
Subscribe to the Newsletter

Contributed By Eric Sorenson

Tekton is a project that evolved from an internal Google tool that used Knative to build and deploy software. In 2018, it was spun out as an independent project and donated to the Continuous Delivery Foundation.

The core component, Tekton Pipelines, runs as a controller in a Kubernetes cluster. It registers several custom resource definitions which represent the basic Tekton objects with the Kubernetes API server, so the cluster knows to delegate requests containing those objects to Tekton. These primitives are fundamental to the way Tekton works. Tekton’s building block approach starts with the smallest atom of work, the Step, aggregates Steps together in Tasks, and aggregates Tasks together in Pipelines. 

If the nomenclature here feels confusing, don’t feel bad — it is complicated! Each tool in the space uses slightly different terms; this is something we’re working on standardizing in the CDF Interoperability SIG. We’d love your input – here’s how to participate! Tekton’s usage of these terms is clarified in the sig-interop Vocabulary definitions doc:

* *Step*: a specific function to perform.

* *Task*: is a collection of sequential steps you would want to run as part of your continuous integration flow. A task will run inside a pod on your cluster.

* *ClusterTask*: Similar to Task, but with a cluster scope.

* *Pipeline*: stateless, reusable, parameterized collection of tasks. Tasks are linked together in a Pipeline, which describes the end-to-end deployment for an application.

* *PipelineRun*: an instantiation of a Pipeline definition, filling in the Pipeline’s parameters with concrete values

* *Pipeline Resource*: objects that will be input to or output from tasks

* *Trigger*: Triggers is a Kubernetes Custom Resource Definition (CRD) controller that allows you to extract information from event payloads (a “trigger”) to create Kubernetes resources.

Notable omissions from the CRD list are “Steps”, which don’t have their own CRD because they’re the smallest unit of execution which are always contained inside a Task. The Conditions and Dashboard Extension CRDs are still optional and experimental — but very exciting!

Tekton’s approach is particularly interesting from a tool interoperability standpoint. By focusing on these building blocks and the concrete representation of them as declarative configuration, Tekton creates a standard platform for CD in the same way that Kubernetes provides a platform for application runtimes. This allows user-facing tools to build on the platform rather than reinventing these primitives. Several projects have already taken up this approach:

* Jenkins X uses Tekton as its execution engine. It’s been an option for a while now, but recently the project announced it was moving to using Tekton exclusively. Jenkins X provides pipeline definitions and gitops workflows that are tailored for cloud-native CD.

* Kabanero is a project that enables teams to develop and deploy applications on Kubernetes, so architects can provide pre-approved application stacks for developers to work from. It uses Tekton Pipelines and several associated projects like Tekton Dashboard and Triggers; indeed the developers building the Dashboard are largely working on Kabanero and the IBM Cloud Devops Pipeline product.

* Relay by Puppet is a hosted service that uses Tekton as the execution engine for event-triggered devops and deployment workflows. (Full disclosure, this is the product I am working on!) It provides a YAML dialect for building workflows that can be triggered by external events, via API, or manually, to automate tasks that need to stitch together different tools and services.

* TriggerMesh have integrated Tekton Pipelines into their TriggerMesh Cloud project and are working on a tool called Aktion to translate Github Actions into Tekton Pipelines.

* There are more, too! Check out the Tekton Friends repo for a longer list of projects and end users building on Tekton.

As exciting as this activity is, I think it’s important to note there’s still a lot of work to be done. There’s a distinct difference between two projects both using Tekton as a common upstream platform and achieving interoperability between them! It’s a big problem and it’s easy to get overwhelmed with the magnitude of the whole thing. One of my earliest lessons when I moved from SRE into product management was: focus first on solving the pain points which end users feel most acutely. That can be some combination of pervasiveness (what percent of the overall user base feels it?) and severity (how bad is each individual incident?) – ideally, fix the thing which is worst on both axes! From an end user’s standpoint, CD tools have a pretty steep learning curve with a bunch of pitfalls. A sampling of these severe-and-pervasive pitfalls I’ve heard from our users as we’ve been building Relay:

* How do I wrap my head around the terminology and technology so I can get started?

* How do I integrate the parts of the build/test/deploy toolchain my organization needs to continue using?

* How do I operate (upgrade, monitor, troubleshoot) the tool once it’s up and running?

Interoperability isn’t a cure-all, but there are definitely areas where it could work like a soothing balm on all of this pain. Industry-standard terminology or at a minimum, an authoritative Rosetta Stone for CD, could help. At the moment, there’s still pockets of debate on whether the “D” stands for Deployment or Delivery! (It’s “Delivery”, folks – when you mean “Deployment” you have to spell it out.)

Going deeper, it’d be hugely helpful help users integrate the tools they’re already using into a new framework. A wide ecosystem of steps that could be used by any of the containerized CD tools – not just those based on Tekton but, for example, Spinnaker and Keptn as well – would have a number of benefits. For end users, it would increase the amount of content available “out of the box”, meaning they would have less work to integrate the tools and services they need. Ideally, no end-user should have to create a step from scratch because there’s a vast, easily discoverable library of things that accomplish the job they have. There’s also a benefit to maintainers of services and tools that end-users want, like Kaniko, Gradle, and the cloud services, who have to build an integration with each execution framework themselves or rely on the community to do it. Building and maintaining one reusable implementation would reduce the maintenance burden and allow them to provide higher quality. 

To put on my Tekton advocate hat for a moment, its well-defined container contract makes it easy to use general-purpose containers in your pipeline. If you want to take advantage of more specialized features the framework provides, the Tekton Catalog has a number of high-quality examples to build from. There are improvements on the way to aid the discoverability and reuse parts of the problem, such as the exciting new Tekton Hub donated by Red Hat. 

The operability concerns are a real problem for CD pipeline tools, too. Although CD is usually associated with development, in many organizations the tool itself is considered a production service, because if there are problems committing, building, testing, and shipping code, the engineering organization isn’t delivering value. Troubleshooting byzantine failures in complex CI/CD pipelines is a specialized discipline requiring skills that span Quality Engineering, SRE, and Development. The more resilient the CD tools are architected, and the more standard their interfaces for reporting availability and performance metrics, the easier that troubleshooting becomes. 

Again, to address these from Tekton’s perspective, a huge benefit of running on Kubernetes is that the Tekton services that run in the cluster can take advantage of all the powerful k8s operability features. So fundamental capabilities that are highly valuable to operators and troubleshooters like log aggregation, in-place upgrades, error reporting, and scale-out all ride on top of the Kubernetes infrastructure. It’s not “for free” of course; nothing in distributed systems is ever truly “for free” and if anyone tries to tell you otherwise, the thing they’re selling you is probably *very* expensive. But it does mean that general-purpose Kubernetes skills and tooling goes a long way towards operating Tekton at scale, rather than having to relearn or reimplement them at the application layer.

In conclusion, I’m excited that the interoperability conversation is well underway at the CDF. There’s a long way to go, but the amount of activity and progress in the space is very encouraging. If you’re interested in pitching in to discuss and solve these kinds of problems, please feel free to join in #sig-interoperability channel on the CDF slack or check out the contribution information

From Jenkins – GitHub App authentication support released

By Blog, Project

Originally posted on the Jenkins blog by Tim Jacomb

I’m excited to announce support for authenticating as a GitHub app in Jenkins. This has been a long awaited feature by many users.

It has been released in GitHub Branch Source 2.7.0-beta1 which is available in the Jenkins experimental update center.

Authenticating as a GitHub app brings many benefits:

  • Larger rate limits – The rate limit for a GitHub app scales with your organization size, whereas a user based token has a limit of 5000 regardless of how many repositories you have.
  • User-independent authentication – Each GitHub app has its own user-independent authentication. No more need for ‘bot’ users or figuring out who should be the owner of 2FA or OAuth tokens.
  • Improved security and tighter permissions – GitHub Apps offer much finer-grained permissions compared to a service user and its personal access tokens. This lets the Jenkins GitHub app require a much smaller set of privileges to run properly.
  • Access to GitHub Checks API – GitHub Apps can access the the GitHub Checks API to create check runs and check suites from Jenkins jobs and provide detailed feedback on commits as well as code annotation

Getting started

Install the GitHub Branch Source plugin, make sure the version is at least 2.7.0-beta1. Installation guidelines for beta releases are available here

Configuring the GitHub Organization Folder

Follow the GitHub App Authentication setup guide. These instructions are also linked from the plugin’s README on GitHub.

Once you’ve finished setting it up, Jenkins will validate your credential and you should see your new rate limit. Here’s an example on a large org:

GitHub app rate limit

How do I get an API token in my pipeline?

In addition to usage of GitHub App authentication for Multi-Branch Pipeline, you can also use app authentication directly in your Pipelines. You can access the Bearer token for the GitHub API by just loading a ‘Username/Password’ credential as usual, the plugin will handle authenticating with GitHub in the background.

This could be used to call additional GitHub API endpoints from your pipeline, possibly the deployments api or you may wish to implement your own checks api integration until Jenkins supports this out of the box.

Note: the API token you get will only be valid for one hour, don’t get it at the start of the pipeline and assume it will be valid all the way through

Example: Let’s submit a check run to Jenkins from our Pipeline:

pipeline {
  agent any

  stages{
    stage('Check run') {
      steps {
        withCredentials([usernamePassword(credentialsId: 'githubapp-jenkins',
                                          usernameVariable: 'GITHUB_APP',
                                          passwordVariable: 'GITHUB_JWT_TOKEN')]) {
            sh '''
            curl -H "Content-Type: application/json" \
                 -H "Accept: application/vnd.github.antiope-preview+json" \
                 -H "authorization: Bearer ${GITHUB_JWT_TOKEN}" \
                 -d '{ "name": "check_run", \
                       "head_sha": "'${GIT_COMMIT}'", \
                       "status": "in_progress", \
                       "external_id": "42", \
                       "started_at": "2020-03-05T11:14:52Z", \
                       "output": { "title": "Check run from Jenkins!", \
                                   "summary": "This is a check run which has been generated from Jenkins as GitHub App", \
                                   "text": "...and that is awesome"}}' https://api.github.com/repos/<org>/<repo>/check-runs
            '''
        }
      }
    }
  }
}

What’s next

GitHub Apps authentication in Jenkins is a huge improvement. Many teams have already started using it and have helped improve it by giving pre-release feedback. There are more improvements on the way.

There’s a proposed Google Summer of Code project: GitHub Checks API for Jenkins Plugins. It will look at integrating with the Checks API, with a focus on reporting issues found using the warnings-ng plugin directly onto the GitHub pull requests, along with test results summary on GitHub. Hopefully it will make the Pipeline example below much simpler for Jenkins users 🙂 If you want to get involved with this, join the GSoC Gitter channel and ask how you can help.

Tekton: Wow! We’re beta now!

By Blog, Project

You may have heard that Tekton Pipelines is now beta! That’s not beta like the video format but beta like Kubernetes! Okay I’ll stop trying to make jokes, because compatibility is no laughing matter for folks who want to build on top of and use Tekton, and that’s why we’ve declared beta, so that you can feel more confident in using it.

What exactly does beta mean for Tekton?

So what does beta mean exactly? It means for Tekton what it means for Kubernetes, and it boils down to two things:

  1. Features that are beta will not be removed; they might change but you can count on the features themselves sticking around
  2. Backwards incompatible changes to the API will be avoided; if they do have to happen you will be given at least 9 months worth of releases to migrate to the new way of doing things

You might be wondering what “the API” means in this context – good question! It’s the specifications of the CRDs themselves and runtime details like the special directories that Tekton makes.

Not all of Tekton is beta however! Right now it’s just Tekton Pipelines and it’s only the following CRDs:

  • Tasks, ClusterTasks and TaskRuns
  • Pipelines and PipelineRuns


This means that other types that you might like, such as Conditions and PipelineResources (see the next section!) are still alpha and don’t (yet!) have the same beta level guarantees.

You can always refer to our API compatibility docs in our repo if you forget!

What about PipelineResources?

What about them indeed! If you are part of the Tekton community, you’ll know that we keep going back and forth on our love/hate-able PipelineResources – the feature you love until it doesn’t work.

A few months ago, our “difficult to understand, hard to debug” friend was challenged by the community: what would the Tekton world look like without PipelineResources? And when we went on that journey, we discovered features which PipelineResources gave us which were super useful on their own:

So we focused on adding those features and brought them to beta. In the meantime, we keep asking the question: do we still need PipelineResources? And what would they look like if redesigned with workspaces and results? We’re still asking those questions and that’s why PipelineResources aren’t beta (yet)!

We know some users really love them: “There are dozens of us,” – @dlorenc. So we haven’t given up on them yet, and there are some things that you just still can’t do well without them: for example, how do you consistently represent artifacts such as images moving through Pipelines? You can’t! So the investigation continues.

In the meantime, we’ve made Task equivalents of some of our PipelineResources in the Tekton catalog, such as PullRequests, GCS, and git.

Tekton Website is Live Now!

Hooray! Our shiny new site is live! Right this way -> https://tekton.dev/

Tekton Documentation is now hosted on the website at https://tekton.dev/docs/. And interactive tutorials are hosted at https://tekton.dev/try/. There is just one interactive tutorial hosted right now but more are in process to get published, so watch this space!

What’s coming up next?

We’re hard at work on more nifty Tekton stuff to make your CI/CD Pipelines more powerful and more portable by achieving Tekton’s mission:

Be the industry-standard, cloud-native CI/CD platform components and ecosystem.

Check out more on our mission and our 2020 roadmap in our community repo.

THANK YOU!!! ❤️

Thanks to all of the many amazing contributors who have gotten us to this point! The list below is people credited in Tekton Pipelines release notes, but for the complete list of everyone contributing to Tekton check out our devstats!

From Jenkins – Validating JCasC configuration files using Visual Studio Code

By Blog, Project

Originally posted on the Jenkins Blog by Sladyn Nunes

Configuration-as-code plugin

Problem Statement: Convert the existing schema validation workflow from the current scripting language in the Jenkins Configuration as Code Plugin to a Java based rewrite thereby enhancing its readablity and testability supported by a testing framework for the same. Enhance developer experience by developing a VSCode Plugin to facilitate autocompletion and validation which would help the developer write correct yaml files before application to a Jenkins Instance.

The Configuration as Code plugin has been designed as an opinionated way to configure Jenkins based on human-readable declarative configuration files. Writing such a file should be feasible without being a Jenkins expert, just translating into code a configuration process one is used to executing in the web UI. The plugin uses a schema to verify the files being applied to the Jenkins instance.

With the new JSON Schema being enabled developers can now test their yaml file against it. The schema checks the descriptors i.e. configuration that can be applied to a plugin or Jenkins core, the correct type is used and help text is provided in some cases. VSCode allows us to test out the schema right out of the box with some modifications. This project was built as part of the Community Bridge initiative which is a platform created by the Linux Foundation to empower developers — and the individuals and companies who support them — to advance sustainability, security, and diversity in open source technology. You can take a look at the Jenkins Community Bridge Project Page

Steps to Enable the Schema Validation

a) The first step includes installing the JCasC Plugin for Visual Studio Code and opening up the extension via the extension list. Shortcut for opening the extension list in VSCode editor using Ctrl + Shift + X.

b) In order to enable validation we need to include it in the workspace settings. Navigate to File and then Preference and then Settings. Inside settings search for json and inside settings.json include the following configuration.

{
"yaml.schemas": {
        "schema.json": "y[a]?ml"
    }
}

You can specify a glob pattern as the value for schema.json which is the file name for the schema. This would apply the schema to all yaml files. eg: .[y[a]?ml]

c) The following tasks can be done using VSCode:

a) Auto completion (Ctrl + Space):
  Auto completes on all commands.
b) Document Outlining (Ctrl + Shift + O):
Provides the document outlining of all completed nodes in the file.

d) Create a new file under the work directory called jenkins.yml. For example consider the following contents for the file:

jenkins:
  systemMessage: “Hello World”
  numExecutors: 2
  1. The above yaml file is valid according to the schema and vscode should provide you with validation and autocompletion for the same.

Screenshots

vscode
userDocs1
userDocs2

We are holding an online meetup on the 26th February regarding this plugin and how you could use it to validate your YAML configuration files. For any suggestions or dicussions regarding the schema feel free to join our gitter channel. Issues can be created on Github.