Skip to main content


From Red Hat – Part 1: Building Multiarch imageStream with the NFD Operator and OpenShift 4

By Blog, Member

Originally posted on the OpenShift 4.3 blog by Eduardo Arango


Using general available packages (in the form of container images) from an official source or a certified provider comes with a big caveat in relation to performance-sensitive workloads. 

These packages may provide ABI compatibility, but they are not optimized for our specialized hardware (like GPUs or high-performance NICs), nor our CPU chip architecture. The best way to address this is to compile your packages (build your images) on your own deployment.

OpenShift provides a way to seamlessly build images based on defined events called BUILDS. A build is the process of transforming input parameters into a resulting object. Most often, the process is used to transform input parameters or source code into a runnable image. A BuildConfig object is the definition of the entire build process.

The missing part to building hardware-specific images is to orchestrate the build process over the different available resources. In this post you will learn about the Node Feature Discovery (NFD) operator and how to tie it to OpenShift builds to have a hardware-specific image build.

The first part describes the NFD operator and how you can use it to manage the detection of hardware features in the cluster. The second part describes how to create an imageStream from a GitHub webhook and how to use the information from the NFD operator to schedule node-specific builds. The third part presents a sample app to test what you have learned.

The Node Feature Discovery Operator

The Node Feature Discovery Operator (NFD) manages the detection of hardware features and configuration in an OpenShift cluster by labeling the nodes with hardware-specific information. NFD will label the host with node-specific attributes, like PCI cards, kernel, or OS version, and many more. See (Upstream NFD) for more info.

The NFD operator can be found on the Operator Hub by searching for “Node Feature Discovery”:

After following the install steps, you can go to “Installed Operators” in the OpenShift  cluster and see:

Inside, a card instructs you to create an instance:

Click on “Create Instance” to get help from the OpenShift web console, which will auto-generate the needed YAML file and allow you to create the object from the console.

Once the NFD operator is deployed, you can go to a node dashboard and check all the Node_labels generated by the operator. Here is a sample excerpt of NFD labels applied to the node:

By reading the generated labels, you can understand hardware information of the OpenShift node; for example, we are running an amd64 architecture with multithreading enabled: (“”, “”)

Defining a BuildConfig

BuildConfig is a powerful tool in OpenShift. OpenShift Container Platform uses Kubernetes by creating containers from build images and pushing them to a container image registry.

The first step is to create a specific namespace to allocate the builds:

apiVersion: v1
kind: Namespace
	name: multiarch-build
	labels: "true"

For this example, you are pointing the builds to a repository on GitHub. First, you need to generate a secret for the generated GitHub hook:

kind: Secret
apiVersion: v1
  name: arch-dummy-github-webhook-secret
  namespace: multiarch-build
  WebHookSecretKey: bXVsdGlhcmNoLWJ1aWxk
kind: Secret
apiVersion: v1
  name: arch-dummy-generic-webhook-secret
  namespace: multiarch-build
  WebHookSecretKey: bXVsdGlhcmNoLWJ1aWxk

With the namespace and secret in place, you can now create the imageStream and BuildConfig to continuously watch for user-defined triggers to keep the image up to date. Image streams are part of the OpenShift extension APIs. Image streams are named references to container images. The OpenShift extension resources reference container images indirectly, using image streams.

The following YAML files can be generated via the OpenShift Developer web console. Once you have a generated imageStream and BuildConfig YAML, you need to make sure they look like the following:

kind: ImageStream
	app: arch-dummy
  name: arch-dummy
spec: {}
kind: BuildConfig
  name: arch-dummy
  namespace: multiarch-build
  selfLink: >-
	app: arch-dummy arch-dummy arch-dummy arch-dummy-app
  annotations: master ''
  nodeSelector: ""
  	cpu: "100m"
  	memory: "256Mi"
  	kind: ImageStreamTag
  	name: 'arch-dummy:latest'
  resources: {}
  successfulBuildsHistoryLimit: 3
  failedBuildsHistoryLimit: 3
	type: Docker
  	dockerfilePath: build/Dockerfile
  postCommit: {}
	type: Git
  	uri: ''
  	ref: master
	contextDir: /
	- type: ImageChange
  	ImageChange: {}
	- type: GitHub
      	name: arch-dummy-github-webhook-secret
	- type: ConfigChange
  runPolicy: Parallel

There are three lines worth noting in the above YAML (not auto-generated via the Developer web console), where you leverage the NFD operator labels to orchestrate the image builds on top of nodes with specific features, by using the nodeSelector key. For example, only schedule container builds on worker nodes with amd64 architecture:

  nodeSelector: ""

Now with the BuildConfig created, you can check out the GitHub URL hook: 

[eduardo@fedora-ws image_stream]$ oc describe bc/arch-dummy
Name:   	 arch-dummy
Namespace:    multiarch-build
Created:    5 days ago
Labels:   	 app=arch-dummy
Latest Version:    2

Strategy:   	 Docker
Ref:   		 master
ContextDir:   	 /
Dockerfile Path:    build/Dockerfile
Output to:   	 ImageStreamTag arch-dummy:latest

Build Run Policy:    Serial
Triggered by:   	 Config
Webhook Generic:
    AllowEnv:    false
Webhook GitHub:
Builds History Limit:
    Successful:    5
    Failed:   	 5

Build   	 Status   	 Duration    Creation Time
arch-dummy-1     complete     1m37s    	 2020-03-31 17:35:32 -0400 EDT

Events:    <none>


With this URL, you can then follow GitHub Webhook instructions for a ready-to-work imageStream..

To learn more about OpenShift Builds and more advanced use cases, you can go here.

Deploy an Example

To test what you just learned today, you can create a buildConfig of Arch-Dummy as a didactic confirmation that the feature-specific build is working. To do so, deploy the image as detailed on by selecting the built image “arch-dummy:latest”.

This image was built from the repo as seen in the imageStream.yaml.

This application generates a small API service with three endpoints:

/ -> Will retrieve information about the app

/version -> Will retrieve information about the app binary and where it was built


/ -> Will retrieve information about the app

/version -> Will retrieve information about the app binary and where it was built
{Git Commit:"6825a2f2a5b6a60278868260d8cdb51d192d9e63",CPU_arch:"Intel(R) Xeon(R) CPU E5-2686 v4 @",Built:"Tue Mar 31 21:43:15 UTC 2020",Go_version:"go1.12.8 linux/amd64"}

/cpu -> will retrieve information about the node on which the app is currently running

{name:"Intel(R) Xeon(R) CPU E5-2686 v4 @ 2.30GHz",model:"79",family:"6"}

This dummy arch app will allow you to test whether the image was built correctly and orchestrated correctly.


Building hardware-specific images is easy by leveraging internal OpenShift tooling like imageStreams and coupling with the Node-Feature-Discovery Operator to manage the detection of hardware features and configuration in the OpenShift cluster. OpenShift simplicity allows developers to define the nodeSelector key to orchestrate image builds over target hardware. These could prove to be of great use when you consider image-build processes that involve AI/ML training requiring GPU and other special resources.

Future Work

On this blog post, you saw a quick example on how to tie together the Node-Feature-Discovery Operator and Openshift imageStreams for simple hardware-specific image builds. The following post goes deeper into OpenShift replacing the imageStream build with OpenShift Pipelines and another operator, the Special-resource-operator, to build more complex images and deploy them in the cluster.

From Armory – Safety Is No Accident

By Blog, Member

Originally posted on the Armory blog by Chad Tripod

Continuous Delivery and Deployment is changing the way organizations deliver software. Over the years, software delivery has morphed into a time consuming process.  With countless validations and approvals to ensure the code is safe to present to users. And with good reason, releasing bad software can severely impact a business’s brand, popularity, and even revenue. In this day and age, with customer sentiment immediately feeding back into public visibility, companies are taking even stricter measures to ensure the best software delivery and user experience.

When deploying software to production, we use words like “resilience” to talk about how the code runs in the wild. For the optimists, we use words such as “Availability Zones,” and for those more pessimistic about deployments, we say, “Failure Domains.” When I was architecting and deploying applications for Apple, eBay, and others, I always built for failure. I was always more interested in how things behave when we break things, and less so on the steady state. I’d relish in unleashing tools like Simian Army to wreak havoc on what we had built to ensure code and experience weren’t impaired. 

Nowadays, there is a much better approach to ensuring safety. Continuous Delivery (CD) has enabled organizations to shift left. Empowering developers with access to deploy directly to production, but with the guardrails needed to make sure safety isn’t compromised for speed. Luckily, the world-class engineers at Netflix and Google have built a platform, Spinnaker. Spinnaker addresses deployment resiliency concerns and empowers developers with toolsets to validate and verify as a built-in part of delivering code.

Now, let’s break down the modern model and review the tools available in the CI/CD workflow. 

Spinnaker – Spinnaker is a high scale multicloud continuous delivery (CD) tool.  While leveraging the years of software delivery best practices that Netflix and Google built into Spinnaker, users get to serialize and automate all the decisions that they have baked into their current software delivery process. Approvals, environments, testing, failures, feature flagging, ticketing, etc., are all completely automated and shared across the whole organization. The end result? Built-in safety that allows DevOps teams to deploy software with great velocity.

Continuous Verification – Leveraging real-time KPIs and log messages to dictate the health of code and environment. Spinnaker’s canary deployments ingest real-time metrics from data platforms including Datadog, NewRelic, Prometheus, Splunk, and Istio into a service called Kayenta. Kayenta runs these time series metrics into the Mann Whitney algorithm developed by Netflix and Google, and compares  release metrics to current production metrics. Spinnaker will then adjust or roll back deployments automatically based on success criteria. This allows math and data, rather than manual best-guesses, to dictate in real time if the user is getting the best experience from the service.

Chaos Engineering – Why wait for things to break in production to fix them? There are better ways. Chaos Engineering is the practice of breaking things in pre prod environments to understand how the code behaves when it’s exercised. What happens when a dependent service goes offline? How do the other services in the application behave? How does Kubernetes deal with it? What about shutting off a process in a service? These are the measures Gremlin and Chaos Monkey give your developers. Now testing is much more than what your CI Server does, it takes into consideration the environments in which they are deployed. 

Service Mesh – Service Meshes are a Kubernetes traffic management solution. Kubernetes applications can traverse many clusters, regions, and even clouds. Service Meshes are a way to manage traffic flows, traceability, and most importantly with ephemeral workloads, observability. There are many flavors of service meshes to choose from. Istio/Envoy has the most visibility, but you can also implement service meshes from nginx, or even get enterprise support from companies like F5/NGINX+ or Citrix, which offer elevated ingress features. Service Meshes in the context of software delivery provide a very granular canary release. Instead of blindly sending traffic to a canary version for testing, you can instead programmatically use layer 7 traffic characteristics such as URI, host, query, path, and cookie to steer traffic. This allows you to switch only certain users, business partners, or regions to new versions of software.

DevSecOps – In my years seeing changes in technology and how we deliver software to end users, one thing is for sure: security wants to understand the risks in what you’re doing. And with good reason. Security exploits can leak sensitive information or, even expose an organization to malicious hackers. Luckily this new deployment world allows security to process their scans and validations in an automated fashion. Solutions range from TwistlockArtifactory XrayAquaSignal Science, etc. There are many DevSecOps solutions, so it is a good thing to know that Spinnaker supports them all!

Spinnaker stages automate developer tools:

End Result – As you put together your new cloud native tool chain, there are many ways you can improve the way you release software. I urge you to deploy the tools you need for the service you are providing, not only based on what a vendor is saying. Over time, implementing guardrails will increase your innovation and time to market. For many this will be a competitive advantage against those who move slowly, and investing in these areas will, over time, improve the hygiene of your software code, which will provide stability in your future releases. By de-risking the release processes and improving safety, the end users are given the best possible experience with your software.

From Armory – The World of Jenkins: Better #withSpinnaker

By Blog, Member
Jenkins and Spinnaker, Better Together

Originally posted on the Armory blog by Rosalind Benoit

Folks in the DevOps community often ask me, “I’m already using Jenkins, so why should I use Spinnaker?” We’re hosting a virtual talk to address the question! Register here to join us 3/26 and learn how Jenkins and Spinnaker cooperate for safe, scalable, maintainable software delivery.

A delivery engineer I spoke with last week said it best:

“I came from a world of using Jenkins to deploy. It’s great but, you’re just modifying Jenkins jobs. It can do a lot, but it’s like that line in Jurassic Park – ‘Your scientists were so preoccupied with whether or not they could, they didn’t stop to think if they should.’”

Many of us came from that world: we built delivery automation with scripts and tools like Jenkins, CircleCI, Bamboo, and TeamCity. We found configuration management, and used Puppet or Ansible to provision infrastructure in our pipelines as code. We became addicted to D.R.Y. (don’t repeat yourself), and there is no looking back.

Jenkins provides approachable automation of continuous integration steps. Spinnaker works with Jenkins to pick up and deliver build artifacts, and to delegate pipeline stages. As a true continuous delivery platform, Spinnaker codifies your unique software delivery culture and processes to your comfort level. It also adds production-ready value to your pipelines:

  • Turnkey automation of advanced delivery strategies such as canary deployments
  • One-click rollbacks
  • Single pane of glass to view deployments, applications, server groups, clusters, load balancers, security groups, and firewalls
  • Centralized API to automate and integrate across your toolchain

Jenkins taught us many lessons. It popularized the use of imperative pipelines to execute ordered steps in a SDLC. It taught us that centralizing delivery workflows into one platform makes strategic sense in scaling operations. At the same time, especially when used for deployments, it suffers from instability and maintenance overhead brought by unchecked plugin sprawl. It struggles to offer a scalable model for managing multiple jobs and distributed apps. But the way it consolidated SDLC tasks within a full-featured GUI empowered developer teams to start doing delivery.

In the new world of fast innovation through immutable infrastructure, Spinnaker has adapted that visibility to the realms of cloud and cloud native. It provides a centralized vantage point on all of your ephemerally-packaged applications, in their many variations.  Within your pipelines, its guardrails identify invalid or non-compliant infrastructure before deployment even happens. Spinnaker’s smart delivery workflows insulate customers and end-users from impact to their software experience.

This sense of safety is Jenkins’ missing ingredient. Jenkins introduced a world where developers could independently chain together a path to production. It enabled us to improve our efficiency and code quality through testing and build automation, with self-service. This giant technological shift sparked a move away from waterfall development and ITIL-style delivery.

But, culture cannot change overnight. Developers who exercised this newfound power struck terror in the hearts of those accountable for availability and software-driven business goals. Culture lagged behind tooling, sparking fear and risk aversion. That fear still permeates many organizations, allowing baggage-free startups and the most nimble companies to digitally disrupt the status quo. These innovators prove that delivering highly valued interactions through software means increased profit and influence. Enter Armory Spinnaker.

Watch Armory CTO Isaac Mosquera’s Supercharge Your Deployments With Spinnaker and Jenkins presentation at CD Summit, or check out the longer version with Q&A at DeliveryConf.

Stop spending time and talent knitting your toolchain together with pipeline steps that rely on brittle, expensive-to-maintain scripts and repetitive GUI fiddles! Attend “I have Jenkins; why do I need Spinnaker?” to learn more about how Spinnaker can free your developers and evolve your continuous delivery game.

From Armory – Scaling Spinnaker at Salesforce: The Life of a Cloud Ops Architect

By Blog, Member
Salesforce & Spinnaker

Originally posted on the Armory blog by Rosalind Benoit

I met Edgar Magana at Spinnaker Summit last year, when he spoke during Armory’s keynote as one of six Spinnaker champions. The energy and enthusiasm he brings to advocating for Spinnaker contrasts his intensity in approaching his role as operator of mega-scale cloud infrastructure. But, the more I get to know him, the more I understand that it’s one in the same. To enable Salesforce’s application owners to safely evolve software, he must ensure homogenous, predictable models for continuous delivery. Spinnaker has helped him make that a reality. 

“Across multiple environments, we have to enable different models for Spinnaker based on security requirements,” he explains as he shares his standardization strategy. 

Salesforce logo

“We templatize all the pipelines for consistency across services, with two types: EC2 instance deployments, and Kubernetes cluster pipelines. For Kubernetes, we require a lot of security hardening, and we need to use the same logging and monitoring mechanisms for all of our clusters.”

Every time Edgar’s team discovers a new configuration for Kubernetes, they upgrade that pipeline, which should trigger a service owner to relaunch the pipeline with new parameters, or sometimes, destroy their Kubernetes cluster and create a new one. “We make all these changes in development and staging first, of course,” he says, noting the dev, pre-prod, and prod Spinnaker instances his team maintains. 

A security requirement that artifacts not be created in the same place that deploys services imposes an added complexity. It required Edgar’s team to innovate further, and split the baking process across two different Spinnaker instances. Luckily, they could configure this out of the box with Spinnaker by overriding parameters; Edgar appreciates that Spinnaker doesn’t hard-code a lot of configuration, as rigidity wouldn’t support Salesforce’s unique requirements. That has also allowed his team to create a “heavy layer of automation on top of Spinnaker,” providing guardrails for application owners.

Our conversation turns to the Spinnaker Ops SIG (Special Interest Group), which Edgar recently founded. A solid kickoff meeting produced several action items to be completed before the next scheduled meeting on February 27th at 10 AM, (always a good sign).

Most importantly at this stage, Edgar says:

“We want to reach out to more operators, people who are either struggling or evaluating Spinnaker. We need operators in different stages — super experts who control everything, like those at Netflix and Airbnb, operators that are getting there, like us at Salesforce, and those in the initial evaluation stage. The goal of the SIG is to have a place where operators can exchange use cases, and have a unified voice, just like other Spinnaker SIGs, and a path for specific features we want incorporated into Spinnaker. The community needed a place to discuss how to operate Spinnaker better. As an operator of large-scale infrastructure, I don’t want to share this system with only a few companies. We want to welcome new users and operators, and facilitate their transition from the POC (proof-of-concept) environment to the real thing. This will help us understand what kinds of features are more important.”

Does the Ops SIG also provide a place to vent and empathize? I sure hope so! “That’s the life of a cloud operations architect,” Edgar says when he has to reschedule our meeting, “we get called all the time, from account issues, to Spinnaker and Kubernetes configuration,” and lots more; indeed, once when I ping Edgar about this blog, he’s in a “war room” (boy, I sure don’t miss my Ops days right now!)  But just like Armory, Edgar values empowering developers, and safely pushing control of applications and their infrastructure to the edge to fuel innovation. Better software is worth the hard work!

Another of Edgar’s goals for the Ops SIG: create reference architecture documentation for HA (high availability) and disaster recovery. “I want new users of Spinnaker to say, ‘I don’t need to reinvent the wheel; I’ll just follow these HA guidelines.” Architecture collateral will help Platforms and DevOps teams convince leadership that Spinnaker is a good investment for the company’s continuous delivery of software. This is where Edgar’s warm enthusiasm and operator’s intensity meet: empowering developers, empowering the community, empowering the planet. 

I look forward to working more with Edgar and his team at Salesforce as part of the Ops SIG, our April Spinnaker Gardening Days online hackathon, and more.  This is the kind of open-source heroism that will usher in the new industrial revolution!

Check out Edgar’s talk on Salesforce & Spinnaker from last winter’s Spinnaker Summit below, or hop over to the registration page for Armory’s upcoming, “I have Jenkins; why do I need Spinnaker?” webinar to reserve your spot!

View image on Twitter

712:23 PM – Nov 16, 2019Twitter Ads info and privacySee Edgar Magana’s other Tweets

Watch Edgar’s 8 minute talk, part of Isaac Mosquera’s keynote.:

From Armory – Kubernetes Native: Introducing Spinnaker Operator

By Blog, Member

Originally posted on the Armory blog by German Muzquiz

Spinnaker Operator is now Beta!

With Spinnaker Operator, define all the configurations of Spinnaker in native Kubernetes manifest files, as part of the Kubernetes kind “SpinnakerService” defined in its own Custom Resource Definition (CRD). With this approach, you can customize, save, deploy and generally manage Spinnaker configurations in a standard Kubernetes workflow for managing manifests. No need to learn a new CLI like Halyard, or worry about how to run that service.

The Spinnaker Operator has two flavors to choose from, depending on which Spinnaker you want to use: Open Source or Armory Spinnaker.

With the Spinnaker Operator, you can:

  • Use “kubectl” to install and configure a brand new Spinnaker (OSS or Armory Spinnaker).
  • Take over an existing Spinnaker installation deployed by Halyard and continue managing it with the operator going forward.
  • Use Kustomize or Helm Charts to manage different Spinnaker installations with slight variations.
  • Use Spinnaker profile files for providing service-specific configuration overrides (the equivalent of clouddriver-local.yml, gate-local.yml, etc.)
  • Use Spinnaker service settings files to tweak the way Deployment manifests for Spinnaker microservices are generated.
  • Use any raw files needed by configs in the SpinnakerService manifest (i.e. packer templates, support config files, etc.)
  • Safely store secret-free manifests under source control, since a SpinnakerService manifest can contain references to secrets stored in S3GCS or Vault (Vault is Armory Spinnaker only).

Additionally, Spinnaker Operator has some exclusive new features not available with other deployment methods like Halyard:

  • Spinnaker secrets can be read from Kubernetes secrets.
  • Spinnaker is automatically exposed with Kubernetes service load balancers (optional).
  • Experimental: Accounts can be provisioned and validated individually by using a different SpinnakerAccount manifest, so that adding new accounts involves creating a new manifest instead of having everything in a single manifest.

Let’s look at an example workflow.

Assuming you have stored SpinnakerService manifests under source control, you have a pipeline in Spinnaker to apply these manifests automatically on source control pushes (Spinnaker deploying Spinnaker) and you want to add a new Kubernetes account:

  1. Save the kubeconfig file of the new account in a Kubernetes secret, in the same namespace where Spinnaker is installed.
  2. Checkout Spinnaker config repository from source control.
  3. Add a new Kubernetes account to the SpinnakerService manifest file, referencing the kubeconfig in the secret:
kind: SpinnakerService
name: spinnaker
version: 2.17.1 # the version of Spinnaker to be deployed
persistentStoreType: s3
bucket: acme-spinnaker
rootFolder: front50
+ providers:
+ kubernetes:
+ enabled: true
+ accounts:
+ - name: kube-staging
+ requiredGroupMembership: []
+ providerVersion: V2
+ permissions: {}
+ dockerRegistries: []
+ configureImagePullSecrets: true
+ cacheThreads: 1
+ namespaces: []
+ omitNamespaces: []
+ kinds: []
+ omitKinds: []
+ customResources: []
+ cachingPolicies: []
+ oAuthScopes: []
+ onlySpinnakerManaged: false
+ kubeconfigFile: encryptedFile:k8s!n:spinnaker-secrets!k:kube-staging-kubeconfig # secret name: spinnaker-secrets, secret key: kube-staging-kubeconfig
+ primaryAccount: kube-staging
  1. Commit the changes and open a Pull Request for review.
  2. Pull Request approved and merged.
  3. Automatically a Spinnaker pipeline runs and applies the updated manifest.

We hope that the Spinnaker Operator will make installing, configuring and managing Spinnaker easier and more powerful. We’re enhancing Spinnaker iteratively, and welcome your feedback.

Get OSS Spinnaker Operator (documentation)

Get Armory Spinnaker Operator (documentation)

Interested in learning more about the Spinnaker Operator? Reach out to us here or on Spinnaker Slack – we’d love to chat!

From Armory – Why Armory Has a Remote Work Culture

By Blog, Member

Originally posted on the Armory blog by Andrew Backes

Armory Has a Remote Work Culture

At Armory, we are intensely focused on building our culture, not just building our product. Our culture is the operating system of our company, underpinning and supporting everything that we do.

From the get-go, we decided that Armory’s culture was going to be designed with intentionality to be remote work friendly. Armory is built on Spinnaker, an open source project created by Netflix and Google. One of the incredible features of open source software is that it is built through the collaboration of talented individuals and teams distributed all over the world. To align with the vibrant Spinnaker open source community, we doubled down on building a strong remote work culture. Today, more than half of our company works remotely, with the remainder working at our HQ in San Mateo, CA.

What Does this Mean in Practice?

For many companies, the experience of remote workers is a secondary concern, if it is even thought of at all. But remote workers face their own unique experiences, benefits, and challenges. At Armory, we acknowledge and welcome this unique experience, approach it with empathy and understanding, and create feedback mechanisms to ensure that our on-site and remote tribals are all having the best possible experience.

Some of the things we do at Armory include:

  • Set up Zoom stations all over the office to make it seamless to hop on a video chat and have synchronous, “face-to-face” communication
    • This includes the always-on camera and screen in the main section of our office so that we can all eat lunch “together”
  • Rotate meeting leadership between on-site and remote tribals, to continually ensure that the needs of remote tribals are being met
  • Provide a flexible expense policy for remote workers to ensure that they have the best, most productive remote setup for their individual needs
  • Organize frequent in-person events (team and company-wide offsites, holiday parties, the Spinnaker Summit, and others) to provide everyone with the opportunity for in-person relationship building
  • Default to having conversations in a public Slack channel to ensure that everyone can participate, voice their perspectives, and share in the knowledge
  • Create unstructured time for “high-bandwidth” communication (over Zoom and in-person) for people to get to know each other outside of a business context
    • Specifically within the engineering team, many of the scrum teams dedicate a chunk of their daily standup time to the type of general team syncing and catch-up that happens more naturally in-person in the office

A high-risk moment for Armory! Most of the engineering team in one single elevator, on the way to dinner during a 3-day offsite. Everything turned out fine, and dinner was delicious.

Why Do We Invest in the Remote Experience?

All of these initiatives take time and effort. Why do we bother, instead of leaving it up to remote tribals to conform to the working cadence of the tribals who are at HQ?

One simple answer is that it comes down to people. We want to work with the best people all over the world, not just the best people in the Bay Area. And that means embracing remote work and maximizing the remote work experience.

For me personally, my main role as the Head of Engineering is to empower the engineering team. That means creating the visibility and shared context so that engineers have the information that they need to make great decisions to positively impact the company and the Spinnaker community. Furthermore, continual growth and improvement is a core value at Armory. It is imperative that each of us continue to learn and grow, not just in our technical depth but also in understanding how each other work, how we can function as a better, more cohesive team, and how we can strive together for larger goals.

If I am not investing in the right tools, processes, and culture to foster that shared context and continual growth across the entire engineering team, then I am not doing my job.

Armory is hiring polyglot engineers, as well as tribals across the entire organization. Check out open roles and learn more about what life is like at Armory at We’d love to hear from you!

Insights by Harness – Common Challenges with Continuous Delivery

By Blog, Member

At Harness, we are here to champion Continuous Delivery for all. As a member of the Continuous Delivery Foundation [CDF], we’re happy to see that the industry has taken a collective step forward to better each other’s capabilities. We are pretty fortunate to have the opportunity to talk to many people along their Continuous Delivery journey. Part of what we do at Harness when engaging with a customer or prospect is help run a Continuous Delivery Capability Assessment (CDCA) to catalog and measure maturity. 

Over the past year, we have analyzed and aggregated the capability assessments that we have performed. We uncovered common challenges organizations faced in the past 12 months. In Continuous Delivery Insights 2020, we identified the time, effort, cost, and velocity associated with their current Continuous Delivery process. Trending data that we have around key metrics is that velocity is up but also complexity and cost are on the rise. 

Key Findings

We observed the following Continuous Delivery performance metrics across the sample of over 100 firms. Organizations that are looking towards strengthening or furthering their Continuous Delivery goals we noticed the following with median [middle] or average values. As sophisticated as organizations are, there is still a lot of effort to get features/fixes into production. 

In terms of deployment frequency, we define deployment frequency as the number of times a build is deployed to production. In terms of a microservices architecture, deployment frequency is usually increased as the number of services typically have a one to one relationship with the build. For the sample set we interviewed, the median deployment frequency is ten days which shows bi-monthly deployments are becoming more the norm.

These bi-monthly deployments might be on demand but the lead times can start to add up. Lead time is the amount of time needed to validate a deployment once the process has started. Through the sample, organizations typically require an average of eight hours; e.g eight hours of advance notice to allow validation and sign off of a deployment. 

If during those eight hours of lead time during the validation steps, if a decision is made to roll back, we saw that organizations in the sample averaged 60 mins of time to restore a service e.g roll back or roll forward. An hour might not seem too long to some but for engineers, every second can feel stressful as you race to restore your SLAs

Adding up all the effort from different team members during a deployment, getting an artifact into production represented an average of 25 human hours of work. Certainly, different team members will have varying levels of involvement throughout the build, deploy, and validation cycles but represents more than ½ a week of a full-time employee in total burden. 

Software development is full of unknowns; core to innovation we are trying and developing approaches and features for the first time. Expectation is there for iteration and learning from failures. We certainly have gotten better at deployment and testing methodologies and one way to measure is with change failure rate or the percentage of deployments that fail. Through the sample set, there was on average 11% of deployments failed. Not all doom and gloom, the Continuous Delivery Foundation is here to move the needle forward. 

Looking Forward 

The goal of the Continuous Delivery Foundation and Harness is to collectively raise the bar around software delivery. For organizations that are members of the Continuous Delivery Foundation, deploying via a canary deployment seems like second nature for safer deployments. If you are unfamiliar with a canary deployment, basically a safe approach to releasing where you send in a canary [release candidate] and incrementally replace the stable version until the canary has taken over.  As simple as this concept is to grasp, in practice can be difficult. In Continuous Delivery Insights 2020, only about four percent of organizations were taking a canary based approach somewhere in their organization. 
We are excited to start to track metrics on the overall challenges with Continuous Delivery year to year and work towards improving the metrics and adoption of Continous Delivery approaches for all. For greater insights, approaches, and breakdowns, feel free to grab your digital copy of Continuous Delivery Insights 2020 today!

From IBM – Build and Deliver Using Tekton-Enabled Pipelines

By Blog, Member

Originally posted to the IBM blog by Jerh O’Connor

We are delighted to announce the addition of the industry-leading Tekton capability to IBM Cloud Continuous Delivery pipelines.

With this new type of Continuous Delivery Pipeline, you define your pipeline as code using Tekton resource YAML stored in a Git repository. This lets you version and share your pipeline definitions while allowing you to configure, run, and view pipeline output in the familiar IBM Cloud Continuous Delivery DevOps experience.

Tekton Pipelines is an open source project, born out of Knative Build, that you can use to configure and run continuous integration/continuous delivery (CI/CD) pipelines within a Kubernetes cluster.

Tekton Pipelines are cloud native and run on Kubernetes using custom resource definitions specialized for executing CI/CD workflows. The Tekton Pipelines project is new and evolving and has support and active commitment from leading technology companies, including IBM, Red Hat, Google, and CloudBees. 

For a closer look at Tekton, see our video “What is Tekton?”:06:49

What is Tekton?

In addition to the benefits of Tekton, this new capability in IBM Continuous Delivery provides the following unique features:

  • dashboard to easily view in-flight and completed Pipeline Run information
  • Manual and Git triggering support
  • Integration into the rich Open Toolchain ecosystem of integrations
Delivery Pipeline Tekton Dashboard

How to get started

Getting started with CD Tekton-enabled pipelines requires a little setup. You will need the following:

Once you have these pieces in place, you need to configure the Tekton Pipeline to enable the running of your pipeline.

Click on the Tekton Pipeline tile on your toolchain dashboard to be taken to the configuration screen, where you will see notifications on what remains to be configured to use this new pipeline technology:

Click on the Tekton Pipeline tile on your toolchain dashboard to be taken to the configuration screen, where you will see notifications on what remains to be configured to use this new pipeline technology:

Select the required values in the Definitions tab, select your private worker in the Worker tab, and click Save.

You now need to set some trigger mappings via the Triggers tab so you can run the pipeline.

The simplest trigger is a manual trigger where you kick off a run yourself from the dashboard. You can also create Git triggers that fire based on git push and pull events as needed.

CD Tekton-enabled pipelines also provide a means of externalising reusable properties and provide a simple, secure method for storing sensitive information like apikeys via the Environment Properties tab.

Once the setup has been completed, the new Tekton Dashboard automatically updates to show information on in-flight and completed Pipeline Runs, showing the status of these runs and allowing the cancellation and deletion of runs.

In all cases, you can dig deeper into the logs and see the output of each Tekton Task and Step in the selected Pipeline run.


Caution: The Tekton Pipelines project is still in alpha and being actively developed. It is very likely that you will need to make changes to your pipeline definitions as the project continues to evolve

Watch the demo

Watch this short companion video for a demo of using our sample Tekton pipeline in IBM Cloud.


Now you can try it out yourself. For more information, see our documentation. If you have questions, engage our team via Slack by registering here and join the discussion in the #ask-your-question channel on our public Cloud DevOps @ IBM Slack.

We’d love to hear your feedback.

Learn more about Tekton by reading “Tekton: A Modern Approach to Continuous Delivery.”

From Armory – Kelsey Hightower on Spinnaker: Culture is what you DO

By Blog, Member

Originally posted on the Armory blog, by Rosalind Benoit

“Let Google’s CloudBuild handle building, testing, and pushing the artifact to your repository. #WithSpinnaker, you can go as fast as you want, whenever you’re ready.”

Calling all infrastructure nerds, SREs, platforms engineers, and the like: if you’ve never seen Kelsey Hightower speak in person, add it to your bucket list. Last week, he gave a talk at Portland’s first Spinnaker meetup, hosted at New Relic by the amazing PDX DevOps GroundUp. I cackled and cried at the world’s most poignant ‘Ops standup’ routine. Of course, he thrilled the Armory tribe with his praise of Spinnaker’s “decoupling of high level primitives,” and I can share some key benefits that Kelsey highlighted:

  • Even with many different build systems, you can consolidate deployments #withSpinnaker. Each can notify Spinnaker to pick up new artifacts as they are ready.
  • Spinnaker’s application-centric approach helps achieve continuous delivery buy-in. It gives application owners the control they crave, within automated guardrails that serialize your software delivery culture. 
  • Building manual judgements into heavy deployment automation is a “holy grail” for some. #WithSpinnaker, we can end the fallacy of “just check in code and everything goes to prod.” We can codify the steps in between as part of the pipeline. 
  • Spinnaker uses the perfect integration point. It removes the brittleness of scripting out the transition between a ‘ready-to-rock’ artifact and an application running in production. 

Kelsey’s words have profound impact. He did give some practical advice, like “Don’t run Spinnaker on the same cluster you’re deploying to,” and of course, keep separate build and deploy target environments. But the way Kelsey talked about culture struck a chord. We called the meetup, “Serializing culture into continuous delivery,” and in his story, Kelsey explained that culture is what you do: the actions you take as you work; your steps in approaching problems. 

Is Spinnaker The Hard Way coming?

Yes, please!

I’m reminded of working on a team struggling with an “agile transformation” through a series of long, circular discussions. I urged my team, “Scrum is just something that you do!” You go to standups, and do demos. You get better at pointing work over time. The ceremonies matter because you adapt by doing the work

Kelsey says his doing approach starts with raising his hand and saying, “I would like to own that particular problem,” and then figuring it out as he goes. Really owning a problem requires jumping in to achieve a deep understanding of it. It’s living it, and sharing with others who have lived it. We can BE our culture by learning processes hands-on, digging into the business reasons behind constraints, and using that knowledge to take ownership. Hiding behind culture talk doesn’t cut it, since you have to do it before you can change it. 

Tweets: Spinnaker is complex, but the return on investment is totally worth it

“The return on investment is totally worth it”

Another important way of doing: recognizing when you don’t know how to do it and need some help. Powerful open source projects like Kubernetes and Spinnaker can become incredibly complicated to implement in a way that faithfully serializes your culture. Responsible ownership means getting the help you need to execute.

I love how Kelsey juxtaposed the theatrics and hero mythology behind change management and outage “war rooms” with the stark truth of the business needs behind our vital services. As Kelsey shared his Ops origins story, I recalled my own – the rocket launch music that played in my head the first time I successfully restarted the java process for an LMS I held the pager for, contrasted with the sick feeling I got when reading the complaining tweets from university students who relied on the system and had their costly education disrupted by the outage. I knew the vast majority of our students worked full time and paid their own way, and that many had families to juggle as I do. This was the real story of our work. It drove home the importance of continuous improvement, and meant that our slow-moving software delivery culture frustrated the heck out of me. 

Kelsey's Deployment Guide Doc

Kelsey’s LOL simulation of the Word doc deployment guide at his first “real” job. Got a deployment horror story about a Word-copied command with an auto-replaced en-dash on a flag not triggered until after database modification scripts had already run? I do!

So what do you do if you’re Kelsey? You become an expert at serializing a company’s decisions around software delivery and telling them, as a quietly functioning story, with the best-in-class open source and Google tooling of the moment. He tells the story of his first automation journey: “So I started to serialize culture,” he says, when most of the IT department left him to fend for himself over the winter holidays. Without trying to refactor applications, he set to work codifying the software delivery practices he had come to understand through Ops. He automated processes, using tools his team felt comfortable with. 

He said, “We never walked around talking about all of our automation tools,” and that’s not a secrecy move, it’s his awareness of cognitive dissonance and cognitive overload. Because he had created a system based on application owners expectations, their comfort zone, he didn’t need to talk about it! It just worked (better and more efficiently over time, of course), and fulfilled the business case. Like Agile, this approach limits the scope of what one has to wrap their brain around to be excellent. Like Spinnaker, it empowers developers to focus on what they do best.

Instead of talking about the transformation you need, start by starting. Then change will begin.

Join Spinnaker Slack to learn more about Spinnaker and connect with folks who use and operate it. Read more about starting where you are, with what you have, or reach out to to set up a value stream mapping discovery day with experts from Armory and Continuity. 

Tweets: Spinnaker and Jenkins

Spinnaker and Jenkins can cooperate to deliver software if that’s what makes sense in your culture.

Tweet: I learned so much from Kelsey's talk on Spinnaker

Kelsey is a #legend for his techniques for getting people comfortable with new tools and automation!

From Puppet – How to Cut Unused EC2 Instances with AWS Lambda

By Blog, Member

Uh… Why is this month’s AWS bill so high?

Originally posted on the relay-sh blog, by Kenaz Kwa, Principal Product Manager @puppetize

Forgotten AWS EC2 instances have made everyone’s pockets hurt (including Puppet!). Take it from us ( team) — if you don’t proactively clean up unused EC2 instances, cloud spending can quickly get out of control. However, it can be tedious to routinely check which EC2 instances are still in use, track down the old ones, and remove them. Luckily — we know how to automate these tasks!

Our mission is to free you to do what robots can’t.

This post walks you through de-provisioning unused EC2 instances by using AWS Lambda and CloudFormation to deploy an EC2 reaper that uses simple Tags to cut down on spending.

To see the full code, check out this repo.

AWS EC2 Reaper overview

The AWS Reaper works by checking and enforcing tags that are set on the EC2 instances. All EC2 instances must be tagged with a lifetime or a termination_date. The termination_date defines a future date after which the EC2 instance will be terminated. Alternatively, the Reaper looks for a lifetime tag– if found, it calculates a new future date and adds that date as the termination_date tag for the EC2 instance.

First, let’s look at the The main reaper logic for handling instances is in the terminate_expired_instances function which lists instances and looks up the termination date tag for each instance:

Improperly Tagged Instances

If we find an instance that doesn’t have a termination_dateor we find the tag can’t be parsed, we stop it:

This enables us to stop the b̶l̶e̶e̶d̶i̶n̶g̶ billing while we contact the instance owner to see if it should still be kept around.

Expired Instances

For all instances we find that are expired, we destroy:

Deploying the EC2 reaper

Now, we could just run this python script against different AWS regions and we’d already be better off than doing this manually. However, we would rather not spend time babysitting scripts at all. We’re going to deploy this into AWS using CloudFormation Stacks.

Deploying the AWS reaper has two parts:

  • deploy_to_s3.yaml AWS CloudFormation template that places the lambda zip resources in S3 buckets in every region so that the deploy_reaper template can read them for Reaper deployment.
  • deploy_reaper.yaml AWS CloudFormation template that installs the reaper creates the IAM role and deploys the lambda function to perform the instance reaping.

deploy_to_s3 template

In order to use this template, you must first manually create an S3 bucket that contains the resources to copy across all regions. You will need to do this once per region; S3 resources can be read between accounts but not between regions for AWS Lambda. This only needs to be done one time for the administrative account.

  • Manually create an S3 bucket accessible from the administrative account. Zip up the two python reaper files, and and place them in the bucket, naming them and
  • From the administrative account, create a new stack set and use the deploy_to_s3 template. An example CLI invocation would look like:
$ aws cloudformation create-stack-set --stack-set-name reaper-assets --template-body 
file://path/to/deploy_to_s3.yaml --capabilities CAPABILITY_IAM --parameters
  • Deploy stack-set-instances for this stack set, one per region in the administrative account. Check the Amazon documentation for the most up-to-date region list. For example:
$ aws cloudformation create-stack-instances --stack-set-name --accounts 123456789012
--regions "us-west-1" "us-west-2" "eu-west-1" ...

deploy_reaper template

After the resources for the reaper have been distributed, you can use the deploy_reaper CloudFormation template to deploy the reaper into an account.

You will need to follow the steps below for each account you are deploying the reaper into.

  • First, create a stack set representing the account you wish to run the reaper in. Example invocation:
$ aws cloudformation create-stack-set --stack-set-name reaper-aws-account --template-body
file://path/to/deploy_reaper.yaml --capabilities CAPABILITY_IAM --parameters
ParameterKey=SLACKWEBHOOK,ParameterValue=1234567 ...
  • Deploy the reaper into the account.
$ aws cloudformation create-stack-instances --stack-set-name reaper-aws-account --accounts
098765432109 --regions "us-west-1" "us-west-2" "eu-west-1" ...

Turning on the EC2 Reaper

Once deployed, the EC2 Reaper will not reap anything unless the environment variable LIVEMODE is set to TRUE. It will only report what it would have done to Slack.

When the time comes to activate the Reaper, update the parameter value LIVEMODE to “TRUE”(the regex is case-insensitive).

$ aws cloudformation update-stack-set --stack-set-name reaper-aws-account
--use-previous-template --parameters ParameterKey=LIVEMODE,ParameterValue=TRUE --capabilities CAPABILITY_IAM


Now you have learned how to control costs on AWS by reaping old EC2 instances. To learn more about our mission and product, sign up for our updates on Our mission is to free you of tedious cloud-native workflows with event-driven automation! For more content like this, please follow our medium page at