Step v0.8.3: Federation and Root Rotation for step Certificates

Sebastian Tiedtke

We are excited to start the New Year off with a new release (v0.8.3) of step, the zero trust swiss army knife and powerful open source certificate management solution. Amongst regular bug fixes, we've included some exciting new features! A common thread in community feedback we’ve received since the introduction of step certificates, which brought world-class certificate management to step, has centered around the following set of related problems:
- How do we seamlessly rotate root certificates?
- How do we migrate between PKI/certificate management solutions?
- How do we enable secure cross-communication between autonomous internal PKIs?
Fundamentally, the solution to this set of problems is to have workloads (relying parties) trust multiple root certificates. This capability allows for transitioning from an old root cert to a new root cert (e.g. due to imminent expiry or to transition onto step) by trusting both for some transitional period. It also allows for cross-communication with two autonomous systems (with independent CAs) trusting each others' root certificates for an extended period. This more sophisticated trust model, with relying parties trusting multiple root certificates for the purpose of cross-communication, is called federation.
Federation & root certificate rotation
While trusting multiple roots sounds conceptually simple the devil is in the details. Since the details of how TLS manages authentication are distributed -- each endpoint is configured separately and operates independently -- they're hard to change. To federate, a new bundle of root certificates must be distributed to every endpoint. Ideally, configuration management automation should do that for you. More often than not config management is out of sync with reality or won’t provide coverage to touch every deployed endpoint. Root cert rotation is even trickier since it requires a series of carefully sequenced operations to introduce the new root, trust old and new for a transitional period, and then decommission the old root once it's no longer required.
We'd like to make all of this easier. We’re excited to announce the preview release (alpha) of root cert rotation and federation support in step (v0.8.3). This includes workflow support and a scalable automated approach to root cert bundle distribution to all secured endpoints. Since the intention behind federation (long-lived trust of multiple roots) and root cert rotation (short-lived migration period) are different we chose to implement them separately.
In this post we'll take a closer look into how federation works.
Securely communicate across clusters/clouds
The purpose of federation is to allow for secure communication across autonomous systems (e.g., across clouds or between kubernetes clusters). Frequently, people try to solve this problem by relying on certificates issued by Web PKI certificate authorites (CAs) like Let's Encrypt. This approach seems attractive at first: you don't need to run a CA and your certificates are trusted by default by most TLS implementations.
Unfortunately, with a bit of scale and scrutiny, things start falling apart: the domain validation done by Let's Encrypt is a relatively weak form of authentication for sensitive service-to-service communication; the number of certificates you can have is limited by cost and/or rate limiting; and Web PKI was never designed for this use case! Check out "Everything you should know about certificates and PKI but are too afraid to ask" for a thorough overview.
Using your own internal PKI is a better approach.
Internal PKIs using online Certificate Authorities
You could manage cross-cluster communication by issuing certificates everywhere from a single CA. More often, autonomous clusters run their own independent CAs to meet architectural, security, or governance/compliance requirements. When autonomous systems with independent CAs need to communicate, federation is required.
To illustrate the problem, and how step certificates solves it, let's launch two autonomous CAs and try to communicate using mutual TLS (mTLS).
In order to run through the example make sure you have a Golang environment (go version go1.11.x) set up. We assume macOS but the example will work on Linux with minor modifications.
Bring up two autonomous CA instances
Let's launch two CAs in separate terminal windows called Cloud and Kubernetes.
Note: For demonstration purposes, we'll use pre-generated keys and configurations. Don't use these artifacts for your own PKI deployments. Check out the step certificates announcement instructions on how to generate your own.
In terminal window 1 (password is password for the sake of simplicity):
In terminal window 2 (again password is password)
Launch demonstration server
Once both online CAs are up and running let's bring up the demo server with a cert from Cloud CA. This demo server leverages step certificates' SDK to handle cert enrollment using a one-time token and subsequent auto-renewals for our convenience. We let the server know what CA, Cloud or Kubernetes, to use by passing command line flags:
Secure service-to-service communication using mTLS
Our server uses mTLS, so if we try to connect to it without specifying the right root certificate and presenting a valid client certificate and key our requests will fail:
Let's grab a client cert for curl from Cloud CA and try again:
The server's successful (HTTP/2 200) response shows that the client was successfully authenticated.
Now let's try connecting with a client cert issued by Kubernetes CA:
As expected the server will reject the client authentication attempt because it only trusts certificates issued by a single root; Cloud as opposed to Kubernetes.

Securely facilitate service-to-service communication across authorities
We could solve this by issuing two certificates to each client: one from Cloud CA and the other from Kubernetes CA. But that's complicated and it likely conflicts with our motivation for running independent CAs in the first place (e.g., to simplicy governance & access management, or to prevent an online CA from being exposed to the public internet). Instead, let's federate. Here's how:
Terminate Cloud CA in your terminal window and restart it using its federated configuration file.
The only difference in this new configuration for Cloud CA is that the root cert of Kubernetes CA is listed under federatedRoots.
At this point we've configured Cloud CA to trust Kubernetes CA. This root bundle change can be easily distributed using the step CLI. For code that integrates with our SDK (like our example server) this change will be picked up automatically next time they renew their certificate. We're done.
At this point Cloud CA clients have been configured to trust Kubernetes CA, but not the other way around. We could stop here, but many real world scenarios require mutual trust. So let's also terminate Kubernetes CA and restart it using its own federated configuration:
Our server will eventually pick up a new root bundle when its certificate renews. To speed things up we can restart our server. It will report that it now accepts certificates anchored in both the Cloud and Kubernetes CAs.
We can also use the step CLI to fetch the federated root bundle for use with curl:
Robust implementation with step's SDK for clients
While it's perfectly acceptable to manage certificates and root bundles on the filesystem, we also have a client SDK to go along with the server SDK we've been using here. Like the server SDK, the client SDK will generate certs, private/public keys, and fetch trusted (federated) root bundles automatically. It keeps keys in memory and automatically renews certificates before they expire, improving security and streamlining operations.
Summary
Amongst many other benefits, mutually authenticated TLS lends itself well to secure service-to-service communication between autonomous systems running everywhere. With these newly added preview features for federation & root rotation, the step toolkit expands robust identity bootstrapping beyond a single Kubernetes cluster, cloud, or VM without getting bogged down by operational challenges.
Try it out and let us know what you think. We're also looking forward to hearing about challenges you're running into deploying microservices into heterogeneous environments that span multiple clusters & clouds or any other questions you might have.
Subscribe to updates
Unsubscribe anytime, see Privacy Policy




