Embarrassingly easy private certificate management for VMs on AWS, GCP, and Azure
Mike Malone
Today’s release of step
and step-ca
(v0.11.0
) adds support for cloud instance identity documents (IIDs), making it embarrassingly easy to get certificates to workloads running on public cloud virtual machines (VMs). If you want to use TLS to secure service-to-service communication in the cloud (and you should), this will make your life 100% easier.
See the release announcement to find out what else is new in v0.11.0. This post introduces IID-based authentication with
step
andstep-ca
, and notes some interesting architectural and security details. Here's how it works once everything is set up:
All of the nitty-gritty details of key pair & CSR generation, and IID-based authentication, are handled for you. With three simple commands you get a certificate and private key for use with TLS:
$ step certificate inspect --short foo.crt X.509v3 TLS Certificate (ECDSA P-256) [Serial: 4555...1939] Subject: foo.local Issuer: My Intermediate CA Provisioner: AWS IID Provisioner [ID: 8074...3263] Valid from: 2019-07-08T21:39:40Z to: 2019-07-09T21:39:40Z
Background & motivation
step-ca
is a private certificate authority (CA) you can run yourself. step
is a step-ca
API client. Together, they let you automate internal certificate management.
More concretely:
step-ca
is a server that issues certificates via a JSON/HTTPS APIstep
integrates withstep-ca
's API to get, renew, and revoke certificates
They're both open source. See our getting started guide to get them up and running.
Certificates are the only broadly interoperable option for defense-in-depth and for connecting system components across the public internet. You can use certificates and TLS to connect to databases, authenticate & encrypt requests inside a VPC, or connect securely across the public internet without a VPN.
Running a private CA reduces reliance on trusted third parties, lets you control certificate details, satisfy SLAs, control naming, and generally own your internal security.
But running a CA and managing certificates is hard. step
and step-ca
fill a tooling gap to make these tasks much easier. IID-authentication makes integrating these tools into cloud environments turnkey.
IID-based authentication
Instance identity documents (IIDs) are simply credentials that identify an instance’s name and owner. By presenting an IID in a request, a workload can prove that it's running on a VM instance that you own.
The major clouds have different names for IIDs: AWS calls them instance identity documents, GCP calls them instance identity tokens, and Azure calls them access tokens. The “metadata” included in an IID also differs between clouds, along with many other implementation details. Abstractly, though, they're all the same: signed bearer tokens that identify, minimally, the name and owner of a VM.
IIDs are returned from a metadata API exposed via a non-routable IP address (the link-local address 169.254.169.254
). This magic is orchestrated by the hypervisor, which identifies the requesting VM and services IID requests locally. The upshot is: IIDs are very easy to get from within a VM via one unauthenticated HTTP request. Barring any security issues, they're impossible to get from anywhere else.
For fun, let's fetch an IID on GCP. Since GCP encodes IIDs as JWTs and makes their public keys available at a well-known endpoint we can easily verify and decode a GCP IID on the command line using step
:
$ curl -s -H "Metadata-Flavor: Google" \ 'http://metadata/computeMetadata/v1/instance/service-accounts/default/identity?audience=step-cli&format=full' \ | step crypto jwt verify \ --jwks https://www.googleapis.com/oauth2/v3/certs \ --aud step-cli \ --iss https://accounts.google.com { "header": { "alg": "RS256", "kid": "afde80eb1edf9f3bf4486dd877c34ba46afbba1f", "typ": "JWT" }, "payload": { "aud": "step-cli", "azp": "117354011164720418655", "email": "259112794408-compute@developer.gserviceaccount.com", "email_verified": true, "exp": 1562780090, "google": { "compute_engine": { "instance_creation_timestamp": 1557864897, "instance_id": "5404647754959331152", "instance_name": "foo", "project_id": "step-instance-identity-test", "project_number": 259112794408, "zone": "us-west1-b" } }, "iat": 1562776490, "iss": "https://accounts.google.com", "sub": "117354011164720418655" }, "signature": "AV8vZiNjOJNkWhWp5oy9R_WgGu3-tePyM4pKyHoela2SMVyWfpq4fPlSUxSPdzmfCT_akrUXrw-mDq7eLByqDOp3A4sGEZn9bY4Vmt5h9QYMVIo_60LRtC7c7QoBFZp2u3tNrPaI8ZhoINgHCsTdbfGEUDnCA8aH1mygd8b3kUEXcMCHrgUayPEVSMih8OYfmHUdecyTt0qOw6Ima16lX1jmM6lSoj8VNFmee36qFn58qULchB89lqviv-E0VzS5NthlqaM2_ukYNtKac-MdQdIlE86a-2YtgyXo4OVCpb87Svf2Rw9VaFsCKt4wFlRsnz4B3rx3I2bM2mXsQZY38Q" }
Note that, since IIDs are signed by your cloud provider, verification doesn't require any additional API requests. So, in addition to being easy for clients, IID-based authentication is scalable, performant, and highly available.
In summary, by simply fetching an IID from the metadata API and presenting it in an HTTPS request header a workload can prove that it’s running on a VM under your control. That’s precisely how step
and step-ca
use IIDs.
Subscribe to updates
Unsubscribe anytime, see Privacy Policy
Getting a certificate from step-ca
using IID-based authentication
The basic architecture of step-ca
's IID-based authentication is pretty simple:
To get a certificate from step-ca
using an IID we need to:
- Generate a key pair and a certificate signing request (CSR) with the workload's name
- Obtain an IID from the metadata API to authenticate to
step-ca
- Submit the CSR & IID to
step-ca
via HTTPS POST to obtain a certificate - Store the certificate and private key somewhere our workload can find it
While this is all standards-based and simple in theory, in practice there's a lot of arcane implementation detail to get right. Luckily, the step
command line utility works seamlessly with step-ca
to do all of this for us.
To demonstrate, assume we have step-ca
running on AWS with hostname ca.local
(see getting started).
To enable IID-based authentication we'll simply configure step-ca
, adding an AWS
-type provisioner.
Find your AWS account ID to restrict access to our VMs:
On the host running step-ca
add an AWS provisioner to your configuration by running:
$ step ca provisioner add "AWS IID Provisioner" \ --type AWS \ --aws-account 123456789042
(Re)start step-ca
to pick up this new configuration:
$ step-ca $(step path)/config/ca.json
With the step-ca
server configured and running, let's use step ca bootstrap
to configure a new VM instance to trust and connect to it:
$ step ca bootstrap \ --ca-url https://ca.local \ --fingerprint f501ed49263c1369bd490a85660ddd4388d4175e0337100a11d4e82eae496499 The root certificate has been saved in ~/.step/certs/root_ca.crt. Your configuration has been saved in ~/.step/config/defaults.json.
The
--fingerprint
is for the root certificate used bystep-ca
. Find it by runningstep certificate fingerprint $(step path)/certs/root_ca.crt
on your CA. After bootstrapping we're ready to get a certificate. If we pass our AWS IID provisioner name tostep ca certificate
(via--provisioner
),step
will automatically use IID-based authentication to get a certificate:
$ step ca certificate foo.local foo.crt foo.key \ --provisioner "AWS IID Provisioner" ✔ Key ID: AWS IID Provisioner (AWS) ✔ CA: https://ca.local ✔ Certificate: foo.crt ✔ Private Key: foo.key
The first positional argument to step ca certificate
specifies our workload's name -- the certificate subject -- in this case, foo.local
. The next two positional arguments specify which files to write the certificate and private key, respectively.
We can use step certificate inspect
to check our work:
$ step certificate inspect --short foo.crt X.509v3 TLS Certificate (ECDSA P-256) [Serial: 4555...1939] Subject: foo.local ip-172-31-65-180.us-east-1.compute.internal 172.31.65.180 Issuer: My Intermediate CA Provisioner: AWS IID Provisioner [ID: 8074...3263] Valid from: 2019-07-08T21:39:40Z to: 2019-07-09T21:39:40Z
Congratulations, that's it. Tell your workload to use foo.crt
and foo.key
, configure clients to trust certificates signed by $(step path)/certs/root_ca.crt
, and you're good to go. You now have a strong standards-based mechanism to authenticate workloads and encrypt traffic using TLS.
Example configurations for GCP and Azure are available in the step certificates docs. Instead of account IDs, the GCP IID implementation restricts access by project and/or service account. Azure restricts access by tenant. The step
CLI abstracts away the remaining differences.
Depending on your situation and religion, you might toss these commands into a startup script, bake them into an AMI, or use configuration management for the last mile of automation. Speaking of religion: if you're using kubernetes then IID-based authentication isn't right for you. Use autocert or our cert-manager integration instead.
A few more features
Subject Names
In the example above we got a certificate identifying our workload using the logical service name foo.local
. Standard HTTPS implementations require this name to appear in the authority portion of a URL. That is, this certificate will only work if we connect to the service using a URL like https://foo.local/some/path
.
The name foo.local
doesn't appear anywhere in the IID. There's no check to ensure that this particular VM should be allowed to obtain a certificate with this particular name. In fact, by default, step-ca
will allow your VMs to obtain a certificate binding any name. This is a pretty broad privilege -- a VM that's supposed to run bar.local
could maliciously impersonate foo.local
-- but this isn't a problem if you trust all of the processes and users that have access to your VMs.
If you'd rather, you can add "disableCustomSANs": true
to your provisioner configuration to lock down step-ca
so that it will only issue certificates binding a cloud-specific instance identifier (e.g., i-05e360db6e50b5eda
), internal hostname (e.g., ip-172-31-65-180.us-east-1.compute.internal
), and internal IP address (e.g., 172.31.65.180
):
ubuntu@ip-172-31-65-180:~$ NAME=$(curl -s http://169.254.169.254/1.0/meta-data/instance-id) ubuntu@ip-172-31-65-180:~$ step ca certificate $NAME vm.crt vm.key \ --provisioner "AWS IID Provisioner" ✔ Key ID: AWS IID Provisioner (AWS) ✔ CA: https://ca.local ✔ Certificate: vm.crt ✔ Private Key: vm.key
Here, we've passed the instance ID (i.e., i-05e360db6e50b5eda
) to step ca certificate
as the certificate's subject (its common name), which we obtain via introspection from the metadata API. With custom subject alternative names (SANs) disabled, step-ca
automatically extracts the instance's internal hostname and IP from the IID and includes them as SANs in the certificate:
ubuntu@ip-172-31-65-180:~$ step certificate inspect --short vm.crt X.509v3 TLS Certificate (ECDSA P-256) [Serial: 9391...0237] Subject: i-05e360db6e50b5eda ip-172-31-65-180.us-east-1.compute.internal 172.31.65.180 Issuer: My Intermediate CA Provisioner: AWS IID Provisioner [ID: 8074...3263] Valid from: 2019-07-08T19:00:48Z to: 2019-07-09T19:00:48Z
With custom SANs disabled for an IID provisioner this is the only sort of certificate step-ca
will sign. That is, step-ca
will only sign certificates binding your VM's instance identifier, host name, and IP address. This is much more restrictive, and will require that you connect to workloads using URLs like https://ip-172-31-65-180.us-east-1.compute.internal
or https://172.31.65.180
.
If you're interested in more sophisticated enrollment and access control please open an issue or let us know so we can understand your requirements.
Multiple Certificates
If you try to get two certificates on the same VM you'll find that the CA returns Unauthorized
:
$ step ca certificate foo.local foo.crt foo.key ✔ Key ID: AWS IID Provisioner (AWS) ✔ CA: https://ca.local ✔ Certificate: foo.crt ✔ Private Key: foo.key $ step ca certificate bar.local bar.crt bar.key ✔ Key ID: AWS IID Provisioner (AWS) ✔ CA: https://ca.local Unauthorized
This is another security feature. Since any user or process on a VM can obtain an IID, they can also obtain a certificate. To prevent malicious certificate requests the CA will only issue one certificate per VM by default. If you obtain a genuine certificate early in your VM's life cycle (e.g., in a startup script) any subsequent requests will fail. So this feature mitigates some threats.
Unfortunately, this also makes it impossible to use IIDs to issue different certificates to two or more workloads running on the same VM. To turn this feature off, set "disableTrustOnFirstUse": true
in your provisioner configuration and restart step-ca
.
Renewal & Revocation
You can renew and revoke a certificate that was issued using IID-based authentication the same way you would any other certificate issued by step-ca
, with step ca renew
and step ca revoke
:
$ step ca renew --force foo.crt foo.key Your certificate has been saved in foo.crt. $ step ca revoke --cert foo.crt --key foo.key Certificate with Serial Number 4555...1939 has been revoked. $ step ca renew --force foo.crt foo.key error renewing certificate: Unauthorized
This handy incantation of step ca renew
can address a couple obvious operational considerations, too:
$ step ca renew --daemon --exec "kill -HUP $SOME_PID" foo.crt foo.key INFO: 2019/07/18 14:22:18 first renewal in 15h50m43s
Here we've told step
to stay running, renewing your certificate periodically, and notifying $SOME_PID
after each renewal by sending it a SIGHUP
. If the CA goes down for a period of time and a renewal fails, step ca renew
will simply try again.
For more thoughts and best practices around renewal & revocation see our blog post on passive revocation.
Summary
So there you have it. IID-based authentication makes it easier than ever to get certificates to cloud VMs. All you need to do is spin up an instance running 'step-ca' and add a few shell commands to your CD pipeline. The hard stuff is handled for you by your cloud provider, step
, and step-ca
. You can get up and running in an afternoon.
Once you have certificates issued it's easy to use TLS (or mutual TLS) to authenticate workloads, for defense-in-depth, and to connect across clouds without a VPN. This is a powerful capability that every distributed system should have. Your only excuse -- that it's "too hard" -- is no longer true. Automated certificate management with step
and step-ca
is free, open, and embarrassingly easy. Give it a try and let us know how it goes!
Subscribe to updates
Unsubscribe anytime, see Privacy Policy
Mike Malone has been working on making infrastructure security easy with Smallstep for six years as CEO and Founder. Prior to Smallstep, Mike was CTO at Betable. He is at heart a distributed systems enthusiast, making open source solutions that solve big problems in Production Identity and a published research author in the world of cybersecurity policy.