Secrets at Shopify - Introducing EJSON

This is a continuation of our series describing our evolution of Shopify toward a Docker-powered, containerized data centre. Read the last post in the series here.

One of the challenges along the road to containerization has been establishing a way to move application secrets like API keys, database passwords, and so on into the application in a secure way. This post explains our solution, and how you can use it with your own projects.

Motivation

In the previous iteration of our infrastructure, secrets were provisioned onto servers via Chef's encrypted data bags. This worked fine, but had a few issues:

  1. Adding and rotating secrets required the "keys to the kingdom", as Chef uses symmetric encryption. This led to a lot of unnecessary work for our operations team in setting secrets for other teams, and eventually resulted in a more teams having access to higher-level credentials than was reasonable;

  2. Chef-client doesn't run synchronously with application deploys, so extra effort was required to add or rotate secrets when they were accompanied by code changes;

  3. The way Chef encrypts databags makes line-by-line auditing of changes to secrets impossible with our normal code review process;

  4. We did not want to run Chef inside containers, or rely on externally provisioned files for container correctness.

The first three points had been minor annoyances for a while, but the fourth ended up being a deal breaker. We wanted a solution with the following properties:

  1. Asymmetric encryption, to give developers the power to add or rotate secrets, but not to decrypt them;
  2. Secrets that change synchronously with respect to application source;
  3. Trivial per-secret change auditing;
  4. Minimal host-side runtime dependencies while still keeping decrypted secrets out of docker images.

EJSON

In order to meet our requirements, we created EJSON. EJSON is a simple library that encrypts all the values in a JSON file with a public key embedded in the file. The paired private key is stored our container build server - we’ll get into how builds are handled a little later on. Developers can add plaintext secrets to an ejson(5) file, then run the ejson(1) command line utility to encrypt all the new plaintext values.

For example:

{
  "_public_key": "9ec06d06c63143bd3dd1c121326cc4c6536932a8ccf57f78cf34894c1c3e567a",
  "api_key": "some_password_or_token"
}

If we save this as secrets.ejson in our application and run ejson encrypt *.ejson, the file will be transformed into something like this:

{
  "_public_key": "9ec06d06c63143bd3dd1c121326cc4c6536932a8ccf57f78cf34894c1c3e567a",
  "new_secret": "EJ[1:xjJ/6phdt37CtSZjiriMiixogdguczl9tlcRtZAQ1Sk=:/UZUbD+9dphu4tC5vTq6lIVK4Cg7i2oI:YCz9c9q6lYACXgDPfART1OPHYJMMqt4=]"
}

demo

This can then be committed to the git repository, which allows for line-by-line auditing. We provision a new key for each project that uses EJSON. This allows us to hand the application developer the private key if the need arises, without compromising secrets in other applications.

Key Management

Now we have a secrets.ejson file in our repository, and we're building a Docker container. We build a container for each push to the master branch, which includes the source, plus all dependencies for that ref. This means that each container includes a copy of the encrypted secrets - a strategy we like for its reduction of the number of variables in play during deploys and rollbacks. Here’s our criteria for decrypting secrets to make them available to the application at runtime:

  1. Decrypted secrets and decryption keys should never be present on the Docker registry;
  2. Application servers which run the containers should only require one decryption key;
  3. The container build server should be the only machine that has access to all the private keys.

Given the above, the process we use is as follows:

  1. When building the container, decrypt all the secrets in the repository (encrypted with various keys), and re-encrypt them with a single 'infrastructure' key;
  2. Securely provision the 'infrastructure' key onto each server that runs containers. We use Chef for this - these servers are already provisioned using Chef, so this choice didn’t introduce any unnecessary complexity;
  3. When booting an image, mount in the infrastructure key, decrypt any secrets found inside the image, unmount the key volume, then initialize the application. This is accomplished by our custom init process.

From a developer’s perspective, the end result of this process is that they put encrypted secrets into the repository at coding-time, which are then decrypted in-place before runtime. At no point does the application have access to the decryption key, since the decryption process takes place in our custom container init process, prior to dropping permissions and executing the application itself.

Key Negotiation

It turned out to be surprisingly difficult to securely insert the decryption key into the container while revoking access before the application starts. We wanted to mount in the key as a volume and then unmount it (ie. umount(2)), but Docker specifically drops the required capability (CAP_SYS_ADMIN) when it initializes the container.

We experimented with copying the key into a staging directory, which we would mount in to the container as a read-write volume, removing the key file after it had been read. However, copying keys on the host filesystem required a relatively complex dance to get a container running, and also had a higher attack surface.

Eventually we realized that rather than actually starting our containers with a low-permission user, we could start our containers with high permissions and have our init process drop them before executing the hosted application. From there, we arrived at our current process:

  1. docker run with -v /opt/ejson/keys:/opt/ejson/keys:ro, -u root and --cap-add=SYS_ADMIN
  2. (In the container init process) Decrypt the secrets in-place using the mounted key
  3. Unmount /opt/ejson/keys
  4. Drop permissions from root to our low-permission user (implicitly dropping CAP_SYS_ADMIN)
  5. Execute the application

This may sound a little complex, but it's very little additional complexity if, like us, you already have a custom init process.

Metadata

We've recently become interested in attaching metadata to each secret in our ejson files, so we added a feature to leave values unencrypted if their key begins with an underscore. This lets us build schemas like:

{
  "fooco_api_key": {
    "_description": "token for FooCo analytics service",
    "_urls": ["https://foo.co"],
    "_rotation": "log in to the console with user=shopify and the password below, click 'generate new token'",
    "console_password": "passw0rd",
    "secret": "my_api_token"
  }
}

In the example above, only console_password and secret would be encrypted. We specifically decided against building knowledge of schemas into ejson, but we're experimenting with enforcing them on applications via Continuous Integration checks.

In summary:

  • We use ejson to write encrypted secrets directly into the application's git repo with a different keypair per project.
  • Our container builder re-encrypts the secrets to a single 'infrastructure' key present on all machines that run production containers.
  • The container's init process decrypts the secrets and discards the decryption key before the application initializes.
  • We're experimenting with schemas to track secret metadata and make rotation a little easier.

Of all the issues we had to work through while implementing Docker at Shopify, this has been the most unexpectedly complex. We hope the Docker ecosystem is able to come up with a more generic solution to this problem, but at the same time, we don't have any magic bullets. EJSON works well for us, and will work well for some others, but the requirement to add logic to the container's init process makes it too heavyweight a solution for many.

How do you handle application secrets in docker containers? Let us know in the comments, and stay tuned for more posts in this series.