May 12 | 8 min read

Introducing dynamic tenant creation

Dr. Jan-Philip Gehrcke

How to add a tenant and generate corresponding API tokens

Introduction: multi-tenancy and the authentication proof challenge

An Opstrace instance supports multiple secured tenants. The 'tenant' primitive—provided by the underlying projects Cortex and Loki—can be used to isolate data, dashboards, and API endpoints by team, environment, or by anything else important to you. Sometimes this isolation is used to address security concerns, but more generally, it helps keep things well-organized: after all, separation of concerns is a valuable best practice to keep in mind.

So, when installing an Opstrace instance in your account, you have to specify which tenants you want to provision. You list them by name in the cluster configuration document.

But what if, after that initial installation, you decide you would like to add another tenant? Until recently, you would not have a tenant complete with a new public API authentication token.

One challenge associated with that task is creating API authentication tokens for that new tenant (as our APIs are secure by default). Let's take this slow.

For each tenant, an Opstrace instance exposes certain API endpoints, with the tenant's name used in the corresponding DNS names. For example, the Cortex API for a tenant with the name prod in an Opstrace instance with the name nice-org is exposed securely via https://cortex.prod.nice-org.opstrace.io.

To use that API endpoint, a client has to provide authentication proof: a tenant-specific API token, signed by a private key whose corresponding public key needs to be trusted by the Opstrace instance.

In the past few weeks, we added functionality to the Opstrace CLI for said key management and token creation. We also started covering the complete user experience flow for tenant creation and corresponding token creation in our continuous integration. With this work, we can confidently announce the capability of dynamic tenant addition to a running Opstrace instance.

This blog post primarily is a tutorial, showing:

How to generate a new key pair with the Opstrace CLI.
How to generate an API token for a specific tenant using the private key of said key pair (also with the help of the Opstrace CLI).
How to add the public key of said key pair to a running Opstrace cluster. 4. How to — after all — add a tenant to a running Opstrace instance via the UI: we will wait for the corresponding API endpoints to become available, and then use the tenant-specific Cortex HTTP API using the previously generated API token.

Along the way, some of the core concepts are explained.

Step-by-step guide

Let's add a new tenant (name: newtenant) to a running Opstrace instance. The name of the existing (running) Opstrace instance in this guide is showdown.

1) Create a new RSA key pair

Create a new RSA key pair and store it in a file with this command:

./opstrace ta-create-keypair ./custom-keypair.pem

Note: The ta- prefix represents the idea of "tenant API authentication." All ta-* commands offered by the Opstrace CLI are new and should be thought of as experimental (command names and signatures are subject to potentially big changes in the future).

After running this command, you have a local file ./custom-keypair.pem in your file system, with locked-down file permissions. It is important to understand that this file contains a secret, the private key.

2) Create an authentication token

The following command creates a new authentication token, signed with the private key of the key pair generated in the first step:

./opstrace ta-create-token showdown newtenant custom-keypair.pem > token-showdown-newtenant.jwt

The token (emitted via stdout and captured in the file token-showdown-newtenant.jwt) is a standards-compliant JSON Web Token (JWT), implementing Opstrace-specific conventions:

in the JWT header, it encodes an ID of the public key via which it can be cryptographically validated.
in the JWT payload section, it encodes the name of the associated tenant.

For this token to become useful, an Opstrace instance needs to be configured with the public key corresponding to the private key that was used to sign the token. Let's do that.

3) Add the public key to a running Opstrace instance

You can think of this step as adding a new trust anchor to the trust store of a running Opstrace instance. Just like with X.509 certificates, trusting a public key means that authentication proof signed with the correspond private key is accepted (reminder: in public-key cryptography, verification of authentication proof only ever needs the non-sensitive public key material and not the private key data—that's the beauty).

So, to make an existing Opstrace instance trust the authentication token generated in the previous step, we have to put the public key into the instance:

./opstrace ta-pubkeys-add aws showdown custom-keypair.pem

4) Create a new tenant with the UI

Visit https://showdown.opstrace.io/cluster/tenants and press the Add Tenant button:

Type the name of the new tenant (here: newtenant)

Warning: we do not yet do strict tenant name validation. To make sure things work, please keep the name lower case [a-z] for now.

After adding the tenant, the overview should look like this:

5) Wait for the corresponding API endpoints to become available

After creating the new tenant via the UI, the Opstrace controller running in the Opstrace instance starts a number of new components and initiates a DNS reconfiguration.

Effectively, we're now waiting for the DNS name cortex.newtenant.showdown.opstrace.io to become available.

We can probe that from our point of view with curl:

curl https://cortex.newtenant.showdown.opstrace.io/api/v1/labels

It should take about 5 minutes for DNS name resolution errors to disappear. Next up, expect an HTTP response with status code 401, showing the error message Authorization header missing in the response body.

6) Test using the API, in an authenticated fashion

Let's add said header and make an example API call against the Cortex API for the new tenant:

$ curl -vH "Authorization: Bearer $(cat token-showdown-newtenant.jwt)" \
    https://cortex.newtenant.showdown.opstrace.io/api/v1/labels
...
< HTTP/2 200
...
{"status":"success","data":[]}

Getting a 200 response (and not a 401 response) means: the authentication token provided in the request was accepted. The so-called authenticator in the Opstrace instance extracted the public key ID from the token's header section, found a corresponding public key in its trust store (think: "set of public keys that I am configured to trust"), and then performed a cryptographic verification using that public key. It also confirmed that the tenant name encoded in the token matches the tenant associated with the API endpoint.

You could now go ahead and take this authentication token and configure serious API clients with it, such as a Prometheus instance to be able to remote_write to https://cortex.newtenant.showdown.opstrace.io/api/v1/push.

Additional notes about token revocation and private key material

For now, authentication tokens created via the process shown here are practically never expiring. In view of that, a good way to invalidate (revoke) tokens is to remove the corresponding public key (trust anchor) from an Opstrace instance. That can be done with the opstrace ta-pubkeys-remove ... command.

We decided deliberately for private key material to not enter the Opstrace instance. That means that the Opstrace instance itself cannot issue new authentication tokens. But who can? The owner of the private key.

After stepping through this guide, you have an important piece of information sitting on the disk of your local machine: the private key which can be used to craft authentication tokens that will be trusted by the Opstrace instance that holds the corresponding public key.

Having access to this key and having it available on your local machine might potentially imply a lot of responsibility; depending on the security context of the Opstrace instance that now trusts matching tokens.

So, what to do with the private key? In serious environments, you maybe want to pick one of the following two options:

Delete the private key. Immediately and safely. That means: only the single authentication token derived from it so far is actually trusted by the Opstrace instance, and other tokens can never be generated anymore.
Put the private key somewhere real safe. For example, put it into a password database. Or store it in an appropriate secret store. Annotate it with what its purpose is and how to use it.

After all, managing secrets is a task that needs to be taken seriously by the humans interacting with the secrets; it should always be subject to some kind of organization-dependent process.

Conclusion

An Opstrace instance supports multiple secured tenants to enable logical separation of concerns (for example, tenants for teams or environments). We leverage the generous primitives provided by Cortex and Loki to now dynamically provision tenants in a running Opstrace instance. Because the APIs exposed for each tenant are secure by default, the process requires careful orchestration and consideration. Hopefully, this blog post has detailed this process and its nuances so that you can safely provision new tenants as needed.

Security and privacy are core to what we do. We will continue to invest in these areas and hope you will join us along the way.

You might also like...

Follow Opstrace on...