Strongbox: Secrets management
Introduction
When compute and applications are distributed across hundreds or thousands of sites, it is critical that data is protected from security breaches, both when in storage, and when transported over the network. Physical theft is a real possibility.
Secrets such as crypto keys, certificates, and third-party access credentials should not be bundled into application images. Secrets should instead be accessed through fully authenticated and audited APIs that allow secrets to be updated, and crypto keys to remain hidden, without updating software images.
An intrusion at one site must not compromise data at other sites, and when a security breach occurs, it must be easily isolated and mediated.
Strongbox is a system-wide distributed service in Avassa for managing secrets and policies. Secrets are automatically shared to sites where they are required, and policies are applied across the entire system. Sharing only occurs in one direction, from the management point and outward. Local secrets are not propagated upward to avoid poisoning from a potential breach.
Key Strongbox features include:
- Cryptographic isolation of secrets between tenants and sites, separate keys for each tenant and site
- One-step operation to block a tenant, a site, and a host
- Fine-grained control of how secrets are distributed to sites
- Local secrets storage is sealed until remotely unsealed
- Fully audit-logged access to secrets
- Centralized key management: key rotation, revocation, access
- Encrypt and decrypt services that allow use without access to actual crypto keys
- API for format-preserving encryption/decryption as well as masking of data before logging, storage, and transport
- Fully encrypted and authenticated communication both between sites (inter-site) and between hosts at a site (inter-host)
- Mount secret as file in the file system for a container
The seal
The Strongbox application consists of a protected core process that handles the plain text data. No data is allowed to leave the core process un-encrypted. The data is protected as long as the memory of the core process cannot be accessed.
The state of the Strongbox application is encrypted using a AES 256 GCM cipher when it leaves the core process.
The key to the Strongbox state cipher (called the sealkey
) is not
stored locally, it has to be provided by some external means. The
process of providing the sealkey
is called unsealing
. The
Strongbox application is unusable until it has been unsealed since it
cannot access its internal state.
The sealkey
is generated when the site is first initialized. It is
presented exactly once and must be stored securely outside the
system. The sealkey
is split up into five parts, using the Shamir
Secret Sharing algorithm. To recover the sealkey
at least three of
the five parts must be provided. It is recommended that the parts
are stored separately from each other.
The Strongbox application runs on the controller nodes in an Avassa site. The nodes are connected using mutual TLS, ie TLS with client and server certificates. The entire site is unsealed as a unit. The individual nodes in the site are automatically unsealed as long as one node is unsealed.
Automatic unseal
The Control Tower must be unsealed by some external entity when restarted. Other sites request remote unseal from their parent site. Each site is globally unsealed. Individual tenants do not have to unseal, a site is either fully sealed or fully unsealed.
Automatic unseal is possible provided that a site has stored its sealkey with its parent site, and it has a token that allows it to access the parent site. The unseal secret is not stored in plain text at the parent, instead it is encrypted using a private-public key pair. The sealkey is encrypted using the public key before it is stored at the parent. In order to unseal, the sealkey needs to be decrypted using the private part of the key. The key-pair for this can be stored in a TPM to further secure the remote auto-unseal process.
When a site is created, a site specific access token is created for that site. The use of this token is limited to storing and retrieving the unseal secret. The token is split up into X parts, where X is the number of controller nodes in the site, using the Shamir Secrets Sharing algorithm. Each controller node in the site gets one part of this token secret.
In order for a site to request remote unseal from its parent site it needs to assemble the access token from the different parts. A majority of controller nodes are needed to recover the access token from the parts. This is to prevent the site from being vulnerable to theft of a single node.
Once the access token has been recovered it can be used to retrieve the unseal secret from the parent which, in turn, can be de-crypted using the private key and unseal is possible. It is possible to further tighten security by limiting the IP addresses allowed to access the unseal secret at the parent.
In case an edge site cannot reach its parent and request automatic unseal, it is possible to manually unseal the site. This operation is performed by providing the remote sealkey over a local craft interface. The remote sealkey can be accessed by an administrator using an action at the Control Tower.
Tenant secrets
In addition to encrypting the Strongbox state, each tenant's Strongbox data (eg stored secrets) is encrypted with a unique AES 256 key, different from the sealkey. The tenant keys are stored in the Strongbox state and only becomes available once the state has been unsealed.
Tenant data is stored separately from each other, encrypted with unique keys for each tenant. The keys are also unique to each site. If an adversary gains access to a key for one tenant, at one site, the data for other tenants remain secure, as well as data for the same tenant on other site.
When the data is transferred between sites it is secured using mutual TLS and, in addition, the data is encrypted using a tenant specific transfer key. This transfer key is shared between all sites that a tenant has access to. In case of a security breach the transfer key is rotated to lock out the compromised site.
Tenant data is not kept in plain text in memory when it isn't needed, it is de-crypted when needed for an operation, and then re-encrypted and stored.
Distribution
It is important to be able to control how secrets are distributed to different sites. There are two principles at play. First, distribution only occurs from top to bottom of the site tree. For example, a secret is never shared from an edge site to the Control Tower. Secondly, distribution only occurs when explicitly configured and only to sites that have been assigned to the tenant.
It is possible to configure how secrets should be distributed
using the distribute
setting. It may be either a generic
setting of the to
leaf which can have the value all
(distribute to all sites a tenant is assigned to), none
(do not distribute at all), and inherit
(inherit distribution
setting from parent), a list of deployments
, or an explicit
list of sites
.
When a site is initially started, or transitions from being blocked to becoming unblocked, it retrieves its initial state from its parent. This initial state contains all data that should be distributed to the site and becomes a starting point for incremental updates.
When a secret is modified (created, updated, or deleted) the changes are distributed according to the distribution setting for the secret. A minimal diff is calculated and sent downstream to the site that should receive it. The diff is encrypted using the tenants transfer key.
Audit logs
Audit logs are provided for all operations performed by a tenant. The log includes the access token, if provided, the operation, and any parameters provided.
However, all sensitive data has been hashed, using a tenant specific HMAC, before being logged.
To search for some specific sensitive data in the logs, for example,
operations performed using a specific access token, a plain text
version of the data is hashed in the same way. The audit log HMAC
function can be accessed using then strongbox/audit/hmac
action.
Audit logs are streamed upwards from local sites to aggregation sites higher up in the tree. This to allow inspection of audit logs if a site is compromised.
Core functionality
Strongbox provides a number of services available through REST APIs:
- Vault - Key/Value Store
- Crypto Functions (encryption, signing, hmac, etc)
- Transformation
- SSH CA
- SSL CA
- One Time Passwords
Most of these services have some state. That state is encrypted with the tenant specific secret (as described above), and is handled by a separate process for each tenant.
Vault - key/value store
Vault is an encrypted key-value store. The user may have multiple vaults, where each vault may have different settings in terms of how it should be distributed among sites.
Each vault, in turn, may have multiple secrets
where each
secret
stores a separate dictionary of keys and values.
These dictionaries are treated as an atomic unit, ie they are
read, written, updated, and deleted as a unit.
It is possible to mount a vault dictionary as a file image
provided the mounting containers hash has been added to the
allowed-image-access
list of the secret
.
Auto-mount
Vaults can be auto mounted when an application is started in two different ways.
- As files in a volume
- As environment variables
When mounting as files the a specific secret in a vault is mounted as a volume with files named after the keys in the secret and the file content is derived from the associated values in the secret.
When mounting as a variable a vault, secret and a specific key has to be specified.
Versioned vault - key/value store
There is a versioned version of Vault that keeps a history of old
values. It is possible to retrieve an old value of a key. There is a
configurable maximum number of versions to keep at any given time. Old
versions are removed once the max-versions
threshold has been
reached.
Cooperative locking can be achieved by requiring that the old version value is supplied when storing a new version. If the version does not match the stored value then the storage operation will be rejected.
Versions can be deleted, and later un-deleted. If the latest version is deleted then a read of the secret will return the latest live version.
Using the PATCH or merge operation on a versioned secret will result in a new version based on the previous version.
Writing a new value will result in a new version of the secret.
Crypto functions and transit keys
It is important to avoid including keys and other sensitive information in the applications that are distributed and instead access these functions through a fully access controlled API.
Strongbox provides a number of cryptographic functions under the
transit
path, many of which require some form of encryption key. The
state associated with these functions is always encrypted when stored
and distributed.
When a transit
instance is created it is possible to import an
existing key, or to generate a new one. Depending on the selected
cipher type different operations are supported: encryption,
decryption, signing, signature verification, key derivation, and
convergent encryption. The supported ciphers are:
- aes128-gcm96, aes192-gcm96, aes256-gcm96
Supporting:
- encryption
- decryption
- key derivation
- additional auth data
- convergent encryption
- chacha20-poly1305
Supporting:
- encryption
- decryption
- key derivation
- additional auth data
- convergent encryption
- ed25519
Supporting:
- signing
- signature verification
- key derivation
- ecdsa-p256, ecdsa-p384, ecdsa-p521
Supporting:
- signing
- signature verification
- rsa-2048, rsa-3072, rsa-4096
Supporting:
- encryption
- decryption
- signing
- signature verification
It is possible to keep a number of versions of cipher keys at the same time. Each ciphertext will be tagged with the key version used to encrypt the data. This makes it possible to smoothly phase in and out new version of a key, ie to perform key rotation.
It is possible to specify a minimal version for both encryption and decryption. It might be desirable to phase out an old key by increasing the minimal encryption version, while keeping the minimal decryption version until all data has been migrated to the new version, or become irrelevant.
By default, all version of a key are kept. The only way to remove
keys is by using the trim
operation. It allows all keys, up to
a given version, to be removed. The specified version must be less
than the minimal encryption/decryption versions.
Encryption
The encryption service allows a program to encrypt and decrypt data without having the encryption key in clear text in the program. By default the newest available key version is used when encrypting, but it is also possible to specify an earlier key version.
The resulting cipher text include information about which key version was used to encrypt the data.
Encrypted data will be on the format sbox:v<KeyVersion>:<Data>
There is a specific API operation to re-encrypt data using a new key, latest if not explicitly specified. This operation does not return the data in plaintext and can thus be delegated to non-privileged users.
All data that is encrypted is expected to be base64 encoded, and the result of decrypting a cipher text is base64 encoded plain text.
Signatures
Some key types, see above, can be used for signing and verifying
signatures. A base64 encoded signature is created using the transit sign
operation.
Derived keys
If the derived
option is set for a key then the bcrypt pbkdf2
algorithm is used to calculate the key using the secret key component
together with the provided context
. This makes it possible to have a
large number of keys without using more than one key definition
(and not use any extra space). They are all rotated at the same time.
If the convergent_encryption
options is also set then the IV (nonce)
will also be calculated using the same bcrypt pbkdf2 algorithm with
key, context
as initial input. The result is that the same
plaintext input will always result in the same cipher text. This
is useful when it is desirable to be able to compare values without
decoding them.
Export, backup and restore
When large amounts of data is to be encoded / decoded it is not
recommended to use the provided APIs, instead the key should be
configured as exportable
(once this setting has been enabled it
cannot be revoked), and the application should "check out" the key and
use it for bulk encryption / decryption.
Provided that the key has been configured with
allow_plaintext_backup
a plaintext representation of the
entire key state can be extracted using transit backup
. This
is a base64 encoded internal representation of the
state. This state can then be restored using transit restore
,
possibly under a different key name.
Generate data keys
It is also possible to use Strongbox to generate data keys. They are
optionally returned in both plaintext
(base64 encoded), and
wrapped
(encrypted using the indicated transit key). These
keys are not stored.
Hash and HMAC
It is possible to use Strongbox to calculate a hash of some data using
a specified algorithm. Supported algorithms are: sha
(sha1, not
recommended), sha224
, sha256
(default), sha384
, sha512
,
sha3_224
, sha3_256
, sha3_384
, and sha3_512
. The result can be
returned in two different format
: base64 and hex.
Also, all key types can be used for calculating a HMAC together with a
HMAC altorithm. Supported algorithms are sha
(sha1, not
recommended), sha224
, sha256
(default), sha384
, sha512
,
sha3_224
, sha3_256
, sha3_384
, and sha3_512
.
BCrypt
There is also an action for hashing passwords using the bcrypt password hashing function (version 2y), as well as an action for verifying bcrypt encoded password.
Transformation
It is sometimes desirable to hide data (mask or encrypt) in such a way that the original format of the data is preserved. For example, it might be desirable to protect a credit card number by encrypting it, and at the same time be possible to provide it to a sub-system that expects a value on the format of a credit card number.
There is a class of encryption algorithms that supports this called
Format Preserving Encryption (FPE) algorithms. We use an Erlang
implementation by Guilherme Andrade called erlffx
of the algorithm
described in the 2010 paper The FFX Mode of Operation for
Format-Preserving Encryption
by Bellare, Rogaway and Spies.
By encrypting the data it can be restored by decryption. It is also
possible to mask the data using masking
. Masked data cannot be
restored, however this is sometimes desirable as well, for example,
when logging.
Transformation setup
A new transform
service is configured on a given path with
a specified role name, together with parameters for:
Parameter | Default | Description |
---|---|---|
key-length | 16 | length of AES key used to perform encryption |
type | fpe | fpe or masking |
tweak-source | calculated | one of supplied , and generated |
masking-character | * | character to use for masking text |
template | (no default) | creditcardnumber or a template specification |
A template consists of:
Parameter | Description |
---|---|
alphabet | One of numeric , alphalower , alphaupper , alphanumericlower , alphanumericupper , alphanumeric , and a custom alphabet specified as a binary of all alphabet elements. |
pattern | a regular expression pattern where each group is subject to encryption, for example \([a-zA-Z0-9]+\)-\([a-zA-Z0-9]+\) |
tweak
is some data used together with the encrypted data as a
type of salt. Typically the surrounding text is used as tweak
data. If you, for example, want to encrypt then middle four
digits of a larger number, it would be possible to guess the
content of the middle numbers by comparing different encrypted
strings. However, if the surrounding texts were used as tweak
when encrypting the middle numbers, then the same numbers are
no longer encrypted to the same string, and becomes more difficult
to guess.
The tweak character can be automatically calculated or provided for each encryption/decryption invocation. Note that the same tweak text must be supplied when decrypting as was supplied when encrypting.
Encryption/decryption
Data can be encrypted or masked using the transform encrypt
command. It has to conform to the pattern and alphabet specified
for the transform. A tweak
can optionally be specified, or
automatically derived from the surrounding text.
Decryption is only possible if the value was encrypted using
fpe
and not masking
. The original value will be restored
as long as the same tweak
is supplied.
SSH
Strongbox can function as a ssh CA and generate both host (server side) credentials and client credentials. Both ssh keys and ssh certificates can be generated.
It can also generate OTPs that can be used by a client to log on to a service, and by a server to authenticate a user.
SSH setup
When a ssh CA service is created it can be configured to generate a
signing key, or a signing key can be supplied. Supported key types are
rsa
(key length 1024, 2048, 3072, and 4096), ed25519
, ecdsa
(curves nistp256
, nistp384
, and nistp521
). The public key
is available as state.
SSH host certificates
Host certificates are used to identify a host to a client. Often the
public key is used to identify the host. This is a security risk
since most users tend to just accept the public key that the host
provides and add it to their known_hosts
file. A much better way is
to use a host certificate. The client then adds an entry in the
.ssh/known_hosts
file to indicate that it trusts all certificates
signed by a given CA. An entry in the clients .ssh/known_hosts
file
may look like this:
@cert-authority tio.avassa.io ecdsa-sha2-nistp256 AAAAE2VjZHN...
The host needs to configure the host certificate using the
HostCertificate
setting in /etc/ssh/sshd_config
. For example in
OpenSSH:
HostCertificate /etc/ssh/host_id_ecdsa-cert.pub
Client certificates
SSH keys may be used to facilitate authentication without passwords towards a SSH host. A problem with this is that clients needs to install their public keys on all hosts they want to access. It is difficult to add keys to all hosts, and more importantly, when a client's access is revoked all entries needs to be removed on all hosts.
A better solution is to use ssh client certificates. The client provides it's public key to the CA, which signs it and returns a signed certificate with a limited validity time. The client can use this certificate to authenticate towards a host as long as the certificate has not expired.
The host needs to be configured to trust certificates signed by the
CA. In OpenSSH this is done using the TrustedUserCAKeys
setting.
For example:
TrustedUserCAKeys /etc/ssh/ca.pub
The Strongbox CA can sign and generate SSH certificates. When generating a SSH certificate both the private and the public key will be generated, and the public key will be signed.
SSH one time passwords
As an alternative to SSH client certificates Strongbox can be used for generating and verifying OTPs. An OTP is issued for a specific user and a specific IP address. The OTP can be validated exactly once.
To verify an OTP the server is configured with a PAM module that invokes a program that performs the verification towards Strongbox. This may look like:
auth requisite pam_exec.so quiet expose_authtok log=/tmp/sboxssh.log /usr/local/bin/sbox-ssh-helper -dev -config=/etc/sbox-ssh-helper.d/config.hcl
auth optional pam_unix.so not_set_pass use_first_pass nodelay
And the sshd_config
is modified to use PAM, ie
ChallengeResponseAuthentication yes
PasswordAuthentication no
UsePAM yes
Roles
Different roles are configured for issuing certificates and OTPs. Each role instance can be configured with different limitations for the certificates and OTPs it can generate.
It is a good idea to create one role per user you want to issue OTPs for.
TLS
Strongbox can be setup to function as a SSL/TLS CA, either with a self-signed root certificate, or with a provided SSL/TLS CA certificate.
This functionality is primarily intended for securing communication between applications using, for example, mutual TLS, but it can also be used to secure Web traffic.
The certificates can be configured to be automatically rotated when they are about to expire.
Intermediate CA certificates
The Strongbox CA can issue intermediate CA certificates, which helps with setting up a distributed trust scheme.
Client and server certificates
Certificates are signed and issued by different TLS roles. Different
roles can be configured to be allowed to issue certificates with
different restrictions such as allow-client-certificates
,
allow-server-certificates
, allowed-hosts
, allowed-domains
,
ttl
, allow-subdomains
, etc.
The CA can create and sign RSA (1024, 2048, 3072, and 4096) and ECDSA (secp256r1, secp384r1, and secp521r1) certificates. Certificates can be created from a provided public key, or a proper key pair can be generated.
Revocation lists
The CA functionality can also keep track of revoked certificates, and generate properly signed revocation lists on demand.