Encryption at Rest: How We Designed and Built It in Notary
When released, Notary will be an open-source TLS certificate management software that provides a secure, reliable, and simple way to manage X.509 certificates for applications and services.
It will allow uploading certificates manually signed from a different certificate authority (CA) or signing certificates directly in Notary, with Notary acting as the CA. Users will be able to upload certificate signing requests (CSRs), certificates, sign certificates, and download certificates. Certificates, private keys, and user data are stored in Notary’s database. Some of this data is considered sensitive, such as private keys and user passwords, and in some cases, even usernames may be considered confidential.
Notary is intended to be deployed in enterprise environments where organizations use it to manage certificates internally. One of the main requirements enterprises have regarding security is that sensitive data must be protected at all times, during transit and at rest, therefore requiring the data to be encrypted. This is not only important to prevent unauthorized access but also to comply with regulations. Notary can be configured to use TLS and HTTPS for API communication, ensuring data is transferred securely. For secure data storage, we implemented encryption at rest. This post walks you through the steps we took to design and build encryption at rest in Notary.
Why Encryption at Rest Matters
Say a malicious actor gets temporary access to the database and is able to leak parts of it. Those leaked parts might include sensitive data that can be used in harmful ways, for example, a private key used by a CA to sign certificates, or the password of the admin’s account. Insider threats or accidental leaks pose similar risks, making encryption at rest a critical safeguard.
When approaching the problem of encryption at rest, there are two main questions to answer:
- How to encrypt data?
- How to protect the encryption key?
The first question is simpler. We must encrypt data at the time it is created and store only encrypted data in the database. When we need to read that data, we decrypt it and use the raw content. For this, we use one encryption key for both encryption and decryption. This is called symmetric encryption because the same key is used for both operations, unlike asymmetric encryption, which uses two keys: one for encryption and another for decryption. Asymmetric encryption is typically used when multiple parties are involved, and the message is encrypted in one place then decrypted elsewhere. For example, asymmetric encryption is often used to exchange keys securely or in digital signatures. But for local encryption and decryption, symmetric encryption is more suitable.
Choosing the Right Encryption Algorithm
There are many algorithms for symmetric encryption. Here, we highlight the key characteristics of symmetric encryption algorithms that influenced our design decisions in Notary.
A widely adopted and modern one is Advanced Encryption Standard (AES)-256-Galois/Counter Mode (GCM), commonly used in disk encryption.
This means we use a 256-bit key with the AES-GCM algorithm to encrypt and decrypt data. AES-GCM is efficient and addresses problems and vulnerabilities found in its predecessors like AES-Electronic Codebook (ECB) and AES-Cipher Block Chaining (CBC).
AES-ECB is straightforward. It applies encryption as a mathematical operation on the message and generates a ciphertext. The reverse operation reconstructs the message from the ciphertext using the same key. A major flaw in this approach is that it reveals patterns, the same message always produces the same ciphertext. A well-known visual demonstration of this issue is the ECB-Penguin which visually shows how repeated patterns can leak information.
AES-CBC solves the pattern leakage problem by introducing randomness using an initialization vector (IV). This IV is included in the first block’s ciphertext. Then, each block’s output influences the next, creating a chained effect.
However, problems remain with AES-CBC. It is relatively slower because encryption must proceed sequentially block by block.
Another issue is authenticity. There’s no built-in way to verify if a ciphertext has been tampered with. For instance, if the original message was an email address and the ciphertext is altered, decrypting it will yield gibberish, making the tampering obvious. But in other cases, such as encrypting tokens or keys (which already appear random), a tampered ciphertext might still look valid after decryption, even though it isn’t. AES-CBC can be combined with a separate Hash-based Message Authentication Code (HMAC) for authentication, but this adds extra complexity.
AES-GCM solves both issues. It uses a counter for randomness, allowing parallel encryption of blocks, and includes built-in authentication. If a ciphertext is altered, decryption fails instead of producing incorrect output.
Protecting the Encryption Key
Now that we’ve encrypted our data with AES-GCM, we need to protect the encryption key itself. We must decide where to store it, how to protect it, and how to load it into Notary so it can be used for cryptographic operations.
We can either store the key in the same database or externally.
If we store it unencrypted in the same database, we reduce the risk from partial leaks, but full database access allows attackers to retrieve the key and decrypt everything. So if we store it in the same database, we should encrypt it. But then, how do we protect the key that was used to encrypt our primary key?
Alternatively, we can store it in an external system and load it into Notary during startup. This keeps the key and the encrypted data in different locations, reducing the attack surface and delegating key management to the external system. The external backend must be able to store the key and either securely transfer it to Notary or provide an interface to encrypt and decrypt data. Such backends include managed secrets stores like HashiCorp Vault or cloud Key Management Services (KMS).
The downside is that if the backend is compromised and we need to generate a new key, we would have to decrypt all existing data and re-encrypt it, making key rotation expensive and complex.
We chose to combine the benefits of both approaches. We store the data encryption key in the database, but encrypted using a second key: the Key Encryption Key (KEK). Management of the KEK is offloaded to an external encryption backend.
We implemented two encryption backends: HashiCorp Vault and Hardware Security Modules (HSMs). On startup, Notary checks for its data encryption key. If it exists in the database, it uses the encryption backend to decrypt it. Otherwise, it generates a new key and uses the backend to encrypt it. Rotating the root encryption key (the KEK) then only requires re-encrypting Notary’s encryption key.
This design ensures the encryption backend is only needed at startup. The root key lives outside the database, and rotation becomes straightforward.
Hardware Security Modules (HSMs)
A Hardware Security Module (HSM) is a physical hardware device, often resembling a tiny USB, that implements the Public-Key Cryptography Standards #11 (PKCS#11) protocol, which is an industry standard application programming interface (API) for interacting with hardware security modules. In our case, the key operations we care about are encrypt and decrypt.
We developed and tested with a hardware device called YubiHSM2. At the time of implementation, we discovered it doesn’t support AES-GCM for encryption and decryption operations, so we used AES-CBC instead.
Despite AES-CBC’s limitations, this trade-off is acceptable in our scenario since encryption is only performed once at startup, making efficiency less critical. However we still need to deal with the problem of authenticity.
What’s Next
For authenticity, we plan to use a Hash-based Message Authentication Code (HMAC) to verify the integrity of ciphertexts. This will help detect any tampering or corruption of encrypted data, which is especially important because some encryption modes like AES-CBC do not provide built-in authentication.