#symmetriccryptography — Public Fediverse posts
Live and recent posts from across the Fediverse tagged #symmetriccryptography, aggregated by home.social.
-
Key Transparency and the Right to be Forgotten
This post is the first in a new series covering some of the reasoning behind decisions made in my project to build end-to-end encryption for direct messages on the Fediverse.
(Collectively, Fedi-E2EE.)
Although the reasons for specific design decisions should be immediately obvious from reading the relevant specification (and if not, I consider that a bug in the specification), I believe writing about it less formally will improve the clarity behind the specific design decisions taken.
In the inaugural post for this series, I’d like to focus on how the Fedi-E2EE Public Key Directory specification aims to provide Key Transparency and an Authority-free PKI for the Fediverse without making GDPR compliance logically impossible.
CMYKat‘s art, edited by me.Background
Key Transparency
For a clearer background, I recommend reading my blog post announcing the focused effort on a Public Key Directory, and then my update from August 2024.
If you’re in a hurry, I’ll be brief:
The goal of Key Transparency is to ensure everyone in a network sees the same view of who has which public key.
How it accomplishes this is a little complicated: It involves Merkle trees, digital signatures, and a higher-level protocol of distinct actions that affect the state machine.
If you’re thinking “blockchain”, you’re in the right ballpark, but we aren’t propping up a cryptocurrency. Instead, we’re using a centralized publisher model (per Public Key Directory instance) with decentralized verification.
Add a bit of cross-signing and replication, and you can stitch together a robust network of Public Key Directories that can be queried to obtain the currently-trusted list of public keys (or other auxiliary data) for a given Fediverse user. This can then be used to build application-layer protocols (i.e., end-to-end encryption with an identity key more robust than “trust on first use” due to the built-in audit trail to Merkle trees).
I’m handwaving a lot of details here. The Architecture and Specification documents are both worth a read if you’re curious to learn more.
HarubakiRight To Be Forgotten
I am not a lawyer, nor do I play one on TV. This is not legal advice. Other standard disclaimers go here.
Okay, now that we’ve got that out of the way, Article 17 of the GDPR establishes a “Right to erasure” for Personal Data.
What this actually means in practice has not been consistently decided by the courts yet. However, a publicly readable, immutable ledger that maps public keys (which may be considered Personal Data) with Actor IDs (which includes usernames, which are definitely Personal Data) goes against the grain when it comes to GDPR.
It remains an open question of there is public interest in this data persisting in a read-only ledger ad infinitum, which could override the right to be forgotten. If there is, that’s for the courts to decide, not furry tech bloggers.
I know it can be tempting, especially as an American with no presence in the European Union, to shrug and say, “That seems like a them problem.” However, if other folks want to be able to use my designs within the EU, I would be remiss to at least consider this potential pitfall and try to mitigate it in my designs.
So that’s exactly what I did.
AJAlmost Contradictory
At first glance, the privacy goals of both Key Transparency and the GDPR’s Right To Erasure are at odds.
- One creates an immutable, append-only history.
- The other establishes a right for EU citizens’ history to be selectively censored, which means history has to be mutable.
However, they’re not totally impossible to reconcile.
An untested legal theory circulating around large American tech companies is that “crypto shredding” is legally equivalent to erasure.
Crypto shredding is the act of storing encrypted data, and then when given a legal takedown request from an EU citizen, deleting the key instead of the data.
AJThis works from a purely technical perspective: If the data is encrypted, and you don’t know the key, to you it’s indistinguishable from someone who encrypted the same number of NUL bytes.
In fact, many security proofs for encryption schemes are satisfied by reaching this conclusion, so this isn’t a crazy notion.
Is Crypto Shredding Plausible?
In 2019, the European Parliamentary Research Service published a lengthy report titled Blockchain and the General Data Protection Regulation which states the following:
Before any examination of whether blockchain technology is capable of complying with Article 17 GDPR; it must be underscored that the precise meaning of the term ‘erasure’ remains unclear.
Article 17 GDPR does not define erasure, and the Regulation’s recitals are equally mum on how this term should be understood. It might be assumed that a common-sense understanding of this terminology ought to be embraced. According to the Oxford English Dictionary, erasure means ‘the removal or writing, recorded material, or data’ or ‘the removal of all traces of something: obliteration’.494
From this perspective, erasure could be taken to equal destruction. It has, however, already been stressed that the destruction of data on blockchains, particularly these of a public and permissionless nature, is far from straightforward.
There are, however, indications that the obligation inherent to Article 17 GDPR does not have to be interpreted as requiring the outright destruction of data. In Google Spain, the delisting of information from research results was considered to amount to erasure. It is important to note, however, that in this case, this is all that was requested of Google by the claimant, who did not have control over the original data source (an online newspaper publication). Had the claimant wished to obtain the outright destruction of the relevant data it would have had to address the newspaper, not Google. This may be taken as an indication that what the GDPR requires is that the obligation resting on data controllers is to do all they can to secure a result as close as possible to the destruction of their data within the limits of [their] own factual possibilities.
Dr Michèle Finck, Blockchain and the General Data Protection Regulation, pp. 75-76
From this, we can kind of intuit that the courts aren’t pedantic: The cited Google Spain case was satisfied by merely delisting the content, not the erasure of the newspaper’s archives.
The report goes on to say:
As awareness regarding the tricky reconciliation between Article 17 GDPR and distributed ledgers grows, a number of technical alternatives to the outright destruction of data have been considered by various actors. An often-mentioned solution is that of the destruction of the private key, which would have the effect of making data encrypted with a public key inaccessible. This is indeed the solution that has been put forward by the French data protection authority CNIL in its guidance on blockchains and the GDPR. The CNIL has suggested that erasure could be obtained where the keyed hash function’s secret key is deleted together with information from other systems where it was stored for processing.
Dr Michèle Finck, Blockchain and the General Data Protection Regulation, pp. 76-77
That said, I cannot locate a specific court decision that affirms that crypto erasure is legally sufficient for complying with data erasure requests (nor any that affirm that it’s necessary).
I don’t have a crystal ball that can read the future on what government compliance will decide, nor am I an expert in legal matters.
Given the absence of a clear legal framework, I do think it’s totally reasonable to consider crypto-shredding equivalent to data erasure. Most experts would probably agree with this. But it’s also possible that the courts could rule totally stupidly on this one day.
Therefore, I must caution anyone that follows a similar path: Do not claim GDPR compliance just because you implement crypto-shredding in a distributed ledger. All you can realistically promise is that you’re not going out of your way to make compliance logically impossible. All we have to go by are untested legal hypotheses, and very little clarity (even if the technologists are near-unanimous on the topic!).
Towards A Solution
With all that in mind, let’s start with “crypto shredding” as the answer to the GDPR + transparency log conundrum.
This is only the start of our complications.
CMYKatProtocol Risks Introduced by Crypto Shredding
Before the introduction of crypto shredding, the job of the Public Key Directory was simple:
- Receive a protocol message.
- Validate the protocol message.
- Commit the protocol message to a transparency log (in this case, Sigsum).
- Retrieve the protocol message whenever someone requests it to independently verify its inclusion.
- Miscellaneous other protocol things (cross-directory checkpoint commitment, replication, etc.).
Point being: there was very little that the directory could do to be dishonest. If they lied about the contents of a record, it would invalidate the inclusion proofs of every successive record in the ledger.
In order to make a given record crypto-shreddable without breaking the inclusion proofs for every record that follows, we need to commit to the ciphertext, not the plaintext. (And then, when a takedown request comes in, wipe the key.)
Now, things are quite more interesting.
Do you…
- …Distribute the encryption key alongside the ciphertext and let independent third parties decrypt it on demand?
…OR…
- Decrypt the ciphertext and serve plaintext through the public API, keeping the encryption key private so that it may be shredded later?
The first option seems simple, but runs into governance issues: How do you claim the data was crypto-shredded if countless individuals have a copy of the encryption key, and can therefore recover the plaintext from the ciphertext?
I don’t think that would stand up in court.
CMYKatClearly, your best option is the second one.
Okay, so how does an end user know that the ciphertext that was committed to the transparency ledger decrypts to the specific plaintext value served by the Public Key Directory? How do users know it’s not lying?
Quick aside: This question is also relevant if you went with the first option and used a non-committing AEAD mode for the actual encryption scheme.
In that scenario, a hostile nation state adversary could pressure a Public Key Directory to selectively give one decryption key to targeted users, and another to the rest of the Internet, in order to perform a targeted attack against citizens they’d rather didn’t have civil rights.
My entire goal with introducing key transparency to my end-to-end encryption proposal is to prevent these sorts of attacks, not enable them.
There are a lot of avenues we could explore here, but it’s always worth outlining the specific assumptions and security goals of any design before you start perusing the literature.
AJAssumptions
This is just a list of things we assume are true, and do not need to prove for the sake of our discussion here today. The first two are legal assumptions; the remainder are cryptographic.
Ask your lawyer if you want advice about the first two assumptions. Ask your cryptographer if you suspect any of the remaining assumptions are false.
- Crypto-shredding is a legally valid way to provide data erasure (as discussed above).
- EU courts will consider public keys to be Personal Data.
- The SHA-2 family of hash functions is secure (ignoring length-extension attacks, which won’t matter for how we’re using them).
- HMAC is a secure way to build a MAC algorithm out of a secure hash function.
- HKDF is a secure KDF if used correctly.
- AES is a secure 128-bit block cipher.
- Counter Mode (CTR) is a secure way to turn a block cipher into a stream cipher.
- AES-CTR + HMAC-SHA2 can be turned into a secure AEAD mode, if done carefully.
- Ed25519 is a digital signature algorithm that provides strong security against existent forgery under a chosen-message attack (SUF-CMA).
- Argon2id is a secure, memory-hard password KDF, when used with reasonable parameters. (You’ll see why in a moment.)
- Sigsum is a secure mechanism for building a transparency log.
This list isn’t exhaustive or formal, but should be sufficient for our purposes.
Security Goals
- The protocol messages stored in the Public Key Directory are accompanied by a Merkle tree proof of inclusion. This makes it append-only with an immutable history.
- The Public Key Directory cannot behave dishonestly about the decrypted plaintext for a given ciphertext without clients detecting the deception.
- Whatever strategy we use to solve this should be resistant to economic precomputation and brute-force attacks.
Can We Use Zero-Knowledge Proofs?
At first, this seems like an ideal situation for a succinct, non-interactive zero-knowledge proof.
After all, you’ve got some secret data that you hold, and you want to prove that a calculation is correct without revealing the data to the end user. This seems like the ideal setup for Schnorr’s identification protocol.
CMYKatUnfortunately, the second assumption (public keys being considered Personal Data by courts, even though they’re derived from random secret keys) makes implementing a Zero-Knowledge Proof here very challenging.
First, if you look at Ed25519 carefully, you’ll realize that it’s just a digital signature algorithm built atop a Schnorr proof, which requires some sort of public key (even an ephemeral one) to be managed.
Worse, if you try to derive this value solely from public inputs (rather than creating a key management catch-22), the secret scalar your system derives at will have been calculated from the user’s Personal Data–which only strengthens a court’s argument that the public key is therefore personally identifiable.
CMKatThere may be a more exotic zero-knowledge proof scheme that might be appropriate for our needs, but I’m generally wary of fancy new cryptography.
Here are two rules I live by in this context:
- If I can’t get the algorithms out of the crypto module for whatever programming language I find myself working with, it may as well not even exist.
- Corollary: If libsodium bindings are available, that counts as “the crypto module” too.
- If a developer needs to reach for a generic Big Integer library (e.g., GMP) for any reason in the course of implementing a protocol, I do not trust their implementation.
Unfortunately, a lot of zero-knowledge proof designs fail one or both of these rules in practice.
(Sorry not sorry, homomorphic encryption enthusiasts! The real world hasn’t caught up to your ideas yet.)
What About Verifiable Random Functions (VRFs)?
It may be tempting to use VRFs (i.e., RFC 9381), but this runs into the same problem as zero-knowledge proofs: we’re assuming that an EU court would deem public keys Personal Data.
But even if that assumption turns out false, the lifecycle of a protocol message looks like this:
- User wants to perform an action (e.g.,
AddKey). - Their client software creates a plaintext protocol message.
- Their client software generates a random 256-bit key for each potentially-sensitive attribute, so it can be shredded later.
- Their client software encrypts each attribute of the protocol message.
- The ciphertext and keys are sent to the Public Key Directory.
- For each attribute, the Public Key Directory decrypts the ciphertext with the key, verifies the contents, and then stores both. The ciphertext is used to generate a commitment on Sigsum (signed by the Public Key Directory’s keypair).
- The Public Key Directory serves plaintext to requestors, but does not disclose the key.
- In the future, the end user can demand a legal takedown, which just wipes the key.
Let’s assume I wanted to build a VRF out of Ed25519 (similar to what Signal does with VXEdDSA). Now I have a key management problem, which is pretty much what this project was meant to address in the first place.
VRFs are really cool, and more projects should use them, but I don’t think they will help me.
CMYKatSoatok’s Proposed Solution
If you want to fully understand the nitty-gritty implementation details, I encourage you to read the current draft specification, plus the section describing the encryption algorithm, and finally the plaintext commitment algorithm.
Now that we’ve established all that, I can begin to describe my approach to solving this problem.
First, we will encrypt each attribute of a protocol message, as follows:
- For subkey derivation, we use HKDF-HMAC-SHA512.
- For encrypting the actual plaintext, we use AES-256-CTR.
- For message authentication, we use HMAC-SHA512.
- Additional associated data (AAD) is accepted and handled securely; i.e., we don’t use YOLO as a hash construction.
This prevents an Invisible Salamander attack from being possible.
This encryption is performed client-side, by each user, and the symmetric key for each attribute is shared with the Public Key Directory when publishing protocol messages.
If they later issue a legal request for erasure, they can be sure that the key used to encrypt the data they previously published isn’t secretly the same key used by every other user’s records.
They always know this because they selected the key, not the server. Furthermore, everyone can verify that the hash published to the Merkle tree matches a locally generated hash of the ciphertext they just emitted.
This provides a mechanism to keep everyone honest. If anything goes wrong, it will be detected.
Next, to prevent the server from being dishonest, we include a plaintext commitment hash, which is included as part of the AAD (alongside the attribute name).
(Implementing crypto-shredding is straightforward: simply wipe the encryption keys for the attributes of the records in scope for the request.)
If you’ve read this far, you’re probably wondering, “What exactly do you mean by plaintext commitment?”
Art by Scruff.Plaintext Commitments
The security of a plaintext commitment is attained by the Argon2id password hashing function.
By using the Argon2id KDF, you can make an effective trapdoor that is easy to calculate if you know the plaintext, but economically infeasible to brute-force attack if you do not.
However, you need to do a little more work to make it safe.
HarubakiThe details here matter a lot, so this section is unavoidably going to be a little dense.
Pass the Salt?
Argon2id expects both a password and a salt.
If you eschew the salt (i.e., zero it out), you open the door to precomputation attacks (see also: rainbow tables) that would greatly weaken the security of this plaintext commitment scheme.
You need a salt.
If you generate the salt randomly, this commitment property isn’t guaranteed by the algorithm. It would be difficult, but probably not impossible, to find two salts (, ) such that .
Deriving the salt from public inputs eliminates this flexibility.
By itself, this reintroduces the risk of making salts totally deterministic, which reintroduces the risk of precomputation attacks (which motivated the salt in the first place).
If you include the plaintext in this calculation, it could also create a crib that gives attackers a shortcut for bypassing the cost of password hashing.
Furthermore, any two encryptions operations that act over the same plaintext would, without any additional design considerations, produce an identical value for the plaintext commitment.
CMYKatPublic Inputs for Salt Derivation
The initial proposal included the plaintext value for Argon2 salt derivation, and published the salt and Argon2 output next to each other.
Hacker News comex pointed out a flaw with this technique, so I’ve since revised how salts are selected to make them independent of the plaintext.
The public inputs for the Argon2 salt are now:
- The version identifier prefix for the ciphertext blob.
- The 256-bit random value used as a KDF salt (also stored in the ciphertext blob).
- A recent Merkle tree root.
- The attribute name (prefixed by its length).
These values are all hashed together with SHA-512, and then truncated to 128 bits (the length required by libsodium for Argon2 salts).
This salt is not stored, but can deterministically be calculated from public information.
Crisis Averted?
This sure sounds like we’ve arrived at a solution, but let’s also consider another situation before we declare our job done.
High-traffic Public Key Directories may have multiple users push a protocol message with the same recent Merkle root.
This may happen if two or more users query the directory to obtain the latest Merkle root before either of them publish their updates.
Later, if both of these users issue a legal takedown, someone might observe that the
recent-merkle-rootis the same for two messages, but their commitments differ.Is this enough leakage to distinguish plaintext records?
In my earlier design, we needed to truncate the salt and rely on understanding the birthday bound to reason about its security. This is no longer the case, since each salt is randomized by the same random value used in key derivation.
Choosing Other Parameters
As mentioned a second ago, we set the output length of the Argon2id KDF to 32 bytes (256 bits). We expect the security of this KDF to exceed , which to most users might as well be infinity.
With apologies to Filippo.The other Argon2id parameters are a bit hand-wavey. Although the general recommendation for Argon2id is to use as much memory as possible, this code will inevitably run in some low-memory environments, so asking for several gigabytes isn’t reasonable.
For the first draft, I settled on 16 MiB of memory, 3 iterations, and a parallelism degree of 1 (for widespread platform support).
Plaintext Commitment Algorithm
With all that figured out, our plaintext commitment algorithm looks something like this:
- Calculate the SHA512 hash of:
- A domain separation constant
- The header prefix (stored in the ciphertext)
- The randomness used for key-splitting in encryption (stored in the ciphertext)
- Recent Merkle Root
- Attribute Name Length (64-bit unsigned integer)
- Attribute Name
- Truncate this hash to the rightmost 16 bytes (128 bits). This is the salt.
- Calculate Argon2id over the following inputs concatenated in this order, with an output length of 32 bytes (256 bits), using the salt from step 2:
- Recent Merle Root Length (64-bit unsigned integer)
- Recent Merkle Root
- Attribute Name Length (64-bit unsigned integer)
- Attribute Name
- Plaintext Length (64-bit unsigned integer)
- Plaintext
The output (step 3) is included as the AAD in the attribute encryption step, so the authentication tag is calculated over both the randomness and the commitment.
To verify a commitment (which is extractable from the ciphertext), simply recalculate the commitment you expect (using the recent Merkle root specified by the record), and compare the two in constant-time.
If they match, then you know the plaintext you’re seeing is the correct value for the ciphertext value that was committed to the Merkle tree.
If the encryption key is shredded in the future, an attacker without knowledge of the plaintext will have an enormous uphill battle recovering it from the KDF output (and the salt will prove to be somewhat useless as a crib).
AJCaveats and Limitations
Although this design does satisfy the specific criteria we’ve established, an attacker that already knows the correct plaintext can confirm that a specific record matches it via the plaintext commitment.
This cannot be avoided: If we are to publish a commitment of the plaintext, someone with the plaintext can always confirm the commitment after the fact.
CMYKatWhether this matters at all to the courts is a question for which I cannot offer any insight.
Remember, we don’t even know if any of this is actually necessary, or if “moderation and platform safety” is a sufficient reason to sidestep the right to erasure.
If the courts ever clarify this adequately, we can simply publish the mapping of Actor IDs to public keys and auxiliary data without any crypto-shredding at all.
Trying to attack it from the other direction (download a crypto-shredded record and try to recover the plaintext without knowing it ahead of time) is attack angle we’re interested in.
Herd Immunity for the Forgotten
Another interesting implication that might not be obvious: The more Fediverse servers and users publish to a single Public Key Directory, the greater the anonymity pool available to each of them.
Consider the case where a user has erased their previous Fediverse account and used the GDPR to also crypto-shred the Public Key Directory entries containing their old Actor ID.
To guess the correct plaintext, you must not only brute-force guessing possible usernames, but also permute your guesses across all of the instances in scope.
The more instances there are, the higher the cost of the attack.
CMYKatRecap
I tasked myself with designing a Key Transparency solution that doesn’t make complying with Article 17 of the GDPR nigh-impossible. To that end, crypto-shredding seemed like the only viable way forward.
A serialized record containing ciphertext for each sensitive attribute would be committed to the Merkle tree. The directory would store the key locally and serve plaintext until a legal takedown was requested by the user who owns the data. Afterwards, the stored ciphertext committed to the Merkle tree is indistinguishable from random for any party that doesn’t already know the plaintext value.
I didn’t want to allow Public Key Directories to lie about the plaintext for a given ciphertext, given that they know the key and the requestor doesn’t.
After considering zero-knowledge proofs and finding them to not be a perfect fit, I settled on designing a plaintext commitment scheme based on the Argon2id password KDF. The KDF salts can be calculated from public inputs.
Altogether, this meets the requirements of enabling crypto-shredding while keeping the Public Key Directory honest. All known attacks for this design are prohibitively expensive for any terrestrial threat actors.
As an added bonus, I didn’t introduce anything fancy. You can build all of this with the cryptography available to your favorite programming language today.
CMYKatClosing Thoughts
If you’ve made it this far without being horribly confused, you’ve successfully followed my thought process for developing message attribute shreddability in my Public Key Directory specification.
This is just one component of the overall design proposal, but one that I thought my readers would enjoy exploring in greater detail than the specification needed to capture.
(This post was updated on 2024-11-22 to replace the incorrect term “PII” with “personal data”. Apologies for the confusion!)
#Argon2 #crypto #cryptography #E2EE #encryption #FederatedPKI #fediverse #passwordHashing #symmetricCryptography
-
Ever since the Invisible Salamanders paper was published, there has been a quiet renaissance within my friends and colleagues in applied cryptography for studying systems that use Authenticated Encryption with Associated Data (AEAD) constructions, understanding what implicit assumptions these systems make about the guarantees of the AEAD mode they chose to build upon, and the consequences of those assumptions being false.
I’ve discussed Invisible Salamanders several times throughout this blog, from my criticisms of AES-GCM and XMPP + OMEMO to my vulnerability disclosures in Threema.
Five years after Invisible Salamanders, it’s become clear to me that many software developers do not fully appreciate the underlying problem discussed in the Invisible Salamanders paper, even when I share trivial proof-of-concept exploits.
Background
Fast AEAD constructions based on polynomial MACs, such as AES-GCM and ChaCha20-Poly1305, were designed to provide confidentiality and integrity for the plaintext data, and optionally integrity for some additional associated data, in systems where both parties already negotiated one shared symmetric key.
The integrity goals of the systems that adopted these AEAD constructions were often accompanied by performance goals–usually to prevent Denial of Service (DoS) attacks in networking protocols. Verification needed to be very fast and consume minimal resources.
In this sense, AEAD constructions were an incredible success. So successful, in fact, that most cryptographers urge application developers to use one of the fast AEAD modes as the default suggestion without looking deeper at the problem being solved. This is a good thing, because most developers will choose something stupid like ECB mode in the absence of guidance from cryptographers, and AEAD modes are much, much safer than any hand-rolled block cipher modes.
The problem is, that one tiny little assumption that both parties (sender, recipient) for a communication have agreed on exactly one symmetric key for use in the protocol.
Fast MACs Are Not Key-Committing
Cryptographers have concluded that AEAD constructions based on polynomial MACs–while great for performance and rejection of malformed packets without creating DoS risks–tend to make the same assumption. This is even true of misuse-resistant modes like AES-GCM-SIV and extended-nonce constructions like XSalsa20-Poly1305.
When discussing this implicit assumption of only one valid key in the systems that use these AEAD modes, we say that the modes are not key-committing. This terminology is based on what happens when this assumption is false.
Consequently, you can take a single, specially crafted ciphertext (with an authentication tag) and decrypt it under multiple different keys. The authentication tags will be valid for all keys, and the plaintext will be different.
Art: SwizzWhat does this look like in practice?
Consider my GCM exploit, which was written to generate puzzle ciphertexts for the DEFCON Furs badge challenge a few years ago. How it works is conceptually simple (although the actual mechanics behind step 4 is a bit technical):
- Generate two keys.
There’s nothing special about these keys, or their relationship to each other, and can be totally random. They just can’t be identical or the exploit is kind of pointless.
- Encrypt some blocks of plaintext with key1.
- Encrypt some more blocks of plaintext with key2.
- Calculate a collision block from the ciphertext in the previous two steps–which is just a bit of polynomial arithmetic in GF(2^128)
- Return the ciphertext (steps 2, 3, 4) and authentication tag calculated over them (which will collide for both keys).
A system that decrypts the output of this exploit under key1 will see some plaintext, followed by some garbage, followed by 1 final block of garbage.
If the same system decrypts under key2, it will see some garbage, followed by some plaintext, followed by 1 final block of garbage.
For many file formats, this garbage isn’t really a problem. Additionally, a bit more precomputation allows you to choose garbage that will be more advantageous to ensuring both outputs are accepted as “valid” by the target system.
For example, choosing two keys and a targeted nonce may allow both the valid plaintext and garbage blocks to begin with a PDF file header.
If you’re familiar with the file polyglot work of Ange Albertini, you can use this to turn the invisible salamanders problem into an artform.
Why is it called Invisible Salamanders?
The proof-of-concept used in the paper involved sending one picture (of a salamander) over an end-to-end encrypted messaging app, but when the recipient flagged it as abusive, the moderator saw a different picture.
https://www.youtube.com/watch?v=3M1jIO-jLHI
Thus, the salamander was invisible to the moderators of the encrypted messaging app.
As for the choice of a “salamander”, I’ve been told by friends familiar with the research that was inspired by the original name of the Signal Protocol being “Axolotl”.
But, like, who cares about these details besides me? It’s a cute and memorable name.
What are the consequences of violating the “one key” assumption?
That depends entirely on what your system does!
In Database Cryptography Fur the Rest of Us, I discussed the use of AEAD modes to prevent confused deputy attacks. This works great, but if you’re building an application that supports multi-tenancy, you suddenly have to care about this issue again.
An earlier design for OPAQUE, a password authenticated key exchange algorithm, was broken by a partitioning oracle attack due to building atop AEAD modes that are not key-committing. This let an attacker recover passwords from Shadowsocks proxy servers with a complexity similar to a binary search algorithm.
These are two very different impacts from the same weakness, which I believe is a significant factor for why the Invisible Salamanders issue isn’t more widely understood.
Sometimes violating the “one key” assumption that went into fast AEAD modes based on Polynomial MACs completely destroys the security of your system.
Other times, it opens the door for a high-complexity but low-impact behavior that simply violates the principle of least astonishment but doesn’t buy the attacker anything useful.
They Just Don’t Get It
The Invisible Salamanders issue is relevant in any system that uses symmetric-key encryption where more than one key can be valid.
This includes, but is not limited to:
- Multi-tenant data warehouses
- Group messaging protocols
- Envelope encryption schemes with multiple wrapping keys
- Bearer tokens (such as JSON Web Tokens) in systems that utilize Key IDs
Systems can mitigate this issue by introducing an explicit key commitment scheme (based on a cryptographic hash rather than a polynomial MAC) or by using a committing cipher mode (such as AES + HMAC, if done carefully).
However, most of the time, this advice falls on deaf ears whenever this concern is brought up by a cryptography engineer who’s more aware of this issue.
“Abuse reporting? We don’t have no stinking abuse reporting!”
The most common misunderstanding is, “We don’t have a report abuse feature, so this issue doesn’t affect us.”
This is because the Invisible Salamanders talk and paper focused on how it could be leveraged to defeat abuse reporting tools and bypass content moderation.
In my experience, many security teams would read the paper and conclude that it only impacts abuse reporting features and not potentially all systems that allow multiple symmetric keys in a given context.
Another Exploit Scenario
Imagine you’re building a Data Loss Prevention product that integrates with corporate file-sharing and collaboration software (e.g. ownCloud) for small and medium businesses.
One day, someone decides to ship an end-to-end encryption feature to the file-sharing software that uses AES-GCM to encrypt files, and then encrypts the keys to each recipient’s public key. This is basically the envelope encryption use-case above.
In order to update your integration to act as another “user”, whose public key must be included in all E2EE transfers, and will block download of ciphertexts it cannot decrypt OR contains sensitive information.
And this works, until an insider threat clever enough to abuse the Invisible Salamanders issue comes along.
In order for said insider threat (e.g., a senior business analyst) to leak sensitive data (e.g., anything that would be useful for illegal insider trading) to another person that shouldn’t have access to it (e.g., a store clerk that’s talking to the press), they just have to do this:
- Encrypt the data they want to exfiltrate using key1.
- Encrypt some innocuous data that won’t trigger your DLP product, using key2.
- Ensure that both messages encrypt to the same ciphertext and authentication tag.
- Give their recipient key1, give everyone else (including your DLP software) key2.
Bam! File leaked, and everyone’s none the wiser, until it’s too late. Let’s actually imagine what happens next:
A random store clerk has leaked sensitive data to the press that only a few analysts had access to.
The only communication between the analyst and the store clerk is a file that was shared to all employees, using the E2EE protocol. No emails or anything else were identified.
Your DLP product didn’t identify any other communications between these two, but somehow the store clerk has the data on their desktop.
A detailed forensics analysis may eventually figure out what happened, but by then, the damage is done and your product’s reputation is irrecoverably damaged.
All because the hypothetical E2EE protocol didn’t include a key-commitment mechanism, and nobody identified this deficit in their designs.
This isn’t to endorse DLP solutions at all, but rather, to highlight one of the many ways that the Invisible Salamander issue can be used creatively by clever attackers.
Art: AJThe Lesson to Learn
If you’re building a network protocol that uses AEAD to encrypt data over an insecure network (e.g., WireGuard), keep up the good work.
If you’re doing anything more involved than that, at the application layer, pause for a moment and consider whether your system will ever need multiple valid symmetric keys at once.
And, if the answer is “yes”, then you should always explicitly add a key-commitment mechanism to your system design.
(Hire a cryptographer if you’re not sure how to proceed.)
In my opinion, hemming and hawing over whether there’s a significant impact to the Invisible Salamanders issue is a worse use of your time than just solving it directly.
Eventually, I expect a new generation of AEAD modes will be standardized that explicitly provide key-commitment.
When these new designs are standardized, widely supported, and sufficiently trusted by experts, feel free to update my advice to “prefer using those modes” instead.
Header art: Harubaki, CMYKat, and Brian Gratwicke
https://soatok.blog/2024/09/10/invisible-salamanders-are-not-what-you-think/
#AEAD #AESGCM #InvisibleSalamanders #randomKeyRobustness #symmetricCryptography
- Generate two keys.
-
Ever since the Invisible Salamanders paper was published, there has been a quiet renaissance within my friends and colleagues in applied cryptography for studying systems that use Authenticated Encryption with Associated Data (AEAD) constructions, understanding what implicit assumptions these systems make about the guarantees of the AEAD mode they chose to build upon, and the consequences of those assumptions being false.
I’ve discussed Invisible Salamanders several times throughout this blog, from my criticisms of AES-GCM and XMPP + OMEMO to my vulnerability disclosures in Threema.
Five years after Invisible Salamanders, it’s become clear to me that many software developers do not fully appreciate the underlying problem discussed in the Invisible Salamanders paper, even when I share trivial proof-of-concept exploits.
Background
Fast AEAD constructions based on polynomial MACs, such as AES-GCM and ChaCha20-Poly1305, were designed to provide confidentiality and integrity for the plaintext data, and optionally integrity for some additional associated data, in systems where both parties already negotiated one shared symmetric key.
The integrity goals of the systems that adopted these AEAD constructions were often accompanied by performance goals–usually to prevent Denial of Service (DoS) attacks in networking protocols. Verification needed to be very fast and consume minimal resources.
In this sense, AEAD constructions were an incredible success. So successful, in fact, that most cryptographers urge application developers to use one of the fast AEAD modes as the default suggestion without looking deeper at the problem being solved. This is a good thing, because most developers will choose something stupid like ECB mode in the absence of guidance from cryptographers, and AEAD modes are much, much safer than any hand-rolled block cipher modes.
The problem is, that one tiny little assumption that both parties (sender, recipient) for a communication have agreed on exactly one symmetric key for use in the protocol.
Fast MACs Are Not Key-Committing
Cryptographers have concluded that AEAD constructions based on polynomial MACs–while great for performance and rejection of malformed packets without creating DoS risks–tend to make the same assumption. This is even true of misuse-resistant modes like AES-GCM-SIV and extended-nonce constructions like XSalsa20-Poly1305.
When discussing this implicit assumption of only one valid key in the systems that use these AEAD modes, we say that the modes are not key-committing. This terminology is based on what happens when this assumption is false.
Consequently, you can take a single, specially crafted ciphertext (with an authentication tag) and decrypt it under multiple different keys. The authentication tags will be valid for all keys, and the plaintext will be different.
Art: SwizzWhat does this look like in practice?
Consider my GCM exploit, which was written to generate puzzle ciphertexts for the DEFCON Furs badge challenge a few years ago. How it works is conceptually simple (although the actual mechanics behind step 4 is a bit technical):
- Generate two keys.
There’s nothing special about these keys, or their relationship to each other, and can be totally random. They just can’t be identical or the exploit is kind of pointless.
- Encrypt some blocks of plaintext with key1.
- Encrypt some more blocks of plaintext with key2.
- Calculate a collision block from the ciphertext in the previous two steps–which is just a bit of polynomial arithmetic in GF(2^128)
- Return the ciphertext (steps 2, 3, 4) and authentication tag calculated over them (which will collide for both keys).
A system that decrypts the output of this exploit under key1 will see some plaintext, followed by some garbage, followed by 1 final block of garbage.
If the same system decrypts under key2, it will see some garbage, followed by some plaintext, followed by 1 final block of garbage.
For many file formats, this garbage isn’t really a problem. Additionally, a bit more precomputation allows you to choose garbage that will be more advantageous to ensuring both outputs are accepted as “valid” by the target system.
For example, choosing two keys and a targeted nonce may allow both the valid plaintext and garbage blocks to begin with a PDF file header.
If you’re familiar with the file polyglot work of Ange Albertini, you can use this to turn the Invisible Salamanders problem into an artform.
And this is just the simple attack!
The Invisible Salamanders paper outlined a more advanced variant (with a proof of concept) in Section 3.2, which doesn’t suffer from nearly as much garbage data as the simple attack.
As Bruce Schneier often says, “Attacks only get better, they never get worse.”
Why is it called Invisible Salamanders?
The proof-of-concept used in the paper involved sending one picture (of a salamander) over an end-to-end encrypted messaging app, but when the recipient flagged it as abusive, the moderator saw a different picture.
https://www.youtube.com/watch?v=3M1jIO-jLHI
Thus, the salamander was invisible to the moderators of the encrypted messaging app.
As for the choice of a “salamander”, I’ve been told by friends familiar with the research that was inspired by the original name of the Signal Protocol being “Axolotl”.
But, like, who cares about these details besides me? It’s a cute and memorable name.
What are the consequences of violating the “one key” assumption?
That depends entirely on what your system does!
In Database Cryptography Fur the Rest of Us, I discussed the use of AEAD modes to prevent confused deputy attacks. This works great, but if you’re building an application that supports multi-tenancy, you suddenly have to care about this issue again.
An earlier design for OPAQUE, a password authenticated key exchange algorithm, was broken by a partitioning oracle attack due to building atop AEAD modes that are not key-committing. This let an attacker recover passwords from Shadowsocks proxy servers with a complexity similar to a binary search algorithm.
These are two very different impacts from the same weakness, which I believe is a significant factor for why the Invisible Salamanders issue isn’t more widely understood.
Sometimes violating the “one key” assumption that went into fast AEAD modes based on Polynomial MACs completely destroys the security of your system.
Other times, it opens the door for a high-complexity but low-impact behavior that simply violates the principle of least astonishment but doesn’t buy the attacker anything useful.
They Just Don’t Get It
The Invisible Salamanders issue is relevant in any system that uses symmetric-key encryption where more than one key can be valid.
This includes, but is not limited to:
- Multi-tenant data warehouses
- Group messaging protocols
- It’s sometimes tempting to discount group messaging as a relevant consideration if your experience is “emulated groups atop 1-to-1 messaging”, but there are protocols that establish a Group Key (i.e., RFC 9420) and then use that for all group messages.
- Envelope encryption schemes with multiple wrapping keys
- Bearer tokens (such as JSON Web Tokens) in systems that utilize Key IDs
Systems can mitigate this issue by introducing an explicit key commitment scheme (based on a cryptographic hash rather than a polynomial MAC) or by using a committing cipher mode (such as AES + HMAC, if done carefully).
However, most of the time, this advice falls on deaf ears whenever this concern is brought up by a cryptography engineer who’s more aware of this issue.
“Abuse reporting? We don’t have no stinking abuse reporting!”
The most common misunderstanding is, “We don’t have a report abuse feature, so this issue doesn’t affect us.”
This is because the Invisible Salamanders talk and paper focused on how it could be leveraged to defeat abuse reporting tools and bypass content moderation.
In my experience, many security teams would read the paper and conclude that it only impacts abuse reporting features and not potentially all systems that allow multiple symmetric keys in a given context.
Another Exploit Scenario
Imagine you’re building a Data Loss Prevention product that integrates with corporate file-sharing and collaboration software (e.g. ownCloud) for small and medium businesses.
One day, someone decides to ship an end-to-end encryption feature to the file-sharing software that uses AES-GCM to encrypt files, and then encrypts the keys to each recipient’s public key. This is basically the envelope encryption use-case above.
So, you dutifully update your integration to act as another “user”, whose public key must be included in all E2EE transfers, and will block download of ciphertexts it cannot decrypt OR contains sensitive information.
And this works, until an insider threat clever enough to abuse the Invisible Salamanders issue comes along.
In order for said insider threat (e.g., a senior business analyst) to leak sensitive data (e.g., anything that would be useful for illegal insider trading) to another person that shouldn’t have access to it (e.g., a store clerk that’s talking to the press), they just have to do this:
- Encrypt the data they want to exfiltrate using key1.
- Encrypt some innocuous data that won’t trigger your DLP product, using key2.
- Ensure that both messages encrypt to the same ciphertext and authentication tag.
- Give their recipient key1, give everyone else (including your DLP software) key2.
Bam! File leaked, and everyone’s none the wiser, until it’s too late. Let’s actually imagine what happens next:
A random store clerk has leaked sensitive data to the press that only a few analysts had access to.
The only communication between the analyst and the store clerk is a file that was shared to all employees, using the E2EE protocol. No emails or anything else were identified.
Your DLP product didn’t identify any other communications between these two, but somehow the store clerk has the data on their desktop.
A detailed forensics analysis may eventually figure out what happened, but by then, the damage is done and your product’s reputation is irrecoverably damaged.
All because the hypothetical E2EE protocol didn’t include a key-commitment mechanism, and nobody identified this deficit in their designs.
This isn’t to endorse DLP solutions at all, but rather, to highlight one of the many ways that the Invisible Salamander issue can be used creatively by clever attackers.
Art: AJ“Couldn’t you do the same with steganography?”
No, the attack is very different from stego.
Stego is about hiding a message in plain sight, so that only the person that knows where/how to look can find it.
The Invisible Salamanders attack lets you send one ciphertext through a network then selectively decrypt it to one of two plaintexts, depending on which key you reveal to each participant.
In the Invisible Salamanders paper and talk, they used this to send “abusive” messages to a recipient that the moderator would not see. Thus, invisible.
In one, the message is always emitted to anyone who knows how to find it. In the other, the attacker selects which you see, even if you have mechanisms to ensure you’re seeing the same ciphertext. It’s not a subtle difference.
Mitigation Techniques
There are multiple ways to mitigate the risk of Invisible Salamanders in a cryptosystem.
- Use HMAC, or (failing that) something built atop cryptographic hash functions, rather than a Polynomial MAC.
- Use an AEAD cipher designed with multi-recipient integrity as a security goal.
- Compute a non-invertible, one-way commitment of the encryption key.
A trivial mitigation looks like this:
class SoatokExampleEncryptor { const NEW_ENCRYPT_KEY = 'myProtocol$encryptKey'; const NEW_COMMITMENT = 'myProtocol$commitment'; public function __construct(#[SensitiveParameter] private string $key) {} /** * Let's assume we're starting with a simple AES-GCM wrapper */ public function legacyEncrypt(string $plaintext, string $assocData = ''): string { $nonce = random_bytes(12); $tag = ''; $ciphertext = openssl_encrypt( $plaintext, 'aes-256-gcm', $this->key, OPENSSL_RAW_DATA, $nonce, $tag, $assocData ); return $nonce . $ciphertext . $tag; } /** * An improved function looks something like this */ public function newEncrypt(string $plaintext, string $assocData = ''): string { // Avoid birthday bound issues with 256-bits of randomness $longerNonce = random_bytes(32); // Derive a subkey and synthetic nonce $tmp = hash_hkdf('sha512', $this->key, 44, self::NEW_ENCRYPT_KEY . $longerNonce); $encKey = substr($tmp, 0, 32); $nonce = substr($tmp, 32); // New: Key commitment $commitment = hash_hkdf('sha512', $this->key, 32, self::NEW_COMMITMENT . $longerNonce); // Most of this is unchanged $tag = ''; $ciphertext = openssl_encrypt( $plaintext, 'aes-256-gcm', $encKey, OPENSSL_RAW_DATA, $nonce, $tag, $assocData ); return $longerNonce . $commitment . $ciphertext . $tag; }}And then the decryption logic would recalculate the commitment, and compare it with the stored value, in constant-time.
It’s important that the commitment be stored with the ciphertext, rather than bundling it with the key.
(It may be worthwhile to also include the commitment in the associated data, to add a mechanism against downgrade attacks.)
The Lesson to Learn
If you’re building a network protocol that uses AEAD to encrypt data over an insecure network (e.g., WireGuard), keep up the good work.
If you’re doing anything more involved than that, at the application layer, pause for a moment and consider whether your system will ever need multiple valid symmetric keys at once.
And, if the answer is “yes”, then you should always explicitly add a key-commitment mechanism to your system design.
(Hire a cryptographer if you’re not sure how to proceed.)
In my opinion, hemming and hawing over whether there’s a significant impact to the Invisible Salamanders issue is a worse use of your time than just solving it directly.
Eventually, I expect a new generation of AEAD modes will be standardized that explicitly provide key-commitment.
When these new designs are standardized, widely supported, and sufficiently trusted by experts, feel free to update my advice to “prefer using those modes” instead.
Header art: Harubaki, CMYKat, and a photo by Brian Gratwicke. Poorly photoshopped by myself.
https://soatok.blog/2024/09/10/invisible-salamanders-are-not-what-you-think/
#AEAD #AESGCM #InvisibleSalamanders #randomKeyRobustness #symmetricCryptography
- Generate two keys.
-
Ever since the Invisible Salamanders paper was published, there has been a quiet renaissance within my friends and colleagues in applied cryptography for studying systems that use Authenticated Encryption with Associated Data (AEAD) constructions, understanding what implicit assumptions these systems make about the guarantees of the AEAD mode they chose to build upon, and the consequences of those assumptions being false.
I’ve discussed Invisible Salamanders several times throughout this blog, from my criticisms of AES-GCM and XMPP + OMEMO to my vulnerability disclosures in Threema.
Five years after Invisible Salamanders, it’s become clear to me that many software developers do not fully appreciate the underlying problem discussed in the Invisible Salamanders paper, even when I share trivial proof-of-concept exploits.
Background
Fast AEAD constructions based on polynomial MACs, such as AES-GCM and ChaCha20-Poly1305, were designed to provide confidentiality and integrity for the plaintext data, and optionally integrity for some additional associated data, in systems where both parties already negotiated one shared symmetric key.
The integrity goals of the systems that adopted these AEAD constructions were often accompanied by performance goals–usually to prevent Denial of Service (DoS) attacks in networking protocols. Verification needed to be very fast and consume minimal resources.
In this sense, AEAD constructions were an incredible success. So successful, in fact, that most cryptographers urge application developers to use one of the fast AEAD modes as the default suggestion without looking deeper at the problem being solved. This is a good thing, because most developers will choose something stupid like ECB mode in the absence of guidance from cryptographers, and AEAD modes are much, much safer than any hand-rolled block cipher modes.
The problem is, that one tiny little assumption that both parties (sender, recipient) for a communication have agreed on exactly one symmetric key for use in the protocol.
Fast MACs Are Not Key-Committing
Cryptographers have concluded that AEAD constructions based on polynomial MACs–while great for performance and rejection of malformed packets without creating DoS risks–tend to make the same assumption. This is even true of misuse-resistant modes like AES-GCM-SIV and extended-nonce constructions like XSalsa20-Poly1305.
When discussing this implicit assumption of only one valid key in the systems that use these AEAD modes, we say that the modes are not key-committing. This terminology is based on what happens when this assumption is false.
Consequently, you can take a single, specially crafted ciphertext (with an authentication tag) and decrypt it under multiple different keys. The authentication tags will be valid for all keys, and the plaintext will be different.
Art: SwizzWhat does this look like in practice?
Consider my GCM exploit, which was written to generate puzzle ciphertexts for the DEFCON Furs badge challenge a few years ago. How it works is conceptually simple (although the actual mechanics behind step 4 is a bit technical):
- Generate two keys.
There’s nothing special about these keys, or their relationship to each other, and can be totally random. They just can’t be identical or the exploit is kind of pointless.
- Encrypt some blocks of plaintext with key1.
- Encrypt some more blocks of plaintext with key2.
- Calculate a collision block from the ciphertext in the previous two steps–which is just a bit of polynomial arithmetic in GF(2^128)
- Return the ciphertext (steps 2, 3, 4) and authentication tag calculated over them (which will collide for both keys).
A system that decrypts the output of this exploit under key1 will see some plaintext, followed by some garbage, followed by 1 final block of garbage.
If the same system decrypts under key2, it will see some garbage, followed by some plaintext, followed by 1 final block of garbage.
For many file formats, this garbage isn’t really a problem. Additionally, a bit more precomputation allows you to choose garbage that will be more advantageous to ensuring both outputs are accepted as “valid” by the target system.
For example, choosing two keys and a targeted nonce may allow both the valid plaintext and garbage blocks to begin with a PDF file header.
If you’re familiar with the file polyglot work of Ange Albertini, you can use this to turn the Invisible Salamanders problem into an artform.
And this is just the simple attack!
The Invisible Salamanders paper outlined a more advanced variant (with a proof of concept) in Section 3.2, which doesn’t suffer from nearly as much garbage data as the simple attack.
As Bruce Schneier often says, “Attacks only get better, they never get worse.”
Why is it called Invisible Salamanders?
The proof-of-concept used in the paper involved sending one picture (of a salamander) over an end-to-end encrypted messaging app, but when the recipient flagged it as abusive, the moderator saw a different picture.
https://www.youtube.com/watch?v=3M1jIO-jLHI
Thus, the salamander was invisible to the moderators of the encrypted messaging app.
As for the choice of a “salamander”, I’ve been told by friends familiar with the research that was inspired by the original name of the Signal Protocol being “Axolotl”.
But, like, who cares about these details besides me? It’s a cute and memorable name.
What are the consequences of violating the “one key” assumption?
That depends entirely on what your system does!
In Database Cryptography Fur the Rest of Us, I discussed the use of AEAD modes to prevent confused deputy attacks. This works great, but if you’re building an application that supports multi-tenancy, you suddenly have to care about this issue again.
An earlier design for OPAQUE, a password authenticated key exchange algorithm, was broken by a partitioning oracle attack due to building atop AEAD modes that are not key-committing. This let an attacker recover passwords from Shadowsocks proxy servers with a complexity similar to a binary search algorithm.
These are two very different impacts from the same weakness, which I believe is a significant factor for why the Invisible Salamanders issue isn’t more widely understood.
Sometimes violating the “one key” assumption that went into fast AEAD modes based on Polynomial MACs completely destroys the security of your system.
Other times, it opens the door for a high-complexity but low-impact behavior that simply violates the principle of least astonishment but doesn’t buy the attacker anything useful.
They Just Don’t Get It
The Invisible Salamanders issue is relevant in any system that uses symmetric-key encryption where more than one key can be valid.
This includes, but is not limited to:
- Multi-tenant data warehouses
- Group messaging protocols
- It’s sometimes tempting to discount group messaging as a relevant consideration if your experience is “emulated groups atop 1-to-1 messaging”, but there are protocols that establish a Group Key (i.e., RFC 9420) and then use that for all group messages.
- Envelope encryption schemes with multiple wrapping keys
- Bearer tokens (such as JSON Web Tokens) in systems that utilize Key IDs
Systems can mitigate this issue by introducing an explicit key commitment scheme (based on a cryptographic hash rather than a polynomial MAC) or by using a committing cipher mode (such as AES + HMAC, if done carefully).
However, most of the time, this advice falls on deaf ears whenever this concern is brought up by a cryptography engineer who’s more aware of this issue.
“Abuse reporting? We don’t have no stinking abuse reporting!”
The most common misunderstanding is, “We don’t have a report abuse feature, so this issue doesn’t affect us.”
This is because the Invisible Salamanders talk and paper focused on how it could be leveraged to defeat abuse reporting tools and bypass content moderation.
In my experience, many security teams would read the paper and conclude that it only impacts abuse reporting features and not potentially all systems that allow multiple symmetric keys in a given context.
Another Exploit Scenario
Imagine you’re building a Data Loss Prevention product that integrates with corporate file-sharing and collaboration software (e.g. ownCloud) for small and medium businesses.
One day, someone decides to ship an end-to-end encryption feature to the file-sharing software that uses AES-GCM to encrypt files, and then encrypts the keys to each recipient’s public key. This is basically the envelope encryption use-case above.
So, you dutifully update your integration to act as another “user”, whose public key must be included in all E2EE transfers, and will block download of ciphertexts it cannot decrypt OR contains sensitive information.
And this works, until an insider threat clever enough to abuse the Invisible Salamanders issue comes along.
In order for said insider threat (e.g., a senior business analyst) to leak sensitive data (e.g., anything that would be useful for illegal insider trading) to another person that shouldn’t have access to it (e.g., a store clerk that’s talking to the press), they just have to do this:
- Encrypt the data they want to exfiltrate using key1.
- Encrypt some innocuous data that won’t trigger your DLP product, using key2.
- Ensure that both messages encrypt to the same ciphertext and authentication tag.
- Give their recipient key1, give everyone else (including your DLP software) key2.
Bam! File leaked, and everyone’s none the wiser, until it’s too late. Let’s actually imagine what happens next:
A random store clerk has leaked sensitive data to the press that only a few analysts had access to.
The only communication between the analyst and the store clerk is a file that was shared to all employees, using the E2EE protocol. No emails or anything else were identified.
Your DLP product didn’t identify any other communications between these two, but somehow the store clerk has the data on their desktop.
A detailed forensics analysis may eventually figure out what happened, but by then, the damage is done and your product’s reputation is irrecoverably damaged.
All because the hypothetical E2EE protocol didn’t include a key-commitment mechanism, and nobody identified this deficit in their designs.
This isn’t to endorse DLP solutions at all, but rather, to highlight one of the many ways that the Invisible Salamander issue can be used creatively by clever attackers.
Art: AJ“Couldn’t you do the same with steganography?”
No, the attack is very different from stego.
Stego is about hiding a message in plain sight, so that only the person that knows where/how to look can find it.
The Invisible Salamanders attack lets you send one ciphertext through a network then selectively decrypt it to one of two plaintexts, depending on which key you reveal to each participant.
In the Invisible Salamanders paper and talk, they used this to send “abusive” messages to a recipient that the moderator would not see. Thus, invisible.
In one, the message is always emitted to anyone who knows how to find it. In the other, the attacker selects which you see, even if you have mechanisms to ensure you’re seeing the same ciphertext. It’s not a subtle difference.
Mitigation Techniques
There are multiple ways to mitigate the risk of Invisible Salamanders in a cryptosystem.
- Use HMAC, or (failing that) something built atop cryptographic hash functions, rather than a Polynomial MAC.
- Use an AEAD cipher designed with multi-recipient integrity as a security goal.
- Compute a non-invertible, one-way commitment of the encryption key.
A trivial mitigation looks like this:
class SoatokExampleEncryptor { const NEW_ENCRYPT_KEY = 'myProtocol$encryptKey'; const NEW_COMMITMENT = 'myProtocol$commitment'; public function __construct(#[SensitiveParameter] private string $key) {} /** * Let's assume we're starting with a simple AES-GCM wrapper */ public function legacyEncrypt(string $plaintext, string $assocData = ''): string { $nonce = random_bytes(12); $tag = ''; $ciphertext = openssl_encrypt( $plaintext, 'aes-256-gcm', $this->key, OPENSSL_RAW_DATA, $nonce, $tag, $assocData ); return $nonce . $ciphertext . $tag; } /** * An improved function looks something like this */ public function newEncrypt(string $plaintext, string $assocData = ''): string { // Avoid birthday bound issues with 256-bits of randomness $longerNonce = random_bytes(32); // Derive a subkey and synthetic nonce $tmp = hash_hkdf('sha512', $this->key, 44, self::NEW_ENCRYPT_KEY . $longerNonce); $encKey = substr($tmp, 0, 32); $nonce = substr($tmp, 32); // New: Key commitment $commitment = hash_hkdf('sha512', $this->key, 32, self::NEW_COMMITMENT . $longerNonce); // Most of this is unchanged $tag = ''; $ciphertext = openssl_encrypt( $plaintext, 'aes-256-gcm', $encKey, OPENSSL_RAW_DATA, $nonce, $tag, $assocData ); return $longerNonce . $commitment . $ciphertext . $tag; }}And then the decryption logic would recalculate the commitment, and compare it with the stored value, in constant-time.
It’s important that the commitment be stored with the ciphertext, rather than bundling it with the key.
(It may be worthwhile to also include the commitment in the associated data, to add a mechanism against downgrade attacks.)
The Lesson to Learn
If you’re building a network protocol that uses AEAD to encrypt data over an insecure network (e.g., WireGuard), keep up the good work.
If you’re doing anything more involved than that, at the application layer, pause for a moment and consider whether your system will ever need multiple valid symmetric keys at once.
And, if the answer is “yes”, then you should always explicitly add a key-commitment mechanism to your system design.
(Hire a cryptographer if you’re not sure how to proceed.)
In my opinion, hemming and hawing over whether there’s a significant impact to the Invisible Salamanders issue is a worse use of your time than just solving it directly.
Eventually, I expect a new generation of AEAD modes will be standardized that explicitly provide key-commitment.
When these new designs are standardized, widely supported, and sufficiently trusted by experts, feel free to update my advice to “prefer using those modes” instead.
Header art: Harubaki, CMYKat, and a photo by Brian Gratwicke. Poorly photoshopped by myself.
https://soatok.blog/2024/09/10/invisible-salamanders-are-not-what-you-think/
#AEAD #AESGCM #InvisibleSalamanders #randomKeyRobustness #symmetricCryptography
- Generate two keys.
-
Ever since the Invisible Salamanders paper was published, there has been a quiet renaissance within my friends and colleagues in applied cryptography for studying systems that use Authenticated Encryption with Associated Data (AEAD) constructions, understanding what implicit assumptions these systems make about the guarantees of the AEAD mode they chose to build upon, and the consequences of those assumptions being false.
I’ve discussed Invisible Salamanders several times throughout this blog, from my criticisms of AES-GCM and XMPP + OMEMO to my vulnerability disclosures in Threema.
Five years after Invisible Salamanders, it’s become clear to me that many software developers do not fully appreciate the underlying problem discussed in the Invisible Salamanders paper, even when I share trivial proof-of-concept exploits.
Background
Fast AEAD constructions based on polynomial MACs, such as AES-GCM and ChaCha20-Poly1305, were designed to provide confidentiality and integrity for the plaintext data, and optionally integrity for some additional associated data, in systems where both parties already negotiated one shared symmetric key.
The integrity goals of the systems that adopted these AEAD constructions were often accompanied by performance goals–usually to prevent Denial of Service (DoS) attacks in networking protocols. Verification needed to be very fast and consume minimal resources.
In this sense, AEAD constructions were an incredible success. So successful, in fact, that most cryptographers urge application developers to use one of the fast AEAD modes as the default suggestion without looking deeper at the problem being solved. This is a good thing, because most developers will choose something stupid like ECB mode in the absence of guidance from cryptographers, and AEAD modes are much, much safer than any hand-rolled block cipher modes.
The problem is, that one tiny little assumption that both parties (sender, recipient) for a communication have agreed on exactly one symmetric key for use in the protocol.
Fast MACs Are Not Key-Committing
Cryptographers have concluded that AEAD constructions based on polynomial MACs–while great for performance and rejection of malformed packets without creating DoS risks–tend to make the same assumption. This is even true of misuse-resistant modes like AES-GCM-SIV and extended-nonce constructions like XSalsa20-Poly1305.
When discussing this implicit assumption of only one valid key in the systems that use these AEAD modes, we say that the modes are not key-committing. This terminology is based on what happens when this assumption is false.
Consequently, you can take a single, specially crafted ciphertext (with an authentication tag) and decrypt it under multiple different keys. The authentication tags will be valid for all keys, and the plaintext will be different.
Art: SwizzWhat does this look like in practice?
Consider my GCM exploit, which was written to generate puzzle ciphertexts for the DEFCON Furs badge challenge a few years ago. How it works is conceptually simple (although the actual mechanics behind step 4 is a bit technical):
- Generate two keys.
There’s nothing special about these keys, or their relationship to each other, and can be totally random. They just can’t be identical or the exploit is kind of pointless.
- Encrypt some blocks of plaintext with key1.
- Encrypt some more blocks of plaintext with key2.
- Calculate a collision block from the ciphertext in the previous two steps–which is just a bit of polynomial arithmetic in GF(2^128)
- Return the ciphertext (steps 2, 3, 4) and authentication tag calculated over them (which will collide for both keys).
A system that decrypts the output of this exploit under key1 will see some plaintext, followed by some garbage, followed by 1 final block of garbage.
If the same system decrypts under key2, it will see some garbage, followed by some plaintext, followed by 1 final block of garbage.
For many file formats, this garbage isn’t really a problem. Additionally, a bit more precomputation allows you to choose garbage that will be more advantageous to ensuring both outputs are accepted as “valid” by the target system.
For example, choosing two keys and a targeted nonce may allow both the valid plaintext and garbage blocks to begin with a PDF file header.
If you’re familiar with the file polyglot work of Ange Albertini, you can use this to turn the invisible salamanders problem into an artform.
Why is it called Invisible Salamanders?
The proof-of-concept used in the paper involved sending one picture (of a salamander) over an end-to-end encrypted messaging app, but when the recipient flagged it as abusive, the moderator saw a different picture.
https://www.youtube.com/watch?v=3M1jIO-jLHI
Thus, the salamander was invisible to the moderators of the encrypted messaging app.
As for the choice of a “salamander”, I’ve been told by friends familiar with the research that was inspired by the original name of the Signal Protocol being “Axolotl”.
But, like, who cares about these details besides me? It’s a cute and memorable name.
What are the consequences of violating the “one key” assumption?
That depends entirely on what your system does!
In Database Cryptography Fur the Rest of Us, I discussed the use of AEAD modes to prevent confused deputy attacks. This works great, but if you’re building an application that supports multi-tenancy, you suddenly have to care about this issue again.
An earlier design for OPAQUE, a password authenticated key exchange algorithm, was broken by a partitioning oracle attack due to building atop AEAD modes that are not key-committing. This let an attacker recover passwords from Shadowsocks proxy servers with a complexity similar to a binary search algorithm.
These are two very different impacts from the same weakness, which I believe is a significant factor for why the Invisible Salamanders issue isn’t more widely understood.
Sometimes violating the “one key” assumption that went into fast AEAD modes based on Polynomial MACs completely destroys the security of your system.
Other times, it opens the door for a high-complexity but low-impact behavior that simply violates the principle of least astonishment but doesn’t buy the attacker anything useful.
They Just Don’t Get It
The Invisible Salamanders issue is relevant in any system that uses symmetric-key encryption where more than one key can be valid.
This includes, but is not limited to:
- Multi-tenant data warehouses
- Group messaging protocols
- Envelope encryption schemes with multiple wrapping keys
- Bearer tokens (such as JSON Web Tokens) in systems that utilize Key IDs
Systems can mitigate this issue by introducing an explicit key commitment scheme (based on a cryptographic hash rather than a polynomial MAC) or by using a committing cipher mode (such as AES + HMAC, if done carefully).
However, most of the time, this advice falls on deaf ears whenever this concern is brought up by a cryptography engineer who’s more aware of this issue.
“Abuse reporting? We don’t have no stinking abuse reporting!”
The most common misunderstanding is, “We don’t have a report abuse feature, so this issue doesn’t affect us.”
This is because the Invisible Salamanders talk and paper focused on how it could be leveraged to defeat abuse reporting tools and bypass content moderation.
In my experience, many security teams would read the paper and conclude that it only impacts abuse reporting features and not potentially all systems that allow multiple symmetric keys in a given context.
Another Exploit Scenario
Imagine you’re building a Data Loss Prevention product that integrates with corporate file-sharing and collaboration software (e.g. ownCloud) for small and medium businesses.
One day, someone decides to ship an end-to-end encryption feature to the file-sharing software that uses AES-GCM to encrypt files, and then encrypts the keys to each recipient’s public key. This is basically the envelope encryption use-case above.
In order to update your integration to act as another “user”, whose public key must be included in all E2EE transfers, and will block download of ciphertexts it cannot decrypt OR contains sensitive information.
And this works, until an insider threat clever enough to abuse the Invisible Salamanders issue comes along.
In order for said insider threat (e.g., a senior business analyst) to leak sensitive data (e.g., anything that would be useful for illegal insider trading) to another person that shouldn’t have access to it (e.g., a store clerk that’s talking to the press), they just have to do this:
- Encrypt the data they want to exfiltrate using key1.
- Encrypt some innocuous data that won’t trigger your DLP product, using key2.
- Ensure that both messages encrypt to the same ciphertext and authentication tag.
- Give their recipient key1, give everyone else (including your DLP software) key2.
Bam! File leaked, and everyone’s none the wiser, until it’s too late. Let’s actually imagine what happens next:
A random store clerk has leaked sensitive data to the press that only a few analysts had access to.
The only communication between the analyst and the store clerk is a file that was shared to all employees, using the E2EE protocol. No emails or anything else were identified.
Your DLP product didn’t identify any other communications between these two, but somehow the store clerk has the data on their desktop.
A detailed forensics analysis may eventually figure out what happened, but by then, the damage is done and your product’s reputation is irrecoverably damaged.
All because the hypothetical E2EE protocol didn’t include a key-commitment mechanism, and nobody identified this deficit in their designs.
This isn’t to endorse DLP solutions at all, but rather, to highlight one of the many ways that the Invisible Salamander issue can be used creatively by clever attackers.
Art: AJThe Lesson to Learn
If you’re building a network protocol that uses AEAD to encrypt data over an insecure network (e.g., WireGuard), keep up the good work.
If you’re doing anything more involved than that, at the application layer, pause for a moment and consider whether your system will ever need multiple valid symmetric keys at once.
And, if the answer is “yes”, then you should always explicitly add a key-commitment mechanism to your system design.
(Hire a cryptographer if you’re not sure how to proceed.)
In my opinion, hemming and hawing over whether there’s a significant impact to the Invisible Salamanders issue is a worse use of your time than just solving it directly.
Eventually, I expect a new generation of AEAD modes will be standardized that explicitly provide key-commitment.
When these new designs are standardized, widely supported, and sufficiently trusted by experts, feel free to update my advice to “prefer using those modes” instead.
Header art: Harubaki, CMYKat, and Brian Gratwicke
https://soatok.blog/2024/09/10/invisible-salamanders-are-not-what-you-think/
#AEAD #AESGCM #InvisibleSalamanders #randomKeyRobustness #symmetricCryptography
- Generate two keys.
-
Four years ago, I wrote a (surprisingly popular) blog post about the notion of wear-out for symmetric encryption schemes.
Two years ago, I wrote a thing about extending the nonce used by AES-GCM without introducing foot-guns. This was very recently referenced in one of Filippo Valsorda’s Cryptography Dispatches, which referenced alternative designs in the same vein.
As I was reading Filippo’s newsletter, it occurred to me that a dedicated treatment to how cryptographers discuss the Birthday Bound for symmetric encryption would be beneficial.
Birthday Bound?
The Birthday Bound is rooted in the so-called Birthday Paradox in probability theory. It goes something like this:
How many people (chosen from a uniform random distribution) would you need to have in the same room in order for there to be a 50% or greater chance that two of them have the same birthday?
Even though there are 366 possible calendar days in the year (365 if you ignore leap years), the answer is only 23.
This observation can tell us something interesting about the collision risk in discrete uniformly random samples.
For example, the random number (called an IV in this case) used to encrypt a message with AES-CBC, which is a 128-bit random number. This means that there are possible values. We can simply describe this situation for any distribution; in this case, .
For any random distribution (i.e., random nonces or tweaks for an AEAD scheme) of possible values, you expect to have a 50% chance of collision after about queries. This is a loose, order-of-magnitude estimate.
From our earlier AES-CBC example, this means blocks of data, we’d expect a 50% chance of the two to collide.
My symmetric wear-out blog post went in-depth about the particulars for specific designs, if you’d like to know how the numbers play out.
Is 50/50 Good Enough?
A security policy that cuts off at a 50% chance of a nonce reuse is not fit for the real world. We would call that a YOLO security policy.
AES-GCM, which accepts a 96-bit nonce, is considered unsafe to use for more than random nonces. At this point, the probability of a collision for each subsequent message is greater than or equal to .
Consequently, many people consider the point that the risk exceeds the Birthday Bound for a block cipher mode; after which, a new encryption key must be used.
I certainly don’t fault any security policy that keeps the risk of a bad outcome below the mark, but I think there’s another way to interpret what NIST did with AES-GCM. Namely, the fact that GCM’s risk of is exceeded after samples (for some fixed value ) offers a nice symmetry to the risk analysis.
Three Birthday Bounds
(I was originally trying to find a wordplay that references the children’s story Six-Dinner Sid for this section, but ran out of steam and pressed publish before finding an appropriate parody.)
Soatok Editorial Note
This gives rise to three different possible values for a given random distribution that can be considered whenever someone says the phrase “birthday bound”.
- 50% collision risk after samples, which I like to think of as the
Hard Birthday Bound. - collision risk after samples, which I like to call the
Soft Birthday Bound. - collision risk after samples, which I like to call the
Optimal Birthday Bound.
(These labels are not official; a better naming convention may be worth considering, should this framework for discussion prove useful at all.)
Since we’re still dealing with approximations, a useful technique for calculating is to just take the cube-root of (a.k.a., divide the exponent by 3).
In the case of AES-GCM, the Soft and Optimal values are equivalent.
In the case of Filippo’s XAES-256-GCM design? They differ quite a bit.
- Soft: messages for collision probability.
- Optimal: messages for collision probability.
Whereas the Soft and Optimal limits for a 96-bit nonce are the same, with a 192-bit nonce, they differ a lot: Soft is about 65,536 times the size as Optimal.
Does Any Of This Matter?
If your accepted risk is hard-coded at , then you don’t need to pass Go or collect $200, because your security policy saves you the headache of having to care about math nerd stuff.
However, I do think that the Optimal Birthday Bound is more useful for risk analysis for one simple reason: This is the point at which the probability of a collision is inversely proportional to the number of samples.
Being able to say, “Encrypting messages under the same key incurs a probability of a nonce collision of approximately 1 in ,” is much more deeply satisfying than arbitrarily focusing on as an arbitrary safety limit.
Of course, to most people, and are both ridiculously small numbers that they “might as well be zero”.
Whenever designing extended-nonce constructions atop AEAD block cipher modes, it may be worthwhile to discuss both the Soft and Optimal limits for nonce collisions.
So long as the people designing extended-nonce constructions at least consider doing this, my work here is done.
CMYKat made thisCan we get a useful table?
Of course!
Generally, you don’t want to consider a probability space smaller than (which is the inflection point where Optimal becomes larger than Soft).
Probability SpaceHard BoundSoft BoundOptimal BoundThe collision probability for any Optimal Bound is 1 divided by the number of messages.To use this table, select a probability space from the left column. Then, on the same row, find the cell corresponding to the type of bound you are interested in.
https://soatok.blog/2024/07/01/blowing-out-the-candles-on-the-birthday-bound/
#birthdayAttack #birthdayBound #blockCipherModes #symmetricCryptography
- 50% collision risk after samples, which I like to think of as the
-
If you’ve never heard of NIST SP 800-108 before, or NIST Special Publications in general, here’s a quick primer:
Special Publications are a type of publication issued by NIST. Specifically, the SP 800-series reports on the Information Technology Laboratory’s research, guidelines, and outreach efforts in computer security, and its collaborative activities with industry, government, and academic organizations. These documents often support FIPS (Federal Information Protection Standards).
Via NIST.gov
One of the NIST 800-series documents concerned with Key Derivation using Pseudorandom Functions is NIST SP 800-108, first published in 2009.
In October 2021, NIST published a draft update to NIST SP 800-108 and opened a comment period until January 2022. This update mostly included Keccak-based Message Authentication Codes (KMAC) in addition to the incumbent standardized designs (HMAC and CMAC).
Upon reviewing a proposal for NIST SP 800-108 revision 1 after its comment period opened, Amazon’s cryptographers discovered a novel security issue with the standard.
I was a co-author of the public comment that disclosed this issue, along with Matthew Campagna, Panos Kampanakis, and Adam Petcher, but take no credit for its discovery.
Consequently, Section 6.7 was added to the final revision 1 of the standard to address Key Control Security.
This post examines the attack against the initial SP 800-108 design when AES-CMAC is used as the PRF in KDF Counter mode.
This meme is the TL;DR of this blog postPreliminaries
(If you’re in a hurry, feel free to skip to the attack.)
NIST SP 800-108 specifies a “KDF in Counter Mode” that can be used with several PRFs, including AES-CMAC. It’s worth noting that this family of KDFs can be defined to use any arbitrary PRF, but only the PRFs approved by NIST for this use are recommended.
AES-CMAC is a one-key CBC-MAC construction. Some cryptographers, such as Matt Green, are famously not fond of CBC-MAC.
KDF Security and PRF Security
Yes, I will take any excuse to turn cryptography knowledge into wholesome memes.KDF stands for “Key Derivation Function”.
PRF stands for “Pseudo-Random Function”.
The security notion for KDF Security is stronger than PRF Security.
PRFs require a uniformly-distributed secret key, while KDFs can tolerate a key that is not uniformly random.
This matters if you’re, say, trying to derive symmetric encryption keys from a Diffie-Hellman shared secret of some sort, where the output of your DH() function has some algebraic structure.
Realistically, the difference between the two security notions matters a lot less in scenarios where you’re deriving sub-keys from a primary uniformly random cryptographic secret.
However, it does make your proofs nicer to achieve KDF security instead of merely PRF security.
Key Control Security
Let’s pretend, for simplicity, we have a generic
KDF()function that offers KDF Security. We don’t need to know how it works just yet.Because KDFs are thought of as PRFs, but stronger, it seems perfectly reasonable that you could use
KDF()in a setup where multiple inputs are provided, each from a different party, and the output would always be uniformly random.Further, even if all other parties’ inputs are known, it should remain computationally infeasible for one of the parties to influence the output of
KDF()to produce a specific value; e.g. a key with all bits zeroed.The assumption that this result is computationally infeasible when working with
KDF()is referred to as “Key Control Security”.Loss of Key Control Security in NIST SP 800-108
You already know where this is going…I’m going to explain the attack by way of example.
If you want a more formal treatment, I believe Appendix B of NIST SP 800-108 rev 1 has what you’re looking for.
Imagine that you’re designing an online two-party private messaging app. To ensure forward secrecy, you implement a forward-secure KDF ratchet, loosely inspired by Signal’s design.
For your KDF, you choose AES-CMAC in Counter Mode, because you’re designing for hardware that has accelerated AES instructions and want to avoid the overhead of hash functions.
(Aside: I guess this would also imply you’re most likely selecting AES-CCM for your actual message encryption.)
With each message, the sender commits some random bytes by encrypting them with their message. The recipient, after verifying the authentication tag and decrypting the message, possess knowledge of the same random bytes.
Both parties then use the random bytes and the current symmetric key to ratchet forward to a new 128-bit symmetric key.
The million dollar question is: Is this ratcheting protocol secure?
In the case of KDF in Counter Mode with AES-CMAC, if you have more than 16 bytes of input material, the answer is simply: No.
How The Attack Works
A two-block implementation of this KDF is normally computed as follows:
- Return
Don’t get intimidated by the notation. This is just AES encryption and XOR.
The messages and are defined in the KDF specification. In the scenario we sketched above, we assume the attacker can choose these arbitrarily.
To coerce a recipient to use an arbitrary 128-bit value (i.e., ) all an attacker needs to do is:
- Calculate
- Let some value
- Here, is the target value.
- Force
Notice that is the result of encrypting , and our attacker’s goal in step 3 can be achieved solely by manipulating (which exists independent of )?
That’s the vulnerability.
The public comments and Appendix B on the NIST document describe the actual steps of computing to force a chosen , which involve manipulating the structure of to achieve this result.
Feel free to check out both documents if you’re interested in the finer details.
What Can An Attacker Actually Do With This?
If an attacker controls both and …
Or if an attacker knows some and can control …
…then they can force the final KDF output to equal whatever 128-bit value they want you to use.
The most straightforward application of the loss of key control security is to introduce a backdoor into an application.
If the Underhanded Crypto Contest were still running this year, NIST SP 800-108 using AES-CMAC in Counter Mode would be an excellent basis for a contestant.
Does Anyone Actually Use NIST SP 800-108 This Way?
I’m not aware of any specific products or services that use this KDF in this way. I will update this section if someone finds any.
Is This A Deliberate Backdoor in a NIST Standard?
No.
I understand that, in the wake of Dual_EC_DRBG, there is a lot of distrust for NIST’s work on standardized cryptography.
However, I have no specific knowledge to indicate this was placed deliberately in the standard.
It is inaccurate to describe the loss of key control security in this context as a backdoor. Instead, it’s an unexpected property of the algorithms that can be used to create a clever backdoor. These are wildly different propositions.
At least, that was the case until it was disclosed to NIST in January 2022. 🙂
(I’m including an answer to this question, preemptively, in case someone overreacts when I publish this blog post. I hope it proves unnecessary, but I figured some caution was warranted.)
Mitigation Options
If you care about Key Control Security and use NIST SP 800-108, you should use HMAC or KMAC instead of CMAC. Only CMAC is impacted.
Revision 1 of NIST SP800-108 also outlines another mitigation that involves changing the inputs to include an additional (but reusable) PRF output for every block.
This tweak does change makes the KDF behave more like our intuition for PRFs, but in my opinion it’s better to avoid using CMAC entirely for KDFs.
Why Wasn’t This Widely Publicized?
As interesting and surprising as the loss of Key Control Security in a NIST standard is to cryptography nerds, it’s not exactly like Heartbleed or Log4shell.
That said, regardless of your personal feelings on NIST, if you’re interesting in not having findings like this slip through the cracks in the future, it’s generally worthwhile to pay attention to what NIST is up to.
https://scottarc.blog/2024/06/04/attacking-nist-sp-800-108/
#cybersecurity #framework #KDF #KDFSecurity #KeyDerivationFunctions #NIST #NISTSP800108 #PRFSecurity #security #standards #symmetricCryptography
-
Head’s up: This is a blog post about applied cryptography, with a focus on web and cloud applications that encrypt data at rest in a database or filesystem. While the lessons can be broadly applicable, the scope of the post is not.
One of the lessons I learned during my time at AWS Cryptography (and particularly as an AWS Crypto Bar Raiser) is that the threat model for Encryption At Rest is often undefined.
Prior to consulting cryptography experts, most software developers do not have a clear and concise understanding of the risks they’re facing, let alone how or why the encrypting data at rest would help protect their customers.
Unsurprisingly, I’ve heard a few infosec thought leader types insist that encryption-at-rest is security theater over the years. I disagree with this assessment in the absolute terms, but there is a nugget of truth in that assertion.
The million dollar question.Let’s explore this subject in a little more detail.
Why should we listen to you about this topic?
(If you don’t need any convincing, feel free to skip this section.)
Encryption at rest is a particular hobby horse of mine. I previously wrote on this blog about the under-celebrated design decisions in the AWS Database Encryption SDK and the need for key-committing AEAD modes in multi-tenant data lakes.
Before my time at Amazon, I had also designed a PHP library called CipherSweet that offers a limited type of Searchable Encryption. The goal of CipherSweet was to improve the cryptography used by SuiteCRM. (The library name is, of course, a pun.)
I’ve also contributed a ton of time making cryptography easy-to-use and hard to misuse outside of the narrow use-case that is at-rest data encryption. To that end, I designed PASETO as a secure-by-default alternative to JSON Web Tokens.
I also have a lot of skin in the game when it comes to developer comprehension: I was the first Stack Overflow user with a gold badge for both [security] and [encryption], largely due to the effort I put into cleaning up the bad cryptography advice for the PHP ecosystem.
I have spent the past decade or so trying to help teams avoid security disasters in one form or another.
Why should we not listen to you about this topic?
If you happen to know a cryptography expert you trust more than some Internet stranger with a blog, I implore you to listen to them if we disagree on any point. They may know something I don’t. (That said, I’m always happy to learn something new!)
I also do not have a college degree in Cryptography, nor have I published any papers in prestigious academic journals. If you care very much about this sort of pedigree, you will likely find my words easily discarded. If this describes your situation, no hard feelings.
Why and How to use Encryption At Rest to Protect Sensitive Data
Important: I’m chiefly interested in discussing one use-case, and not focusing on other use cases. Namely, I’m focusing on encryption-at-rest in the narrow context of web applications and/or cloud services.
This is not a comprehensive blog post covering every possible use case or threat model relating to encryption at rest. Those other use cases are certainly interesting, but this post is already long enough with a narrower focus.
In particular: I’m not talking about the threats faced by activists or whistleblowers. This is a software engineering and applied cryptography focused blog post.
If you’re only interested in compliance requirements, you can probably just enable Full Disk Encryption and call it a day. Then, if your server’s hard drive grows legs and walks out of the data center, your users’ most sensitive data will remain confidential.
Unfortunately, for the server-side encryption at rest use case, that’s basically all that Disk Encryption protects against.
If your application or database software is online and an attacker gains access to it (e.g., through SQL injection), with full disk encryption, it might as well be plaintext to an online attacker.
It do be like that with online attacks.Therefore, if you find yourself reaching for Encryption At Rest to mitigate the impact of the kind of vulnerability that would leak the contents of your database or filesystem to an attacker, you’re probably unwittingly engaging in security theater.
Disk Encryption is important for disk disposal and mitigating hardware theft, not preventing data leakage to online attackers.
So the next logical thing to do is draw a box around the system or component that stores a lot of data and never let plaintext cross that boundary.
What Do You Mean By “Encryption At Rest”?
Encryption At Rest is best contrasted with Encrypted In Transit.
For Encryption-in-Transit, think TLS.
For Encryption-at-Rest, think of anything a web app or cloud service would do to encrypt data before storing it in… where ever the data is actually stored.
If there’s another usage of the term to mean something else, it’s not one that I’m familiar with.
Client-Side Encryption
Note: The naming here is a little imprecise. It is client-side encryption with respect to your data warehouse (i.e. SQL database), but not with respect to the user experience of a web application. In those cases, client-side would mean on the actual end user’s device.
Instead, client-side encryption is the generic buzz-word to mean that you’re encrypting data outside of the box you drew in your system architecture. Generally, this means that you have an application server that’s acting as the “client” for the purpose of bulk data encryption.
There are a lot of software projects that aim to provide client-side encryption for data stored in a database or filesystems; e.g., in Amazon S3 buckets.
This is a step in the right direction, but implementation details matter a lot.
Quick aside: For the remainder of this blog post, I’m going to assume an architecture that looks like a traditional web application, for simplicity.
The assumed architecture looks vaguely like this:
- User Agents (e.g., web browsers) that communicate with the application server.
- Application Server(s) respond to HTTP requests from user agents, manages key material using KMS, encrypts / decrypts records stored in the database.
- Database Server(s) which store ciphertext on behalf of the application server.
This is an abstract design, so the actual implementation details you encounter in the real world may be simpler or more complex in different respects.
There are other interesting design considerations for OS-level end-user device encryption that I’m not going to explore today. For example: Adiantum is extremely cool.
I’m also not going to dive deep into laptop theft or the importance of Full Disk Encryption as a mechanism for ensuring data is erased from solid state hard drives, or the activities of hostile nation states. That’s a separate discussion entirely.
Security Considerations for Client-Side Encryption
The first question to answer when data is being encrypted is, “How are the keys being managed?” This is a very deep rabbit hole of complexity, but one good answer for a centralized service is, “Cloud-based key management service with audit logging”; i.e. AWS KMS, Google CloudKMS, etc.
We could talk about key management for a very long time, but there’s other things I want to focus on, so let’s revisit that in a future blog post.
Before we begin, you may find it helpful to read my previous blog post on a related matter: Lucid Multi-Key Deputies Require Commitment. It’s not strictly necessary, but some of the terminology used there may be helpful to understanding this one.
Next, you have to understand how the data is being encrypted in the first place.
Bulk Data Encryption Techniques
Bad answer: AES in CBC mode without HMAC.
Worse answer: AES in ECB mode.
Generally, you’re going to want to use an AEAD construction, such as AES-GCM or XChaCha20-Poly1305.
For those not in the loop: AEAD is an acronym that stands for Authenticated Encryption with Associated Data.
You’ll also want key-commitment if you’re storing data for multiple customers in the same hardware. You can get this property by stapling HKDF onto your protocol (once for key derivation, again for commitment). See also: PASETO v3 and v4, or Version 2 of the AWS Encryption SDK.
It may be tempting to build your own custom committing AEAD scheme out of, e.g., AES-CTR and HMAC. If you do this, take extra care that you don’t introduce canonicalization risks in your MAC.
Either way, using an AEAD mode is a significant improvement over using AES directly.
Is Your Deputy Confused?
Even if you’re using IND-CCA secure encryption and managing your keys securely, there is still a very stupid attack against many data-at-rest encryption schemes.
To understand the attack, first consider this sort of scenario:
Alice and Bob use the same health insurance provider, who is storing sensitive medical records for both parties. Bob works as a database administrator for the insurance company he and Alice both use. One day, he decides to snoop on her private medical history.
Fortunately, the data is encrypted at the web application, so all of the data Bob can access is indistinguishable from random. He can access his own account and see his data through the application, but he cannot see Alice’s data from his vantage point on the database server.
Here’s the stupid simple attack that works in far too many cases: Bob copies Alice’s encrypted data, and overwrites his records in the database, then accesses the insurance provider’s web app.
Bam! Alice’s plaintext recovered.
What’s happening here is simple: The web application has the ability to decrypt different records encrypted with different keys. If you pass records that were encrypted for Alice to the application to decrypt it for Bob, and you’re not authenticating your access patterns, Bob can read Alice’s data by performing this attack.
The cryptographic attack is literally copy and paste, from the database administrator’s perspective. It’s stupid but it works against too many encryption-at-rest software projects.
In this setup, the application is the Deputy, and you can easily confuse it by replaying an encrypted blob in the incorrect context.
The mitigation is simple: Use the AAD mechanism (part of the standard AEAD interface) to bind a ciphertext to its context. This can be a customer ID, each row’s value for the primary key of the database table, or something else entirely.
If you’re using AWS KMS, you can also use Encryption Context for this exact purpose.
An Illustrative Example
Let’s say you have a simple web application that encrypts data before storing it in a SQL database.
Let’s also write it to use AES-GCM, since unauthenticated CBC mode is awful.
A quick and dirty implementation might look like this:
class User { public function __construct( public readonly string $username, public string $email, public string $fullName ) {}}class UserModel { public function __construct(protected Database $db) {} public function save(User $user): bool { return $this->db->upsert( 'users', [ // set 'full_name' => aes128gcm_encrypt($user->fullName), 'email' => aes128gcm_encrypt($user->email) // encryption details abstracted ], [ // where 'username' => $user->username ] ); } public function fetch(string $username): User { $row = $this->db->fetch('users', ['username' => $username]); return new User( $username, aes128gcm_decrypt($row['email']), aes128gcm_decrypt($row['full_name']) ); }}For the abstracted
aes128gcmfunctions in the pseudocode above, just assume they’re getting the key from KMS during encryption and storing an encrypted data key in a place the ciphertext can reference later on decrypt. I didn’t want to complicate the pseudocode with a lot of boilerplate.You might decide to prove the confused deputy risk by doing something like this:
$model = new UserModel($db);$model->save(new User('alice', '[email protected]', 'Alice McWonderland'));$model->save(new User('bob', '[email protected]', 'Bob BurgerMeister'));// Fetch Alice's data$aliceData = $db->fetch('users', ['username' => 'alice']);$bobData = $db->fetch('users', ['username' => 'bob']);// This is the attack the database server can perfrom:// Replace Bob's full_name with Alice's email$db->upsert('users', [ 'full_name' => $alice['email']], ['username' => 'bob']);$badBob = $model->fetch('bob');Now Bob’s full name is set to Alice’s email address.
Okay, So What?
Now imagine someone performs the same attack, but against salary fields in a payroll system.
The Curious Case of CipherSweet
My knowledge of this risk didn’t manifest itself in a vacuum. It was discovered over the years of maintaining an open source library.
The first release of CipherSweet mitigated most of this risk by construction: Each field uses a different encryption key, through a key derivation scheme.
In pseudocode, this construction looks something like this:
def encryptRow(self, records): for field, type in self.fieldsToEncrypt: key = self.getFieldSymmetricKey(self.table, field) records[field] = encryptField(key, field)
Since CipherSweet’s inception, if you try to replace Alice’s encrypted zip code with Alice’s encrypted social security number, the keys would be wrong, so it would lead to a decryption failure.
Or so I thought!
As I mentioned in my blog post about multi-tenancy and confused deputy attacks, if your AEAD mode doesn’t commit to the key used, it’s possible to craft a single (ciphertext, tag) that decrypts to two different plaintext values under two different keys.
CipherSweet’s original
ModernCryptosuite used XChaCha20-Poly1305, which is not key-committing, and therefore susceptible to this sort of misuse.This violated the Principle of Least Astonishment and motivated the development of a new algorithm suite called
BoringCrypto, which used BLAKE2b-MAC instead of Poly1305. This change was released in version 3.0.0 in June 2021.However, even with
BoringCryptoin 3.0.0, this only mitigated most of the issue by construction. The last mile of complexity here is that each field must also be bound to a primary key or foreign key.Encrypting with AAD has been possible since a very early release of CipherSweet, but being possible to use securely is not sufficient. It should be easy to use securely.
CipherSweet Version 4.7.0, which was released last month, now only requires a code change that looks like this in order to mitigate confused deputies in an application:
$multiRowEncryptor = new EncryptedMultiRows($engine); $multiRowEncryptor+ ->setAutoBindContext(true)+ ->setPrimaryKeyColumn('table2', 'id') ->addTextField('table1', 'field1')This is in addition to the new Enhanced AAD feature, which allows for flexible and powerful context binding based on other fields and/or string literals.
(In fact, this new convenience feature actually uses Enhanced AAD under-the-hood.)
This doesn’t come for free, however: Users have to know the serial / primary key for a record prior to writing it, in order to use it as AAD when encrypting fields. However, that’s a much easier pill to swallow than expecting PHP devs to manage the complexity of context-binding themselves.
As you can see, mitigating confused deputies in an encryption library (without making it unwieldy) requires a painstaking attention to detail to get right.
As Avi Douglen says, “Security at the cost of usability comes at the cost of security.”
Given the prevalence of client-side encryption projects that just phone it in with insecure block cipher modes (or ECB, which is the absence of a block cipher mode entirely), it’s highly doubtful that most of them will ever address confused deputy attacks. Even I didn’t get it right at first when I made CipherSweet back in 2018.
What about non-databases?
Everything I mentioned in the previous section was focused on confused deputy attacks against client-side encryption for information that is stored in a database, but it’s a general problem with encrypting data at rest and storing the ciphertext “server-side”.
If you’re storing encrypted data in an S3 bucket, rather than in MySQL, you still need some form of context-binding mechanism to prevent the dumb and obvious attack from working against a deputy that reads data from said S3 bucket.
If you take nothing else away from this blog post, remember: Authenticate your access patterns.
Why aren’t things better already?
As with most things in software security, the problem is either not widely known, or is not widely understood.
Unknown unknowns tend to fester, untreated, across the entire ecosystem.
Misunderstood issues often lead to an incorrect solution.
In this case, at-rest encryption is mostly in Column B, and confused deputy attacks are mostly in Column A.
The most pronounced consequence of this is, when tasked with building at-rest data encryption in an application, most software developers do not have a cohesive threat model in mind (let alone a formal one).
This leads to disagreement between stakeholders about what the security requirements actually are.
How can I help improve things somewhat?
Most importantly, spread awareness of the nuances of encryption at-rest.
This blog post is intended to be a good conversation starter, but there are other resources to consider, too. I’ve linked to many of them throughout this post already.
If you’re paying for software to encrypt data at rest, ask your vendor how they mitigate the risk of confused deputy attacks. Link them to this blog post if they’re not sure what you mean.
If said vendor responds, “this risk is outside of our threat model,” ask to see their formal threat model document. If it exists and doesn’t align with your application’s threat model, maybe consider alternative solutions that provide protection against more attack classes than Full Disk Encryption would.
Finally, gaining experience with threat modeling is a good use of every developer’s time. Adam Caudill has an excellent introductory blog post on the subject.
Closing Thoughts
Despite everything I’ve written here today, I do not claim to have all the answers for encryption at rest.
However, you can unlock a lot of value just by asking the right questions. My hope is that anyone that reads this post is now capable of asking those questions.
Addendum (2024-06-03)
After I published this, the r/netsec subreddit has expressed disappointment that this blog post had “no mention of” consumer device theft or countries experiencing civil unrest and pulling hard drives from data centers.
You could make a congruent complaint that it also had no mention of Batman.
To be clear, I’m not saying that the use cases and risks Reddit cares about are off-topic to any discussion of full-disk encryption. They matter.
Rather, it’s that they’re not relevant to the specific point I am making: Even in the simplest use case, far from the annoying details of end user hardware or the whims of nation states, encryption-at-rest is poorly understood by most developers, and should be thought through carefully.
Your threat model is not my threat model, and vice versa.
I never advertised this blog post as a comprehensive and complete guide to the entire subject of encryption-at-rest. If you too felt under-served by this blog post for not addressing the corner cases that really matter to you, I hope this addendum makes it clearer why I didn’t cover them.
Finally, if you feel that there’s an aspect of the encryption-at-rest topic that really warrants further examination, I invite you to blog about it.
If your blog post is interesting enough, I’ll revise this post and link to it here.
https://scottarc.blog/2024/06/02/encryption-at-rest-whose-threat-model-is-it-anyway/
#Cryptography #cybersecurity #encryption #encryptionAtRest #security #symmetricCryptography #technology
-
Earlier this year, Cendyne wrote a blog post covering the use of HKDF, building partially upon my own blog post about HKDF and the KDF security definition, but moreso inspired by a cryptographic issue they identified in another company’s product (dubbed AnonCo).
At the bottom they teased:
Database cryptography is hard. The above sketch is not complete and does not address several threats! This article is quite long, so I will not be sharing the fixes.
Cendyne
If you read Cendyne’s post, you may have nodded along with that remark and not appreciate the degree to which our naga friend was putting it mildly. So I thought I’d share some of my knowledge about real-world database cryptography in an accessible and fun format in the hopes that it might serve as an introduction to the specialization.
Note: I’m also not going to fix Cendyne’s sketch of AnonCo’s software here–partly because I don’t want to get in the habit of assigning homework or required reading, but mostly because it’s kind of obvious once you’ve learned the basics.
I’m including art of my fursona in this post… as is tradition for furry blogs.If you don’t like furries, please feel free to leave this blog and read about this topic elsewhere.
Thanks to CMYKat for the awesome stickers.
Contents
- Database Cryptography?
- Cryptography for Relational Databases
- The Perils of Built-in Encryption Functions
- Application-Layer Relational Database Cryptography
- Confused Deputies
- Canonicalization Attacks
- Multi-Tenancy
- Cryptography for NoSQL Databases
- NoSQL is Built Different
- Record Authentication
- Bonus: A Maximally Schema-Free, Upgradeable Authentication Design
- Searchable Encryption
- Order-{Preserving, Revealing} Encryption
- Deterministic Encryption
- Homomorphic Encryption
- Searchable Symmetric Encryption (SSE)
- You Can Have Little a HMAC, As a Treat
- Intermission
- Case Study: MongoDB Client-Side Encryption
- MongoCrypt: The Good
- How is Queryable Encryption Implemented?
- MongoCrypt: The Bad
- MongoCrypt: The Ugly
- MongoCrypt: The Good
- Wrapping Up
Database Cryptography?
The premise of database cryptography is deceptively simple: You have a database, of some sort, and you want to store sensitive data in said database.
The consequences of this simple premise are anything but simple. Let me explain.
Art: ScruffKerfluffThe sensitive data you want to store may need to remain confidential, or you may need to provide some sort of integrity guarantees throughout your entire system, or sometimes both. Sometimes all of your data is sensitive, sometimes only some of it is. Sometimes the confidentiality requirements of your data extends to where within a dataset the record you want actually lives. Sometimes that’s true of some data, but not others, so your cryptography has to be flexible to support multiple types of workloads.
Other times, you just want your disks encrypted at rest so if they grow legs and walk out of the data center, the data cannot be comprehended by an attacker. And you can’t be bothered to work on this problem any deeper. This is usually what compliance requirements cover. Boxes get checked, executives feel safer about their operation, and the whole time nobody has really analyzed the risks they’re facing.
But we’re not settling for mere compliance on this blog. Furries have standards, after all.
So the first thing you need to do before diving into database cryptography is threat modelling. The first step in any good threat model is taking inventory; especially of assumptions, requirements, and desired outcomes. A few good starter questions:
- What database software is being used? Is it up to date?
- What data is being stored in which database software?
- How are databases oriented in the network of the overall system?
- Is your database properly firewalled from the public Internet?
- How does data flow throughout the network, and when do these data flows intersect with the database?
- Which applications talk to the database? What languages are they written in? Which APIs do they use?
- How will cryptography secrets be managed?
- Is there one key for everyone, one key per tenant, etc.?
- How are keys rotated?
- Do you use envelope encryption with an HSM, or vend the raw materials to your end devices?
The first two questions are paramount for deciding how to write software for database cryptography, before you even get to thinking about the cryptography itself.
(This is not a comprehensive set of questions to ask, either. A formal threat model is much deeper in the weeds.)
The kind of cryptography protocol you need for, say, storing encrypted CSV files an S3 bucket is vastly different from relational (SQL) databases, which in turn will be significantly different from schema-free (NoSQL) databases.
Furthermore, when you get to the point that you can start to think about the cryptography, you’ll often need to tackle confidentiality and integrity separately.
If that’s unclear, think of a scenario like, “I need to encrypt PII, but I also need to digitally sign the lab results so I know it wasn’t tampered with at rest.”
My point is, right off the bat, we’ve got a three-dimensional matrix of complexity to contend with:
- On one axis, we have the type of database.
- Flat-file
- Relational
- Schema-free
- On another, we have the basic confidentiality requirements of the data.
- Field encryption
- Row encryption
- Column encryption
- Unstructured record encryption
- Encrypting entire collections of records
- Finally, we have the integrity requirements of the data.
- Field authentication
- Row/column authentication
- Unstructured record authentication
- Collection authentication (based on e.g. Sparse Merkle Trees)
And then you have a fourth dimension that often falls out of operational requirements for databases: Searchability.
Why store data in a database if you have no way to index or search the data for fast retrieval?
Credit: HarubakiIf you’re starting to feel overwhelmed, you’re not alone. A lot of developers drastically underestimate the difficulty of the undertaking, until they run head-first into the complexity.
Some just phone it in with
AES_Encrypt()calls in their MySQL queries. (Too bad ECB mode doesn’t provide semantic security!)Which brings us to the meat of this blog post: The actual cryptography part.
Cryptography is the art of transforming information security problems into key management problems.
Former coworker
Note: In the interest of time, I’m skipping over flat files and focusing instead on actual database technologies.
Cryptography for Relational Databases
Encrypting data in an SQL database seems simple enough, even if you’ve managed to shake off the complexity I teased from the introduction.
You’ve got data, you’ve got a column on a table. Just encrypt the data and shove it in a cell on that column and call it a day, right?
But, alas, this is a trap. There are so many gotchas that I can’t weave a coherent, easy-to-follow narrative between them all.
So let’s start with a simple question: where and how are you performing your encryption?
The Perils of Built-in Encryption Functions
MySQL provides functions called AES_Encrypt and AES_Decrypt, which many developers have unfortunately decided to rely on in the past.
It’s unfortunate because these functions implement ECB mode. To illustrate why ECB mode is bad, I encrypted one of my art commissions with AES in ECB mode:
Art by Riley, encrypted with AES-ECBThe problems with ECB mode aren’t exactly “you can see the image through it,” because ECB-encrypting a compressed image won’t have redundancy (and thus can make you feel safer than you are).
ECB art is a good visual for the actual issue you should care about, however: A lack of semantic security.
A cryptosystem is considered semantically secure if observing the ciphertext doesn’t reveal information about the plaintext (except, perhaps, the length; which all cryptosystems leak to some extent). More information here.
ECB art isn’t to be confused with ECB poetry, which looks like this:
Oh little one, you’re growing up
You’ll soon be writing C
You’ll treat your ints as pointers
You’ll nest the ternary
You’ll cut and paste from github
And try cryptography
But even in your darkest hour
Do not use ECBCBC’s BEASTly when padding’s abused
And CTR’s fine til a nonce is reused
Some say it’s a CRIME to compress then encrypt
Or store keys in the browser (or use javascript)
Diffie Hellman will collapse if hackers choose your g
And RSA is full of traps when e is set to 3
Whiten! Blind! In constant time! Don’t write an RNG!
But failing all, and listen well: Do not use ECBThey’ll say “It’s like a one-time-pad!
The data’s short, it’s not so bad
the keys are long–they’re iron clad
I have a PhD!”
And then you’re front page Hacker News
Your passwords cracked–Adobe Blues.
Don’t leave your penguins showing through,
Do not use ECB— Ben Nagy, PoC||GTFO 0x04:13
Most people reading this probably know better than to use ECB mode already, and don’t need any of these reminders, but there is still a lot of code that inadvertently uses ECB mode to encrypt data in the database.
Also,
Credit: CMYKattSHOW processlist;leaks your encryption keys. Oops.Application-layer Relational Database Cryptography
Whether burned by ECB or just cautious about not giving your secrets to the system that stores all the ciphertext protected by said secret, a common next step for developers is to simply encrypt in their server-side application code.
And, yes, that’s part of the answer. But how you encrypt is important.
Credit: Harubaki“I’ll encrypt with CBC mode.”
If you don’t authenticate your ciphertext, you’ll be sorry. Maybe try again?“Okay, fine, I’ll use an authenticated mode like GCM.”
Did you remember to make the table and column name part of your AAD? What about the primary key of the record?“What on Earth are you talking about, Soatok?”
Welcome to the first footgun of database cryptography!Confused Deputies
Encrypting your sensitive data is necessary, but not sufficient. You need to also bind your ciphertexts to the specific context in which they are stored.
To understand why, let’s take a step back: What specific threat does encrypting your database records protect against?
We’ve already established that “your disks walk out of the datacenter” is a “full disk encryption” problem, so if you’re using application-layer cryptography to encrypt data in a relational database, your threat model probably involves unauthorized access to the database server.
What, then, stops an attacker from copying ciphertexts around?
Credit: CMYKattLet’s say I have a legitimate user account with an ID 12345, and I want to read your street address, but it’s encrypted in the database. But because I’m a clever hacker, I have unfettered access to your relational database server.
All I would need to do is simply…
UPDATE table SET addr_encrypted = 'your-ciphertext' WHERE id = 12345…and then access the application through my legitimate access. Bam, data leaked. As an attacker, I can probably even copy fields from other columns and it will just decrypt. Even if you’re using an authenticated mode.
We call this a confused deputy attack, because the deputy (the component of the system that has been delegated some authority or privilege) has become confused by the attacker, and thus undermined an intended security goal.
The fix is to use the AAD parameter from the authenticated mode to bind the data to a given context. (AAD = Additional Authenticated Data.)
- $addr = aes_gcm_encrypt($addr, $key);+ $addr = aes_gcm_encrypt($addr, $key, canonicalize([+ $tableName,+ $columnName,+ $primaryKey+ ]);
Now if I start cutting and pasting ciphertexts around, I get a decryption failure instead of silently decrypting plaintext.
This may sound like a specific vulnerability, but it’s more of a failure to understand an important general lesson with database cryptography:
Where your data lives is part of its identity, and MUST be authenticated.
Soatok’s Rule of Database Cryptography
Canonicalization Attacks
In the previous section, I introduced a pseudocode called
canonicalize(). This isn’t a pasto from some reference code; it’s an important design detail that I will elaborate on now.First, consider you didn’t do anything to canonicalize your data, and you just joined strings together and called it a day…
function dumbCanonicalize( string $tableName, string $columnName, string|int $primaryKey): string { return $tableName . '_' . $columnName . '#' . $primaryKey;}Consider these two inputs to this function:
dumbCanonicalize('customers', 'last_order_uuid', 123);dumbCanonicalize('customers_last_order', 'uuid', 123);
In this case, your AAD would be the same, and therefore, your deputy can still be confused (albeit in a narrower use case).
In Cendyne’s article, AnonCo did something more subtle: The canonicalization bug created a collision on the inputs to HKDF, which resulted in an unintentional key reuse.
Up until this point, their mistake isn’t relevant to us, because we haven’t even explored key management at all. But the same design flaw can re-emerge in multiple locations, with drastically different consequence.
Multi-Tenancy
Once you’ve implemented a mitigation against Confused Deputies, you may think your job is done. And it very well could be.
Often times, however, software developers are tasked with building support for Bring Your Own Key (BYOK).
This is often spawned from a specific compliance requirement (such as cryptographic shredding; i.e. if you erase the key, you can no longer recover the plaintext, so it may as well be deleted).
Other times, this is driven by a need to cut costs: Storing different users’ data in the same database server, but encrypting it such that they can only encrypt their own records.
Two things can happen when you introduce multi-tenancy into your database cryptography designs:
- Invisible Salamanders becomes a risk, due to multiple keys being possible for any given encrypted record.
- Failure to address the risk of Invisible Salamanders can undermine your protection against Confused Deputies, thereby returning you to a state before you properly used the AAD.
So now you have to revisit your designs and ensure you’re using a key-committing authenticated mode, rather than just a regular authenticated mode.
Isn’t cryptography fun?
“What Are Invisible Salamanders?”
This refers to a fun property of AEAD modes based on Polynomical MACs. Basically, if you:
- Encrypt one message under a specific key and nonce.
- Encrypt another message under a separate key and nonce.
…Then you can get the same exact ciphertext and authentication tag. Performing this attack requires you to control the keys for both encryption operations.
This was first demonstrated in an attack against encrypted messaging applications, where a picture of a salamander was hidden from the abuse reporting feature because another attached file had the same authentication tag and ciphertext, and you could trick the system if you disclosed the second key instead of the first. Thus, the salamander is invisible to attackers.
Art: CMYKatWe’re not quite done with relational databases yet, but we should talk about NoSQL databases for a bit. The final topic in scope applies equally to both, after all.
Cryptography for NoSQL Databases
Most of the topics from relational databases also apply to NoSQL databases, so I shall refrain from duplicating them here. This article is already sufficiently long to read, after all, and I dislike redundancy.
NoSQL is Built Different
The main thing that NoSQL databases offer in the service of making cryptographers lose sleep at night is the schema-free nature of NoSQL designs.
What this means is that, if you’re using a client-side encryption library for a NoSQL database, the previous concerns about confused deputy attacks are amplified by the malleability of the document structure.
Additionally, the previously discussed cryptographic attacks against the encryption mode may be less expensive for an attacker to pull off.
Consider the following record structure, which stores a bunch of data stored with AES in CBC mode:
{ "encrypted-data-key": "<blob>", "name": "<ciphertext>", "address": [ "<ciphertext>", "<ciphertext>" ], "social-security": "<ciphertext>", "zip-code": "<ciphertext>"}If this record is decrypted with code that looks something like this:
$decrypted = [];// ... snip ...foreach ($record['address'] as $i => $addrLine) { try { $decrypted['address'][$i] = $this->decrypt($addrLine); } catch (Throwable $ex) { // You'd never deliberately do this, but it's for illustration $this->doSomethingAnOracleCanObserve($i); // This is more believable, of course: $this->logDecryptionError($ex, $addrLine); $decrypted['address'][$i] = ''; }}Then you can keep appending rows to the
Art: Harubaki"address"field to reduce the number of writes needed to exploit a padding oracle attack against any of the<ciphertext>fields.This isn’t to say that NoSQL is less secure than SQL, from the context of client-side encryption. However, the powerful feature sets that NoSQL users are accustomed to may also give attackers a more versatile toolkit to work with.
Record Authentication
A pedant may point out that record authentication applies to both SQL and NoSQL. However, I mostly only observe this feature in NoSQL databases and document storage systems in the wild, so I’m shoving it in here.
Encrypting fields is nice and all, but sometimes what you want to know is that your unencrypted data hasn’t been tampered with as it flows through your system.
The trivial way this is done is by using a digital signature algorithm over the whole record, and then appending the signature to the end. When you go to verify the record, all of the information you need is right there.
This works well enough for most use cases, and everyone can pack up and go home. Nothing more to see here.
Except…
When you’re working with NoSQL databases, you often want systems to be able to write to additional fields, and since you’re working with schema-free blobs of data rather than a normalized set of relatable tables, the most sensible thing to do is to is to append this data to the same record.
Except, oops! You can’t do that if you’re shoving a digital signature over the record. So now you need to specify which fields are to be included in the signature.
And you need to think about how to model that in a way that doesn’t prohibit schema upgrades nor allow attackers to perform downgrade attacks. (See below.)
I don’t have any specific real-world examples here that I can point to of this problem being solved well.Art: CMYKat
Furthermore, as with preventing confused deputy and/or canonicalization attacks above, you must also include the fully qualified path of each field in the data that gets signed.
As I said with encryption before, but also true here:
Where your data lives is part of its identity, and MUST be authenticated.
Soatok’s Rule of Database Cryptography
This requirement holds true whether you’re using symmetric-key authentication (i.e. HMAC) or asymmetric-key digital signatures (e.g. EdDSA).
Bonus: A Maximally Schema-Free, Upgradeable Authentication Design
Art: HarubakiOkay, how do you solve this problem so that you can perform updates and upgrades to your schema but without enabling attackers to downgrade the security? Here’s one possible design.
Let’s say you have two metadata fields on each record:
- A compressed binary string representing which fields should be authenticated. This field is, itself, not authenticated. Let’s call this
meta-auth. - A compressed binary string representing which of the authenticated fields should also be encrypted. This field is also authenticated. This is at most the same length as the first metadata field. Let’s call this
meta-enc.
Furthermore, you will specify a canonical field ordering for both how data is fed into the signature algorithm as well as the field mappings in
meta-authandmeta-enc.{ "example": { "credit-card": { "number": /* encrypted */, "expiration": /* encrypted */, "ccv": /* encrypted */ }, "superfluous": { "rewards-member": null } }, "meta-auth": compress_bools([ true, /* example.credit-card.number */ true, /* example.credit-card.expiration */ true, /* example.credit-card.ccv */ false, /* example.superfluous.rewards-member */ true /* meta-enc */ ]), "meta-enc": compress_bools([ true, /* example.credit-card.number */ true, /* example.credit-card.expiration */ true, /* example.credit-card.ccv */ false /* example.superfluous.rewards-member */ ]), "signature": /* -- snip -- */}When you go to append data to an existing record, you’ll need to update
meta-authto include the mapping of fields based on this canonical ordering to ensure only the intended fields get validated.When you update your code to add an additional field that is intended to be signed, you can roll that out for new records and the record will continue to be self-describing:
- New records will have the additional field flagged as authenticated in
meta-auth(andmeta-encwill grow) - Old records will not, but your code will still sign them successfully
- To prevent downgrade attacks, simply include a schema version ID as an additional plaintext field that gets authenticated. An attacker who tries to downgrade will need to be able to produce a valid signature too.
You might think
meta-authgives an attacker some advantage, but this only includes which fields are included in the security boundary of the signature or MAC, which allows unauthenticated data to be appended for whatever operational purpose without having to update signatures or expose signing keys to a wider part of the network.{ "example": { "credit-card": { "number": /* encrypted */, "expiration": /* encrypted */, "ccv": /* encrypted */ }, "superfluous": { "rewards-member": null } }, "meta-auth": compress_bools([ true, /* example.credit-card.number */ true, /* example.credit-card.expiration */ true, /* example.credit-card.ccv */ false, /* example.superfluous.rewards-member */ true, /* meta-enc */ true /* meta-version */ ]), "meta-enc": compress_bools([ true, /* example.credit-card.number */ true, /* example.credit-card.expiration */ true, /* example.credit-card.ccv */ false, /* example.superfluous.rewards-member */ true /* meta-version */ ]), "meta-version": 0x01000000, "signature": /* -- snip -- */}If an attacker tries to use the
meta-authfield to mess with a record, the best they can hope for is an Invalid Signature exception (assuming the signature algorithm is secure to begin with).Even if they keep all of the fields the same, but play around with the structure of the record (e.g. changing the XPath or equivalent), so long as the path is authenticated with each field, breaking this is computationally infeasible.
Searchable Encryption
If you’ve managed to make it through the previous sections, congratulations, you now know enough to build a secure but completely useless database.
Art: CMYKatOkay, put away the pitchforks; I will explain.
Part of the reason why we store data in a database, rather than a flat file, is because we want to do more than just read and write. Sometimes computer scientists want to compute. Almost always, you want to be able to query your database for a subset of records based on your specific business logic needs.
And so, a database which doesn’t do anything more than store ciphertext and maybe signatures is pretty useless to most people. You’d have better luck selling Monkey JPEGs to furries than convincing most businesses to part with their precious database-driven report generators.
Art: SophieSo whenever one of your users wants to actually use their data, rather than just store it, they’re forced to decide between two mutually exclusive options:
- Encrypting the data, to protect it from unauthorized disclosure, but render it useless
- Doing anything useful with the data, but leaving it unencrypted in the database
This is especially annoying for business types that are all in on the Zero Trust buzzword.
Fortunately, the cryptographers are at it again, and boy howdy do they have a lot of solutions for this problem.
Order-{Preserving, Revealing} Encryption
On the fun side of things, you have things like Order-Preserving and Order-Revealing Encryption, which Matthew Green wrote about at length.
[D]atabase encryption has been a controversial subject in our field. I wish I could say that there’s been an actual debate, but it’s more that different researchers have fallen into different camps, and nobody has really had the data to make their position in a compelling way. There have actually been some very personal arguments made about it.
Attack of the week: searchable encryption and the ever-expanding leakage function
The problem with these designs is that they have a significant enough leakage that it no longer provides semantic security.
From Grubbs, et al. (GLMP, 2019.)
Colors inverted to fit my blog’s theme better.To put it in other words: These designs are only marginally better than ECB mode, and probably deserve their own poems too.
Order revealing
Reveals much more than order
Softcore ECBOrder preserving
Semantic security?
Only in your dreamsHaiku for your consideration
Deterministic Encryption
Here’s a simpler, but also terrible, idea for searchable encryption: Simply give up on semantic security entirely.
If you recall the
AES_{De,En}crypt()functions built into MySQL I mentioned at the start of this article, those are the most common form of deterministic encryption I’ve seen in use.SELECT * FROM foo WHERE bar = AES_Encrypt('query', 'key');However, there are slightly less bad variants. If you use AES-GCM-SIV with a static nonce, your ciphertexts are fully deterministic, and you can encrypt a small number of distinct records safely before you’re no longer secure.
From Page 14 of the linked paper. Full view.That’s certainly better than nothing, but you also can’t mitigate confused deputy attacks. But we can do better than this.
Homomorphic Encryption
In a safer plane of academia, you’ll find homomorphic encryption, which researchers recently demonstrated with serving Wikipedia pages in a reasonable amount of time.
Homomorphic encryption allows computations over the ciphertext, which will be reflected in the plaintext, without ever revealing the key to the entity performing the computation.
If this sounds vaguely similar to the conditions that enable chosen-ciphertext attacks, you probably have a good intuition for how it works: RSA is homomorphic to multiplication, AES-CTR is homomorphic to XOR. Fully homomorphic encryption uses lattices, which enables multiple operations but carries a relatively enormous performance cost.
Art: HarubakiHomomorphic encryption sometimes intersects with machine learning, because the notion of training an encrypted model by feeding it encrypted data, then decrypting it after-the-fact is desirable for certain business verticals. Your data scientists never see your data, and you have some plausible deniability about the final ML model this work produces. This is like a Siren song for Venture Capitalist-backed medical technology companies. Tech journalists love writing about it.
However, a less-explored use case is the ability to encrypt your programs but still get the correct behavior and outputs. Although this sounds like a DRM technology, it’s actually something that individuals could one day use to prevent their ISPs or cloud providers from knowing what software is being executed on the customer’s leased hardware. The potential for a privacy win here is certainly worth pondering, even if you’re a tried and true Pirate Party member.
Just say “NO” to the copyright cartels.Art: CMYKat
Searchable Symmetric Encryption (SSE)
Forget about working at the level of fields and rows or individual records. What if we, instead, worked over collections of documents, where each document is viewed as a set of keywords from a keyword space?
Art: CMYKatThat’s the basic premise of SSE: Encrypting collections of documents rather than individual records.
The actual implementation details differ greatly between designs. They also differ greatly in their leakage profiles and susceptibility to side-channel attacks.
Some schemes use a so-called trapdoor permutation, such as RSA, as one of their building blocks.
Some schemes only allow for searching a static set of records, while others can accommodate new data over time (with the trade-off between more leakage or worse performance).
If you’re curious, you can learn more about SSE here, and see some open source SEE implementations online here.
You’re probably wondering, “If SSE is this well-studied and there are open source implementations available, why isn’t it more widely used?”
Your guess is as good as mine, but I can think of a few reasons:
- The protocols can be a little complicated to implement, and aren’t shipped by default in cryptography libraries (i.e. OpenSSL’s libcrypto or libsodium).
- Every known security risk in SSE is the product of a trade-offs, rather than there being a single winner for all use cases that developers can feel comfortable picking.
- Insufficient marketing and developer advocacy.
SSE schemes are mostly of interest to academics, although Seny Kamara (Brown Univeristy professior and one of the luminaries of searchable encryption) did try to develop an app called Pixek which used SSE to encrypt photos.
Maybe there’s room for a cryptography competition on searchable encryption schemes in the future.
You Can Have Little a HMAC, As a Treat
Finally, I can’t talk about searchable encryption without discussing a technique that’s older than dirt by Internet standards, that has been independently reinvented by countless software developers tasked with encrypting database records.
The oldest version I’ve been able to track down dates to 2006 by Raul Garcia at Microsoft, but I’m not confident that it didn’t exist before.
The idea I’m alluding to goes like this:
- Encrypt your data, securely, using symmetric cryptography.
(Hopefully your encryption addresses the considerations outlined in the relevant sections above.) - Separately, calculate an HMAC over the unencrypted data with a separate key used exclusively for indexing.
When you need to query your data, you can just recalculate the HMAC of your challenge and fetch the records that match it. Easy, right?
Even if you rotate your keys for encryption, you keep your indexing keys static across your entire data set. This lets you have durable indexes for encrypted data, which gives you the ability to do literal lookups for the performance hit of a hash function.
Additionally, everyone has HMAC in their toolkit, so you don’t have to move around implementations of complex cryptographic building blocks. You can live off the land. What’s not to love?
Hooray!However, if you stopped here, we regret to inform you that your data is no longer indistinguishable from random, which probably undermines the security proof for your encryption scheme.
How annoying!Of course, you don’t have to stop with the addition of plain HMAC to your database encryption software.
Take a page from Troy Hunt: Truncate the output to provide k-anonymity rather than a direct literal look-up.
“K-What Now?”
Imagine you have a full HMAC-SHA256 of the plaintext next to every ciphertext record with a static key, for searchability.
Each HMAC output corresponds 1:1 with a unique plaintext.
Because you’re using HMAC with a secret key, an attacker can’t just build a rainbow table like they would when attempting password cracking, but it still leaks duplicate plaintexts.
For example, an HMAC-SHA256 output might look like this:
Art: CMYKat\04a74e4c0158e34a566785d1a5e1167c4e3455c42aea173104e48ca810a8b1aeIf you were to slice off most of those bytes (e.g. leaving only the last 3, which in the previous example yields
a8b1ae), then with sufficient records, multiple plaintexts will now map to the same truncated HMAC tag.Which means if you’re only revealing a truncated HMAC tag to the database server (both when storing records or retrieving them), you can now expect false positives due to collisions in your truncated HMAC tag.
These false positives give your data a discrete set of anonymity (called k-anonymity), which means an attacker with access to your database cannot:
- Distinguish between two encrypted records with the same short HMAC tag.
- Reverse engineer the short HMAC tag into a single possible plaintext value, even if they can supply candidate queries and study the tags sent to the database.
As with SSE above, this short HMAC technique exposes a trade-off to users.
- Too much k-anonymity (i.e. too many false positives), and you will have to decrypt-then-discard multiple mismatching records. This can make queries slow.
- Not enough k-anonymity (i.e. insufficient false positives), and you’re no better off than a full HMAC.
Even more troublesome, the right amount to truncate is expressed in bits (not bytes), and calculating this value depends on the number of unique plaintext values you anticipate in your dataset. (Fortunately, it grows logarithmically, so you’ll rarely if ever have to tune this.)
If you’d like to play with this idea, here’s a quick and dirty demo script.
Intermission
If you started reading this post with any doubts about Cendyne’s statement that “Database cryptography is hard”, by making it to this point, they’ve probably been long since put to rest.
Art: HarubakiConversely, anyone that specializes in this topic is probably waiting for me to say anything novel or interesting; their patience wearing thin as I continue to rehash a surface-level introduction of their field without really diving deep into anything.
Thus, if you’ve read this far, I’d like to demonstrate the application of what I’ve covered thus far into a real-world case study into an database cryptography product.
Case Study: MongoDB Client-Side Encryption
MongoDB is an open source schema-free NoSQL database. Last year, MongoDB made waves when they announced Queryable Encryption in their upcoming client-side encryption release.
Taken from the press release, but adapted for dark themes.A statement at the bottom of their press release indicates that this isn’t clown-shoes:
Queryable Encryption was designed by MongoDB’s Advanced Cryptography Research Group, headed by Seny Kamara and Tarik Moataz, who are pioneers in the field of encrypted search. The Group conducts cutting-edge peer-reviewed research in cryptography and works with MongoDB engineering teams to transfer and deploy the latest innovations in cryptography and privacy to the MongoDB data platform.
If you recall, I mentioned Seny Kamara in the SSE section of this post. They certainly aren’t wrong about Kamara and Moataz being pioneers in this field.
So with that in mind, let’s explore the implementation in libmongocrypt and see how it stands up to scrutiny.
MongoCrypt: The Good
MongoDB’s encryption library takes key management seriously: They provide a KMS integration for cloud users by default (supporting both AWS and Azure).
MongoDB uses Encrypt-then-MAC with AES-CBC and HMAC-SHA256, which is congruent to what Signal does for message encryption.
How Is Queryable Encryption Implemented?
From the current source code, we can see that MongoCrypt generates several different types of tokens, using HMAC (calculation defined here).
According to their press release:
The feature supports equality searches, with additional query types such as range, prefix, suffix, and substring planned for future releases.
Which means that most of the juicy details probably aren’t public yet.
These HMAC-derived tokens are stored wholesale in the data structure, but most are encrypted before storage using AES-CTR.
There are more layers of encryption (using AEAD), server-side token processing, and more AES-CTR-encrypted edge tokens. All of this is finally serialized (implementation) as one blob for storage.
Since only the equality operation is currently supported (which is the same feature you’d get from HMAC), it’s difficult to speculate what the full feature set looks like.
However, since Kamara and Moataz are leading its development, it’s likely that this feature set will be excellent.
MongoCrypt: The Bad
Every call to
do_encrypt()includes at most the Key ID (but typicallyNULL) as the AAD. This means that the concerns over Confused Deputies (and NoSQL specifically) are relevant to MongoDB.However, even if they did support authenticating the fully qualified path to a field in the AAD for their encryption, their AEAD construction is vulnerable to the kind of canonicalization attack I wrote about previously.
First, observe this code which assembles the multi-part inputs into HMAC.
/* Construct the input to the HMAC */uint32_t num_intermediates = 0;_mongocrypt_buffer_t intermediates[3];// -- snip --if (!_mongocrypt_buffer_concat ( &to_hmac, intermediates, num_intermediates)) { CLIENT_ERR ("failed to allocate buffer"); goto done;}if (hmac == HMAC_SHA_512_256) { uint8_t storage[64]; _mongocrypt_buffer_t tag = {.data = storage, .len = sizeof (storage)}; if (!_crypto_hmac_sha_512 (crypto, Km, &to_hmac, &tag, status)) { goto done; } // Truncate sha512 to first 256 bits. memcpy (out->data, tag.data, MONGOCRYPT_HMAC_LEN);} else { BSON_ASSERT (hmac == HMAC_SHA_256); if (!_mongocrypt_hmac_sha_256 (crypto, Km, &to_hmac, out, status)) { goto done; }}The implementation of
_mongocrypt_buffer_concat()can be found here.If either the implementation of that function, or the code I snipped from my excerpt, had contained code that prefixed every segment of the AAD with the length of the segment (represented as a
uint64_tto make overflow infeasible), then their AEAD mode would not be vulnerable to canonicalization issues.Using TupleHash would also have prevented this issue.
Silver lining for MongoDB developers: Because the AAD is either a key ID or NULL, this isn’t exploitable in practice.
The first cryptographic flaw sort of cancels the second out.
If the libmongocrypt developers ever want to mitigate Confused Deputy attacks, they’ll need to address this canonicalization issue too.
MongoCrypt: The Ugly
MongoCrypt supports deterministic encryption.
If you specify deterministic encryption for a field, your application passes a deterministic initialization vector to AEAD.
We already discussed why this is bad above.
Wrapping Up
This was not a comprehensive treatment of the field of database cryptography. There are many areas of this field that I did not cover, nor do I feel qualified to discuss.
However, I hope anyone who takes the time to read this finds themselves more familiar with the subject.
Additionally, I hope any developers who think “encrypting data in a database is [easy, trivial] (select appropriate)” will find this broad introduction a humbling experience.
Art: CMYKathttps://soatok.blog/2023/03/01/database-cryptography-fur-the-rest-of-us/
#appliedCryptography #blockCipherModes #cryptography #databaseCryptography #databases #encryptedSearch #HMAC #MongoCrypt #MongoDB #QueryableEncryption #realWorldCryptography #security #SecurityGuidance #SQL #SSE #symmetricCryptography #symmetricSearchableEncryption
-
Earlier this year, Cendyne wrote a blog post covering the use of HKDF, building partially upon my own blog post about HKDF and the KDF security definition, but moreso inspired by a cryptographic issue they identified in another company’s product (dubbed AnonCo).
At the bottom they teased:
Database cryptography is hard. The above sketch is not complete and does not address several threats! This article is quite long, so I will not be sharing the fixes.
Cendyne
If you read Cendyne’s post, you may have nodded along with that remark and not appreciate the degree to which our naga friend was putting it mildly. So I thought I’d share some of my knowledge about real-world database cryptography in an accessible and fun format in the hopes that it might serve as an introduction to the specialization.
Note: I’m also not going to fix Cendyne’s sketch of AnonCo’s software here–partly because I don’t want to get in the habit of assigning homework or required reading, but mostly because it’s kind of obvious once you’ve learned the basics.
I’m including art of my fursona in this post… as is tradition for furry blogs.If you don’t like furries, please feel free to leave this blog and read about this topic elsewhere.
Thanks to CMYKat for the awesome stickers.
Contents
- Database Cryptography?
- Cryptography for Relational Databases
- The Perils of Built-in Encryption Functions
- Application-Layer Relational Database Cryptography
- Confused Deputies
- Canonicalization Attacks
- Multi-Tenancy
- Cryptography for NoSQL Databases
- NoSQL is Built Different
- Record Authentication
- Bonus: A Maximally Schema-Free, Upgradeable Authentication Design
- Searchable Encryption
- Order-{Preserving, Revealing} Encryption
- Deterministic Encryption
- Homomorphic Encryption
- Searchable Symmetric Encryption (SSE)
- You Can Have Little a HMAC, As a Treat
- Intermission
- Case Study: MongoDB Client-Side Encryption
- MongoCrypt: The Good
- How is Queryable Encryption Implemented?
- MongoCrypt: The Bad
- MongoCrypt: The Ugly
- MongoCrypt: The Good
- Wrapping Up
Database Cryptography?
The premise of database cryptography is deceptively simple: You have a database, of some sort, and you want to store sensitive data in said database.
The consequences of this simple premise are anything but simple. Let me explain.
Art: ScruffKerfluffThe sensitive data you want to store may need to remain confidential, or you may need to provide some sort of integrity guarantees throughout your entire system, or sometimes both. Sometimes all of your data is sensitive, sometimes only some of it is. Sometimes the confidentiality requirements of your data extends to where within a dataset the record you want actually lives. Sometimes that’s true of some data, but not others, so your cryptography has to be flexible to support multiple types of workloads.
Other times, you just want your disks encrypted at rest so if they grow legs and walk out of the data center, the data cannot be comprehended by an attacker. And you can’t be bothered to work on this problem any deeper. This is usually what compliance requirements cover. Boxes get checked, executives feel safer about their operation, and the whole time nobody has really analyzed the risks they’re facing.
But we’re not settling for mere compliance on this blog. Furries have standards, after all.
So the first thing you need to do before diving into database cryptography is threat modelling. The first step in any good threat model is taking inventory; especially of assumptions, requirements, and desired outcomes. A few good starter questions:
- What database software is being used? Is it up to date?
- What data is being stored in which database software?
- How are databases oriented in the network of the overall system?
- Is your database properly firewalled from the public Internet?
- How does data flow throughout the network, and when do these data flows intersect with the database?
- Which applications talk to the database? What languages are they written in? Which APIs do they use?
- How will cryptography secrets be managed?
- Is there one key for everyone, one key per tenant, etc.?
- How are keys rotated?
- Do you use envelope encryption with an HSM, or vend the raw materials to your end devices?
The first two questions are paramount for deciding how to write software for database cryptography, before you even get to thinking about the cryptography itself.
(This is not a comprehensive set of questions to ask, either. A formal threat model is much deeper in the weeds.)
The kind of cryptography protocol you need for, say, storing encrypted CSV files an S3 bucket is vastly different from relational (SQL) databases, which in turn will be significantly different from schema-free (NoSQL) databases.
Furthermore, when you get to the point that you can start to think about the cryptography, you’ll often need to tackle confidentiality and integrity separately.
If that’s unclear, think of a scenario like, “I need to encrypt PII, but I also need to digitally sign the lab results so I know it wasn’t tampered with at rest.”
My point is, right off the bat, we’ve got a three-dimensional matrix of complexity to contend with:
- On one axis, we have the type of database.
- Flat-file
- Relational
- Schema-free
- On another, we have the basic confidentiality requirements of the data.
- Field encryption
- Row encryption
- Column encryption
- Unstructured record encryption
- Encrypting entire collections of records
- Finally, we have the integrity requirements of the data.
- Field authentication
- Row/column authentication
- Unstructured record authentication
- Collection authentication (based on e.g. Sparse Merkle Trees)
And then you have a fourth dimension that often falls out of operational requirements for databases: Searchability.
Why store data in a database if you have no way to index or search the data for fast retrieval?
Credit: HarubakiIf you’re starting to feel overwhelmed, you’re not alone. A lot of developers drastically underestimate the difficulty of the undertaking, until they run head-first into the complexity.
Some just phone it in with
AES_Encrypt()calls in their MySQL queries. (Too bad ECB mode doesn’t provide semantic security!)Which brings us to the meat of this blog post: The actual cryptography part.
Cryptography is the art of transforming information security problems into key management problems.
Former coworker
Note: In the interest of time, I’m skipping over flat files and focusing instead on actual database technologies.
Cryptography for Relational Databases
Encrypting data in an SQL database seems simple enough, even if you’ve managed to shake off the complexity I teased from the introduction.
You’ve got data, you’ve got a column on a table. Just encrypt the data and shove it in a cell on that column and call it a day, right?
But, alas, this is a trap. There are so many gotchas that I can’t weave a coherent, easy-to-follow narrative between them all.
So let’s start with a simple question: where and how are you performing your encryption?
The Perils of Built-in Encryption Functions
MySQL provides functions called AES_Encrypt and AES_Decrypt, which many developers have unfortunately decided to rely on in the past.
It’s unfortunate because these functions implement ECB mode. To illustrate why ECB mode is bad, I encrypted one of my art commissions with AES in ECB mode:
Art by Riley, encrypted with AES-ECBThe problems with ECB mode aren’t exactly “you can see the image through it,” because ECB-encrypting a compressed image won’t have redundancy (and thus can make you feel safer than you are).
ECB art is a good visual for the actual issue you should care about, however: A lack of semantic security.
A cryptosystem is considered semantically secure if observing the ciphertext doesn’t reveal information about the plaintext (except, perhaps, the length; which all cryptosystems leak to some extent). More information here.
ECB art isn’t to be confused with ECB poetry, which looks like this:
Oh little one, you’re growing up
You’ll soon be writing C
You’ll treat your ints as pointers
You’ll nest the ternary
You’ll cut and paste from github
And try cryptography
But even in your darkest hour
Do not use ECBCBC’s BEASTly when padding’s abused
And CTR’s fine til a nonce is reused
Some say it’s a CRIME to compress then encrypt
Or store keys in the browser (or use javascript)
Diffie Hellman will collapse if hackers choose your g
And RSA is full of traps when e is set to 3
Whiten! Blind! In constant time! Don’t write an RNG!
But failing all, and listen well: Do not use ECBThey’ll say “It’s like a one-time-pad!
The data’s short, it’s not so bad
the keys are long–they’re iron clad
I have a PhD!”
And then you’re front page Hacker News
Your passwords cracked–Adobe Blues.
Don’t leave your penguins showing through,
Do not use ECB— Ben Nagy, PoC||GTFO 0x04:13
Most people reading this probably know better than to use ECB mode already, and don’t need any of these reminders, but there is still a lot of code that inadvertently uses ECB mode to encrypt data in the database.
Also,
Credit: CMYKattSHOW processlist;leaks your encryption keys. Oops.Application-layer Relational Database Cryptography
Whether burned by ECB or just cautious about not giving your secrets to the system that stores all the ciphertext protected by said secret, a common next step for developers is to simply encrypt in their server-side application code.
And, yes, that’s part of the answer. But how you encrypt is important.
Credit: Harubaki“I’ll encrypt with CBC mode.”
If you don’t authenticate your ciphertext, you’ll be sorry. Maybe try again?“Okay, fine, I’ll use an authenticated mode like GCM.”
Did you remember to make the table and column name part of your AAD? What about the primary key of the record?“What on Earth are you talking about, Soatok?”
Welcome to the first footgun of database cryptography!Confused Deputies
Encrypting your sensitive data is necessary, but not sufficient. You need to also bind your ciphertexts to the specific context in which they are stored.
To understand why, let’s take a step back: What specific threat does encrypting your database records protect against?
We’ve already established that “your disks walk out of the datacenter” is a “full disk encryption” problem, so if you’re using application-layer cryptography to encrypt data in a relational database, your threat model probably involves unauthorized access to the database server.
What, then, stops an attacker from copying ciphertexts around?
Credit: CMYKattLet’s say I have a legitimate user account with an ID 12345, and I want to read your street address, but it’s encrypted in the database. But because I’m a clever hacker, I have unfettered access to your relational database server.
All I would need to do is simply…
UPDATE table SET addr_encrypted = 'your-ciphertext' WHERE id = 12345…and then access the application through my legitimate access. Bam, data leaked. As an attacker, I can probably even copy fields from other columns and it will just decrypt. Even if you’re using an authenticated mode.
We call this a confused deputy attack, because the deputy (the component of the system that has been delegated some authority or privilege) has become confused by the attacker, and thus undermined an intended security goal.
The fix is to use the AAD parameter from the authenticated mode to bind the data to a given context. (AAD = Additional Authenticated Data.)
- $addr = aes_gcm_encrypt($addr, $key);+ $addr = aes_gcm_encrypt($addr, $key, canonicalize([+ $tableName,+ $columnName,+ $primaryKey+ ]);
Now if I start cutting and pasting ciphertexts around, I get a decryption failure instead of silently decrypting plaintext.
This may sound like a specific vulnerability, but it’s more of a failure to understand an important general lesson with database cryptography:
Where your data lives is part of its identity, and MUST be authenticated.
Soatok’s Rule of Database Cryptography
Canonicalization Attacks
In the previous section, I introduced a pseudocode called
canonicalize(). This isn’t a pasto from some reference code; it’s an important design detail that I will elaborate on now.First, consider you didn’t do anything to canonicalize your data, and you just joined strings together and called it a day…
function dumbCanonicalize( string $tableName, string $columnName, string|int $primaryKey): string { return $tableName . '_' . $columnName . '#' . $primaryKey;}Consider these two inputs to this function:
dumbCanonicalize('customers', 'last_order_uuid', 123);dumbCanonicalize('customers_last_order', 'uuid', 123);
In this case, your AAD would be the same, and therefore, your deputy can still be confused (albeit in a narrower use case).
In Cendyne’s article, AnonCo did something more subtle: The canonicalization bug created a collision on the inputs to HKDF, which resulted in an unintentional key reuse.
Up until this point, their mistake isn’t relevant to us, because we haven’t even explored key management at all. But the same design flaw can re-emerge in multiple locations, with drastically different consequence.
Multi-Tenancy
Once you’ve implemented a mitigation against Confused Deputies, you may think your job is done. And it very well could be.
Often times, however, software developers are tasked with building support for Bring Your Own Key (BYOK).
This is often spawned from a specific compliance requirement (such as cryptographic shredding; i.e. if you erase the key, you can no longer recover the plaintext, so it may as well be deleted).
Other times, this is driven by a need to cut costs: Storing different users’ data in the same database server, but encrypting it such that they can only encrypt their own records.
Two things can happen when you introduce multi-tenancy into your database cryptography designs:
- Invisible Salamanders becomes a risk, due to multiple keys being possible for any given encrypted record.
- Failure to address the risk of Invisible Salamanders can undermine your protection against Confused Deputies, thereby returning you to a state before you properly used the AAD.
So now you have to revisit your designs and ensure you’re using a key-committing authenticated mode, rather than just a regular authenticated mode.
Isn’t cryptography fun?
“What Are Invisible Salamanders?”
This refers to a fun property of AEAD modes based on Polynomical MACs. Basically, if you:
- Encrypt one message under a specific key and nonce.
- Encrypt another message under a separate key and nonce.
…Then you can get the same exact ciphertext and authentication tag. Performing this attack requires you to control the keys for both encryption operations.
This was first demonstrated in an attack against encrypted messaging applications, where a picture of a salamander was hidden from the abuse reporting feature because another attached file had the same authentication tag and ciphertext, and you could trick the system if you disclosed the second key instead of the first. Thus, the salamander is invisible to attackers.
Art: CMYKatWe’re not quite done with relational databases yet, but we should talk about NoSQL databases for a bit. The final topic in scope applies equally to both, after all.
Cryptography for NoSQL Databases
Most of the topics from relational databases also apply to NoSQL databases, so I shall refrain from duplicating them here. This article is already sufficiently long to read, after all, and I dislike redundancy.
NoSQL is Built Different
The main thing that NoSQL databases offer in the service of making cryptographers lose sleep at night is the schema-free nature of NoSQL designs.
What this means is that, if you’re using a client-side encryption library for a NoSQL database, the previous concerns about confused deputy attacks are amplified by the malleability of the document structure.
Additionally, the previously discussed cryptographic attacks against the encryption mode may be less expensive for an attacker to pull off.
Consider the following record structure, which stores a bunch of data stored with AES in CBC mode:
{ "encrypted-data-key": "<blob>", "name": "<ciphertext>", "address": [ "<ciphertext>", "<ciphertext>" ], "social-security": "<ciphertext>", "zip-code": "<ciphertext>"}If this record is decrypted with code that looks something like this:
$decrypted = [];// ... snip ...foreach ($record['address'] as $i => $addrLine) { try { $decrypted['address'][$i] = $this->decrypt($addrLine); } catch (Throwable $ex) { // You'd never deliberately do this, but it's for illustration $this->doSomethingAnOracleCanObserve($i); // This is more believable, of course: $this->logDecryptionError($ex, $addrLine); $decrypted['address'][$i] = ''; }}Then you can keep appending rows to the
Art: Harubaki"address"field to reduce the number of writes needed to exploit a padding oracle attack against any of the<ciphertext>fields.This isn’t to say that NoSQL is less secure than SQL, from the context of client-side encryption. However, the powerful feature sets that NoSQL users are accustomed to may also give attackers a more versatile toolkit to work with.
Record Authentication
A pedant may point out that record authentication applies to both SQL and NoSQL. However, I mostly only observe this feature in NoSQL databases and document storage systems in the wild, so I’m shoving it in here.
Encrypting fields is nice and all, but sometimes what you want to know is that your unencrypted data hasn’t been tampered with as it flows through your system.
The trivial way this is done is by using a digital signature algorithm over the whole record, and then appending the signature to the end. When you go to verify the record, all of the information you need is right there.
This works well enough for most use cases, and everyone can pack up and go home. Nothing more to see here.
Except…
When you’re working with NoSQL databases, you often want systems to be able to write to additional fields, and since you’re working with schema-free blobs of data rather than a normalized set of relatable tables, the most sensible thing to do is to is to append this data to the same record.
Except, oops! You can’t do that if you’re shoving a digital signature over the record. So now you need to specify which fields are to be included in the signature.
And you need to think about how to model that in a way that doesn’t prohibit schema upgrades nor allow attackers to perform downgrade attacks. (See below.)
I don’t have any specific real-world examples here that I can point to of this problem being solved well.Art: CMYKat
Furthermore, as with preventing confused deputy and/or canonicalization attacks above, you must also include the fully qualified path of each field in the data that gets signed.
As I said with encryption before, but also true here:
Where your data lives is part of its identity, and MUST be authenticated.
Soatok’s Rule of Database Cryptography
This requirement holds true whether you’re using symmetric-key authentication (i.e. HMAC) or asymmetric-key digital signatures (e.g. EdDSA).
Bonus: A Maximally Schema-Free, Upgradeable Authentication Design
Art: HarubakiOkay, how do you solve this problem so that you can perform updates and upgrades to your schema but without enabling attackers to downgrade the security? Here’s one possible design.
Let’s say you have two metadata fields on each record:
- A compressed binary string representing which fields should be authenticated. This field is, itself, not authenticated. Let’s call this
meta-auth. - A compressed binary string representing which of the authenticated fields should also be encrypted. This field is also authenticated. This is at most the same length as the first metadata field. Let’s call this
meta-enc.
Furthermore, you will specify a canonical field ordering for both how data is fed into the signature algorithm as well as the field mappings in
meta-authandmeta-enc.{ "example": { "credit-card": { "number": /* encrypted */, "expiration": /* encrypted */, "ccv": /* encrypted */ }, "superfluous": { "rewards-member": null } }, "meta-auth": compress_bools([ true, /* example.credit-card.number */ true, /* example.credit-card.expiration */ true, /* example.credit-card.ccv */ false, /* example.superfluous.rewards-member */ true /* meta-enc */ ]), "meta-enc": compress_bools([ true, /* example.credit-card.number */ true, /* example.credit-card.expiration */ true, /* example.credit-card.ccv */ false /* example.superfluous.rewards-member */ ]), "signature": /* -- snip -- */}When you go to append data to an existing record, you’ll need to update
meta-authto include the mapping of fields based on this canonical ordering to ensure only the intended fields get validated.When you update your code to add an additional field that is intended to be signed, you can roll that out for new records and the record will continue to be self-describing:
- New records will have the additional field flagged as authenticated in
meta-auth(andmeta-encwill grow) - Old records will not, but your code will still sign them successfully
- To prevent downgrade attacks, simply include a schema version ID as an additional plaintext field that gets authenticated. An attacker who tries to downgrade will need to be able to produce a valid signature too.
You might think
meta-authgives an attacker some advantage, but this only includes which fields are included in the security boundary of the signature or MAC, which allows unauthenticated data to be appended for whatever operational purpose without having to update signatures or expose signing keys to a wider part of the network.{ "example": { "credit-card": { "number": /* encrypted */, "expiration": /* encrypted */, "ccv": /* encrypted */ }, "superfluous": { "rewards-member": null } }, "meta-auth": compress_bools([ true, /* example.credit-card.number */ true, /* example.credit-card.expiration */ true, /* example.credit-card.ccv */ false, /* example.superfluous.rewards-member */ true, /* meta-enc */ true /* meta-version */ ]), "meta-enc": compress_bools([ true, /* example.credit-card.number */ true, /* example.credit-card.expiration */ true, /* example.credit-card.ccv */ false, /* example.superfluous.rewards-member */ true /* meta-version */ ]), "meta-version": 0x01000000, "signature": /* -- snip -- */}If an attacker tries to use the
meta-authfield to mess with a record, the best they can hope for is an Invalid Signature exception (assuming the signature algorithm is secure to begin with).Even if they keep all of the fields the same, but play around with the structure of the record (e.g. changing the XPath or equivalent), so long as the path is authenticated with each field, breaking this is computationally infeasible.
Searchable Encryption
If you’ve managed to make it through the previous sections, congratulations, you now know enough to build a secure but completely useless database.
Art: CMYKatOkay, put away the pitchforks; I will explain.
Part of the reason why we store data in a database, rather than a flat file, is because we want to do more than just read and write. Sometimes computer scientists want to compute. Almost always, you want to be able to query your database for a subset of records based on your specific business logic needs.
And so, a database which doesn’t do anything more than store ciphertext and maybe signatures is pretty useless to most people. You’d have better luck selling Monkey JPEGs to furries than convincing most businesses to part with their precious database-driven report generators.
Art: SophieSo whenever one of your users wants to actually use their data, rather than just store it, they’re forced to decide between two mutually exclusive options:
- Encrypting the data, to protect it from unauthorized disclosure, but render it useless
- Doing anything useful with the data, but leaving it unencrypted in the database
This is especially annoying for business types that are all in on the Zero Trust buzzword.
Fortunately, the cryptographers are at it again, and boy howdy do they have a lot of solutions for this problem.
Order-{Preserving, Revealing} Encryption
On the fun side of things, you have things like Order-Preserving and Order-Revealing Encryption, which Matthew Green wrote about at length.
[D]atabase encryption has been a controversial subject in our field. I wish I could say that there’s been an actual debate, but it’s more that different researchers have fallen into different camps, and nobody has really had the data to make their position in a compelling way. There have actually been some very personal arguments made about it.
Attack of the week: searchable encryption and the ever-expanding leakage function
The problem with these designs is that they have a significant enough leakage that it no longer provides semantic security.
From Grubbs, et al. (GLMP, 2019.)
Colors inverted to fit my blog’s theme better.To put it in other words: These designs are only marginally better than ECB mode, and probably deserve their own poems too.
Order revealing
Reveals much more than order
Softcore ECBOrder preserving
Semantic security?
Only in your dreamsHaiku for your consideration
Deterministic Encryption
Here’s a simpler, but also terrible, idea for searchable encryption: Simply give up on semantic security entirely.
If you recall the
AES_{De,En}crypt()functions built into MySQL I mentioned at the start of this article, those are the most common form of deterministic encryption I’ve seen in use.SELECT * FROM foo WHERE bar = AES_Encrypt('query', 'key');However, there are slightly less bad variants. If you use AES-GCM-SIV with a static nonce, your ciphertexts are fully deterministic, and you can encrypt a small number of distinct records safely before you’re no longer secure.
From Page 14 of the linked paper. Full view.That’s certainly better than nothing, but you also can’t mitigate confused deputy attacks. But we can do better than this.
Homomorphic Encryption
In a safer plane of academia, you’ll find homomorphic encryption, which researchers recently demonstrated with serving Wikipedia pages in a reasonable amount of time.
Homomorphic encryption allows computations over the ciphertext, which will be reflected in the plaintext, without ever revealing the key to the entity performing the computation.
If this sounds vaguely similar to the conditions that enable chosen-ciphertext attacks, you probably have a good intuition for how it works: RSA is homomorphic to multiplication, AES-CTR is homomorphic to XOR. Fully homomorphic encryption uses lattices, which enables multiple operations but carries a relatively enormous performance cost.
Art: HarubakiHomomorphic encryption sometimes intersects with machine learning, because the notion of training an encrypted model by feeding it encrypted data, then decrypting it after-the-fact is desirable for certain business verticals. Your data scientists never see your data, and you have some plausible deniability about the final ML model this work produces. This is like a Siren song for Venture Capitalist-backed medical technology companies. Tech journalists love writing about it.
However, a less-explored use case is the ability to encrypt your programs but still get the correct behavior and outputs. Although this sounds like a DRM technology, it’s actually something that individuals could one day use to prevent their ISPs or cloud providers from knowing what software is being executed on the customer’s leased hardware. The potential for a privacy win here is certainly worth pondering, even if you’re a tried and true Pirate Party member.
Just say “NO” to the copyright cartels.Art: CMYKat
Searchable Symmetric Encryption (SSE)
Forget about working at the level of fields and rows or individual records. What if we, instead, worked over collections of documents, where each document is viewed as a set of keywords from a keyword space?
Art: CMYKatThat’s the basic premise of SSE: Encrypting collections of documents rather than individual records.
The actual implementation details differ greatly between designs. They also differ greatly in their leakage profiles and susceptibility to side-channel attacks.
Some schemes use a so-called trapdoor permutation, such as RSA, as one of their building blocks.
Some schemes only allow for searching a static set of records, while others can accommodate new data over time (with the trade-off between more leakage or worse performance).
If you’re curious, you can learn more about SSE here, and see some open source SEE implementations online here.
You’re probably wondering, “If SSE is this well-studied and there are open source implementations available, why isn’t it more widely used?”
Your guess is as good as mine, but I can think of a few reasons:
- The protocols can be a little complicated to implement, and aren’t shipped by default in cryptography libraries (i.e. OpenSSL’s libcrypto or libsodium).
- Every known security risk in SSE is the product of a trade-offs, rather than there being a single winner for all use cases that developers can feel comfortable picking.
- Insufficient marketing and developer advocacy.
SSE schemes are mostly of interest to academics, although Seny Kamara (Brown Univeristy professior and one of the luminaries of searchable encryption) did try to develop an app called Pixek which used SSE to encrypt photos.
Maybe there’s room for a cryptography competition on searchable encryption schemes in the future.
You Can Have Little a HMAC, As a Treat
Finally, I can’t talk about searchable encryption without discussing a technique that’s older than dirt by Internet standards, that has been independently reinvented by countless software developers tasked with encrypting database records.
The oldest version I’ve been able to track down dates to 2006 by Raul Garcia at Microsoft, but I’m not confident that it didn’t exist before.
The idea I’m alluding to goes like this:
- Encrypt your data, securely, using symmetric cryptography.
(Hopefully your encryption addresses the considerations outlined in the relevant sections above.) - Separately, calculate an HMAC over the unencrypted data with a separate key used exclusively for indexing.
When you need to query your data, you can just recalculate the HMAC of your challenge and fetch the records that match it. Easy, right?
Even if you rotate your keys for encryption, you keep your indexing keys static across your entire data set. This lets you have durable indexes for encrypted data, which gives you the ability to do literal lookups for the performance hit of a hash function.
Additionally, everyone has HMAC in their toolkit, so you don’t have to move around implementations of complex cryptographic building blocks. You can live off the land. What’s not to love?
Hooray!However, if you stopped here, we regret to inform you that your data is no longer indistinguishable from random, which probably undermines the security proof for your encryption scheme.
How annoying!Of course, you don’t have to stop with the addition of plain HMAC to your database encryption software.
Take a page from Troy Hunt: Truncate the output to provide k-anonymity rather than a direct literal look-up.
“K-What Now?”
Imagine you have a full HMAC-SHA256 of the plaintext next to every ciphertext record with a static key, for searchability.
Each HMAC output corresponds 1:1 with a unique plaintext.
Because you’re using HMAC with a secret key, an attacker can’t just build a rainbow table like they would when attempting password cracking, but it still leaks duplicate plaintexts.
For example, an HMAC-SHA256 output might look like this:
Art: CMYKat\04a74e4c0158e34a566785d1a5e1167c4e3455c42aea173104e48ca810a8b1aeIf you were to slice off most of those bytes (e.g. leaving only the last 3, which in the previous example yields
a8b1ae), then with sufficient records, multiple plaintexts will now map to the same truncated HMAC tag.Which means if you’re only revealing a truncated HMAC tag to the database server (both when storing records or retrieving them), you can now expect false positives due to collisions in your truncated HMAC tag.
These false positives give your data a discrete set of anonymity (called k-anonymity), which means an attacker with access to your database cannot:
- Distinguish between two encrypted records with the same short HMAC tag.
- Reverse engineer the short HMAC tag into a single possible plaintext value, even if they can supply candidate queries and study the tags sent to the database.
As with SSE above, this short HMAC technique exposes a trade-off to users.
- Too much k-anonymity (i.e. too many false positives), and you will have to decrypt-then-discard multiple mismatching records. This can make queries slow.
- Not enough k-anonymity (i.e. insufficient false positives), and you’re no better off than a full HMAC.
Even more troublesome, the right amount to truncate is expressed in bits (not bytes), and calculating this value depends on the number of unique plaintext values you anticipate in your dataset. (Fortunately, it grows logarithmically, so you’ll rarely if ever have to tune this.)
If you’d like to play with this idea, here’s a quick and dirty demo script.
Intermission
If you started reading this post with any doubts about Cendyne’s statement that “Database cryptography is hard”, by making it to this point, they’ve probably been long since put to rest.
Art: HarubakiConversely, anyone that specializes in this topic is probably waiting for me to say anything novel or interesting; their patience wearing thin as I continue to rehash a surface-level introduction of their field without really diving deep into anything.
Thus, if you’ve read this far, I’d like to demonstrate the application of what I’ve covered thus far into a real-world case study into an database cryptography product.
Case Study: MongoDB Client-Side Encryption
MongoDB is an open source schema-free NoSQL database. Last year, MongoDB made waves when they announced Queryable Encryption in their upcoming client-side encryption release.
Taken from the press release, but adapted for dark themes.A statement at the bottom of their press release indicates that this isn’t clown-shoes:
Queryable Encryption was designed by MongoDB’s Advanced Cryptography Research Group, headed by Seny Kamara and Tarik Moataz, who are pioneers in the field of encrypted search. The Group conducts cutting-edge peer-reviewed research in cryptography and works with MongoDB engineering teams to transfer and deploy the latest innovations in cryptography and privacy to the MongoDB data platform.
If you recall, I mentioned Seny Kamara in the SSE section of this post. They certainly aren’t wrong about Kamara and Moataz being pioneers in this field.
So with that in mind, let’s explore the implementation in libmongocrypt and see how it stands up to scrutiny.
MongoCrypt: The Good
MongoDB’s encryption library takes key management seriously: They provide a KMS integration for cloud users by default (supporting both AWS and Azure).
MongoDB uses Encrypt-then-MAC with AES-CBC and HMAC-SHA256, which is congruent to what Signal does for message encryption.
How Is Queryable Encryption Implemented?
From the current source code, we can see that MongoCrypt generates several different types of tokens, using HMAC (calculation defined here).
According to their press release:
The feature supports equality searches, with additional query types such as range, prefix, suffix, and substring planned for future releases.
Which means that most of the juicy details probably aren’t public yet.
These HMAC-derived tokens are stored wholesale in the data structure, but most are encrypted before storage using AES-CTR.
There are more layers of encryption (using AEAD), server-side token processing, and more AES-CTR-encrypted edge tokens. All of this is finally serialized (implementation) as one blob for storage.
Since only the equality operation is currently supported (which is the same feature you’d get from HMAC), it’s difficult to speculate what the full feature set looks like.
However, since Kamara and Moataz are leading its development, it’s likely that this feature set will be excellent.
MongoCrypt: The Bad
Every call to
do_encrypt()includes at most the Key ID (but typicallyNULL) as the AAD. This means that the concerns over Confused Deputies (and NoSQL specifically) are relevant to MongoDB.However, even if they did support authenticating the fully qualified path to a field in the AAD for their encryption, their AEAD construction is vulnerable to the kind of canonicalization attack I wrote about previously.
First, observe this code which assembles the multi-part inputs into HMAC.
/* Construct the input to the HMAC */uint32_t num_intermediates = 0;_mongocrypt_buffer_t intermediates[3];// -- snip --if (!_mongocrypt_buffer_concat ( &to_hmac, intermediates, num_intermediates)) { CLIENT_ERR ("failed to allocate buffer"); goto done;}if (hmac == HMAC_SHA_512_256) { uint8_t storage[64]; _mongocrypt_buffer_t tag = {.data = storage, .len = sizeof (storage)}; if (!_crypto_hmac_sha_512 (crypto, Km, &to_hmac, &tag, status)) { goto done; } // Truncate sha512 to first 256 bits. memcpy (out->data, tag.data, MONGOCRYPT_HMAC_LEN);} else { BSON_ASSERT (hmac == HMAC_SHA_256); if (!_mongocrypt_hmac_sha_256 (crypto, Km, &to_hmac, out, status)) { goto done; }}The implementation of
_mongocrypt_buffer_concat()can be found here.If either the implementation of that function, or the code I snipped from my excerpt, had contained code that prefixed every segment of the AAD with the length of the segment (represented as a
uint64_tto make overflow infeasible), then their AEAD mode would not be vulnerable to canonicalization issues.Using TupleHash would also have prevented this issue.
Silver lining for MongoDB developers: Because the AAD is either a key ID or NULL, this isn’t exploitable in practice.
The first cryptographic flaw sort of cancels the second out.
If the libmongocrypt developers ever want to mitigate Confused Deputy attacks, they’ll need to address this canonicalization issue too.
MongoCrypt: The Ugly
MongoCrypt supports deterministic encryption.
If you specify deterministic encryption for a field, your application passes a deterministic initialization vector to AEAD.
We already discussed why this is bad above.
Wrapping Up
This was not a comprehensive treatment of the field of database cryptography. There are many areas of this field that I did not cover, nor do I feel qualified to discuss.
However, I hope anyone who takes the time to read this finds themselves more familiar with the subject.
Additionally, I hope any developers who think “encrypting data in a database is [easy, trivial] (select appropriate)” will find this broad introduction a humbling experience.
Art: CMYKathttps://soatok.blog/2023/03/01/database-cryptography-fur-the-rest-of-us/
#appliedCryptography #blockCipherModes #cryptography #databaseCryptography #databases #encryptedSearch #HMAC #MongoCrypt #MongoDB #QueryableEncryption #realWorldCryptography #security #SecurityGuidance #SQL #SSE #symmetricCryptography #symmetricSearchableEncryption
-
Earlier this year, Cendyne wrote a blog post covering the use of HKDF, building partially upon my own blog post about HKDF and the KDF security definition, but moreso inspired by a cryptographic issue they identified in another company’s product (dubbed AnonCo).
At the bottom they teased:
Database cryptography is hard. The above sketch is not complete and does not address several threats! This article is quite long, so I will not be sharing the fixes.
Cendyne
If you read Cendyne’s post, you may have nodded along with that remark and not appreciate the degree to which our naga friend was putting it mildly. So I thought I’d share some of my knowledge about real-world database cryptography in an accessible and fun format in the hopes that it might serve as an introduction to the specialization.
Note: I’m also not going to fix Cendyne’s sketch of AnonCo’s software here–partly because I don’t want to get in the habit of assigning homework or required reading, but mostly because it’s kind of obvious once you’ve learned the basics.
I’m including art of my fursona in this post… as is tradition for furry blogs.If you don’t like furries, please feel free to leave this blog and read about this topic elsewhere.
Thanks to CMYKat for the awesome stickers.
Contents
- Database Cryptography?
- Cryptography for Relational Databases
- The Perils of Built-in Encryption Functions
- Application-Layer Relational Database Cryptography
- Confused Deputies
- Canonicalization Attacks
- Multi-Tenancy
- Cryptography for NoSQL Databases
- NoSQL is Built Different
- Record Authentication
- Bonus: A Maximally Schema-Free, Upgradeable Authentication Design
- Searchable Encryption
- Order-{Preserving, Revealing} Encryption
- Deterministic Encryption
- Homomorphic Encryption
- Searchable Symmetric Encryption (SSE)
- You Can Have Little a HMAC, As a Treat
- Intermission
- Case Study: MongoDB Client-Side Encryption
- MongoCrypt: The Good
- How is Queryable Encryption Implemented?
- MongoCrypt: The Bad
- MongoCrypt: The Ugly
- MongoCrypt: The Good
- Wrapping Up
Database Cryptography?
The premise of database cryptography is deceptively simple: You have a database, of some sort, and you want to store sensitive data in said database.
The consequences of this simple premise are anything but simple. Let me explain.
Art: ScruffKerfluffThe sensitive data you want to store may need to remain confidential, or you may need to provide some sort of integrity guarantees throughout your entire system, or sometimes both. Sometimes all of your data is sensitive, sometimes only some of it is. Sometimes the confidentiality requirements of your data extends to where within a dataset the record you want actually lives. Sometimes that’s true of some data, but not others, so your cryptography has to be flexible to support multiple types of workloads.
Other times, you just want your disks encrypted at rest so if they grow legs and walk out of the data center, the data cannot be comprehended by an attacker. And you can’t be bothered to work on this problem any deeper. This is usually what compliance requirements cover. Boxes get checked, executives feel safer about their operation, and the whole time nobody has really analyzed the risks they’re facing.
But we’re not settling for mere compliance on this blog. Furries have standards, after all.
So the first thing you need to do before diving into database cryptography is threat modelling. The first step in any good threat model is taking inventory; especially of assumptions, requirements, and desired outcomes. A few good starter questions:
- What database software is being used? Is it up to date?
- What data is being stored in which database software?
- How are databases oriented in the network of the overall system?
- Is your database properly firewalled from the public Internet?
- How does data flow throughout the network, and when do these data flows intersect with the database?
- Which applications talk to the database? What languages are they written in? Which APIs do they use?
- How will cryptography secrets be managed?
- Is there one key for everyone, one key per tenant, etc.?
- How are keys rotated?
- Do you use envelope encryption with an HSM, or vend the raw materials to your end devices?
The first two questions are paramount for deciding how to write software for database cryptography, before you even get to thinking about the cryptography itself.
(This is not a comprehensive set of questions to ask, either. A formal threat model is much deeper in the weeds.)
The kind of cryptography protocol you need for, say, storing encrypted CSV files an S3 bucket is vastly different from relational (SQL) databases, which in turn will be significantly different from schema-free (NoSQL) databases.
Furthermore, when you get to the point that you can start to think about the cryptography, you’ll often need to tackle confidentiality and integrity separately.
If that’s unclear, think of a scenario like, “I need to encrypt PII, but I also need to digitally sign the lab results so I know it wasn’t tampered with at rest.”
My point is, right off the bat, we’ve got a three-dimensional matrix of complexity to contend with:
- On one axis, we have the type of database.
- Flat-file
- Relational
- Schema-free
- On another, we have the basic confidentiality requirements of the data.
- Field encryption
- Row encryption
- Column encryption
- Unstructured record encryption
- Encrypting entire collections of records
- Finally, we have the integrity requirements of the data.
- Field authentication
- Row/column authentication
- Unstructured record authentication
- Collection authentication (based on e.g. Sparse Merkle Trees)
And then you have a fourth dimension that often falls out of operational requirements for databases: Searchability.
Why store data in a database if you have no way to index or search the data for fast retrieval?
Credit: HarubakiIf you’re starting to feel overwhelmed, you’re not alone. A lot of developers drastically underestimate the difficulty of the undertaking, until they run head-first into the complexity.
Some just phone it in with
AES_Encrypt()calls in their MySQL queries. (Too bad ECB mode doesn’t provide semantic security!)Which brings us to the meat of this blog post: The actual cryptography part.
Cryptography is the art of transforming information security problems into key management problems.
Former coworker
Note: In the interest of time, I’m skipping over flat files and focusing instead on actual database technologies.
Cryptography for Relational Databases
Encrypting data in an SQL database seems simple enough, even if you’ve managed to shake off the complexity I teased from the introduction.
You’ve got data, you’ve got a column on a table. Just encrypt the data and shove it in a cell on that column and call it a day, right?
But, alas, this is a trap. There are so many gotchas that I can’t weave a coherent, easy-to-follow narrative between them all.
So let’s start with a simple question: where and how are you performing your encryption?
The Perils of Built-in Encryption Functions
MySQL provides functions called AES_Encrypt and AES_Decrypt, which many developers have unfortunately decided to rely on in the past.
It’s unfortunate because these functions implement ECB mode. To illustrate why ECB mode is bad, I encrypted one of my art commissions with AES in ECB mode:
Art by Riley, encrypted with AES-ECBThe problems with ECB mode aren’t exactly “you can see the image through it,” because ECB-encrypting a compressed image won’t have redundancy (and thus can make you feel safer than you are).
ECB art is a good visual for the actual issue you should care about, however: A lack of semantic security.
A cryptosystem is considered semantically secure if observing the ciphertext doesn’t reveal information about the plaintext (except, perhaps, the length; which all cryptosystems leak to some extent). More information here.
ECB art isn’t to be confused with ECB poetry, which looks like this:
Oh little one, you’re growing up
You’ll soon be writing C
You’ll treat your ints as pointers
You’ll nest the ternary
You’ll cut and paste from github
And try cryptography
But even in your darkest hour
Do not use ECBCBC’s BEASTly when padding’s abused
And CTR’s fine til a nonce is reused
Some say it’s a CRIME to compress then encrypt
Or store keys in the browser (or use javascript)
Diffie Hellman will collapse if hackers choose your g
And RSA is full of traps when e is set to 3
Whiten! Blind! In constant time! Don’t write an RNG!
But failing all, and listen well: Do not use ECBThey’ll say “It’s like a one-time-pad!
The data’s short, it’s not so bad
the keys are long–they’re iron clad
I have a PhD!”
And then you’re front page Hacker News
Your passwords cracked–Adobe Blues.
Don’t leave your penguins showing through,
Do not use ECB— Ben Nagy, PoC||GTFO 0x04:13
Most people reading this probably know better than to use ECB mode already, and don’t need any of these reminders, but there is still a lot of code that inadvertently uses ECB mode to encrypt data in the database.
Also,
Credit: CMYKattSHOW processlist;leaks your encryption keys. Oops.Application-layer Relational Database Cryptography
Whether burned by ECB or just cautious about not giving your secrets to the system that stores all the ciphertext protected by said secret, a common next step for developers is to simply encrypt in their server-side application code.
And, yes, that’s part of the answer. But how you encrypt is important.
Credit: Harubaki“I’ll encrypt with CBC mode.”
If you don’t authenticate your ciphertext, you’ll be sorry. Maybe try again?“Okay, fine, I’ll use an authenticated mode like GCM.”
Did you remember to make the table and column name part of your AAD? What about the primary key of the record?“What on Earth are you talking about, Soatok?”
Welcome to the first footgun of database cryptography!Confused Deputies
Encrypting your sensitive data is necessary, but not sufficient. You need to also bind your ciphertexts to the specific context in which they are stored.
To understand why, let’s take a step back: What specific threat does encrypting your database records protect against?
We’ve already established that “your disks walk out of the datacenter” is a “full disk encryption” problem, so if you’re using application-layer cryptography to encrypt data in a relational database, your threat model probably involves unauthorized access to the database server.
What, then, stops an attacker from copying ciphertexts around?
Credit: CMYKattLet’s say I have a legitimate user account with an ID 12345, and I want to read your street address, but it’s encrypted in the database. But because I’m a clever hacker, I have unfettered access to your relational database server.
All I would need to do is simply…
UPDATE table SET addr_encrypted = 'your-ciphertext' WHERE id = 12345…and then access the application through my legitimate access. Bam, data leaked. As an attacker, I can probably even copy fields from other columns and it will just decrypt. Even if you’re using an authenticated mode.
We call this a confused deputy attack, because the deputy (the component of the system that has been delegated some authority or privilege) has become confused by the attacker, and thus undermined an intended security goal.
The fix is to use the AAD parameter from the authenticated mode to bind the data to a given context. (AAD = Additional Authenticated Data.)
- $addr = aes_gcm_encrypt($addr, $key);+ $addr = aes_gcm_encrypt($addr, $key, canonicalize([+ $tableName,+ $columnName,+ $primaryKey+ ]);
Now if I start cutting and pasting ciphertexts around, I get a decryption failure instead of silently decrypting plaintext.
This may sound like a specific vulnerability, but it’s more of a failure to understand an important general lesson with database cryptography:
Where your data lives is part of its identity, and MUST be authenticated.
Soatok’s Rule of Database Cryptography
Canonicalization Attacks
In the previous section, I introduced a pseudocode called
canonicalize(). This isn’t a pasto from some reference code; it’s an important design detail that I will elaborate on now.First, consider you didn’t do anything to canonicalize your data, and you just joined strings together and called it a day…
function dumbCanonicalize( string $tableName, string $columnName, string|int $primaryKey): string { return $tableName . '_' . $columnName . '#' . $primaryKey;}Consider these two inputs to this function:
dumbCanonicalize('customers', 'last_order_uuid', 123);dumbCanonicalize('customers_last_order', 'uuid', 123);
In this case, your AAD would be the same, and therefore, your deputy can still be confused (albeit in a narrower use case).
In Cendyne’s article, AnonCo did something more subtle: The canonicalization bug created a collision on the inputs to HKDF, which resulted in an unintentional key reuse.
Up until this point, their mistake isn’t relevant to us, because we haven’t even explored key management at all. But the same design flaw can re-emerge in multiple locations, with drastically different consequence.
Multi-Tenancy
Once you’ve implemented a mitigation against Confused Deputies, you may think your job is done. And it very well could be.
Often times, however, software developers are tasked with building support for Bring Your Own Key (BYOK).
This is often spawned from a specific compliance requirement (such as cryptographic shredding; i.e. if you erase the key, you can no longer recover the plaintext, so it may as well be deleted).
Other times, this is driven by a need to cut costs: Storing different users’ data in the same database server, but encrypting it such that they can only encrypt their own records.
Two things can happen when you introduce multi-tenancy into your database cryptography designs:
- Invisible Salamanders becomes a risk, due to multiple keys being possible for any given encrypted record.
- Failure to address the risk of Invisible Salamanders can undermine your protection against Confused Deputies, thereby returning you to a state before you properly used the AAD.
So now you have to revisit your designs and ensure you’re using a key-committing authenticated mode, rather than just a regular authenticated mode.
Isn’t cryptography fun?
“What Are Invisible Salamanders?”
This refers to a fun property of AEAD modes based on Polynomical MACs. Basically, if you:
- Encrypt one message under a specific key and nonce.
- Encrypt another message under a separate key and nonce.
…Then you can get the same exact ciphertext and authentication tag. Performing this attack requires you to control the keys for both encryption operations.
This was first demonstrated in an attack against encrypted messaging applications, where a picture of a salamander was hidden from the abuse reporting feature because another attached file had the same authentication tag and ciphertext, and you could trick the system if you disclosed the second key instead of the first. Thus, the salamander is invisible to attackers.
Art: CMYKatWe’re not quite done with relational databases yet, but we should talk about NoSQL databases for a bit. The final topic in scope applies equally to both, after all.
Cryptography for NoSQL Databases
Most of the topics from relational databases also apply to NoSQL databases, so I shall refrain from duplicating them here. This article is already sufficiently long to read, after all, and I dislike redundancy.
NoSQL is Built Different
The main thing that NoSQL databases offer in the service of making cryptographers lose sleep at night is the schema-free nature of NoSQL designs.
What this means is that, if you’re using a client-side encryption library for a NoSQL database, the previous concerns about confused deputy attacks are amplified by the malleability of the document structure.
Additionally, the previously discussed cryptographic attacks against the encryption mode may be less expensive for an attacker to pull off.
Consider the following record structure, which stores a bunch of data stored with AES in CBC mode:
{ "encrypted-data-key": "<blob>", "name": "<ciphertext>", "address": [ "<ciphertext>", "<ciphertext>" ], "social-security": "<ciphertext>", "zip-code": "<ciphertext>"}If this record is decrypted with code that looks something like this:
$decrypted = [];// ... snip ...foreach ($record['address'] as $i => $addrLine) { try { $decrypted['address'][$i] = $this->decrypt($addrLine); } catch (Throwable $ex) { // You'd never deliberately do this, but it's for illustration $this->doSomethingAnOracleCanObserve($i); // This is more believable, of course: $this->logDecryptionError($ex, $addrLine); $decrypted['address'][$i] = ''; }}Then you can keep appending rows to the
Art: Harubaki"address"field to reduce the number of writes needed to exploit a padding oracle attack against any of the<ciphertext>fields.This isn’t to say that NoSQL is less secure than SQL, from the context of client-side encryption. However, the powerful feature sets that NoSQL users are accustomed to may also give attackers a more versatile toolkit to work with.
Record Authentication
A pedant may point out that record authentication applies to both SQL and NoSQL. However, I mostly only observe this feature in NoSQL databases and document storage systems in the wild, so I’m shoving it in here.
Encrypting fields is nice and all, but sometimes what you want to know is that your unencrypted data hasn’t been tampered with as it flows through your system.
The trivial way this is done is by using a digital signature algorithm over the whole record, and then appending the signature to the end. When you go to verify the record, all of the information you need is right there.
This works well enough for most use cases, and everyone can pack up and go home. Nothing more to see here.
Except…
When you’re working with NoSQL databases, you often want systems to be able to write to additional fields, and since you’re working with schema-free blobs of data rather than a normalized set of relatable tables, the most sensible thing to do is to is to append this data to the same record.
Except, oops! You can’t do that if you’re shoving a digital signature over the record. So now you need to specify which fields are to be included in the signature.
And you need to think about how to model that in a way that doesn’t prohibit schema upgrades nor allow attackers to perform downgrade attacks. (See below.)
I don’t have any specific real-world examples here that I can point to of this problem being solved well.Art: CMYKat
Furthermore, as with preventing confused deputy and/or canonicalization attacks above, you must also include the fully qualified path of each field in the data that gets signed.
As I said with encryption before, but also true here:
Where your data lives is part of its identity, and MUST be authenticated.
Soatok’s Rule of Database Cryptography
This requirement holds true whether you’re using symmetric-key authentication (i.e. HMAC) or asymmetric-key digital signatures (e.g. EdDSA).
Bonus: A Maximally Schema-Free, Upgradeable Authentication Design
Art: HarubakiOkay, how do you solve this problem so that you can perform updates and upgrades to your schema but without enabling attackers to downgrade the security? Here’s one possible design.
Let’s say you have two metadata fields on each record:
- A compressed binary string representing which fields should be authenticated. This field is, itself, not authenticated. Let’s call this
meta-auth. - A compressed binary string representing which of the authenticated fields should also be encrypted. This field is also authenticated. This is at most the same length as the first metadata field. Let’s call this
meta-enc.
Furthermore, you will specify a canonical field ordering for both how data is fed into the signature algorithm as well as the field mappings in
meta-authandmeta-enc.{ "example": { "credit-card": { "number": /* encrypted */, "expiration": /* encrypted */, "ccv": /* encrypted */ }, "superfluous": { "rewards-member": null } }, "meta-auth": compress_bools([ true, /* example.credit-card.number */ true, /* example.credit-card.expiration */ true, /* example.credit-card.ccv */ false, /* example.superfluous.rewards-member */ true /* meta-enc */ ]), "meta-enc": compress_bools([ true, /* example.credit-card.number */ true, /* example.credit-card.expiration */ true, /* example.credit-card.ccv */ false /* example.superfluous.rewards-member */ ]), "signature": /* -- snip -- */}When you go to append data to an existing record, you’ll need to update
meta-authto include the mapping of fields based on this canonical ordering to ensure only the intended fields get validated.When you update your code to add an additional field that is intended to be signed, you can roll that out for new records and the record will continue to be self-describing:
- New records will have the additional field flagged as authenticated in
meta-auth(andmeta-encwill grow) - Old records will not, but your code will still sign them successfully
- To prevent downgrade attacks, simply include a schema version ID as an additional plaintext field that gets authenticated. An attacker who tries to downgrade will need to be able to produce a valid signature too.
You might think
meta-authgives an attacker some advantage, but this only includes which fields are included in the security boundary of the signature or MAC, which allows unauthenticated data to be appended for whatever operational purpose without having to update signatures or expose signing keys to a wider part of the network.{ "example": { "credit-card": { "number": /* encrypted */, "expiration": /* encrypted */, "ccv": /* encrypted */ }, "superfluous": { "rewards-member": null } }, "meta-auth": compress_bools([ true, /* example.credit-card.number */ true, /* example.credit-card.expiration */ true, /* example.credit-card.ccv */ false, /* example.superfluous.rewards-member */ true, /* meta-enc */ true /* meta-version */ ]), "meta-enc": compress_bools([ true, /* example.credit-card.number */ true, /* example.credit-card.expiration */ true, /* example.credit-card.ccv */ false, /* example.superfluous.rewards-member */ true /* meta-version */ ]), "meta-version": 0x01000000, "signature": /* -- snip -- */}If an attacker tries to use the
meta-authfield to mess with a record, the best they can hope for is an Invalid Signature exception (assuming the signature algorithm is secure to begin with).Even if they keep all of the fields the same, but play around with the structure of the record (e.g. changing the XPath or equivalent), so long as the path is authenticated with each field, breaking this is computationally infeasible.
Searchable Encryption
If you’ve managed to make it through the previous sections, congratulations, you now know enough to build a secure but completely useless database.
Art: CMYKatOkay, put away the pitchforks; I will explain.
Part of the reason why we store data in a database, rather than a flat file, is because we want to do more than just read and write. Sometimes computer scientists want to compute. Almost always, you want to be able to query your database for a subset of records based on your specific business logic needs.
And so, a database which doesn’t do anything more than store ciphertext and maybe signatures is pretty useless to most people. You’d have better luck selling Monkey JPEGs to furries than convincing most businesses to part with their precious database-driven report generators.
Art: SophieSo whenever one of your users wants to actually use their data, rather than just store it, they’re forced to decide between two mutually exclusive options:
- Encrypting the data, to protect it from unauthorized disclosure, but render it useless
- Doing anything useful with the data, but leaving it unencrypted in the database
This is especially annoying for business types that are all in on the Zero Trust buzzword.
Fortunately, the cryptographers are at it again, and boy howdy do they have a lot of solutions for this problem.
Order-{Preserving, Revealing} Encryption
On the fun side of things, you have things like Order-Preserving and Order-Revealing Encryption, which Matthew Green wrote about at length.
[D]atabase encryption has been a controversial subject in our field. I wish I could say that there’s been an actual debate, but it’s more that different researchers have fallen into different camps, and nobody has really had the data to make their position in a compelling way. There have actually been some very personal arguments made about it.
Attack of the week: searchable encryption and the ever-expanding leakage function
The problem with these designs is that they have a significant enough leakage that it no longer provides semantic security.
From Grubbs, et al. (GLMP, 2019.)
Colors inverted to fit my blog’s theme better.To put it in other words: These designs are only marginally better than ECB mode, and probably deserve their own poems too.
Order revealing
Reveals much more than order
Softcore ECBOrder preserving
Semantic security?
Only in your dreamsHaiku for your consideration
Deterministic Encryption
Here’s a simpler, but also terrible, idea for searchable encryption: Simply give up on semantic security entirely.
If you recall the
AES_{De,En}crypt()functions built into MySQL I mentioned at the start of this article, those are the most common form of deterministic encryption I’ve seen in use.SELECT * FROM foo WHERE bar = AES_Encrypt('query', 'key');However, there are slightly less bad variants. If you use AES-GCM-SIV with a static nonce, your ciphertexts are fully deterministic, and you can encrypt a small number of distinct records safely before you’re no longer secure.
From Page 14 of the linked paper. Full view.That’s certainly better than nothing, but you also can’t mitigate confused deputy attacks. But we can do better than this.
Homomorphic Encryption
In a safer plane of academia, you’ll find homomorphic encryption, which researchers recently demonstrated with serving Wikipedia pages in a reasonable amount of time.
Homomorphic encryption allows computations over the ciphertext, which will be reflected in the plaintext, without ever revealing the key to the entity performing the computation.
If this sounds vaguely similar to the conditions that enable chosen-ciphertext attacks, you probably have a good intuition for how it works: RSA is homomorphic to multiplication, AES-CTR is homomorphic to XOR. Fully homomorphic encryption uses lattices, which enables multiple operations but carries a relatively enormous performance cost.
Art: HarubakiHomomorphic encryption sometimes intersects with machine learning, because the notion of training an encrypted model by feeding it encrypted data, then decrypting it after-the-fact is desirable for certain business verticals. Your data scientists never see your data, and you have some plausible deniability about the final ML model this work produces. This is like a Siren song for Venture Capitalist-backed medical technology companies. Tech journalists love writing about it.
However, a less-explored use case is the ability to encrypt your programs but still get the correct behavior and outputs. Although this sounds like a DRM technology, it’s actually something that individuals could one day use to prevent their ISPs or cloud providers from knowing what software is being executed on the customer’s leased hardware. The potential for a privacy win here is certainly worth pondering, even if you’re a tried and true Pirate Party member.
Just say “NO” to the copyright cartels.Art: CMYKat
Searchable Symmetric Encryption (SSE)
Forget about working at the level of fields and rows or individual records. What if we, instead, worked over collections of documents, where each document is viewed as a set of keywords from a keyword space?
Art: CMYKatThat’s the basic premise of SSE: Encrypting collections of documents rather than individual records.
The actual implementation details differ greatly between designs. They also differ greatly in their leakage profiles and susceptibility to side-channel attacks.
Some schemes use a so-called trapdoor permutation, such as RSA, as one of their building blocks.
Some schemes only allow for searching a static set of records, while others can accommodate new data over time (with the trade-off between more leakage or worse performance).
If you’re curious, you can learn more about SSE here, and see some open source SEE implementations online here.
You’re probably wondering, “If SSE is this well-studied and there are open source implementations available, why isn’t it more widely used?”
Your guess is as good as mine, but I can think of a few reasons:
- The protocols can be a little complicated to implement, and aren’t shipped by default in cryptography libraries (i.e. OpenSSL’s libcrypto or libsodium).
- Every known security risk in SSE is the product of a trade-offs, rather than there being a single winner for all use cases that developers can feel comfortable picking.
- Insufficient marketing and developer advocacy.
SSE schemes are mostly of interest to academics, although Seny Kamara (Brown Univeristy professior and one of the luminaries of searchable encryption) did try to develop an app called Pixek which used SSE to encrypt photos.
Maybe there’s room for a cryptography competition on searchable encryption schemes in the future.
You Can Have Little a HMAC, As a Treat
Finally, I can’t talk about searchable encryption without discussing a technique that’s older than dirt by Internet standards, that has been independently reinvented by countless software developers tasked with encrypting database records.
The oldest version I’ve been able to track down dates to 2006 by Raul Garcia at Microsoft, but I’m not confident that it didn’t exist before.
The idea I’m alluding to goes like this:
- Encrypt your data, securely, using symmetric cryptography.
(Hopefully your encryption addresses the considerations outlined in the relevant sections above.) - Separately, calculate an HMAC over the unencrypted data with a separate key used exclusively for indexing.
When you need to query your data, you can just recalculate the HMAC of your challenge and fetch the records that match it. Easy, right?
Even if you rotate your keys for encryption, you keep your indexing keys static across your entire data set. This lets you have durable indexes for encrypted data, which gives you the ability to do literal lookups for the performance hit of a hash function.
Additionally, everyone has HMAC in their toolkit, so you don’t have to move around implementations of complex cryptographic building blocks. You can live off the land. What’s not to love?
Hooray!However, if you stopped here, we regret to inform you that your data is no longer indistinguishable from random, which probably undermines the security proof for your encryption scheme.
How annoying!Of course, you don’t have to stop with the addition of plain HMAC to your database encryption software.
Take a page from Troy Hunt: Truncate the output to provide k-anonymity rather than a direct literal look-up.
“K-What Now?”
Imagine you have a full HMAC-SHA256 of the plaintext next to every ciphertext record with a static key, for searchability.
Each HMAC output corresponds 1:1 with a unique plaintext.
Because you’re using HMAC with a secret key, an attacker can’t just build a rainbow table like they would when attempting password cracking, but it still leaks duplicate plaintexts.
For example, an HMAC-SHA256 output might look like this:
Art: CMYKat\04a74e4c0158e34a566785d1a5e1167c4e3455c42aea173104e48ca810a8b1aeIf you were to slice off most of those bytes (e.g. leaving only the last 3, which in the previous example yields
a8b1ae), then with sufficient records, multiple plaintexts will now map to the same truncated HMAC tag.Which means if you’re only revealing a truncated HMAC tag to the database server (both when storing records or retrieving them), you can now expect false positives due to collisions in your truncated HMAC tag.
These false positives give your data a discrete set of anonymity (called k-anonymity), which means an attacker with access to your database cannot:
- Distinguish between two encrypted records with the same short HMAC tag.
- Reverse engineer the short HMAC tag into a single possible plaintext value, even if they can supply candidate queries and study the tags sent to the database.
As with SSE above, this short HMAC technique exposes a trade-off to users.
- Too much k-anonymity (i.e. too many false positives), and you will have to decrypt-then-discard multiple mismatching records. This can make queries slow.
- Not enough k-anonymity (i.e. insufficient false positives), and you’re no better off than a full HMAC.
Even more troublesome, the right amount to truncate is expressed in bits (not bytes), and calculating this value depends on the number of unique plaintext values you anticipate in your dataset. (Fortunately, it grows logarithmically, so you’ll rarely if ever have to tune this.)
If you’d like to play with this idea, here’s a quick and dirty demo script.
Intermission
If you started reading this post with any doubts about Cendyne’s statement that “Database cryptography is hard”, by making it to this point, they’ve probably been long since put to rest.
Art: HarubakiConversely, anyone that specializes in this topic is probably waiting for me to say anything novel or interesting; their patience wearing thin as I continue to rehash a surface-level introduction of their field without really diving deep into anything.
Thus, if you’ve read this far, I’d like to demonstrate the application of what I’ve covered thus far into a real-world case study into an database cryptography product.
Case Study: MongoDB Client-Side Encryption
MongoDB is an open source schema-free NoSQL database. Last year, MongoDB made waves when they announced Queryable Encryption in their upcoming client-side encryption release.
Taken from the press release, but adapted for dark themes.A statement at the bottom of their press release indicates that this isn’t clown-shoes:
Queryable Encryption was designed by MongoDB’s Advanced Cryptography Research Group, headed by Seny Kamara and Tarik Moataz, who are pioneers in the field of encrypted search. The Group conducts cutting-edge peer-reviewed research in cryptography and works with MongoDB engineering teams to transfer and deploy the latest innovations in cryptography and privacy to the MongoDB data platform.
If you recall, I mentioned Seny Kamara in the SSE section of this post. They certainly aren’t wrong about Kamara and Moataz being pioneers in this field.
So with that in mind, let’s explore the implementation in libmongocrypt and see how it stands up to scrutiny.
MongoCrypt: The Good
MongoDB’s encryption library takes key management seriously: They provide a KMS integration for cloud users by default (supporting both AWS and Azure).
MongoDB uses Encrypt-then-MAC with AES-CBC and HMAC-SHA256, which is congruent to what Signal does for message encryption.
How Is Queryable Encryption Implemented?
From the current source code, we can see that MongoCrypt generates several different types of tokens, using HMAC (calculation defined here).
According to their press release:
The feature supports equality searches, with additional query types such as range, prefix, suffix, and substring planned for future releases.
Which means that most of the juicy details probably aren’t public yet.
These HMAC-derived tokens are stored wholesale in the data structure, but most are encrypted before storage using AES-CTR.
There are more layers of encryption (using AEAD), server-side token processing, and more AES-CTR-encrypted edge tokens. All of this is finally serialized (implementation) as one blob for storage.
Since only the equality operation is currently supported (which is the same feature you’d get from HMAC), it’s difficult to speculate what the full feature set looks like.
However, since Kamara and Moataz are leading its development, it’s likely that this feature set will be excellent.
MongoCrypt: The Bad
Every call to
do_encrypt()includes at most the Key ID (but typicallyNULL) as the AAD. This means that the concerns over Confused Deputies (and NoSQL specifically) are relevant to MongoDB.However, even if they did support authenticating the fully qualified path to a field in the AAD for their encryption, their AEAD construction is vulnerable to the kind of canonicalization attack I wrote about previously.
First, observe this code which assembles the multi-part inputs into HMAC.
/* Construct the input to the HMAC */uint32_t num_intermediates = 0;_mongocrypt_buffer_t intermediates[3];// -- snip --if (!_mongocrypt_buffer_concat ( &to_hmac, intermediates, num_intermediates)) { CLIENT_ERR ("failed to allocate buffer"); goto done;}if (hmac == HMAC_SHA_512_256) { uint8_t storage[64]; _mongocrypt_buffer_t tag = {.data = storage, .len = sizeof (storage)}; if (!_crypto_hmac_sha_512 (crypto, Km, &to_hmac, &tag, status)) { goto done; } // Truncate sha512 to first 256 bits. memcpy (out->data, tag.data, MONGOCRYPT_HMAC_LEN);} else { BSON_ASSERT (hmac == HMAC_SHA_256); if (!_mongocrypt_hmac_sha_256 (crypto, Km, &to_hmac, out, status)) { goto done; }}The implementation of
_mongocrypt_buffer_concat()can be found here.If either the implementation of that function, or the code I snipped from my excerpt, had contained code that prefixed every segment of the AAD with the length of the segment (represented as a
uint64_tto make overflow infeasible), then their AEAD mode would not be vulnerable to canonicalization issues.Using TupleHash would also have prevented this issue.
Silver lining for MongoDB developers: Because the AAD is either a key ID or NULL, this isn’t exploitable in practice.
The first cryptographic flaw sort of cancels the second out.
If the libmongocrypt developers ever want to mitigate Confused Deputy attacks, they’ll need to address this canonicalization issue too.
MongoCrypt: The Ugly
MongoCrypt supports deterministic encryption.
If you specify deterministic encryption for a field, your application passes a deterministic initialization vector to AEAD.
We already discussed why this is bad above.
Wrapping Up
This was not a comprehensive treatment of the field of database cryptography. There are many areas of this field that I did not cover, nor do I feel qualified to discuss.
However, I hope anyone who takes the time to read this finds themselves more familiar with the subject.
Additionally, I hope any developers who think “encrypting data in a database is [easy, trivial] (select appropriate)” will find this broad introduction a humbling experience.
Art: CMYKathttps://soatok.blog/2023/03/01/database-cryptography-fur-the-rest-of-us/
#appliedCryptography #blockCipherModes #cryptography #databaseCryptography #databases #encryptedSearch #HMAC #MongoCrypt #MongoDB #QueryableEncryption #realWorldCryptography #security #SecurityGuidance #SQL #SSE #symmetricCryptography #symmetricSearchableEncryption
-
If you’re ever tasked with implementing a cryptography feature–whether a high-level protocol or a low-level primitive–you will have to take special care to ensure you’re not leaking secret information through side-channels.
The descriptions of algorithms you learn in a classroom or textbook are not sufficient for real-world use. (Yes, that means your toy RSA implementation based on GMP from your computer science 101 class isn’t production-ready. Don’t deploy it.)
But what are these elusive side-channels exactly, and how do you prevent them? And in cases where you cannot prevent them, how can you mitigate the risk to your users?
Art by Swizz.Contents
- Cryptographic Side-Channels
- Timing Leaks
- Power Usage
- Electromagnetic Emissions
- Side-Channel Prevention and Mitigation
- Prevention vs. Mitigation
- What is Constant-Time?
- Malicious Environments and Algorithmic Constant-Time
- Mitigation with Blinding Techniques
- Design Patterns for Algorithmic Constant-Time Code
- Constant-Time String Comparison
- Alternative: “Double HMAC” String Comparison
- Constant-Time Conditional Select
- Constant-Time String Inequality Comparison
- Constant-Time Integer Multiplication
- Constant-Time Integer Division
- Constant-Time Modular Inversion
- Constant-Time Null-Byte Trimming
- Further Reading and Online Resources
- Errata
Cryptographic Side-Channels
The concept of a side-channel isn’t inherently cryptographic, as Taylor Hornby demonstrates, but a side-channel can be a game over vulnerability in a system meant to maintain confidentiality (even if only for its cryptography keys).
Cryptographic side-channels allow an attacker to learn secret data from your cryptography system. To accomplish this, the attacker doesn’t necessarily study the system’s output (i.e. ciphertext); instead, they observe some other measurement, such as how much time or power was spent performing an operation, or what kind of electromagnetic radiation was emitted.
Important: While being resistant to side-channels is a prerequisite for implementations to be secure, it isn’t in and of itself sufficient for security. The underlying design of the primitives, constructions, and high-level protocols needs to be secure first, and that requires a clear and specific threat model for what you’re building.
Constant-time ECDSA doesn’t help you if you reuse k-values like it’s going out of style, but variable-time ECDSA still leaks your secret key to anyone who cares to probe your response times. Secure cryptography is very demanding.
Art by Riley.Timing Leaks
Timing side-channels leak secrets through how much time it takes for an operation to complete.
There are many different flavors of timing leakage, including:
- Fast-failing comparison functions (memcmp() in C)
- Cache-timing vulnerabilities (e.g. software AES)
- Memory access patterns
- Conditional branches controlled by secrets
The bad news about timing leaks is that they’re almost always visible to an attacker over the network (including over the Internet (PDF)).
The good news is that most of them can be prevented or mitigated in software.
Art by Kyume.Power Usage
Different algorithms or processor operations may require different amounts of power.
For example, squaring a large number may take less power than multiplying two different large numbers. This observation has led to the development of power analysis attacks against RSA.
Power analysis is especially relevant for embedded systems and smart cards, which are easier to extract a meaningful signal from than your desktop computer.
Some information leakage through power usage can be prevented through careful engineering (for example: BearSSL, which uses Montgomery multiplication instead of square-and-multiply).
But that’s not always an option, so generally these risks are mitigated.
My reaction when I first learned of power leaks: WATT (Art by Swizz)Electromagnetic Emissions
Your computer is a reliable source of electromagnetic emissions (such as radio waves). Some of these emissions may reveal information about your cryptographic secrets, especially to an attacker with physical proximity to your device.
The good news is that research into EM emission side-channels isn’t as mature as side-channels through timing leaks or power usage. The bad news is that mitigations for breakthroughs will generally require hardware (e.g. electromagnetic shielding).
Aren’t computers terrifying? (Art by Swizz)Side-Channel Prevention and Mitigation
Now that we’ve established a rough sense of some of the types of side-channels that are possible, we can begin to identify what causes them and aspire to prevent the leaks from happening–and where we can’t, to mitigate the risk to a reasonable level.
Note: To be clear, I didn’t cover all of the types of side-channels.
Prevention vs. Mitigation
Preventing a side-channel means eliminating the conditions that allow the information leak to occur in the first place. For timing leaks, this means making all algorithms constant-time.
There are entire classes of side-channel leaks that aren’t possible or practical to mitigate in software. When you encounter one, the best you can hope to do is mitigate the risk.
Ideally, you want to make the attack more expensive to pull off than the reward an attacker will gain from it.
What is Constant-Time?
Toto, I don’t think we’re in Tanelorn Kansas anymore.When an implementation is said to be constant-time, what we mean is that the execution time of the code is not a function of its secret inputs.
Vulnerable AES uses table look-ups to implement the S-Box. Constant-time AES is either implemented in hardware, or is bitsliced.
Malicious Environments and Algorithmic Constant-Time
One of the greatest challenges with writing constant-time code is distinguishing between algorithmic constant-time and provably constant-time. The main difference between the two is that you cannot trust your compiler (especially a JIT compiler), which may attempt to optimize your code in a way that reintroduces the side-channel you aspired to remove.
A sufficiently advanced compiler optimization is indistinguishable from an adversary.
John Regehr, possibly with apologies to Arthur C. Clarke
For compiled languages, this is a tractable but expensive problem to solve: You simply have to formally verify everything from the source code to the compiler to the silicon chips that the code will be deployed on, and then audit your supply chain to prevent malicious tampering from going undetected.
For interpreted languages (e.g. PHP and JavaScript), this formal verification strategy isn’t really an option, unless you want to formally verify the runtime that interprets scripts and prove that the operations remain constant-time on top of all the other layers of distrust.
Is this level of paranoia really worth the effort?
For our cases, anyway! (Art by Khia.)For that reason, we’re going to assume that algorithmic constant-time is adequate for the duration of this blog post.
If your threat model prevents you from accepting this assumption, feel free to put in the extra effort yourself and tell me how it goes. After all, as a furry who writes blog posts in my spare time for fun, I don’t exactly have the budget for massive research projects in formal verification.
Mitigation with Blinding Techniques
The best mitigation for some side-channels is called blinding: Obfuscating the inputs with some random data, then deobfuscating the outputs with the same random data, such that your keys are not revealed.
Two well-known examples include RSA decryption and Elliptic Curve Diffie-Hellman. I’ll focus on the latter, since it’s not as widely covered in the literature (although several cryptographers I’ve talked with were somehow knowledgeable about it; I suspect gatekeeping is involved).
Blinded ECDH Key Exchange
In typical ECDH implementations, you will convert a point on a Weierstrass curve to a Jacobian coordinate system .
The exact conversion formula is (, ). The conversion almost makes intuitive sense.
Where does come from though?
Art by circuitslimeIt turns out, the choice for is totally arbitrary. Libraries typically set it equal to 1 (for best performance), but you can also set it to a random number. (You cannot set it to 0, however, for obvious reasons.)
Choosing a random number means the calculations performed over Jacobian coordinates will be obscured by a randomly chosen factor (and thus, if is only used once per scalar multiplication, the bitwise signal the attackers rely on will be lost).
Blinding techniques are cool. (Art by Khia.)I think it’s really cool how one small tweak to the runtime of an algorithm can make it significantly harder to attack.
Design Patterns for Algorithmic Constant-Time Code
Mitigation techniques are cool, but preventing side-channels is a better value-add for most software.
To that end, let’s look at some design patterns for constant-time software. Some of these are relatively common; others, not so much.
Art by Scout Pawfoot.If you prefer TypeScript / JavaScirpt, check out Soatok’s constant-time-js library on Github / NPM.
Constant-Time String Comparison
Rather than using string comparison (== in most programming languages, memcmp() in C), you want to compare cryptographic secrets and/or calculated integrity checks with a secure compare algorithm, which looks like this:
- Initialize a variable (let’s call it D) to zero.
- For each byte of the two strings:
- Calculate (lefti XOR righti)
- Bitwise OR the current value of D with the result of the XOR, store the output in D
- When the loop has concluded, D will be equal to 0 if and only if the two strings are equal.
In code form, it looks like this:
<?phpfunction ct_compare(string $left, string $right): bool{ $d = 0; $length = mb_strlen($left, '8bit'); if (mb_strlen($right, '8bit') !== $length) { return false; // Lengths differ } for ($i = 0; $i < $length; ++$i) { $leftCharCode = unpack('C', $left[$i])[1]; $rightCharCode = unpack('C', $right[$i])[1]; $d |= ($leftCharCode ^ $rightCharCode); } return $d === 0;}In this example, I’m using PHP’s unpack() function to avoid cache-timing leaks with ord() and chr(). Of course, you can simply use hash_equals() instead of writing it yourself (PHP 5.6.0+).
Alternative: “Double HMAC” String Comparison
If the previous algorithm won’t work (i.e. because you’re concerned your JIT compiler will optimize it away), there is a popular alternative to consider. It’s called “Double HMAC” because it was traditionally used with Encrypt-Then-HMAC schemes.
The algorithm looks like this:
- Generate a random 256-bit key, K. (This can be cached between invocations, but it should be unpredictable.)
- Calculate HMAC-SHA256(K, left).
- Calculate HMAC-SHA256(K, right).
- Return true if the outputs of step 2 and 3 are equal.
This is provably secure, so long as HMAC-SHA256 is a secure pseudo-random function and the key K is unknown to the attacker.
In code form, the Double HMAC compare function looks like this:
<?phpfunction hmac_compare(string $left, string $right): bool{ static $k = null; if (!$k) $k = random_bytes(32); return ( hash_hmac('sha256', $left, $k) === hash_hmac('sha256', $right, $k) );}Constant-Time Conditional Select
I like to imagine a conversation between a cryptography engineer and a Zen Buddhist, that unfolds like so:
- CE: “I want to eliminate branching side-channels from my code.”
- ZB: “Then do not have branches in your code.”
And that is precisely what we intend to do with a constant-time conditional select: Eliminate branches by conditionally returning between one of two strings, without an IF statement.
Mind. Blown. (Art by Khia.)This isn’t as tricky as it sounds. We’re going to use XOR and two’s complement to achieve this.
The algorithm looks like this:
- Convert the selection bit (TRUE/FALSE) into a mask value (-1 for TRUE, 0 for FALSE). Bitwise, -1 looks like 111111111…1111111111, while 0 looks like 00000000…00000000.
- Copy the right string into a buffer, call it tmp.
- Calculate left XOR right, call it x.
- Return (tmp XOR (x AND mask)).
Once again, in code this algorithm looks like this:
<?phpfunction ct_select( bool $returnLeft, string $left, string $right): string { $length = mb_strlen($left, '8bit'); if (mb_strlen($right, '8bit') !== $length) { throw new Exception('ct_select() expects two strings of equal length'); } // Mask byte $mask = (-$returnLeft) & 0xff; // X $x = (string) ($left ^ $right); // Output = Right XOR (X AND Mask) $output = ''; for ($i = 0; $i < $length; $i++) { $rightCharCode = unpack('C', $right[$i])[1]; $xCharCode = unpack('C', $x[$i])[1]; $output .= pack( 'C', $rightCharCode ^ ($xCharCode & $mask) ); } return $output;}You can test this code for yourself here. The function was designed to read intuitively like a ternary operator.
A Word of Caution on Cleverness
In some languages, it may seem tempting to use the bitwise trickery to swap out pointers instead of returning a new buffer. But do not fall for this Siren song.
If, instead of returning a new buffer, you just swap pointers, what you’ll end up doing is creating a timing leak through your memory access patterns. This can culminate in a timing vulnerability, but even if your data is too big to fit in a processor’s cache line (I dunno, Post-Quantum RSA keys?), there’s another risk to consider.
Virtual memory addresses are just beautiful lies. Where your data lives on the actual hardware memory is entirely up to the kernel. You can have two blobs with contiguous virtual memory addresses that live on separate memory pages, or even separate RAM chips (if you have multiple).
If you’re swapping pointers around, and they point to two different pieces of hardware, and one is slightly faster to read from than the other, you can introduce yet another timing attack through which pointer is being referenced by the processor.
It’s timing leaks all the ways down! (Art by Swizz)If you’re swapping between X and Y before performing a calculation, where:
- X lives on RAM chip 1, which takes 3 ns to read
- Y lives on RAM chip 2, which takes 4 ns to read
…then the subsequent use of the swapped pointers reveals whether you’re operating on X or Y in the timing: It will take slightly longer to read from Y than from X.
The best way to mitigate this problem is to never design your software to have it in the first place. Don’t be clever on this one.
Constant-Time String Inequality Comparison
Sometimes you don’t just need to know if two strings are equal, you also need to know which one is larger than the other.
To accomplish this in constant-time, we need to maintain two state variables:
- gt (initialized to 0, will be set to 1 at some point if left > right)
- eq (initialized to 1, will be set to 0 at some point if left != right)
Endian-ness will dictate the direction our algorithm goes, but we’re going to perform two operations in each cycle:
- gt should be bitwise ORed with (eq AND ((right – left) right shifted 8 times)
- eq should be bitwise ANDed with ((right XOR left) – 1) right shifted 8 times
If right and left are ever different, eq will be set to 0.
If the first time they’re different the value for lefti is greater than the value for righti, then the subtraction will produce a negative number. Right shifting a negative number 8 places then bitwise ANDing the result with eq (which is only 1 until two bytes differ, and then 0 henceforth if they do) will result in a value for 1 with gt. Thus, if (righti – lefti) is negative, gt will be set to 1. Otherwise, it remains 0.
At the end of this loop, return (gt + gt + eq) – 1. This will result in the following possible values:
- left < right: -1
- left == right: 0
- left > right: 1
The arithmetic based on the possible values of gt and eq should be straightforward.
- Different (eq == 0) but not greater (gt == 0) means left < right, -1.
- Different (eq == 0) and greater (gt == 1) means left > right, 1.
- If eq == 1, no bytes ever differed, so left == right, 0.
A little endian implementation is as follows:
<?phpfunction str_compare(string $left, string $right): int{ $length = mb_strlen($left, '8bit'); if (mb_strlen($right, '8bit') !== $length) { throw new Exception('ct_select() expects two strings of equal length'); } $gt = 0; $eq = 1; $i = $length; while ($i > 0) { --$i; $leftCharCode = unpack('C', $left[$i])[1]; $rightCharCode = unpack('C', $right[$i])[1]; $gt |= (($rightCharCode - $leftCharCode) >> 8) & $eq; $eq &= (($rightCharCode ^ $leftCharCode) -1) >> 8; } return ($gt + $gt + $eq) - 1;}Demo for this function is available here.
Constant-Time Integer Multiplication
Multiplying two integers is one of those arithmetic operations that should be constant-time. But on many older processors, it isn’t.
Of course there’s a microarchitecture timing leak! (Art by Khia.)Fortunately, there is a workaround. It involves an algorithm called Ancient Egyptian Multiplication in some places or Peasant Multiplication in others.
Multiplying two numbers and this way looks like this:
- Determine the number of operations you need to perform. Generally, this is either known ahead of time or .
- Set to 0.
- Until the operation count reaches zero:
- If the lowest bit of is set, add to .
- Left shift by 1.
- Right shfit by 1.
- Return .
The main caveat here is that you want to use bitwise operators in step 3.1 to remove the conditional branch.
Rather than bundle example code in our blog post, please refer to the implementation in sodium_compat (a pure PHP polyfill for libsodium).
For big number libraries, implementing Karatsuba on top of this integer multiplying function should be faster than attempting to multiply bignums this way.
Constant-Time Integer Division
Although some cryptography algorithms call for integer division, division isn’t usually expected to be constant-time.
However, if you look up a division algorithm for unsigned integers with a remainder, you’ll likely encounter this algorithm, which is almost constant-time:
if D = 0 then error(DivisionByZeroException) endQ := 0 -- Initialize quotient and remainder to zeroR := 0 for i := n − 1 .. 0 do -- Where n is number of bits in N R := R << 1 -- Left-shift R by 1 bit R(0) := N(i) -- Set the least-significant bit of R equal to bit i of the numerator if R ≥ D then R := R − D Q(i) := 1 endendIf we use the tricks we learned from implementing constant-time string inequality with constant-time conditional selection, we can implement this algorithm without timing leaks.
Our constant-time version of this algorithm looks like this:
if D = 0 then error(DivisionByZeroException) endQ := 0 -- Initialize quotient and remainder to zeroR := 0 for i := n − 1 .. 0 do -- Where n is number of bits in N R := R << 1 -- Left-shift R by 1 bit R(0) := N(i) -- Set the least-significant bit of R equal to bit i of the numerator compared := ct_compare(R, D) -- Use constant-time inequality -- if R > D then compared == 1, swap = 1 -- if R == D then compared == 0, swap = 1 -- if R < D then compared == -1, swap = 0 swap := (1 - ((compared >> 31) & 1)) -- R' = R - D -- Q' = Q, Q[i] = 1 Rprime := R - D Qprime := Q Qprime(i) := 1 -- The i'th bit is set to 1 -- Replace (R with R', Q with Q') if swap == 1 R = ct_select(swap, Rprime, R) Q = ct_select(swap, Qprime, Q)endIt’s approximately twice as slow as the original, but it’s constant-time.
(Art by Khia.)Constant-Time Modular Inversion
Modular inversion is the calculation of for some prime . This is used in a lot of places, but especially in elliptic curve cryptography and RSA.
Daniel J. Bernstein and Bo-Yin Yang published a paper on fast constant-time GCD and Modular Inversion in 2019. The algorithm in question is somewhat straightforward to implement (although determining whether or not that implementation is safe is left as an exercise to the rest of us).
A simpler technique is to use Fermat’s Little Theorem: for some prime . This only works with prime fields, and is slower than a Binary GCD (which isn’t necessarily constant-time, as OpenSSL discovered).
BearSSL provides an implementation (and accompanying documentation) for a constant-time modular inversion algorithm based on Binary GCD.
(In the future, I may update this section of this blog post with an implementation in PHP, using the GMP extension.)
Constant-Time Null-Byte Trimming
Shortly after this guide first went online, security researchers published the Raccoon Attack, which used a timing leak in the number of leading 0 bytes in the pre-master secret–combined with a lattice attack to solve the hidden number problem–to break TLS-DH(E).
To solve this, you need two components:
- A function that returns a slice of an array without timing leaks.
- A function that counts the number of significant bytes (i.e. ignores leading zero bytes, counts from the first non-zero byte).
A timing-safe array resize function needs to do two things:
- Touch every byte of the input array once.
- Touch every byte of the output array at least once, linearly. The constant-time division algorithm is useful here (to calculate x mod n for the output array index).
- Conditionally select between input[x] and the existing output[x_mod_n], based on whether x >= target size.
I’ve implemented this in my constant-time-js library:
Further Reading and Online Resources
If you’re at all interested in cryptographic side-channels, your hunger for knowledge probably won’t be sated by a single blog post. Here’s a collection of articles, papers, books, etc. worth reading.
- BearSSL’s Documentation on Constant-Time Code — A must-read for anyone interested in this topic
- Cryptographically Secure PHP Development — How to write secure cryptography in languages that cryptographers largely neglect
- CryptoCoding — A style guide for writing secure cryptography code in C (with example code!)
- CryptoGotchas — An overview of the common mistakes one can make when writing cryptography code (which is a much wider scope than side-channels)
- Meltdown and Spectre — Two vulnerabilities that placed side-channels in the scope of most of infosec that isn’t interested in cryptography
- Serious Cryptography — For anyone who lacks the background knowledge to fully understand what I’m talking about on this page
Errata
- 2020-08-27: The original version of this blog post incorrectly attributed Jacobian coordinate blinding to ECDSA hardening, rather than ECDH hardening. This error was brought to my attention by Thai Duong. Thanks Thai!
- 2020-08-27: Erin correctly pointed out that omitting memory access timing was a disservice to developers, who might not be aware of the risks involved. I’ve updated the post to call this risk out specifically (especially in the conditional select code, which some developers might try to implement with pointer swapping without knowing the risks involved). Thanks Erin!
I hope you find this guide to side-channels helpful.
Thanks for reading!Follow my blog for more Defense Against the Bark Arts posts in the future.
https://soatok.blog/2020/08/27/soatoks-guide-to-side-channel-attacks/
#asymmetricCryptography #constantTime #cryptography #ECDH #ECDSA #ellipticCurveCryptography #RSA #SecurityGuidance #sideChannels #symmetricCryptography
- Cryptographic Side-Channels