#encryption-at-rest — Public Fediverse posts
Live and recent posts from across the Fediverse tagged #encryption-at-rest, aggregated by home.social.
-
React-like functional webcomponents, but with vanilla HTML, JS and CSS
Introducing Dim – a new #Framework that brings #ReactJS-like functional #JSX-syntax with #VanillaJS. Check it out here:
🔗 Project: https://github.com/positive-intentions/dim
🔗 Website: https://dim.positive-intentions.comMy journey with #WebComponents started with Lit, and while I appreciated its native browser support (less #Tooling!), coming from #ReactJS, the class components felt like a step backward. The #FunctionalProgramming approach in React significantly improved my #DeveloperExperience and debugging flow.
So, I set out to build a thin, functional wrapper around #Lit, and Dim is the result! It's a #ProofOfConcept right now, with "main" #Hooks similar to React, plus some custom ones like useStore for #EncryptionAtRest. (Note: #StateManagement for encryption-at-rest is still unstable and currently uses a hardcoded password while I explore #Passwordless options like #WebAuthn/#Passkeys).
You can dive deeper into the #Documentation and see how it works here:
📚 Dim Docs: https://positive-intentions.com/docs/category/dimThis #OpenSource project is still in its early stages and very #Unstable, so expect #BreakingChanges. I've already received valuable #Feedback on some functions regarding #Security, and I'm actively investigating those. I'm genuinely open to all feedback as I continue to develop it!
#FrontendDev #JSFramework #Innovation #Coding #Programmer #Tech
-
Head’s up: This is a blog post about applied cryptography, with a focus on web and cloud applications that encrypt data at rest in a database or filesystem. While the lessons can be broadly applicable, the scope of the post is not.
One of the lessons I learned during my time at AWS Cryptography (and particularly as an AWS Crypto Bar Raiser) is that the threat model for Encryption At Rest is often undefined.
Prior to consulting cryptography experts, most software developers do not have a clear and concise understanding of the risks they’re facing, let alone how or why the encrypting data at rest would help protect their customers.
Unsurprisingly, I’ve heard a few infosec thought leader types insist that encryption-at-rest is security theater over the years. I disagree with this assessment in the absolute terms, but there is a nugget of truth in that assertion.
The million dollar question.Let’s explore this subject in a little more detail.
Why should we listen to you about this topic?
(If you don’t need any convincing, feel free to skip this section.)
Encryption at rest is a particular hobby horse of mine. I previously wrote on this blog about the under-celebrated design decisions in the AWS Database Encryption SDK and the need for key-committing AEAD modes in multi-tenant data lakes.
Before my time at Amazon, I had also designed a PHP library called CipherSweet that offers a limited type of Searchable Encryption. The goal of CipherSweet was to improve the cryptography used by SuiteCRM. (The library name is, of course, a pun.)
I’ve also contributed a ton of time making cryptography easy-to-use and hard to misuse outside of the narrow use-case that is at-rest data encryption. To that end, I designed PASETO as a secure-by-default alternative to JSON Web Tokens.
I also have a lot of skin in the game when it comes to developer comprehension: I was the first Stack Overflow user with a gold badge for both [security] and [encryption], largely due to the effort I put into cleaning up the bad cryptography advice for the PHP ecosystem.
I have spent the past decade or so trying to help teams avoid security disasters in one form or another.
Why should we not listen to you about this topic?
If you happen to know a cryptography expert you trust more than some Internet stranger with a blog, I implore you to listen to them if we disagree on any point. They may know something I don’t. (That said, I’m always happy to learn something new!)
I also do not have a college degree in Cryptography, nor have I published any papers in prestigious academic journals. If you care very much about this sort of pedigree, you will likely find my words easily discarded. If this describes your situation, no hard feelings.
Why and How to use Encryption At Rest to Protect Sensitive Data
Important: I’m chiefly interested in discussing one use-case, and not focusing on other use cases. Namely, I’m focusing on encryption-at-rest in the narrow context of web applications and/or cloud services.
This is not a comprehensive blog post covering every possible use case or threat model relating to encryption at rest. Those other use cases are certainly interesting, but this post is already long enough with a narrower focus.
In particular: I’m not talking about the threats faced by activists or whistleblowers. This is a software engineering and applied cryptography focused blog post.
If you’re only interested in compliance requirements, you can probably just enable Full Disk Encryption and call it a day. Then, if your server’s hard drive grows legs and walks out of the data center, your users’ most sensitive data will remain confidential.
Unfortunately, for the server-side encryption at rest use case, that’s basically all that Disk Encryption protects against.
If your application or database software is online and an attacker gains access to it (e.g., through SQL injection), with full disk encryption, it might as well be plaintext to an online attacker.
It do be like that with online attacks.Therefore, if you find yourself reaching for Encryption At Rest to mitigate the impact of the kind of vulnerability that would leak the contents of your database or filesystem to an attacker, you’re probably unwittingly engaging in security theater.
Disk Encryption is important for disk disposal and mitigating hardware theft, not preventing data leakage to online attackers.
So the next logical thing to do is draw a box around the system or component that stores a lot of data and never let plaintext cross that boundary.
What Do You Mean By “Encryption At Rest”?
Encryption At Rest is best contrasted with Encrypted In Transit.
For Encryption-in-Transit, think TLS.
For Encryption-at-Rest, think of anything a web app or cloud service would do to encrypt data before storing it in… where ever the data is actually stored.
If there’s another usage of the term to mean something else, it’s not one that I’m familiar with.
Client-Side Encryption
Note: The naming here is a little imprecise. It is client-side encryption with respect to your data warehouse (i.e. SQL database), but not with respect to the user experience of a web application. In those cases, client-side would mean on the actual end user’s device.
Instead, client-side encryption is the generic buzz-word to mean that you’re encrypting data outside of the box you drew in your system architecture. Generally, this means that you have an application server that’s acting as the “client” for the purpose of bulk data encryption.
There are a lot of software projects that aim to provide client-side encryption for data stored in a database or filesystems; e.g., in Amazon S3 buckets.
This is a step in the right direction, but implementation details matter a lot.
Quick aside: For the remainder of this blog post, I’m going to assume an architecture that looks like a traditional web application, for simplicity.
The assumed architecture looks vaguely like this:
- User Agents (e.g., web browsers) that communicate with the application server.
- Application Server(s) respond to HTTP requests from user agents, manages key material using KMS, encrypts / decrypts records stored in the database.
- Database Server(s) which store ciphertext on behalf of the application server.
This is an abstract design, so the actual implementation details you encounter in the real world may be simpler or more complex in different respects.
There are other interesting design considerations for OS-level end-user device encryption that I’m not going to explore today. For example: Adiantum is extremely cool.
I’m also not going to dive deep into laptop theft or the importance of Full Disk Encryption as a mechanism for ensuring data is erased from solid state hard drives, or the activities of hostile nation states. That’s a separate discussion entirely.
Security Considerations for Client-Side Encryption
The first question to answer when data is being encrypted is, “How are the keys being managed?” This is a very deep rabbit hole of complexity, but one good answer for a centralized service is, “Cloud-based key management service with audit logging”; i.e. AWS KMS, Google CloudKMS, etc.
We could talk about key management for a very long time, but there’s other things I want to focus on, so let’s revisit that in a future blog post.
Before we begin, you may find it helpful to read my previous blog post on a related matter: Lucid Multi-Key Deputies Require Commitment. It’s not strictly necessary, but some of the terminology used there may be helpful to understanding this one.
Next, you have to understand how the data is being encrypted in the first place.
Bulk Data Encryption Techniques
Bad answer: AES in CBC mode without HMAC.
Worse answer: AES in ECB mode.
Generally, you’re going to want to use an AEAD construction, such as AES-GCM or XChaCha20-Poly1305.
For those not in the loop: AEAD is an acronym that stands for Authenticated Encryption with Associated Data.
You’ll also want key-commitment if you’re storing data for multiple customers in the same hardware. You can get this property by stapling HKDF onto your protocol (once for key derivation, again for commitment). See also: PASETO v3 and v4, or Version 2 of the AWS Encryption SDK.
It may be tempting to build your own custom committing AEAD scheme out of, e.g., AES-CTR and HMAC. If you do this, take extra care that you don’t introduce canonicalization risks in your MAC.
Either way, using an AEAD mode is a significant improvement over using AES directly.
Is Your Deputy Confused?
Even if you’re using IND-CCA secure encryption and managing your keys securely, there is still a very stupid attack against many data-at-rest encryption schemes.
To understand the attack, first consider this sort of scenario:
Alice and Bob use the same health insurance provider, who is storing sensitive medical records for both parties. Bob works as a database administrator for the insurance company he and Alice both use. One day, he decides to snoop on her private medical history.
Fortunately, the data is encrypted at the web application, so all of the data Bob can access is indistinguishable from random. He can access his own account and see his data through the application, but he cannot see Alice’s data from his vantage point on the database server.
Here’s the stupid simple attack that works in far too many cases: Bob copies Alice’s encrypted data, and overwrites his records in the database, then accesses the insurance provider’s web app.
Bam! Alice’s plaintext recovered.
What’s happening here is simple: The web application has the ability to decrypt different records encrypted with different keys. If you pass records that were encrypted for Alice to the application to decrypt it for Bob, and you’re not authenticating your access patterns, Bob can read Alice’s data by performing this attack.
The cryptographic attack is literally copy and paste, from the database administrator’s perspective. It’s stupid but it works against too many encryption-at-rest software projects.
In this setup, the application is the Deputy, and you can easily confuse it by replaying an encrypted blob in the incorrect context.
The mitigation is simple: Use the AAD mechanism (part of the standard AEAD interface) to bind a ciphertext to its context. This can be a customer ID, each row’s value for the primary key of the database table, or something else entirely.
If you’re using AWS KMS, you can also use Encryption Context for this exact purpose.
An Illustrative Example
Let’s say you have a simple web application that encrypts data before storing it in a SQL database.
Let’s also write it to use AES-GCM, since unauthenticated CBC mode is awful.
A quick and dirty implementation might look like this:
class User { public function __construct( public readonly string $username, public string $email, public string $fullName ) {}}class UserModel { public function __construct(protected Database $db) {} public function save(User $user): bool { return $this->db->upsert( 'users', [ // set 'full_name' => aes128gcm_encrypt($user->fullName), 'email' => aes128gcm_encrypt($user->email) // encryption details abstracted ], [ // where 'username' => $user->username ] ); } public function fetch(string $username): User { $row = $this->db->fetch('users', ['username' => $username]); return new User( $username, aes128gcm_decrypt($row['email']), aes128gcm_decrypt($row['full_name']) ); }}For the abstracted
aes128gcmfunctions in the pseudocode above, just assume they’re getting the key from KMS during encryption and storing an encrypted data key in a place the ciphertext can reference later on decrypt. I didn’t want to complicate the pseudocode with a lot of boilerplate.You might decide to prove the confused deputy risk by doing something like this:
$model = new UserModel($db);$model->save(new User('alice', '[email protected]', 'Alice McWonderland'));$model->save(new User('bob', '[email protected]', 'Bob BurgerMeister'));// Fetch Alice's data$aliceData = $db->fetch('users', ['username' => 'alice']);$bobData = $db->fetch('users', ['username' => 'bob']);// This is the attack the database server can perfrom:// Replace Bob's full_name with Alice's email$db->upsert('users', [ 'full_name' => $alice['email']], ['username' => 'bob']);$badBob = $model->fetch('bob');Now Bob’s full name is set to Alice’s email address.
Okay, So What?
Now imagine someone performs the same attack, but against salary fields in a payroll system.
The Curious Case of CipherSweet
My knowledge of this risk didn’t manifest itself in a vacuum. It was discovered over the years of maintaining an open source library.
The first release of CipherSweet mitigated most of this risk by construction: Each field uses a different encryption key, through a key derivation scheme.
In pseudocode, this construction looks something like this:
def encryptRow(self, records): for field, type in self.fieldsToEncrypt: key = self.getFieldSymmetricKey(self.table, field) records[field] = encryptField(key, field)
Since CipherSweet’s inception, if you try to replace Alice’s encrypted zip code with Alice’s encrypted social security number, the keys would be wrong, so it would lead to a decryption failure.
Or so I thought!
As I mentioned in my blog post about multi-tenancy and confused deputy attacks, if your AEAD mode doesn’t commit to the key used, it’s possible to craft a single (ciphertext, tag) that decrypts to two different plaintext values under two different keys.
CipherSweet’s original
ModernCryptosuite used XChaCha20-Poly1305, which is not key-committing, and therefore susceptible to this sort of misuse.This violated the Principle of Least Astonishment and motivated the development of a new algorithm suite called
BoringCrypto, which used BLAKE2b-MAC instead of Poly1305. This change was released in version 3.0.0 in June 2021.However, even with
BoringCryptoin 3.0.0, this only mitigated most of the issue by construction. The last mile of complexity here is that each field must also be bound to a primary key or foreign key.Encrypting with AAD has been possible since a very early release of CipherSweet, but being possible to use securely is not sufficient. It should be easy to use securely.
CipherSweet Version 4.7.0, which was released last month, now only requires a code change that looks like this in order to mitigate confused deputies in an application:
$multiRowEncryptor = new EncryptedMultiRows($engine); $multiRowEncryptor+ ->setAutoBindContext(true)+ ->setPrimaryKeyColumn('table2', 'id') ->addTextField('table1', 'field1')This is in addition to the new Enhanced AAD feature, which allows for flexible and powerful context binding based on other fields and/or string literals.
(In fact, this new convenience feature actually uses Enhanced AAD under-the-hood.)
This doesn’t come for free, however: Users have to know the serial / primary key for a record prior to writing it, in order to use it as AAD when encrypting fields. However, that’s a much easier pill to swallow than expecting PHP devs to manage the complexity of context-binding themselves.
As you can see, mitigating confused deputies in an encryption library (without making it unwieldy) requires a painstaking attention to detail to get right.
As Avi Douglen says, “Security at the cost of usability comes at the cost of security.”
Given the prevalence of client-side encryption projects that just phone it in with insecure block cipher modes (or ECB, which is the absence of a block cipher mode entirely), it’s highly doubtful that most of them will ever address confused deputy attacks. Even I didn’t get it right at first when I made CipherSweet back in 2018.
What about non-databases?
Everything I mentioned in the previous section was focused on confused deputy attacks against client-side encryption for information that is stored in a database, but it’s a general problem with encrypting data at rest and storing the ciphertext “server-side”.
If you’re storing encrypted data in an S3 bucket, rather than in MySQL, you still need some form of context-binding mechanism to prevent the dumb and obvious attack from working against a deputy that reads data from said S3 bucket.
If you take nothing else away from this blog post, remember: Authenticate your access patterns.
Why aren’t things better already?
As with most things in software security, the problem is either not widely known, or is not widely understood.
Unknown unknowns tend to fester, untreated, across the entire ecosystem.
Misunderstood issues often lead to an incorrect solution.
In this case, at-rest encryption is mostly in Column B, and confused deputy attacks are mostly in Column A.
The most pronounced consequence of this is, when tasked with building at-rest data encryption in an application, most software developers do not have a cohesive threat model in mind (let alone a formal one).
This leads to disagreement between stakeholders about what the security requirements actually are.
How can I help improve things somewhat?
Most importantly, spread awareness of the nuances of encryption at-rest.
This blog post is intended to be a good conversation starter, but there are other resources to consider, too. I’ve linked to many of them throughout this post already.
If you’re paying for software to encrypt data at rest, ask your vendor how they mitigate the risk of confused deputy attacks. Link them to this blog post if they’re not sure what you mean.
If said vendor responds, “this risk is outside of our threat model,” ask to see their formal threat model document. If it exists and doesn’t align with your application’s threat model, maybe consider alternative solutions that provide protection against more attack classes than Full Disk Encryption would.
Finally, gaining experience with threat modeling is a good use of every developer’s time. Adam Caudill has an excellent introductory blog post on the subject.
Closing Thoughts
Despite everything I’ve written here today, I do not claim to have all the answers for encryption at rest.
However, you can unlock a lot of value just by asking the right questions. My hope is that anyone that reads this post is now capable of asking those questions.
Addendum (2024-06-03)
After I published this, the r/netsec subreddit has expressed disappointment that this blog post had “no mention of” consumer device theft or countries experiencing civil unrest and pulling hard drives from data centers.
You could make a congruent complaint that it also had no mention of Batman.
To be clear, I’m not saying that the use cases and risks Reddit cares about are off-topic to any discussion of full-disk encryption. They matter.
Rather, it’s that they’re not relevant to the specific point I am making: Even in the simplest use case, far from the annoying details of end user hardware or the whims of nation states, encryption-at-rest is poorly understood by most developers, and should be thought through carefully.
Your threat model is not my threat model, and vice versa.
I never advertised this blog post as a comprehensive and complete guide to the entire subject of encryption-at-rest. If you too felt under-served by this blog post for not addressing the corner cases that really matter to you, I hope this addendum makes it clearer why I didn’t cover them.
Finally, if you feel that there’s an aspect of the encryption-at-rest topic that really warrants further examination, I invite you to blog about it.
If your blog post is interesting enough, I’ll revise this post and link to it here.
https://scottarc.blog/2024/06/02/encryption-at-rest-whose-threat-model-is-it-anyway/
#Cryptography #cybersecurity #encryption #encryptionAtRest #security #symmetricCryptography #technology