home.social

#aesgcm — Public Fediverse posts

Live and recent posts from across the Fediverse tagged #aesgcm, aggregated by home.social.

  1. Don’t Use Session (Signal Fork)

    Last year, I outlined the specific requirements that an app needs to have in order for me to consider it a Signal competitor.

    Afterwards, I had several people ask me what I think of a Signal fork called Session. My answer then is the same thing I’ll say today:

    Don’t use Session.

    The main reason I said to avoid Session, all those months ago, was simply due to their decision to remove forward secrecy (which is an important security property of cryptographic protocols they inherited for free when they forked libsignal).

    Lack of forward secrecy puts you in the scope of Key Compromise Impersonation (KCI) attacks, which serious end-to-end encryption apps should prevent if they want to sit at the adults table. This is why I don’t recommend Tox.

    And that observation alone should have been enough for anyone to run, screaming, in the other direction from Session. After all, removing important security properties from a cryptographic security protocol is exactly the sort of thing a malicious government would do (especially if the cover story for such a change involves the introduction of swarms and “onion routing”–which computer criminals might think sounds attractive due to their familiarity with the Tor network).

    Unfortunately, some people love to dig their heels in about messaging apps. So let’s take a closer look at Session.

    I did not disclose this blog post privately to the Session developers before pressing publish.

    I do not feel that cryptographic issues always require coordinated disclosure with the software vendor. As Bruce Schneier argues, full disclosure of security vulnerabilities is a “damned good idea”.

    I have separated this blog post into two sections: Security Issues and Gripes.

    Security Issues

    1. Insufficient Entropy in Ed25519 Keys
    2. In-Band Negotiation for Message Signatures
    3. Using Public Keys as AES-GCM Keys

    Insufficient Entropy in Ed25519 Keys

    One of the departures of Session from Signal is the use of Ed25519 rather than X25519 for everything.

    Ed25519 Keypairs generated from their KeyPairUtilities object only have 128 bits of entropy, rather than the ~253 bits (after clamping) you’d expect from an Ed25519 seed.

    fun generate(): KeyPairGenerationResult {    val seed = sodium.randomBytesBuf(16)    try {        return generate(seed)    } catch (exception: Exception) {        return generate()    }}fun generate(seed: ByteArray): KeyPairGenerationResult {    val padding = ByteArray(16) { 0 }    val ed25519KeyPair = sodium.cryptoSignSeedKeypair(seed + padding)

    As an implementation detail, they encode a recovery key as a “mnemonic” (see also: a gripe about their mnemonic decoding).

    Does This Matter?

    You might think that clearing the highest 128 bits of the Ed25519 seed is fine for one of the following reasons:

    1. It’s hashed with SHA512 before clamping.
    2. Ed25519 only offers 128 bits of security.
    3. Some secret third (and possibly unreasonable) argument.

    It’s true that Ed25519 targets the 128-bit security level, if you’re focused on the security of the Elliptic Curve Discrete Logarithm Problem (ECDLP).

    Achieving 128 bits of security in this model requires 256-bit secrets, since the best attack against the ECDLP finds a discrete logarithm in guesses.

    Additionally, having 256-bit secrets makes the multi-user security of the scheme easy to reason about, whereas 128-bit secrets makes it a lot harder. (This mostly comes up in criticism of AES, which has a 128-bit block size.)

    When your secret only has possible values, your multi-user security is no longer as secure as Ed25519 expects.

    Additionally, you can shove the SHA512 + clamping in your attack script (thus negating the first objection) and find the corresponding secret key in queries if you know the top 128 bits were initialized to 0, using a modified version of Pollard’s rho for discrete logarithms.

    This means that Session’s KeyPairUtilities class only provides 64 bits of ECDLP security.

    CMYKat

    What does 64 bits of ECDLP Security actually mean?

    I provided a technical definition already, but that’s probably not meaningful to most people outside computer security.

    What this means is that a distributed computing effort can find the secret key for a given Ed25519 public key generated from this algorithm in only queries.

    For flavor, queries is approximately the attack cost to find a SHA1 collision, which we know is possible and economical.

    Based on this attack, the authors projected that a collision attack on SHA-1 may cost between US$75K and US$120K by renting GPU computing time on Amazon EC2 using spot-instances, which is significantly lower than Schneier’s 2012 estimates.

    — from the Shattered paper, page 2.

    I don’t know if this was mere stupidity or an intentional NOBUS backdoor that only well-resourced adversaries can crack. (I also don’t have hundreds of thousands of dollars lying around to test this myself.)

    How would you exploit this in practice?

    If you’re not familiar with Pollard’s rho, then this section might be a bit abstract and difficult to follow.

    Instead of directly passing a full 256-bit value to your oracle with each iteration (like you do with a standard Pollard’s rho implementation), you would need mutate the output the same way Session does (n.b., replace 128 bits of the seed with zeroes), hash & clamp that, and then perform the scalar multiplication.

    It should be a bit more expensive than a raw ECDLP attack against a 128-bit curve (due to the hashing), but the strategy should succeed in the expected number of queries (average case).

    Although this makes the attack totally feasible for a nation state, I do not have the resources to build and test a proof of concept against a candidate keypair. If anyone does, get in touch, it would make for a fun research project.

    CMYKat

    Alternatively, Pollard’s kangaroo might be a better cryptanalysis technique for Session’s setup.

    Note: If there is any classified government algorithm especially suited for cracking Ed25519 keys constructed exactly like Session does, it’s not one I’ve ever heard of. I don’t have any security clearances, nor do I want one.

    However, ECDLP security of elliptic curve-based protocols is extremely well-understood in the cryptography literature.

    In-Band Negotiation for Message Signatures

    If you thought the previous issue was mitigated by the use of Ed25519 signatures on each message, don’t worry, the Session developers screwed this up too!

    // 2. ) Get the message partsval signature = plaintextWithMetadata.sliceArray(plaintextWithMetadata.size - signatureSize until plaintextWithMetadata.size)val senderED25519PublicKey = plaintextWithMetadata.sliceArray(plaintextWithMetadata.size - (signatureSize + ed25519PublicKeySize) until plaintextWithMetadata.size - signatureSize)val plaintext = plaintextWithMetadata.sliceArray(0 until plaintextWithMetadata.size - (signatureSize + ed25519PublicKeySize))// 3. ) Verify the signatureval verificationData = (plaintext + senderED25519PublicKey + recipientX25519PublicKey)try {    val isValid = sodium.cryptoSignVerifyDetached(signature, verificationData, verificationData.size, senderED25519PublicKey)    if (!isValid) { throw Error.InvalidSignature }} catch (exception: Exception) {    Log.d("Loki", "Couldn't verify message signature due to error: $exception.")    throw Error.InvalidSignature}

    What this code is doing (after decryption):

    1. Grab the public key from the payload.
    2. Grab the signature from the payload.
    3. Verify that the signature on the rest of the payload is valid… for the public key that was included in the payload.

    Congratulations, Session, you successfully reduced the utility of Ed25519 to that of a CRC32!

    Art: AJ

    Using Public Keys As AES-GCM Keys

    I wasn’t entirely sure whether this belongs in the “gripes” section or not, because it’s so blatantly stupid that there’s basically no way Quarkslab would miss it if it mattered.

    When encrypting payloads for onion routing, it uses the X25519 public key… as a symmetric key, for AES-GCM. See, encryptPayloadForDestination().

    val result = AESGCM.encrypt(plaintext, x25519PublicKey)deferred.resolve(result)

    Session also does this inside of encryptHop().

    val plaintext = encode(previousEncryptionResult.ciphertext, payload)val result = AESGCM.encrypt(plaintext, x25519PublicKey)

    In case you thought, maybe, that this is just a poorly named HPKE wrapper… nope!

     /** * Sync. Don't call from the main thread. */internal fun encrypt(plaintext: ByteArray, symmetricKey: ByteArray): ByteArray {    val iv = Util.getSecretBytes(ivSize)    synchronized(CIPHER_LOCK) {        val cipher = Cipher.getInstance("AES/GCM/NoPadding")        cipher.init(Cipher.ENCRYPT_MODE, SecretKeySpec(symmetricKey, "AES"), GCMParameterSpec(gcmTagSize, iv))        return ByteUtil.combine(iv, cipher.doFinal(plaintext))    }}

    This obviously doesn’t encrypt it such that only the recipient (that owns the secret key corresponding to the public key) can decrypt the message. It makes it to where anyone that knows the public key can decrypt it.

    I wonder if this impacts their onion routing assumptions?

    Why should I trust session?

    (…)

    When using Session, your messages are sent to their destinations through a decentralised onion routing network similar to Tor (with a few key differences) (…)

    Session FAQs

    Gripes

    Some of these aren’t really security issues, but are things I found annoying as a security engineer that specializes in applied cryptography.

    1. Mnemonic Decoding Isn’t Constant-Time
    2. Unsafe Use of SecureRandom on Android

    Mnemonic Decoding Isn’t Constant-Time

    The way mnemonics are decoded involves the modulo operator, which implicitly uses integer division (which neither Java nor Kotlin nor Swift implement in constant-time).

    return wordIndexes.windowed(3, 3) { (w1, w2, w3) ->    val x = w1 + n * ((n - w1 + w2) % n) + n * n * ((n - w2 + w3) % n)    if (x % n != w1.toLong()) throw DecodingError.Generic    val string = "0000000" + x.toString(16)    swap(string.substring(string.length - 8 until string.length))}.joinToString(separator = "") { it }

    This isn’t a real security problem, but I did find it annoying to see in an app evangelized as “better than Signal” on privacy forums.

    Unsafe Use of SecureRandom on Android

    The recommended way to get secure random numbers on Android (or any Java or Kotlin software, really) is simply new SecureRandom(). If you’re running a service in a high-demand environment, you can take extra care to make a thread-local instance of SecureRandom. But a local RNG for a single user isn’t that.

    What does Session do? They use SHA1PRNG, of course.

    public static byte[] getSecretBytes(int size) {  try {    byte[] secret = new byte[size];    SecureRandom.getInstance("SHA1PRNG").nextBytes(secret);    return secret;  } catch (NoSuchAlgorithmException e) {    throw new AssertionError(e);  }}

    And again here.

    SecureRandom secureRandom = SecureRandom.getInstance("SHA1PRNG");

    Why would anyone care about this?

    On modern Android devices, this isn’t a major concern, but the use of SHA1PRNG used to be a source of vulnerabilities in Android apps. (See also: this slide deck.)

    Closing Thoughts

    There are a lot of Session’s design decisions that are poorly specified in their Whitepaper and I didn’t look at. For example, how group messaging keys are managed.

    When I did try to skim that part of the code, I did find a component where you can coerce Android clients into running a moderately expensive Argon2 KDF by simply deleting the nonce from the message.

    val isArgon2Based = (intermediate["nonce"] == null)if (isArgon2Based) {    // Handle old Argon2-based encryption used before HF16

    That’s hilarious.

    Cryptography nerds should NOT be finding the software that activists trust with their privacy hilarious.

    CMYKat

    So if you were wondering what my opinion on Session is, now you know: Don’t use Session. Don’t let your friends use Session.

    If you’re curious about the cryptography used by other messaging apps, please refer to this page that collects my blogs about this topic.

    #AESGCM #Android #asymmetricCryptography #cryptography #E2EE #Ed25519 #Java #Kotlin #messagingApps #OnlinePrivacy #privateMessaging #Session #Signal #SignalAlternatives #vuln

  2. Китайская криптография. Анализ проприетарного протокола MMTLS из WeChat

    Изображение из документации протокола MMTLS Академическая исследовательская группа Citizen Lab из университета Торонто провела первый публичный анализ протокола шифрования MMTLS на предмет безопасности и конфиденциальности. Это основной протокол приложения WeChat , которым пользуется более 1,2 млрд человек ( 34% мобильного трафика в Китае в 2018 году). Как выяснилось, MMTLS представляет собой модифицированную версию TLS 1.3, причём многие изменения, внесённые разработчиками, привели к появлению слабых мест. Более того, в дополнение к MMTLS используется ешё менее безопасный, тоже проприетарный протокол, содержащий множество уязвимостей, в том числе детерминированные векторы инициализации в AES-GCM и отсутствие прямой секретности. Ниже он упоминается под названием Business-layer encryption.

    habr.com/ru/companies/globalsi

    #TLS #MMTLS #WeChat #DH #AESGCM #AESCBC #MD5 #HKDF #шифрование #векторы_инициализации

  3. Ever since the Invisible Salamanders paper was published, there has been a quiet renaissance within my friends and colleagues in applied cryptography for studying systems that use Authenticated Encryption with Associated Data (AEAD) constructions, understanding what implicit assumptions these systems make about the guarantees of the AEAD mode they chose to build upon, and the consequences of those assumptions being false.

    I’ve discussed Invisible Salamanders several times throughout this blog, from my criticisms of AES-GCM and XMPP + OMEMO to my vulnerability disclosures in Threema.

    Five years after Invisible Salamanders, it’s become clear to me that many software developers do not fully appreciate the underlying problem discussed in the Invisible Salamanders paper, even when I share trivial proof-of-concept exploits.

    Background

    Fast AEAD constructions based on polynomial MACs, such as AES-GCM and ChaCha20-Poly1305, were designed to provide confidentiality and integrity for the plaintext data, and optionally integrity for some additional associated data, in systems where both parties already negotiated one shared symmetric key.

    The integrity goals of the systems that adopted these AEAD constructions were often accompanied by performance goals–usually to prevent Denial of Service (DoS) attacks in networking protocols. Verification needed to be very fast and consume minimal resources.

    In this sense, AEAD constructions were an incredible success. So successful, in fact, that most cryptographers urge application developers to use one of the fast AEAD modes as the default suggestion without looking deeper at the problem being solved. This is a good thing, because most developers will choose something stupid like ECB mode in the absence of guidance from cryptographers, and AEAD modes are much, much safer than any hand-rolled block cipher modes.

    The problem is, that one tiny little assumption that both parties (sender, recipient) for a communication have agreed on exactly one symmetric key for use in the protocol.

    Fast MACs Are Not Key-Committing

    Cryptographers have concluded that AEAD constructions based on polynomial MACs–while great for performance and rejection of malformed packets without creating DoS risks–tend to make the same assumption. This is even true of misuse-resistant modes like AES-GCM-SIV and extended-nonce constructions like XSalsa20-Poly1305.

    When discussing this implicit assumption of only one valid key in the systems that use these AEAD modes, we say that the modes are not key-committing. This terminology is based on what happens when this assumption is false.

    Consequently, you can take a single, specially crafted ciphertext (with an authentication tag) and decrypt it under multiple different keys. The authentication tags will be valid for all keys, and the plaintext will be different.

    Art: Swizz

    What does this look like in practice?

    Consider my GCM exploit, which was written to generate puzzle ciphertexts for the DEFCON Furs badge challenge a few years ago. How it works is conceptually simple (although the actual mechanics behind step 4 is a bit technical):

    1. Generate two keys.

      There’s nothing special about these keys, or their relationship to each other, and can be totally random. They just can’t be identical or the exploit is kind of pointless.

    2. Encrypt some blocks of plaintext with key1.
    3. Encrypt some more blocks of plaintext with key2.
    4. Calculate a collision block from the ciphertext in the previous two steps–which is just a bit of polynomial arithmetic in GF(2^128)
    5. Return the ciphertext (steps 2, 3, 4) and authentication tag calculated over them (which will collide for both keys).

    A system that decrypts the output of this exploit under key1 will see some plaintext, followed by some garbage, followed by 1 final block of garbage.

    If the same system decrypts under key2, it will see some garbage, followed by some plaintext, followed by 1 final block of garbage.

    For many file formats, this garbage isn’t really a problem. Additionally, a bit more precomputation allows you to choose garbage that will be more advantageous to ensuring both outputs are accepted as “valid” by the target system.

    For example, choosing two keys and a targeted nonce may allow both the valid plaintext and garbage blocks to begin with a PDF file header.

    If you’re familiar with the file polyglot work of Ange Albertini, you can use this to turn the invisible salamanders problem into an artform.

    Why is it called Invisible Salamanders?

    The proof-of-concept used in the paper involved sending one picture (of a salamander) over an end-to-end encrypted messaging app, but when the recipient flagged it as abusive, the moderator saw a different picture.

    https://www.youtube.com/watch?v=3M1jIO-jLHI

    Thus, the salamander was invisible to the moderators of the encrypted messaging app.

    As for the choice of a “salamander”, I’ve been told by friends familiar with the research that was inspired by the original name of the Signal Protocol being “Axolotl”.

    But, like, who cares about these details besides me? It’s a cute and memorable name.

    What are the consequences of violating the “one key” assumption?

    That depends entirely on what your system does!

    In Database Cryptography Fur the Rest of Us, I discussed the use of AEAD modes to prevent confused deputy attacks. This works great, but if you’re building an application that supports multi-tenancy, you suddenly have to care about this issue again.

    An earlier design for OPAQUE, a password authenticated key exchange algorithm, was broken by a partitioning oracle attack due to building atop AEAD modes that are not key-committing. This let an attacker recover passwords from Shadowsocks proxy servers with a complexity similar to a binary search algorithm.

    These are two very different impacts from the same weakness, which I believe is a significant factor for why the Invisible Salamanders issue isn’t more widely understood.

    Sometimes violating the “one key” assumption that went into fast AEAD modes based on Polynomial MACs completely destroys the security of your system.

    Other times, it opens the door for a high-complexity but low-impact behavior that simply violates the principle of least astonishment but doesn’t buy the attacker anything useful.

    They Just Don’t Get It

    The Invisible Salamanders issue is relevant in any system that uses symmetric-key encryption where more than one key can be valid.

    This includes, but is not limited to:

    • Multi-tenant data warehouses
    • Group messaging protocols
    • Envelope encryption schemes with multiple wrapping keys
    • Bearer tokens (such as JSON Web Tokens) in systems that utilize Key IDs

    Systems can mitigate this issue by introducing an explicit key commitment scheme (based on a cryptographic hash rather than a polynomial MAC) or by using a committing cipher mode (such as AES + HMAC, if done carefully).

    However, most of the time, this advice falls on deaf ears whenever this concern is brought up by a cryptography engineer who’s more aware of this issue.

    “Abuse reporting? We don’t have no stinking abuse reporting!”

    The most common misunderstanding is, “We don’t have a report abuse feature, so this issue doesn’t affect us.”

    This is because the Invisible Salamanders talk and paper focused on how it could be leveraged to defeat abuse reporting tools and bypass content moderation.

    In my experience, many security teams would read the paper and conclude that it only impacts abuse reporting features and not potentially all systems that allow multiple symmetric keys in a given context.

    Another Exploit Scenario

    Imagine you’re building a Data Loss Prevention product that integrates with corporate file-sharing and collaboration software (e.g. ownCloud) for small and medium businesses.

    One day, someone decides to ship an end-to-end encryption feature to the file-sharing software that uses AES-GCM to encrypt files, and then encrypts the keys to each recipient’s public key. This is basically the envelope encryption use-case above.

    In order to update your integration to act as another “user”, whose public key must be included in all E2EE transfers, and will block download of ciphertexts it cannot decrypt OR contains sensitive information.

    And this works, until an insider threat clever enough to abuse the Invisible Salamanders issue comes along.

    In order for said insider threat (e.g., a senior business analyst) to leak sensitive data (e.g., anything that would be useful for illegal insider trading) to another person that shouldn’t have access to it (e.g., a store clerk that’s talking to the press), they just have to do this:

    1. Encrypt the data they want to exfiltrate using key1.
    2. Encrypt some innocuous data that won’t trigger your DLP product, using key2.
    3. Ensure that both messages encrypt to the same ciphertext and authentication tag.
    4. Give their recipient key1, give everyone else (including your DLP software) key2.

    Bam! File leaked, and everyone’s none the wiser, until it’s too late. Let’s actually imagine what happens next:

    A random store clerk has leaked sensitive data to the press that only a few analysts had access to.

    The only communication between the analyst and the store clerk is a file that was shared to all employees, using the E2EE protocol. No emails or anything else were identified.

    Your DLP product didn’t identify any other communications between these two, but somehow the store clerk has the data on their desktop.

    A detailed forensics analysis may eventually figure out what happened, but by then, the damage is done and your product’s reputation is irrecoverably damaged.

    All because the hypothetical E2EE protocol didn’t include a key-commitment mechanism, and nobody identified this deficit in their designs.

    This isn’t to endorse DLP solutions at all, but rather, to highlight one of the many ways that the Invisible Salamander issue can be used creatively by clever attackers.

    Art: AJ

    The Lesson to Learn

    If you’re building a network protocol that uses AEAD to encrypt data over an insecure network (e.g., WireGuard), keep up the good work.

    If you’re doing anything more involved than that, at the application layer, pause for a moment and consider whether your system will ever need multiple valid symmetric keys at once.

    And, if the answer is “yes”, then you should always explicitly add a key-commitment mechanism to your system design.

    (Hire a cryptographer if you’re not sure how to proceed.)

    In my opinion, hemming and hawing over whether there’s a significant impact to the Invisible Salamanders issue is a worse use of your time than just solving it directly.

    Eventually, I expect a new generation of AEAD modes will be standardized that explicitly provide key-commitment.

    When these new designs are standardized, widely supported, and sufficiently trusted by experts, feel free to update my advice to “prefer using those modes” instead.

    Header art: Harubaki, CMYKat, and Brian Gratwicke

    https://soatok.blog/2024/09/10/invisible-salamanders-are-not-what-you-think/

    #AEAD #AESGCM #InvisibleSalamanders #randomKeyRobustness #symmetricCryptography

  4. Ever since the Invisible Salamanders paper was published, there has been a quiet renaissance within my friends and colleagues in applied cryptography for studying systems that use Authenticated Encryption with Associated Data (AEAD) constructions, understanding what implicit assumptions these systems make about the guarantees of the AEAD mode they chose to build upon, and the consequences of those assumptions being false.

    I’ve discussed Invisible Salamanders several times throughout this blog, from my criticisms of AES-GCM and XMPP + OMEMO to my vulnerability disclosures in Threema.

    Five years after Invisible Salamanders, it’s become clear to me that many software developers do not fully appreciate the underlying problem discussed in the Invisible Salamanders paper, even when I share trivial proof-of-concept exploits.

    Background

    Fast AEAD constructions based on polynomial MACs, such as AES-GCM and ChaCha20-Poly1305, were designed to provide confidentiality and integrity for the plaintext data, and optionally integrity for some additional associated data, in systems where both parties already negotiated one shared symmetric key.

    The integrity goals of the systems that adopted these AEAD constructions were often accompanied by performance goals–usually to prevent Denial of Service (DoS) attacks in networking protocols. Verification needed to be very fast and consume minimal resources.

    In this sense, AEAD constructions were an incredible success. So successful, in fact, that most cryptographers urge application developers to use one of the fast AEAD modes as the default suggestion without looking deeper at the problem being solved. This is a good thing, because most developers will choose something stupid like ECB mode in the absence of guidance from cryptographers, and AEAD modes are much, much safer than any hand-rolled block cipher modes.

    The problem is, that one tiny little assumption that both parties (sender, recipient) for a communication have agreed on exactly one symmetric key for use in the protocol.

    Fast MACs Are Not Key-Committing

    Cryptographers have concluded that AEAD constructions based on polynomial MACs–while great for performance and rejection of malformed packets without creating DoS risks–tend to make the same assumption. This is even true of misuse-resistant modes like AES-GCM-SIV and extended-nonce constructions like XSalsa20-Poly1305.

    When discussing this implicit assumption of only one valid key in the systems that use these AEAD modes, we say that the modes are not key-committing. This terminology is based on what happens when this assumption is false.

    Consequently, you can take a single, specially crafted ciphertext (with an authentication tag) and decrypt it under multiple different keys. The authentication tags will be valid for all keys, and the plaintext will be different.

    Art: Swizz

    What does this look like in practice?

    Consider my GCM exploit, which was written to generate puzzle ciphertexts for the DEFCON Furs badge challenge a few years ago. How it works is conceptually simple (although the actual mechanics behind step 4 is a bit technical):

    1. Generate two keys.

      There’s nothing special about these keys, or their relationship to each other, and can be totally random. They just can’t be identical or the exploit is kind of pointless.

    2. Encrypt some blocks of plaintext with key1.
    3. Encrypt some more blocks of plaintext with key2.
    4. Calculate a collision block from the ciphertext in the previous two steps–which is just a bit of polynomial arithmetic in GF(2^128)
    5. Return the ciphertext (steps 2, 3, 4) and authentication tag calculated over them (which will collide for both keys).

    A system that decrypts the output of this exploit under key1 will see some plaintext, followed by some garbage, followed by 1 final block of garbage.

    If the same system decrypts under key2, it will see some garbage, followed by some plaintext, followed by 1 final block of garbage.

    For many file formats, this garbage isn’t really a problem. Additionally, a bit more precomputation allows you to choose garbage that will be more advantageous to ensuring both outputs are accepted as “valid” by the target system.

    For example, choosing two keys and a targeted nonce may allow both the valid plaintext and garbage blocks to begin with a PDF file header.

    If you’re familiar with the file polyglot work of Ange Albertini, you can use this to turn the Invisible Salamanders problem into an artform.

    And this is just the simple attack!

    The Invisible Salamanders paper outlined a more advanced variant (with a proof of concept) in Section 3.2, which doesn’t suffer from nearly as much garbage data as the simple attack.

    As Bruce Schneier often says, “Attacks only get better, they never get worse.”

    Why is it called Invisible Salamanders?

    The proof-of-concept used in the paper involved sending one picture (of a salamander) over an end-to-end encrypted messaging app, but when the recipient flagged it as abusive, the moderator saw a different picture.

    https://www.youtube.com/watch?v=3M1jIO-jLHI

    Thus, the salamander was invisible to the moderators of the encrypted messaging app.

    As for the choice of a “salamander”, I’ve been told by friends familiar with the research that was inspired by the original name of the Signal Protocol being “Axolotl”.

    But, like, who cares about these details besides me? It’s a cute and memorable name.

    What are the consequences of violating the “one key” assumption?

    That depends entirely on what your system does!

    In Database Cryptography Fur the Rest of Us, I discussed the use of AEAD modes to prevent confused deputy attacks. This works great, but if you’re building an application that supports multi-tenancy, you suddenly have to care about this issue again.

    An earlier design for OPAQUE, a password authenticated key exchange algorithm, was broken by a partitioning oracle attack due to building atop AEAD modes that are not key-committing. This let an attacker recover passwords from Shadowsocks proxy servers with a complexity similar to a binary search algorithm.

    These are two very different impacts from the same weakness, which I believe is a significant factor for why the Invisible Salamanders issue isn’t more widely understood.

    Sometimes violating the “one key” assumption that went into fast AEAD modes based on Polynomial MACs completely destroys the security of your system.

    Other times, it opens the door for a high-complexity but low-impact behavior that simply violates the principle of least astonishment but doesn’t buy the attacker anything useful.

    They Just Don’t Get It

    The Invisible Salamanders issue is relevant in any system that uses symmetric-key encryption where more than one key can be valid.

    This includes, but is not limited to:

    • Multi-tenant data warehouses
    • Group messaging protocols
      • It’s sometimes tempting to discount group messaging as a relevant consideration if your experience is “emulated groups atop 1-to-1 messaging”, but there are protocols that establish a Group Key (i.e., RFC 9420) and then use that for all group messages.
    • Envelope encryption schemes with multiple wrapping keys
    • Bearer tokens (such as JSON Web Tokens) in systems that utilize Key IDs

    Systems can mitigate this issue by introducing an explicit key commitment scheme (based on a cryptographic hash rather than a polynomial MAC) or by using a committing cipher mode (such as AES + HMAC, if done carefully).

    However, most of the time, this advice falls on deaf ears whenever this concern is brought up by a cryptography engineer who’s more aware of this issue.

    “Abuse reporting? We don’t have no stinking abuse reporting!”

    The most common misunderstanding is, “We don’t have a report abuse feature, so this issue doesn’t affect us.”

    This is because the Invisible Salamanders talk and paper focused on how it could be leveraged to defeat abuse reporting tools and bypass content moderation.

    In my experience, many security teams would read the paper and conclude that it only impacts abuse reporting features and not potentially all systems that allow multiple symmetric keys in a given context.

    Another Exploit Scenario

    Imagine you’re building a Data Loss Prevention product that integrates with corporate file-sharing and collaboration software (e.g. ownCloud) for small and medium businesses.

    One day, someone decides to ship an end-to-end encryption feature to the file-sharing software that uses AES-GCM to encrypt files, and then encrypts the keys to each recipient’s public key. This is basically the envelope encryption use-case above.

    So, you dutifully update your integration to act as another “user”, whose public key must be included in all E2EE transfers, and will block download of ciphertexts it cannot decrypt OR contains sensitive information.

    And this works, until an insider threat clever enough to abuse the Invisible Salamanders issue comes along.

    In order for said insider threat (e.g., a senior business analyst) to leak sensitive data (e.g., anything that would be useful for illegal insider trading) to another person that shouldn’t have access to it (e.g., a store clerk that’s talking to the press), they just have to do this:

    1. Encrypt the data they want to exfiltrate using key1.
    2. Encrypt some innocuous data that won’t trigger your DLP product, using key2.
    3. Ensure that both messages encrypt to the same ciphertext and authentication tag.
    4. Give their recipient key1, give everyone else (including your DLP software) key2.

    Bam! File leaked, and everyone’s none the wiser, until it’s too late. Let’s actually imagine what happens next:

    A random store clerk has leaked sensitive data to the press that only a few analysts had access to.

    The only communication between the analyst and the store clerk is a file that was shared to all employees, using the E2EE protocol. No emails or anything else were identified.

    Your DLP product didn’t identify any other communications between these two, but somehow the store clerk has the data on their desktop.

    A detailed forensics analysis may eventually figure out what happened, but by then, the damage is done and your product’s reputation is irrecoverably damaged.

    All because the hypothetical E2EE protocol didn’t include a key-commitment mechanism, and nobody identified this deficit in their designs.

    This isn’t to endorse DLP solutions at all, but rather, to highlight one of the many ways that the Invisible Salamander issue can be used creatively by clever attackers.

    Art: AJ

    “Couldn’t you do the same with steganography?”

    No, the attack is very different from stego.

    Stego is about hiding a message in plain sight, so that only the person that knows where/how to look can find it.

    The Invisible Salamanders attack lets you send one ciphertext through a network then selectively decrypt it to one of two plaintexts, depending on which key you reveal to each participant.

    In the Invisible Salamanders paper and talk, they used this to send “abusive” messages to a recipient that the moderator would not see. Thus, invisible.

    In one, the message is always emitted to anyone who knows how to find it. In the other, the attacker selects which you see, even if you have mechanisms to ensure you’re seeing the same ciphertext. It’s not a subtle difference.

    Mitigation Techniques

    There are multiple ways to mitigate the risk of Invisible Salamanders in a cryptosystem.

    1. Use HMAC, or (failing that) something built atop cryptographic hash functions, rather than a Polynomial MAC.
    2. Use an AEAD cipher designed with multi-recipient integrity as a security goal.
    3. Compute a non-invertible, one-way commitment of the encryption key.

    A trivial mitigation looks like this:

    class SoatokExampleEncryptor {  const NEW_ENCRYPT_KEY = 'myProtocol$encryptKey';  const NEW_COMMITMENT = 'myProtocol$commitment';  public function __construct(#[SensitiveParameter] private string $key)  {}  /**   * Let's assume we're starting with a simple AES-GCM wrapper   */  public function legacyEncrypt(string $plaintext, string $assocData = ''): string  {    $nonce = random_bytes(12);    $tag = '';    $ciphertext = openssl_encrypt(      $plaintext,      'aes-256-gcm',      $this->key,      OPENSSL_RAW_DATA,      $nonce,      $tag,      $assocData    );    return $nonce . $ciphertext . $tag;  }  /**   * An improved function looks something like this   */  public function newEncrypt(string $plaintext, string $assocData = ''): string  {    // Avoid birthday bound issues with 256-bits of randomness    $longerNonce = random_bytes(32);    // Derive a subkey and synthetic nonce    $tmp = hash_hkdf('sha512', $this->key, 44, self::NEW_ENCRYPT_KEY . $longerNonce);    $encKey = substr($tmp, 0, 32);    $nonce = substr($tmp, 32);    // New: Key commitment    $commitment = hash_hkdf('sha512', $this->key, 32, self::NEW_COMMITMENT . $longerNonce);    // Most of this is unchanged        $tag = '';    $ciphertext = openssl_encrypt(      $plaintext,      'aes-256-gcm',      $encKey,      OPENSSL_RAW_DATA,      $nonce,      $tag,      $assocData    );    return $longerNonce . $commitment . $ciphertext . $tag;  }}

    And then the decryption logic would recalculate the commitment, and compare it with the stored value, in constant-time.

    It’s important that the commitment be stored with the ciphertext, rather than bundling it with the key.

    (It may be worthwhile to also include the commitment in the associated data, to add a mechanism against downgrade attacks.)

    The Lesson to Learn

    If you’re building a network protocol that uses AEAD to encrypt data over an insecure network (e.g., WireGuard), keep up the good work.

    If you’re doing anything more involved than that, at the application layer, pause for a moment and consider whether your system will ever need multiple valid symmetric keys at once.

    And, if the answer is “yes”, then you should always explicitly add a key-commitment mechanism to your system design.

    (Hire a cryptographer if you’re not sure how to proceed.)

    In my opinion, hemming and hawing over whether there’s a significant impact to the Invisible Salamanders issue is a worse use of your time than just solving it directly.

    Eventually, I expect a new generation of AEAD modes will be standardized that explicitly provide key-commitment.

    When these new designs are standardized, widely supported, and sufficiently trusted by experts, feel free to update my advice to “prefer using those modes” instead.

    Header art: Harubaki, CMYKat, and a photo by Brian Gratwicke. Poorly photoshopped by myself.

    https://soatok.blog/2024/09/10/invisible-salamanders-are-not-what-you-think/

    #AEAD #AESGCM #InvisibleSalamanders #randomKeyRobustness #symmetricCryptography

  5. Ever since the Invisible Salamanders paper was published, there has been a quiet renaissance within my friends and colleagues in applied cryptography for studying systems that use Authenticated Encryption with Associated Data (AEAD) constructions, understanding what implicit assumptions these systems make about the guarantees of the AEAD mode they chose to build upon, and the consequences of those assumptions being false.

    I’ve discussed Invisible Salamanders several times throughout this blog, from my criticisms of AES-GCM and XMPP + OMEMO to my vulnerability disclosures in Threema.

    Five years after Invisible Salamanders, it’s become clear to me that many software developers do not fully appreciate the underlying problem discussed in the Invisible Salamanders paper, even when I share trivial proof-of-concept exploits.

    Background

    Fast AEAD constructions based on polynomial MACs, such as AES-GCM and ChaCha20-Poly1305, were designed to provide confidentiality and integrity for the plaintext data, and optionally integrity for some additional associated data, in systems where both parties already negotiated one shared symmetric key.

    The integrity goals of the systems that adopted these AEAD constructions were often accompanied by performance goals–usually to prevent Denial of Service (DoS) attacks in networking protocols. Verification needed to be very fast and consume minimal resources.

    In this sense, AEAD constructions were an incredible success. So successful, in fact, that most cryptographers urge application developers to use one of the fast AEAD modes as the default suggestion without looking deeper at the problem being solved. This is a good thing, because most developers will choose something stupid like ECB mode in the absence of guidance from cryptographers, and AEAD modes are much, much safer than any hand-rolled block cipher modes.

    The problem is, that one tiny little assumption that both parties (sender, recipient) for a communication have agreed on exactly one symmetric key for use in the protocol.

    Fast MACs Are Not Key-Committing

    Cryptographers have concluded that AEAD constructions based on polynomial MACs–while great for performance and rejection of malformed packets without creating DoS risks–tend to make the same assumption. This is even true of misuse-resistant modes like AES-GCM-SIV and extended-nonce constructions like XSalsa20-Poly1305.

    When discussing this implicit assumption of only one valid key in the systems that use these AEAD modes, we say that the modes are not key-committing. This terminology is based on what happens when this assumption is false.

    Consequently, you can take a single, specially crafted ciphertext (with an authentication tag) and decrypt it under multiple different keys. The authentication tags will be valid for all keys, and the plaintext will be different.

    Art: Swizz

    What does this look like in practice?

    Consider my GCM exploit, which was written to generate puzzle ciphertexts for the DEFCON Furs badge challenge a few years ago. How it works is conceptually simple (although the actual mechanics behind step 4 is a bit technical):

    1. Generate two keys.

      There’s nothing special about these keys, or their relationship to each other, and can be totally random. They just can’t be identical or the exploit is kind of pointless.

    2. Encrypt some blocks of plaintext with key1.
    3. Encrypt some more blocks of plaintext with key2.
    4. Calculate a collision block from the ciphertext in the previous two steps–which is just a bit of polynomial arithmetic in GF(2^128)
    5. Return the ciphertext (steps 2, 3, 4) and authentication tag calculated over them (which will collide for both keys).

    A system that decrypts the output of this exploit under key1 will see some plaintext, followed by some garbage, followed by 1 final block of garbage.

    If the same system decrypts under key2, it will see some garbage, followed by some plaintext, followed by 1 final block of garbage.

    For many file formats, this garbage isn’t really a problem. Additionally, a bit more precomputation allows you to choose garbage that will be more advantageous to ensuring both outputs are accepted as “valid” by the target system.

    For example, choosing two keys and a targeted nonce may allow both the valid plaintext and garbage blocks to begin with a PDF file header.

    If you’re familiar with the file polyglot work of Ange Albertini, you can use this to turn the Invisible Salamanders problem into an artform.

    And this is just the simple attack!

    The Invisible Salamanders paper outlined a more advanced variant (with a proof of concept) in Section 3.2, which doesn’t suffer from nearly as much garbage data as the simple attack.

    As Bruce Schneier often says, “Attacks only get better, they never get worse.”

    Why is it called Invisible Salamanders?

    The proof-of-concept used in the paper involved sending one picture (of a salamander) over an end-to-end encrypted messaging app, but when the recipient flagged it as abusive, the moderator saw a different picture.

    https://www.youtube.com/watch?v=3M1jIO-jLHI

    Thus, the salamander was invisible to the moderators of the encrypted messaging app.

    As for the choice of a “salamander”, I’ve been told by friends familiar with the research that was inspired by the original name of the Signal Protocol being “Axolotl”.

    But, like, who cares about these details besides me? It’s a cute and memorable name.

    What are the consequences of violating the “one key” assumption?

    That depends entirely on what your system does!

    In Database Cryptography Fur the Rest of Us, I discussed the use of AEAD modes to prevent confused deputy attacks. This works great, but if you’re building an application that supports multi-tenancy, you suddenly have to care about this issue again.

    An earlier design for OPAQUE, a password authenticated key exchange algorithm, was broken by a partitioning oracle attack due to building atop AEAD modes that are not key-committing. This let an attacker recover passwords from Shadowsocks proxy servers with a complexity similar to a binary search algorithm.

    These are two very different impacts from the same weakness, which I believe is a significant factor for why the Invisible Salamanders issue isn’t more widely understood.

    Sometimes violating the “one key” assumption that went into fast AEAD modes based on Polynomial MACs completely destroys the security of your system.

    Other times, it opens the door for a high-complexity but low-impact behavior that simply violates the principle of least astonishment but doesn’t buy the attacker anything useful.

    They Just Don’t Get It

    The Invisible Salamanders issue is relevant in any system that uses symmetric-key encryption where more than one key can be valid.

    This includes, but is not limited to:

    • Multi-tenant data warehouses
    • Group messaging protocols
      • It’s sometimes tempting to discount group messaging as a relevant consideration if your experience is “emulated groups atop 1-to-1 messaging”, but there are protocols that establish a Group Key (i.e., RFC 9420) and then use that for all group messages.
    • Envelope encryption schemes with multiple wrapping keys
    • Bearer tokens (such as JSON Web Tokens) in systems that utilize Key IDs

    Systems can mitigate this issue by introducing an explicit key commitment scheme (based on a cryptographic hash rather than a polynomial MAC) or by using a committing cipher mode (such as AES + HMAC, if done carefully).

    However, most of the time, this advice falls on deaf ears whenever this concern is brought up by a cryptography engineer who’s more aware of this issue.

    “Abuse reporting? We don’t have no stinking abuse reporting!”

    The most common misunderstanding is, “We don’t have a report abuse feature, so this issue doesn’t affect us.”

    This is because the Invisible Salamanders talk and paper focused on how it could be leveraged to defeat abuse reporting tools and bypass content moderation.

    In my experience, many security teams would read the paper and conclude that it only impacts abuse reporting features and not potentially all systems that allow multiple symmetric keys in a given context.

    Another Exploit Scenario

    Imagine you’re building a Data Loss Prevention product that integrates with corporate file-sharing and collaboration software (e.g. ownCloud) for small and medium businesses.

    One day, someone decides to ship an end-to-end encryption feature to the file-sharing software that uses AES-GCM to encrypt files, and then encrypts the keys to each recipient’s public key. This is basically the envelope encryption use-case above.

    So, you dutifully update your integration to act as another “user”, whose public key must be included in all E2EE transfers, and will block download of ciphertexts it cannot decrypt OR contains sensitive information.

    And this works, until an insider threat clever enough to abuse the Invisible Salamanders issue comes along.

    In order for said insider threat (e.g., a senior business analyst) to leak sensitive data (e.g., anything that would be useful for illegal insider trading) to another person that shouldn’t have access to it (e.g., a store clerk that’s talking to the press), they just have to do this:

    1. Encrypt the data they want to exfiltrate using key1.
    2. Encrypt some innocuous data that won’t trigger your DLP product, using key2.
    3. Ensure that both messages encrypt to the same ciphertext and authentication tag.
    4. Give their recipient key1, give everyone else (including your DLP software) key2.

    Bam! File leaked, and everyone’s none the wiser, until it’s too late. Let’s actually imagine what happens next:

    A random store clerk has leaked sensitive data to the press that only a few analysts had access to.

    The only communication between the analyst and the store clerk is a file that was shared to all employees, using the E2EE protocol. No emails or anything else were identified.

    Your DLP product didn’t identify any other communications between these two, but somehow the store clerk has the data on their desktop.

    A detailed forensics analysis may eventually figure out what happened, but by then, the damage is done and your product’s reputation is irrecoverably damaged.

    All because the hypothetical E2EE protocol didn’t include a key-commitment mechanism, and nobody identified this deficit in their designs.

    This isn’t to endorse DLP solutions at all, but rather, to highlight one of the many ways that the Invisible Salamander issue can be used creatively by clever attackers.

    Art: AJ

    “Couldn’t you do the same with steganography?”

    No, the attack is very different from stego.

    Stego is about hiding a message in plain sight, so that only the person that knows where/how to look can find it.

    The Invisible Salamanders attack lets you send one ciphertext through a network then selectively decrypt it to one of two plaintexts, depending on which key you reveal to each participant.

    In the Invisible Salamanders paper and talk, they used this to send “abusive” messages to a recipient that the moderator would not see. Thus, invisible.

    In one, the message is always emitted to anyone who knows how to find it. In the other, the attacker selects which you see, even if you have mechanisms to ensure you’re seeing the same ciphertext. It’s not a subtle difference.

    Mitigation Techniques

    There are multiple ways to mitigate the risk of Invisible Salamanders in a cryptosystem.

    1. Use HMAC, or (failing that) something built atop cryptographic hash functions, rather than a Polynomial MAC.
    2. Use an AEAD cipher designed with multi-recipient integrity as a security goal.
    3. Compute a non-invertible, one-way commitment of the encryption key.

    A trivial mitigation looks like this:

    class SoatokExampleEncryptor {  const NEW_ENCRYPT_KEY = 'myProtocol$encryptKey';  const NEW_COMMITMENT = 'myProtocol$commitment';  public function __construct(#[SensitiveParameter] private string $key)  {}  /**   * Let's assume we're starting with a simple AES-GCM wrapper   */  public function legacyEncrypt(string $plaintext, string $assocData = ''): string  {    $nonce = random_bytes(12);    $tag = '';    $ciphertext = openssl_encrypt(      $plaintext,      'aes-256-gcm',      $this->key,      OPENSSL_RAW_DATA,      $nonce,      $tag,      $assocData    );    return $nonce . $ciphertext . $tag;  }  /**   * An improved function looks something like this   */  public function newEncrypt(string $plaintext, string $assocData = ''): string  {    // Avoid birthday bound issues with 256-bits of randomness    $longerNonce = random_bytes(32);    // Derive a subkey and synthetic nonce    $tmp = hash_hkdf('sha512', $this->key, 44, self::NEW_ENCRYPT_KEY . $longerNonce);    $encKey = substr($tmp, 0, 32);    $nonce = substr($tmp, 32);    // New: Key commitment    $commitment = hash_hkdf('sha512', $this->key, 32, self::NEW_COMMITMENT . $longerNonce);    // Most of this is unchanged        $tag = '';    $ciphertext = openssl_encrypt(      $plaintext,      'aes-256-gcm',      $encKey,      OPENSSL_RAW_DATA,      $nonce,      $tag,      $assocData    );    return $longerNonce . $commitment . $ciphertext . $tag;  }}

    And then the decryption logic would recalculate the commitment, and compare it with the stored value, in constant-time.

    It’s important that the commitment be stored with the ciphertext, rather than bundling it with the key.

    (It may be worthwhile to also include the commitment in the associated data, to add a mechanism against downgrade attacks.)

    The Lesson to Learn

    If you’re building a network protocol that uses AEAD to encrypt data over an insecure network (e.g., WireGuard), keep up the good work.

    If you’re doing anything more involved than that, at the application layer, pause for a moment and consider whether your system will ever need multiple valid symmetric keys at once.

    And, if the answer is “yes”, then you should always explicitly add a key-commitment mechanism to your system design.

    (Hire a cryptographer if you’re not sure how to proceed.)

    In my opinion, hemming and hawing over whether there’s a significant impact to the Invisible Salamanders issue is a worse use of your time than just solving it directly.

    Eventually, I expect a new generation of AEAD modes will be standardized that explicitly provide key-commitment.

    When these new designs are standardized, widely supported, and sufficiently trusted by experts, feel free to update my advice to “prefer using those modes” instead.

    Header art: Harubaki, CMYKat, and a photo by Brian Gratwicke. Poorly photoshopped by myself.

    https://soatok.blog/2024/09/10/invisible-salamanders-are-not-what-you-think/

    #AEAD #AESGCM #InvisibleSalamanders #randomKeyRobustness #symmetricCryptography

  6. Ever since the Invisible Salamanders paper was published, there has been a quiet renaissance within my friends and colleagues in applied cryptography for studying systems that use Authenticated Encryption with Associated Data (AEAD) constructions, understanding what implicit assumptions these systems make about the guarantees of the AEAD mode they chose to build upon, and the consequences of those assumptions being false.

    I’ve discussed Invisible Salamanders several times throughout this blog, from my criticisms of AES-GCM and XMPP + OMEMO to my vulnerability disclosures in Threema.

    Five years after Invisible Salamanders, it’s become clear to me that many software developers do not fully appreciate the underlying problem discussed in the Invisible Salamanders paper, even when I share trivial proof-of-concept exploits.

    Background

    Fast AEAD constructions based on polynomial MACs, such as AES-GCM and ChaCha20-Poly1305, were designed to provide confidentiality and integrity for the plaintext data, and optionally integrity for some additional associated data, in systems where both parties already negotiated one shared symmetric key.

    The integrity goals of the systems that adopted these AEAD constructions were often accompanied by performance goals–usually to prevent Denial of Service (DoS) attacks in networking protocols. Verification needed to be very fast and consume minimal resources.

    In this sense, AEAD constructions were an incredible success. So successful, in fact, that most cryptographers urge application developers to use one of the fast AEAD modes as the default suggestion without looking deeper at the problem being solved. This is a good thing, because most developers will choose something stupid like ECB mode in the absence of guidance from cryptographers, and AEAD modes are much, much safer than any hand-rolled block cipher modes.

    The problem is, that one tiny little assumption that both parties (sender, recipient) for a communication have agreed on exactly one symmetric key for use in the protocol.

    Fast MACs Are Not Key-Committing

    Cryptographers have concluded that AEAD constructions based on polynomial MACs–while great for performance and rejection of malformed packets without creating DoS risks–tend to make the same assumption. This is even true of misuse-resistant modes like AES-GCM-SIV and extended-nonce constructions like XSalsa20-Poly1305.

    When discussing this implicit assumption of only one valid key in the systems that use these AEAD modes, we say that the modes are not key-committing. This terminology is based on what happens when this assumption is false.

    Consequently, you can take a single, specially crafted ciphertext (with an authentication tag) and decrypt it under multiple different keys. The authentication tags will be valid for all keys, and the plaintext will be different.

    Art: Swizz

    What does this look like in practice?

    Consider my GCM exploit, which was written to generate puzzle ciphertexts for the DEFCON Furs badge challenge a few years ago. How it works is conceptually simple (although the actual mechanics behind step 4 is a bit technical):

    1. Generate two keys.

      There’s nothing special about these keys, or their relationship to each other, and can be totally random. They just can’t be identical or the exploit is kind of pointless.

    2. Encrypt some blocks of plaintext with key1.
    3. Encrypt some more blocks of plaintext with key2.
    4. Calculate a collision block from the ciphertext in the previous two steps–which is just a bit of polynomial arithmetic in GF(2^128)
    5. Return the ciphertext (steps 2, 3, 4) and authentication tag calculated over them (which will collide for both keys).

    A system that decrypts the output of this exploit under key1 will see some plaintext, followed by some garbage, followed by 1 final block of garbage.

    If the same system decrypts under key2, it will see some garbage, followed by some plaintext, followed by 1 final block of garbage.

    For many file formats, this garbage isn’t really a problem. Additionally, a bit more precomputation allows you to choose garbage that will be more advantageous to ensuring both outputs are accepted as “valid” by the target system.

    For example, choosing two keys and a targeted nonce may allow both the valid plaintext and garbage blocks to begin with a PDF file header.

    If you’re familiar with the file polyglot work of Ange Albertini, you can use this to turn the invisible salamanders problem into an artform.

    Why is it called Invisible Salamanders?

    The proof-of-concept used in the paper involved sending one picture (of a salamander) over an end-to-end encrypted messaging app, but when the recipient flagged it as abusive, the moderator saw a different picture.

    https://www.youtube.com/watch?v=3M1jIO-jLHI

    Thus, the salamander was invisible to the moderators of the encrypted messaging app.

    As for the choice of a “salamander”, I’ve been told by friends familiar with the research that was inspired by the original name of the Signal Protocol being “Axolotl”.

    But, like, who cares about these details besides me? It’s a cute and memorable name.

    What are the consequences of violating the “one key” assumption?

    That depends entirely on what your system does!

    In Database Cryptography Fur the Rest of Us, I discussed the use of AEAD modes to prevent confused deputy attacks. This works great, but if you’re building an application that supports multi-tenancy, you suddenly have to care about this issue again.

    An earlier design for OPAQUE, a password authenticated key exchange algorithm, was broken by a partitioning oracle attack due to building atop AEAD modes that are not key-committing. This let an attacker recover passwords from Shadowsocks proxy servers with a complexity similar to a binary search algorithm.

    These are two very different impacts from the same weakness, which I believe is a significant factor for why the Invisible Salamanders issue isn’t more widely understood.

    Sometimes violating the “one key” assumption that went into fast AEAD modes based on Polynomial MACs completely destroys the security of your system.

    Other times, it opens the door for a high-complexity but low-impact behavior that simply violates the principle of least astonishment but doesn’t buy the attacker anything useful.

    They Just Don’t Get It

    The Invisible Salamanders issue is relevant in any system that uses symmetric-key encryption where more than one key can be valid.

    This includes, but is not limited to:

    • Multi-tenant data warehouses
    • Group messaging protocols
    • Envelope encryption schemes with multiple wrapping keys
    • Bearer tokens (such as JSON Web Tokens) in systems that utilize Key IDs

    Systems can mitigate this issue by introducing an explicit key commitment scheme (based on a cryptographic hash rather than a polynomial MAC) or by using a committing cipher mode (such as AES + HMAC, if done carefully).

    However, most of the time, this advice falls on deaf ears whenever this concern is brought up by a cryptography engineer who’s more aware of this issue.

    “Abuse reporting? We don’t have no stinking abuse reporting!”

    The most common misunderstanding is, “We don’t have a report abuse feature, so this issue doesn’t affect us.”

    This is because the Invisible Salamanders talk and paper focused on how it could be leveraged to defeat abuse reporting tools and bypass content moderation.

    In my experience, many security teams would read the paper and conclude that it only impacts abuse reporting features and not potentially all systems that allow multiple symmetric keys in a given context.

    Another Exploit Scenario

    Imagine you’re building a Data Loss Prevention product that integrates with corporate file-sharing and collaboration software (e.g. ownCloud) for small and medium businesses.

    One day, someone decides to ship an end-to-end encryption feature to the file-sharing software that uses AES-GCM to encrypt files, and then encrypts the keys to each recipient’s public key. This is basically the envelope encryption use-case above.

    In order to update your integration to act as another “user”, whose public key must be included in all E2EE transfers, and will block download of ciphertexts it cannot decrypt OR contains sensitive information.

    And this works, until an insider threat clever enough to abuse the Invisible Salamanders issue comes along.

    In order for said insider threat (e.g., a senior business analyst) to leak sensitive data (e.g., anything that would be useful for illegal insider trading) to another person that shouldn’t have access to it (e.g., a store clerk that’s talking to the press), they just have to do this:

    1. Encrypt the data they want to exfiltrate using key1.
    2. Encrypt some innocuous data that won’t trigger your DLP product, using key2.
    3. Ensure that both messages encrypt to the same ciphertext and authentication tag.
    4. Give their recipient key1, give everyone else (including your DLP software) key2.

    Bam! File leaked, and everyone’s none the wiser, until it’s too late. Let’s actually imagine what happens next:

    A random store clerk has leaked sensitive data to the press that only a few analysts had access to.

    The only communication between the analyst and the store clerk is a file that was shared to all employees, using the E2EE protocol. No emails or anything else were identified.

    Your DLP product didn’t identify any other communications between these two, but somehow the store clerk has the data on their desktop.

    A detailed forensics analysis may eventually figure out what happened, but by then, the damage is done and your product’s reputation is irrecoverably damaged.

    All because the hypothetical E2EE protocol didn’t include a key-commitment mechanism, and nobody identified this deficit in their designs.

    This isn’t to endorse DLP solutions at all, but rather, to highlight one of the many ways that the Invisible Salamander issue can be used creatively by clever attackers.

    Art: AJ

    The Lesson to Learn

    If you’re building a network protocol that uses AEAD to encrypt data over an insecure network (e.g., WireGuard), keep up the good work.

    If you’re doing anything more involved than that, at the application layer, pause for a moment and consider whether your system will ever need multiple valid symmetric keys at once.

    And, if the answer is “yes”, then you should always explicitly add a key-commitment mechanism to your system design.

    (Hire a cryptographer if you’re not sure how to proceed.)

    In my opinion, hemming and hawing over whether there’s a significant impact to the Invisible Salamanders issue is a worse use of your time than just solving it directly.

    Eventually, I expect a new generation of AEAD modes will be standardized that explicitly provide key-commitment.

    When these new designs are standardized, widely supported, and sufficiently trusted by experts, feel free to update my advice to “prefer using those modes” instead.

    Header art: Harubaki, CMYKat, and Brian Gratwicke

    https://soatok.blog/2024/09/10/invisible-salamanders-are-not-what-you-think/

    #AEAD #AESGCM #InvisibleSalamanders #randomKeyRobustness #symmetricCryptography

  7. New progress note (W51) available at goffi.org/b/JLPTTHfRnqhxNozHjR

    This one is about Kivy, infinite scroll and other changes, and OMEMO with e2e file sharing

    #salutatoi #XMPP #progress_note #kivy #OMEMO #e2ee #e2e #encryption #aesgcm