home.social

Search

1000 results for “Meat_Bucket”

  1. CW: Booze

    Not that I haven't had a daiquiri for long but I usually make one with syrup - this way with powdered sugar in lime juice is a variant I haven't tried in a while. Sugar is powdered now, I will have more!

    ht @Meat_Bucket mstdn.social/@Meat_Bucket/1149

    #BoozeHounds #Cocktails #Drinkstodon

  2. It’s great to see more advocacy for proper #Daiquiri construction with granulated sugar. Personally I always keep a blend of super fine (white) sugar and Demerara sugar (that I’ve Vitamixed to the same superfine consistency) on hand for Daiquiris. Dissolves much more quickly.
    #Cocktail #Cocktails

    Deep Diving on the Daiquiri - Imbibe Magazine imbibemagazine.com/deep-diving

  3. #Cocktail #Cocktails #Gin
    Bee’s Knees

    I may or may not have had a couple of these while watching My Man Godfrey.
    Hadn’t seen it in ages, and it’s as fantastic as ever. Really glad not to have watched anything Oscar nominated tonight.
    #WilliamPowell #CarolLombard #MyManGodfrey #Cimemastodon

  4. #Cocktail #Cocktails #Gin
    Bee’s Knees

    I may or may not have had a couple of these while watching My Man Godfrey.
    Hadn’t seen it in ages, and it’s as fantastic as ever. Really glad not to have watched anything Oscar nominated tonight.
    #WilliamPowell #CarolLombard #MyManGodfrey #Cimemastodon

  5. #Sidecar
    My personal recipe is:
    2 oz Pierre Ferrand 1840
    3/4 oz Cointreau
    3/4 oz lemon juice
    1/4 oz @dsoneil’s caramel syrup
    2 drops saline
    #Cocktail #Cocktails #Cognac #Cointreau

  6. #Cocktail #Cocktails #Aperitivi
    Last night’s Better Late Than Never, which is the bourbon base swapped Too Soon? by Sam Ross, and tonight’s Spritz.

    Recipes in the alt tags.

  7. #Cocktail #Cocktails #Aperitivi
    Last night’s Better Late Than Never, which is the bourbon base swapped Too Soon? by Sam Ross, and tonight’s Spritz.

    Recipes in the alt tags.

  8. #Cocktail #Cocktails #Aperitivi
    Last night’s Better Late Than Never, which is the bourbon base swapped Too Soon? by Sam Ross, and tonight’s Spritz.

    Recipes in the alt tags.

  9. #Cocktail #Cocktails #Aperitivi
    Last night’s Better Late Than Never, which is the bourbon base swapped Too Soon? by Sam Ross, and tonight’s Spritz.

    Recipes in the alt tags.

  10. Nightcap is a #Stinger tonight. This one with Pierre Ferrand 1840, and a mix of Get 31 and Branca Menta.

    Good, but I think I prefer just the Get 31 if the ratios are right.

    #Cocktail #Cocktails #Brandy #Cognac #CremeDeMenthe

  11. A Manhattan sort of night… started off with John deBary’s Shark, half of which ended up in my lap after tangling with my winter sofa blanket, then another Shark, perhaps better than the first, PDT style tots and a 2Pok, finishing with Max Green’s Freeride. #PDT #AmorYAmargo #Cocktail #Cocktails #Rum #Tiki #HotDogs #AreNotSandwiches #ForTheRecord

  12. A Manhattan sort of night… started off with John deBary’s Shark, half of which ended up in my lap after tangling with my winter sofa blanket, then another Shark, perhaps better than the first, PDT style tots and a 2Pok, finishing with Max Green’s Freeride. #PDT #AmorYAmargo #Cocktail #Cocktails #Rum #Tiki #HotDogs #AreNotSandwiches #ForTheRecord

  13. A Manhattan sort of night… started off with John deBary’s Shark, half of which ended up in my lap after tangling with my winter sofa blanket, then another Shark, perhaps better than the first, PDT style tots and a 2Pok, finishing with Max Green’s Freeride. #PDT #AmorYAmargo #Cocktail #Cocktails #Rum #Tiki #HotDogs #AreNotSandwiches #ForTheRecord

  14. A Manhattan sort of night… started off with John deBary’s Shark, half of which ended up in my lap after tangling with my winter sofa blanket, then another Shark, perhaps better than the first, PDT style tots and a 2Pok, finishing with Max Green’s Freeride. #PDT #AmorYAmargo #Cocktail #Cocktails #Rum #Tiki #HotDogs #AreNotSandwiches #ForTheRecord

  15. @SazeracLA my favorite substitutions with Stiggins Fancy are a #Sazerac, an #ElPresidente, or half the base of a #Sidecar. Or really any drink. Put some Stiggins in there and it will be better!

  16. CW: Food Preparation - Uncooked Meat

    Shake off excess water over sink. Don't touch it with anything, especially a drying cloth!

    Immediately stir meat around in bucket with whisk taking care not to get hands in curing solution.

    This minimises the potential for harmful bacteria transfer from you into the curing solution which is essential for safe results!

  17. A little farm and garden update for the homies:

    One of my favorite things about the #farm is that every day is pretty much a different experiment.

    The mason jar filled with flowers is my attempt at making lilac oil using the #enfleurage technique. After steeping for a few weeks, it should be ready to go.

    The meat photo? #capicolla! It's going to be vacuum sealed for a week, and then it shall go hang in my cellar with the #bresaola I've had hanging for two weeks now.

    And the garden-- I've got everything in this photo planted except those buckets along the back wall. I'm building a raised bed for watermelons back there, but right now the buckets are back there as storage because I don't know what I'm going to use them for (or plant in them) yet. They'll probably end up going between the back fence and the big bed in the middle. If you have any ideas, let me know!

    The next step aside from the watermelon bed is to finish laying cardboard out to kill weeds, and then cover it with wood chips and level everything, so it doesn't look, well, like it does in the photo.

    Also, if you havent figured it out yet... I'm in my cottage core era now.

    #gardening #gardens #homestead #homesteading #homesteader #dailyexperiments #experiments #experimenting #foodscience #food #selfreliant #selfreliance #selfsufficient #selfsufficiency #howdoesyourgardengrow #foodforest #wisconsin #southeasternwisconsin #farmtotable #healthy #growyourownproduce #growyourownfood #knowwhatsinyourfood #supportfarmers #supportyourlocalfarmer #supportyourlocalfarmers #federated #fediverse #outdoors

  18. LIVING NOW, PLANNING LATER: LESSONS FROM DIE WITH ZERO

    I recently finished reading Die With Zero by Bill Perkins. The central idea is that the only guaranteed moment is the one you are living in right now, so the real question becomes: how do you maximize life, experiences, and finances in a way that still leaves your future self-supported?  

    It is a provocative idea because the tension between enjoying the present and preparing for the future is real and constant.  

    Over time, I’ve learned that for me, balance begins with understanding who you are, where you are, what you value, and how those values show up in your financial choices. When those pieces are unclear, any system, even the good ones, starts to crumble. 

    1. One Person’s Meat Is Another Person’s Poison 

    Many of us grow up in environments that blend shame, guilt, and unspoken expectations. These forces shape our relationship with money long before we ever earn a paycheque. When the messages around us tell us what we should want, how we should behave, or what a “responsible adult” spends on, it becomes difficult to know what we genuinely like, want, or value. This disconnection breaks our internal compass, leading to misalignment and autopilot habits we would never choose intentionally. 

    The first step in creating balance is naming without shame what a good life looks like for you. Not the version you think you should want, or the version your family, friends, or social media celebrate, but the one that fits your actual personality and priorities. The next step is accepting the cost of that vision.  

    Every meaningful life has a cost, financial or otherwise. When you accept those costs, you reduce the likelihood of reaching the end of life filled with regret about the things you never allowed yourself to do. 

    Ideally, you should look at your bank statement and clearly see your values reflected. Instead, many of us spend and save unintentionally. We follow generic financial scripts—save a million dollars, retire at sixty-five, invest aggressively, deny yourself now and reward yourself later—without pausing to evaluate whether those goals align with the life we want. 

    Consider:  

    • How would I live if I had one day left? One month? One year? Twenty years? 

    Your answers will shift, and the shifts matter. 

    • What spending brings guilt or shame? 

    • Does that feeling come from misalignment with your own goals, or from messages inherited from others? 

    • What trade-offs am I making when I choose to save, spend, or prioritize something? 

    Your honest answers shape your financial behavior more than anything else.  

    2. Know your numbers (without obsessing) 

    My sister once walked into the bank because every month her credit card bill shocked her. She was convinced some digital creature was secretly siphoning her money. The bank representative pulled up her statements, and she was forced to confirm that, yes, she had authorized every single purchase.  

    She did not even feel guilty about how she spent—she simply was not in control. Many of those purchases were not intentional, so she did not enjoy them. 

    You need to know where your money goes, but it does not need to be complicated. Start with the basics. List your fixed costs: rent, groceries, insurance, debt payments. Then estimate your variable costs: entertainment, dining, and hobbies.  

    Pull three to six months of bank and credit card statements and find your averages. This gives you a realistic baseline for what “normal” looks like.  

    You do not need to track every cent. Once you understand your baseline, you can make calmer and more informed decisions. 

    3. There are seasons 

    A major takeaway from Die with Zero is the idea of time buckets, recognizing that life has seasons and each season allows for different types of experiences. As you age, your energy, health, and interests shift. I still enjoy traveling, but the idea of shared bathrooms and cold meals now feels miserable. In my twenties, I tolerated it easily, sometimes even enjoyed it. Instead of traveling frugally with friends during that season, I spent much of that time working toward money goals that ultimately felt hollow. 

    Money only matters when it’s connected to a purpose. For me, that purpose now includes being in community, helping my family, and making art I love. 

    If your income barely covers essentials, your priority might be building breathing room, a small emergency fund, paying off a high-interest debt, or creating a side stream of income. If you already have savings, then the question becomes whether your spending aligns with your actual goals or whether it’s driven by fear or habit. 

    Instead of a lifelong bucket list, create decade lists: experiences and goals for your twenties, thirties, forties, and beyond. This brings clarity and makes financial decisions more meaningful. 

    4. Time is an important ingredient 

    This is where my perspective diverges slightly from Perkins. He describes a friend who borrows money to travel in his twenties. Perkins argues that “borrowing from your future self” can enhance early adulthood if the debt is modest and not high interest.  

    I understand the logic, but my bias is different: for nonessential expenses, if you cannot afford something, you should generally avoid borrowing to make it happen. 

    Many people already struggle with debt and low savings, and catching up later becomes unrealistic. Time itself is powerful. The earlier you start saving or investing, the less you have to contribute because compounding does most of the work.  

    For example, someone who saves $200 per month from age 25 to 35 and then stops can end up with more at retirement than someone who starts at thirty-five and saves until 65. Time multiplies effort in a way nothing else can. 

    5. Once you set your intention, automate it 

    James Clear’s Atomic Habits emphasizes removing friction. If your goal is to save, automate your transfers so money moves into savings or investments before you can touch it. Automation protects you from indecision, forgetfulness and emotional decision-making. You don’t need to check accounts constantly or react to every market dip. Review your plan once or twice a year, or when major life changes occur, but avoid endless tinkering. Consistency, not perfection, is what makes the difference.  

    6. Make space for joy 

    The goal is not money itself but what money allows: time with family, meaningful experiences, generosity, creativity or peace. Plan joy with the same seriousness you plan savings. Budget for dinner, the trip, the celebration. Remove the guilt and the outside opinions. You can live in the present and prepare for the future if you define what “enough” looks like for both. 

    #billPerkins #CHIDINMAMBANEFO #Column #dieWithZero #fixedCosts #personalFinances #poison #selfImprovement #severance #spareChange

  19. The thread about David Allan’s watercolours of Edinburgh workers in the 18th century; what they looked like and what their jobs entailed

    This thread was originally written and published in October 2020.

    Show me a fireman that’s as dashing and dapper as an 18th century Edinburgh fireman.

    Edinburgh Fireman, David Allan, 1785. CC-by-NC National Galleries Scotland

    The firemark on the bucket and helmet identify him as being in the employ of the “Sun Fire Office”, founded in 1710 in London and one of the first such organisations. His helmet is probably painted leather, and over his shoulder he carries a length of water pipe. The municipal Edinburgh Fire Establishment was not formed until 1824. Owners of buildings could subscribe to a Fire Insurance Office, and in return for payment would receive a plaque to put on the wall of their property. If it were to catch fire, they could call on the firefighters of their insurance to deal with the blaze. In Edinburgh, the Sun Fire Office was at “The Exchange”, the building which would later become the City Chambers.

    This beautiful watercolour sketch is by David Allan, and fortunately there’s more where it came from. Allan hailed from Alloa, born in 1744. The young David was expelled from school when only 10 years old for drawing a caricature of his teacher. Attending the Academy of Art in Glasgow he honed his craft in Rome, where he lived and studied for 10 years. He established himself in Edinburgh after a spell in London that had left him ill and unrecognised, finding success as a painter of family portraits and as a book illustrator (clients included Alan Ramsay). More importantly as far as I am concerned, he also producing a large and interesting body of his own work, of documentary sketches of characters and workers around the town. In this respect he seems to be heavily influenced by Paul Sandby, the “Father of English Watercolour“, who started his artistic professional life as a military draughtsman and cartographer in Edinburgh in the 1750s, and shared an interest in illustrating the everyday scenes of the town.

    The man below is a caddie (a porter). He wears his licence as a badge on his coat, the lowland garb of knee length breeches, stockings, a hodden overcoat and on his head the ubiquitous blue bonnet. His carrying basket rests on the wall behind him. Caddies were of special importance in the town, and were regulated by the council, hence the badge. They were expected to know everything, everyone and everywhere. If a visitor to the town was important enough, they would be allocated a caddy of their own as a porter, message runner and local guide. The term is from the French cadet, and has been applied to the game of golf, where the caddy is the player’s porter and guide.

    An Edinburgh Porter or Caddie, David Allan, 1785. CC-by-NC National Galleries Scotland

    Coalmen at work. The lad on the cart wears a hodden jacket, breeches and the blue bonnet. He loads large lumps of coal onto the back of the other, who wears a short military redcoat. Note the load is taken around the forehead by a leather strap, the same would be the case for the caddie’s basket. Coal was an essential but expensive item; although it was plentiful in the Lothians, the cost of transporting it even a few miles was high due to a combination of poor roads and it being at the mercy of carters. Colliers and coal haulers were at this time still bonded labourer, living an existence very close to slavery, however something about this pair suggest to me they were possibly working as coal merchants, perhaps they were a father and son team? If you read down tro the Salt Seller you can find out more about the sorry existence of the bonded labourers in Scotland before 1799.

    Coalman. David Allan, 1785. CC-by-NC National Galleries Scotland

    Chimney sweeps. Their attire is perhaps more genteel looking than you might imagine, but then again this was a very important trade in a crowded and flammable city with labyrinthine chimney flues. Again they are in the ubiquitous breeches and stockings, hodden overcoats and blue bonnets. The iron ball on the rope would be lowered down the lum to dislodge stoor, and they worked in pairs; one up top and the other down at the hearth, calling to eachother up and down the lum. Things aren’t that much different these days for sweeps, although hey may have refinements like overalls and nylon ropes.

    Two Edinburgh Chimney Sweeps. David Allan, 1785. CC-by-NC National Galleries Scotland

    Two chairmen and their sedan, one of the earliest forms of “public” transport (if you could afford it) in the Old Town; its streets were too narrow and precipitous to make even small horse traps much use. These men are quite smartly turned out, as they would be serving a certain class of clientèle. Long coats with coloured facings, breeches, and at least one blue bonnet. The man with the brimmed hat has has highland-style stockings woven on the bias in a chequered pattern – many if not most chairmen were of the Gàidhealtachd. The two appear to be sharing snuff. The poles of the chair carry a lantern. There was an alternative occupation called a “link man”, who was usually a boy and whose job it was to carry that lantern ahead of the chair (for a fee).

    An Edinburgh Sedan Chair with Two Porters. David Allan, 1785. CC-by-NC National Galleries Scotland

    Sedan chairs first appeared in Edinburgh in the 17th century; there were six public chairs in 1687. A century later in 1779, not that long before Allan painted this sketch, there were 180. The wealthy who had need of frequent transportation around town might keep their own private chairs. Most chairs plied their trade from the Tron Kirk, but a servant would usually be sent to fetch it to the exact location where it was required. The trade and its fares was subject to regulation, as “Hackney Chairs”; in 1768 the basic fee within the city was 6d, rising to 1/6d for up to half a mile outside the city, up to 4/- to hire the chair for the day. There were a number of storage sheds for them around the town, one of which survives, off Tweeddale Court.

    https://www.flickr.com/photos/davids_leicas/44751274650

    A water carrier. Although the supply in Edinburgh by this time was relatively good, it was drawn from a limited number of public wells. Daughters were often sent to fetch water, but if you could afford it you paid a water carrier to bring it to you, very useful considering in the Old Town the wealthy usually lived a couple of floors up, removed from a bit of the noise and stench of the street. This image clearly an ex-military man, he retains his redcoat and his Kilmarnock-style bonnet is of the sort worn by Highland regiments at this time. Based on the basic blue bonnet at one time it would have been decorated with feathers, but these have long since expired. He has a padded leather apron and harness.

    An Edinburgh Water Carrier. David Allan, 1785. CC-by-NC National Galleries Scotland

    Women worked too of course. A Newhaven fishwife in her distinctive uniform which was directly inspired by the Dutch and Flemish tradition; a bright and voluminous set of striped skirts, a cape tied below the beck and a fancy linen cap. The ankles were reputedly always on show. With a heavy creel of fish on her back, supported by a leather strap, and a basket of oysters under her arm, the fishwives would each day make the 2.5 mile, 250ft climb up the hill from Newhaven to the City to sell their wares on the streets or door to door. For the Fisherrow women, who had their own garb, it was a 5 mile walk! Their Scots refrain of Wha’ll o my caller oo? and Wha’ll o my caller herrin? translates as “who’s for fresh oysters?” (or herring). The island of Inchkeith is clearly visible in the background, so she is standing somewhere on the foreshore between Newhaven and Leith.

    An Edinburgh Fishwife. David Allan, 1785. CC-by-NC National Galleries Scotland

    The pioneering photographers of every day life, Hill & Adamson, made a number of studies of Newhaven fishwives 60 years after Allan’s painting. You can see almost nothing had changed in that time.

    Two Newhaven Fishwives, perhaps Mrs Elizabeth (Johnstone) Hall on the right. 1843. Hill & Adamson. Cc-by-NC National Galleries Scotland

    Indeed little more changed in the following 100 years. Up until the 1950s, a dwindling number of Newhaven fishwives, some of the older ones still dressed in this manner, still took their wares into the city to sell door to door, although by this time they allowed themselves the luxury of travelling by tram.

    Fishwives travelling by tram, c. 1920s-30s. From NLS Mackinnon Collection

    A lacemaker. An older woman carrying a “distaff” – a pole from which the strands could be spun. She has an apron over at least 2 layers of plaid, a shawl around her shoulders and a cowled bonnet on her head.

    Lacemaker with a Distaff. David Allan, 1784. CC-by-NC National Galleries Scotland

    And one of Allan’s most unusual and intriguing portraits, not because it shows a servant girl or a milkmaid, but because she is a black woman. This is one of the earliest images of a black woman in Scotland and is clear and compelling evidence that she was part of the town community. This picture was only secured by the National Gallery of Scotland in 2021, at which time the press release said “Looking directly at the viewer, she is shown in working dress, going about her daily duties and set against the backdrop of an elegant Edinburgh street. Her name and life story is unknown, but it is likely that she was a servant, a milkmaid, as suggested by the large vessel or butter churn shown beside her.” Unlike Allan’s other workers sketches, which are always in a fly-on-the-wall style, this one is clearly posed.

    Milkmaid with Butter Churn. David Allan, 1785. CC-by-NC National Galleries Scotland

    An officer of the town, perhaps a bailie, reads a proclamation (“God save the King!”), accompanied by two town guardsmen with drums. The officer has a luxurious blue velvet coat, the guardsmen are in simpler red coats with blue facings and tricorne military hats. We can see the spire of St. Giles in the background. Allan has at least four different sketches of town officers in this uniform; the badge on the chest of the coat is clearly the civic arms of the city.

    Town Officer and Drummers. David Allan, 1785. CC-by-NC National Galleries Scotland

    A member of the town guard, a red military uniform with blue facings and with red breeches. He wears a tricorne hat and carries a “Lochaber axe”, a long-handled pole weapon. The hook on the end was reputed to be for dismounting horsemen but just as likely was to hang the weapon up in the guard room when not in use. The “town rats” or “black banditti” were another class of citizens largely drawn from the Highlands and their nickname gives you an idea of how popular and respected they were by the general public.

    A town guardsman. David Allan, 1785. CC-by-NC National Galleries Scotland

    Not just any old beggar, this unfortunate man’s blue cloak and the prominent tin badge on his breast identify him as a Jockie. The Jockies were King’s Bedesmen, or Blue Gowns; they were a class of beggars by Royal appointment, first licensed by King James VI. Every birthday of the reigning monarch, each Bluegown received a new cloak, their tin badge with the motto “pass and repass“, a Scots shilling for every year of the monarch’s age and their dinner. “Pass and repass” referred to the holder being allowed to pass freely through the land, not being subject to local begging laws or charges of vagrancy. They had a lodge house outside the city; the Jockies Lodge – this is where the neighbourhood of Jock’s Lodge takes its name from.

    “Charity”. A beggar with donkey and children. David Allan, 1785. CC-by-NC National Galleries Scotland

    A salt seller. Again the load is carried in a basket held with a leather strap around the forehead. A cloth in the basket prevents the salt escaping and could be closed to protect it from rain. Salt was produced along the Forth coast wherever there was good access to coal to evaporate seawater; at Joppa pans, Pinkie pans, Prestonpans etc. Salt was vitally important for everyday life as one of the few preservatives available for meat and fish. After the Act of Union in 1707, a favourable tax regime meant boom times for Scottish producers; this favour definitely did not trickle down to the work force however. The trades of coal miners, coal carriers and salters were of vital importance to the national economy, and although the work was highly skilled it was excruciating labour; in recognition the Scottish government forced them into being permanently bondaged labour in 1606. New workers were not subject to this after 1775, but it was not until an Act of 1799 that the last were freed from their obligations. In 1785, a worker of the age shown in the sketch could well have been a bondaged labourer. Changes to the salt taxation regime and imports of cheap European and later English mined rock salt largely killed off the Scottish sea salt industry in the first half of the 19th century.

    A salter. David Allan, 1785. CC-by-NC National Galleries Scotland

    Note to readers: unfortunately in April 2026, a third-party plug-in more than exceeded its authority and broke many of the image links on this site. No images were lost but I will have to restore them page-by-page, which may take some time. In the meantime please bear with me while I go about rectifying this issue.

    If you have found this site useful, informative or amusing then you can help contribute towards its running costs by supporting me on ko-fi. This includes my commitment to keeping it 100% advert and AI free for all time coming, and in helping to find further unusual stories to bring you by acquiring books and paying for research.
    Or please do just share this post on social media or amongst friends and like-minded people, sites like this thrive on being shared.

    Explore Threadinburgh by map:

    Travelers' Map is loading...
    If you see this after your page is loaded completely, leafletJS files are missing.

    These threads © 2017-2026, Andy Arthur.

    NO AI TRAINING: Any use of the contents of this website to “train” generative artificial intelligence (AI) technologies to generate text is expressly prohibited. The author reserves all rights to license uses of this work for generative AI training and development of machine learning language models.

    #Lochend #Logan #Restalrig #StMargaret
  20. Empire of Dragons @dragonshortstories.wordpress.com@dragonshortstories.wordpress.com ·

    Eternally, Tea

    “Try again.”

    Jourdain hit the floor hard, his elbow hitting before the rest of his body. He cried out in pain, but the instructor urged him to get up. She stood before him, ready for his next move.

    “When you’re in the field, you have to disregard pain. The more hits you take, the stronger you become. Your species operates in this manner.”

    Jourdain sighs, “I know. I know.”

    “Then act like it.”

    The instructor comes at him with another round of blows. He blocks as best as he can but reacts slowly when she jabs him on his right side, and again to the left side of his face.

    Jourdain hit the ground head first, immobilized and knocked out.

    “God, this one is a pain to train,” she sighs. Another day wasted.

    From above, someone was watching the entire training session. They made their assessment mentally, stepping away from the window.

    * * *

    At the end of the training week, students could spend their weekly allowance on food. All of them gathered in the corridor where their food could be ordered. Jourdain walks up to one of the many touchscreen ordering stations, scrolling through the menu. He wanted a slice of strawberry shortcake; the most expensive item. Unfortunately for him, he didn’t eat enough, and he had to eat this week and the next week, and the week after.

    He sighs heavily, opting for another bucket of leafy greens and spicy chicken. No cake this week.

    He sits alone at a table, playing with his food. If he’s hungry enough, he’ll eat it all eventually.

    A redhead girl takes a seat next to him. His friend, Ruvi. “How are you feeling?”

    He shakes his head, “I’m tired and I hurt.”

    Ruvi leaves her seat and gets behind him, gently wrapping her hands around him and brushing her face against his, hoping to ease his sorrows from training.

    “I foresee you will feel better tomorrow.”

    It’s cute, Jourdain will admit that. It brings a smile to his face. He closes his eyes, absorbing all of the good energy he can feel.

    “Thank you, Ruvi.”

    “No problem, now eat!”

    She hops back to her seat while he douses his spinach in salad dressing.

    “How was training for you?” He asked, finally taking a bite of the lettuce and fish.

    “I did goooood,” Ruvi flexes her arm and pats her muscle, “I got this. Imma be top dog in no time.”

    “Yeah,” his smile is warm, but his thoughts hinge on sadness. ‘We’ll be separated.’

    Being a good student meant going into the field early. Ruvi was proving her worth. Soon she’ll be out in the top class’s barracks with her own team. Away from him.

    He plays with his food again, taking the longest to eat.

    Ten minutes pass,

    Twenty.

    Ruvi had finished her food. She sat there playing on her phone, tap tap tapping away.

    Twenty five minutes.

    “Ay, are you okay?”

    “Huh?” He looks up, noticing the concern on her face.

    “Oh- uhh…”

    It takes him a moment to muster the courage to tell a half truth.

    “I don’t want us to be separated.”

    Ruvi pauses, her cheeks turning a delicate shade of red.

    ‘Me leaving has caused him this much pain?’

    Her eyes twinkle with welled tears, she gently places her hands atop his, “Jourdain.”

    His eyes met hers, noticing the tears that fell.

    “…”

    “I’m sorry you’ve been suffering this whole time over me.”

    “I… uh…”

    “If you want us to stay together, you have to be the best as hard as you can.”

    He nods, staying silent.

    “Now, eat so we can sleep and be prepared for tomorrow.”

    He feels a little better, finally being able to eat just a little faster.

    * * *

    Another training session, more pain. This week it was more of the same with minimal improvement to his reaction time. The instructor would beat the shit out of him and he would be knocked out for hours.

    The next week? The same thing? The week after, hmm… there’s some improvement.

    He orders fish and lettuce for both weeks.

    The week after? Typical. Leafy greens with a side of meat. Oh—and pain from training.

    The next month? More leafy greens with fish or meat.

    And his worst fear.

    Ruvi and several other students were given a graduation ceremony in the atrium.

    Jourdain watched as she was onstage, shaking hands with the Director of Research and the overseer of this entire program.

    When she looked to the audience, her gaze settled on him and only him. She took the microphone, intent on leaving a message meant for him.

    “I know it’s hard, but we are dragons. We persevere. Our species has struggled and struggled on this planet, but these struggles always make us strong. So please, let it make you stronger,” she holds out her hand to the audience, to him, “and join me in the top class.”

    Jourdain’s whole face flushed red. He knew she was speaking to him. He knew that time had run out. No more would he have her good energy, her happy vibes, her presence and her jokes to pull him through.

    Seeing her again is his goal. He lets it become the fuel that keeps him pushing through the intense training.

    His reaction time improves slightly, but he uses his intelligence to avoid his trainer and strike her at her weak points, like grabbing her tail and BITING IT.

    “WHAT THE HELL?”

    She tries to yoink him off but the pain brought her to her knees.

    “I won this time,” he huffs, his teeth and chin covered in her blood.

    She grits her teeth, he did indeed win.

    The person watching them made another mental note. Perhaps it was time to personally talk to this boy.

    Jourdain was preparing to leave when The Director stopped him.

    “You there, CT79, you’ve finally found an inventive way to complete a training session.”

    “Uh, yeah. I guess. I really need to wash my mouth out.”

    “I’ve been keeping an eye on you. Don’t let me down.”

    “I won’t, sir.”

    “Good. You are dismissed.”

    Jourdain kept training, using his wits to climb to the top spot. The desire to meet Ruvi again is what pulls him forward. Over time, he desires that cake less and less. It becomes a footnote in the grand scale of his plans.

    He remembers those old days as he stares down the strawberry cake in front of him. Here, dressed in these elegant clothes and mingling with Blavatsky’s club members, he wonders should he follow through with his mission and betray her. Should he become like the other girls and eat the cake, giving in to the dark powers and become a magical girl?

    The club is in chaos. Those who chose to stay with Blavatsky are fighting for their lives against sugar-powered magical heathens. He hears the screaming, he hears the whispers of the cake, but he would rather remain as a vegetable magical girl than to betray the person who gave him a new family.

    He turns away from the cake and arms himself with a broccoli staff and a purple cabbage, both manifesting from his power.

    He takes one last glance back at the cake.

    “Another time. We shall meet again.”

    This story is for a bigger story that I think was called Miss Blavatsky’s High Society Tea Club. It’s about an aristocratic black woman who saves girls who are vulnerable to the influence of sweets and bring them to her club to learn the ways of fruits and vegetables. Lately her club has gotten the attention of someone who wants to bring it down and turn all of the veggie girls into ‘Magical Sweets Girls.’

    The story came to me when I was listening to La Fee Verte by ALI PROJECT and I imagined faceless girls who loved vegetables slowly fall prey to sweets, even fighting over it, while their leader tries to keep the remaining ones away from them. The idea was further developed when I listened to the newer version of Strawberry Pie O Otabe, also by ALI PROJECT.

    Lady Jourdain is a later addition. She is an actually a boy— a femboy posing as a vegetable girl. He planned to betray Blavatsky but as you can see, he just couldn’t.

    Note: CT means cat tail. Refers to Jourdain’s species. Cat Tail dragon. 79 means there were 79 cat tail dragons before him.

    Rate this:

    #books #Dragons #Fantasy #Femboy #Fiction #ScienceFiction #ShortStories #shortStory #Writing
  21. Earlier this year, Cendyne wrote a blog post covering the use of HKDF, building partially upon my own blog post about HKDF and the KDF security definition, but moreso inspired by a cryptographic issue they identified in another company’s product (dubbed AnonCo).

    At the bottom they teased:

    Database cryptography is hard. The above sketch is not complete and does not address several threats! This article is quite long, so I will not be sharing the fixes.

    Cendyne

    If you read Cendyne’s post, you may have nodded along with that remark and not appreciate the degree to which our naga friend was putting it mildly. So I thought I’d share some of my knowledge about real-world database cryptography in an accessible and fun format in the hopes that it might serve as an introduction to the specialization.

    Note: I’m also not going to fix Cendyne’s sketch of AnonCo’s software here–partly because I don’t want to get in the habit of assigning homework or required reading, but mostly because it’s kind of obvious once you’ve learned the basics.

    I’m including art of my fursona in this post… as is tradition for furry blogs.

    If you don’t like furries, please feel free to leave this blog and read about this topic elsewhere.

    Thanks to CMYKat for the awesome stickers.

    Contents

    • Database Cryptography?
    • Cryptography for Relational Databases
      • The Perils of Built-in Encryption Functions
      • Application-Layer Relational Database Cryptography
        • Confused Deputies
        • Canonicalization Attacks
        • Multi-Tenancy
    • Cryptography for NoSQL Databases
      • NoSQL is Built Different
      • Record Authentication
        • Bonus: A Maximally Schema-Free, Upgradeable Authentication Design
    • Searchable Encryption
      • Order-{Preserving, Revealing} Encryption
      • Deterministic Encryption
      • Homomorphic Encryption
      • Searchable Symmetric Encryption (SSE)
      • You Can Have Little a HMAC, As a Treat
    • Intermission
    • Case Study: MongoDB Client-Side Encryption
      • MongoCrypt: The Good
        • How is Queryable Encryption Implemented?
      • MongoCrypt: The Bad
      • MongoCrypt: The Ugly
    • Wrapping Up

    Database Cryptography?

    The premise of database cryptography is deceptively simple: You have a database, of some sort, and you want to store sensitive data in said database.

    The consequences of this simple premise are anything but simple. Let me explain.

    Art: ScruffKerfluff

    The sensitive data you want to store may need to remain confidential, or you may need to provide some sort of integrity guarantees throughout your entire system, or sometimes both. Sometimes all of your data is sensitive, sometimes only some of it is. Sometimes the confidentiality requirements of your data extends to where within a dataset the record you want actually lives. Sometimes that’s true of some data, but not others, so your cryptography has to be flexible to support multiple types of workloads.

    Other times, you just want your disks encrypted at rest so if they grow legs and walk out of the data center, the data cannot be comprehended by an attacker. And you can’t be bothered to work on this problem any deeper. This is usually what compliance requirements cover. Boxes get checked, executives feel safer about their operation, and the whole time nobody has really analyzed the risks they’re facing.

    But we’re not settling for mere compliance on this blog. Furries have standards, after all.

    So the first thing you need to do before diving into database cryptography is threat modelling. The first step in any good threat model is taking inventory; especially of assumptions, requirements, and desired outcomes. A few good starter questions:

    1. What database software is being used? Is it up to date?
    2. What data is being stored in which database software?
    3. How are databases oriented in the network of the overall system?
      • Is your database properly firewalled from the public Internet?
    4. How does data flow throughout the network, and when do these data flows intersect with the database?
      • Which applications talk to the database? What languages are they written in? Which APIs do they use?
    5. How will cryptography secrets be managed?
      • Is there one key for everyone, one key per tenant, etc.?
      • How are keys rotated?
      • Do you use envelope encryption with an HSM, or vend the raw materials to your end devices?

    The first two questions are paramount for deciding how to write software for database cryptography, before you even get to thinking about the cryptography itself.

    (This is not a comprehensive set of questions to ask, either. A formal threat model is much deeper in the weeds.)

    The kind of cryptography protocol you need for, say, storing encrypted CSV files an S3 bucket is vastly different from relational (SQL) databases, which in turn will be significantly different from schema-free (NoSQL) databases.

    Furthermore, when you get to the point that you can start to think about the cryptography, you’ll often need to tackle confidentiality and integrity separately.

    If that’s unclear, think of a scenario like, “I need to encrypt PII, but I also need to digitally sign the lab results so I know it wasn’t tampered with at rest.”

    My point is, right off the bat, we’ve got a three-dimensional matrix of complexity to contend with:

    1. On one axis, we have the type of database.
      • Flat-file
      • Relational
      • Schema-free
    2. On another, we have the basic confidentiality requirements of the data.
      • Field encryption
      • Row encryption
      • Column encryption
      • Unstructured record encryption
      • Encrypting entire collections of records
    3. Finally, we have the integrity requirements of the data.
      • Field authentication
      • Row/column authentication
      • Unstructured record authentication
      • Collection authentication (based on e.g. Sparse Merkle Trees)

    And then you have a fourth dimension that often falls out of operational requirements for databases: Searchability.

    Why store data in a database if you have no way to index or search the data for fast retrieval?

    Credit: Harubaki

    If you’re starting to feel overwhelmed, you’re not alone. A lot of developers drastically underestimate the difficulty of the undertaking, until they run head-first into the complexity.

    Some just phone it in with AES_Encrypt() calls in their MySQL queries. (Too bad ECB mode doesn’t provide semantic security!)

    Which brings us to the meat of this blog post: The actual cryptography part.

    Cryptography is the art of transforming information security problems into key management problems.

    Former coworker

    Note: In the interest of time, I’m skipping over flat files and focusing instead on actual database technologies.

    Cryptography for Relational Databases

    Encrypting data in an SQL database seems simple enough, even if you’ve managed to shake off the complexity I teased from the introduction.

    You’ve got data, you’ve got a column on a table. Just encrypt the data and shove it in a cell on that column and call it a day, right?

    But, alas, this is a trap. There are so many gotchas that I can’t weave a coherent, easy-to-follow narrative between them all.

    So let’s start with a simple question: where and how are you performing your encryption?

    The Perils of Built-in Encryption Functions

    MySQL provides functions called AES_Encrypt and AES_Decrypt, which many developers have unfortunately decided to rely on in the past.

    It’s unfortunate because these functions implement ECB mode. To illustrate why ECB mode is bad, I encrypted one of my art commissions with AES in ECB mode:

    Art by Riley, encrypted with AES-ECB

    The problems with ECB mode aren’t exactly “you can see the image through it,” because ECB-encrypting a compressed image won’t have redundancy (and thus can make you feel safer than you are).

    ECB art is a good visual for the actual issue you should care about, however: A lack of semantic security.

    A cryptosystem is considered semantically secure if observing the ciphertext doesn’t reveal information about the plaintext (except, perhaps, the length; which all cryptosystems leak to some extent). More information here.

    ECB art isn’t to be confused with ECB poetry, which looks like this:

    Oh little one, you’re growing up
    You’ll soon be writing C
    You’ll treat your ints as pointers
    You’ll nest the ternary
    You’ll cut and paste from github
    And try cryptography
    But even in your darkest hour
    Do not use ECB

    CBC’s BEASTly when padding’s abused
    And CTR’s fine til a nonce is reused
    Some say it’s a CRIME to compress then encrypt
    Or store keys in the browser (or use javascript)
    Diffie Hellman will collapse if hackers choose your g
    And RSA is full of traps when e is set to 3
    Whiten! Blind! In constant time! Don’t write an RNG!
    But failing all, and listen well: Do not use ECB

    They’ll say “It’s like a one-time-pad!
    The data’s short, it’s not so bad
    the keys are long–they’re iron clad
    I have a PhD!”
    And then you’re front page Hacker News
    Your passwords cracked–Adobe Blues.
    Don’t leave your penguins showing through,
    Do not use ECB

    — Ben Nagy, PoC||GTFO 0x04:13

    Most people reading this probably know better than to use ECB mode already, and don’t need any of these reminders, but there is still a lot of code that inadvertently uses ECB mode to encrypt data in the database.

    Also, SHOW processlist; leaks your encryption keys. Oops.

    Credit: CMYKatt

    Application-layer Relational Database Cryptography

    Whether burned by ECB or just cautious about not giving your secrets to the system that stores all the ciphertext protected by said secret, a common next step for developers is to simply encrypt in their server-side application code.

    And, yes, that’s part of the answer. But how you encrypt is important.

    Credit: Harubaki

    “I’ll encrypt with CBC mode.”
    If you don’t authenticate your ciphertext, you’ll be sorry. Maybe try again?

    “Okay, fine, I’ll use an authenticated mode like GCM.”
    Did you remember to make the table and column name part of your AAD? What about the primary key of the record?

    “What on Earth are you talking about, Soatok?”
    Welcome to the first footgun of database cryptography!

    Confused Deputies

    Encrypting your sensitive data is necessary, but not sufficient. You need to also bind your ciphertexts to the specific context in which they are stored.

    To understand why, let’s take a step back: What specific threat does encrypting your database records protect against?

    We’ve already established that “your disks walk out of the datacenter” is a “full disk encryption” problem, so if you’re using application-layer cryptography to encrypt data in a relational database, your threat model probably involves unauthorized access to the database server.

    What, then, stops an attacker from copying ciphertexts around?

    Credit: CMYKatt

    Let’s say I have a legitimate user account with an ID 12345, and I want to read your street address, but it’s encrypted in the database. But because I’m a clever hacker, I have unfettered access to your relational database server.

    All I would need to do is simply…

    UPDATE table SET addr_encrypted = 'your-ciphertext' WHERE id = 12345

    …and then access the application through my legitimate access. Bam, data leaked. As an attacker, I can probably even copy fields from other columns and it will just decrypt. Even if you’re using an authenticated mode.

    We call this a confused deputy attack, because the deputy (the component of the system that has been delegated some authority or privilege) has become confused by the attacker, and thus undermined an intended security goal.

    The fix is to use the AAD parameter from the authenticated mode to bind the data to a given context. (AAD = Additional Authenticated Data.)

    - $addr = aes_gcm_encrypt($addr, $key);+ $addr = aes_gcm_encrypt($addr, $key, canonicalize([+     $tableName,+     $columnName,+     $primaryKey+ ]);

    Now if I start cutting and pasting ciphertexts around, I get a decryption failure instead of silently decrypting plaintext.

    This may sound like a specific vulnerability, but it’s more of a failure to understand an important general lesson with database cryptography:

    Where your data lives is part of its identity, and MUST be authenticated.

    Soatok’s Rule of Database Cryptography

    Canonicalization Attacks

    In the previous section, I introduced a pseudocode called canonicalize(). This isn’t a pasto from some reference code; it’s an important design detail that I will elaborate on now.

    First, consider you didn’t do anything to canonicalize your data, and you just joined strings together and called it a day…

    function dumbCanonicalize(    string $tableName,    string $columnName,    string|int $primaryKey): string {    return $tableName . '_' . $columnName . '#' . $primaryKey;}

    Consider these two inputs to this function:

    1. dumbCanonicalize('customers', 'last_order_uuid', 123);
    2. dumbCanonicalize('customers_last_order', 'uuid', 123);

    In this case, your AAD would be the same, and therefore, your deputy can still be confused (albeit in a narrower use case).

    In Cendyne’s article, AnonCo did something more subtle: The canonicalization bug created a collision on the inputs to HKDF, which resulted in an unintentional key reuse.

    Up until this point, their mistake isn’t relevant to us, because we haven’t even explored key management at all. But the same design flaw can re-emerge in multiple locations, with drastically different consequence.

    Multi-Tenancy

    Once you’ve implemented a mitigation against Confused Deputies, you may think your job is done. And it very well could be.

    Often times, however, software developers are tasked with building support for Bring Your Own Key (BYOK).

    This is often spawned from a specific compliance requirement (such as cryptographic shredding; i.e. if you erase the key, you can no longer recover the plaintext, so it may as well be deleted).

    Other times, this is driven by a need to cut costs: Storing different users’ data in the same database server, but encrypting it such that they can only encrypt their own records.

    Two things can happen when you introduce multi-tenancy into your database cryptography designs:

    1. Invisible Salamanders becomes a risk, due to multiple keys being possible for any given encrypted record.
    2. Failure to address the risk of Invisible Salamanders can undermine your protection against Confused Deputies, thereby returning you to a state before you properly used the AAD.

    So now you have to revisit your designs and ensure you’re using a key-committing authenticated mode, rather than just a regular authenticated mode.

    Isn’t cryptography fun?

    “What Are Invisible Salamanders?”

    This refers to a fun property of AEAD modes based on Polynomical MACs. Basically, if you:

    1. Encrypt one message under a specific key and nonce.
    2. Encrypt another message under a separate key and nonce.

    …Then you can get the same exact ciphertext and authentication tag. Performing this attack requires you to control the keys for both encryption operations.

    This was first demonstrated in an attack against encrypted messaging applications, where a picture of a salamander was hidden from the abuse reporting feature because another attached file had the same authentication tag and ciphertext, and you could trick the system if you disclosed the second key instead of the first. Thus, the salamander is invisible to attackers.

    Art: CMYKat

    We’re not quite done with relational databases yet, but we should talk about NoSQL databases for a bit. The final topic in scope applies equally to both, after all.

    Cryptography for NoSQL Databases

    Most of the topics from relational databases also apply to NoSQL databases, so I shall refrain from duplicating them here. This article is already sufficiently long to read, after all, and I dislike redundancy.

    NoSQL is Built Different

    The main thing that NoSQL databases offer in the service of making cryptographers lose sleep at night is the schema-free nature of NoSQL designs.

    What this means is that, if you’re using a client-side encryption library for a NoSQL database, the previous concerns about confused deputy attacks are amplified by the malleability of the document structure.

    Additionally, the previously discussed cryptographic attacks against the encryption mode may be less expensive for an attacker to pull off.

    Consider the following record structure, which stores a bunch of data stored with AES in CBC mode:

    {  "encrypted-data-key": "<blob>",  "name": "<ciphertext>",  "address": [    "<ciphertext>",    "<ciphertext>"  ],  "social-security": "<ciphertext>",  "zip-code": "<ciphertext>"}

    If this record is decrypted with code that looks something like this:

    $decrypted = [];// ... snip ...foreach ($record['address'] as $i => $addrLine) {    try {        $decrypted['address'][$i] = $this->decrypt($addrLine);    } catch (Throwable $ex) {        // You'd never deliberately do this, but it's for illustration        $this->doSomethingAnOracleCanObserve($i);                // This is more believable, of course:        $this->logDecryptionError($ex, $addrLine);        $decrypted['address'][$i] = '';    }}

    Then you can keep appending rows to the "address" field to reduce the number of writes needed to exploit a padding oracle attack against any of the <ciphertext> fields.

    Art: Harubaki

    This isn’t to say that NoSQL is less secure than SQL, from the context of client-side encryption. However, the powerful feature sets that NoSQL users are accustomed to may also give attackers a more versatile toolkit to work with.

    Record Authentication

    A pedant may point out that record authentication applies to both SQL and NoSQL. However, I mostly only observe this feature in NoSQL databases and document storage systems in the wild, so I’m shoving it in here.

    Encrypting fields is nice and all, but sometimes what you want to know is that your unencrypted data hasn’t been tampered with as it flows through your system.

    The trivial way this is done is by using a digital signature algorithm over the whole record, and then appending the signature to the end. When you go to verify the record, all of the information you need is right there.

    This works well enough for most use cases, and everyone can pack up and go home. Nothing more to see here.

    Except…

    When you’re working with NoSQL databases, you often want systems to be able to write to additional fields, and since you’re working with schema-free blobs of data rather than a normalized set of relatable tables, the most sensible thing to do is to is to append this data to the same record.

    Except, oops! You can’t do that if you’re shoving a digital signature over the record. So now you need to specify which fields are to be included in the signature.

    And you need to think about how to model that in a way that doesn’t prohibit schema upgrades nor allow attackers to perform downgrade attacks. (See below.)

    I don’t have any specific real-world examples here that I can point to of this problem being solved well.

    Art: CMYKat

    Furthermore, as with preventing confused deputy and/or canonicalization attacks above, you must also include the fully qualified path of each field in the data that gets signed.

    As I said with encryption before, but also true here:

    Where your data lives is part of its identity, and MUST be authenticated.

    Soatok’s Rule of Database Cryptography

    This requirement holds true whether you’re using symmetric-key authentication (i.e. HMAC) or asymmetric-key digital signatures (e.g. EdDSA).

    Bonus: A Maximally Schema-Free, Upgradeable Authentication Design

    Art: Harubaki

    Okay, how do you solve this problem so that you can perform updates and upgrades to your schema but without enabling attackers to downgrade the security? Here’s one possible design.

    Let’s say you have two metadata fields on each record:

    1. A compressed binary string representing which fields should be authenticated. This field is, itself, not authenticated. Let’s call this meta-auth.
    2. A compressed binary string representing which of the authenticated fields should also be encrypted. This field is also authenticated. This is at most the same length as the first metadata field. Let’s call this meta-enc.

    Furthermore, you will specify a canonical field ordering for both how data is fed into the signature algorithm as well as the field mappings in meta-auth and meta-enc.

    {  "example": {    "credit-card": {      "number": /* encrypted */,      "expiration": /* encrypted */,      "ccv": /* encrypted */    },    "superfluous": {      "rewards-member": null    }  },  "meta-auth": compress_bools([    true,  /* example.credit-card.number */    true,  /* example.credit-card.expiration */    true,  /* example.credit-card.ccv */    false, /* example.superfluous.rewards-member */    true   /* meta-enc */  ]),  "meta-enc": compress_bools([    true,  /* example.credit-card.number */    true,  /* example.credit-card.expiration */    true,  /* example.credit-card.ccv */    false  /* example.superfluous.rewards-member */  ]),  "signature": /* -- snip -- */}

    When you go to append data to an existing record, you’ll need to update meta-auth to include the mapping of fields based on this canonical ordering to ensure only the intended fields get validated.

    When you update your code to add an additional field that is intended to be signed, you can roll that out for new records and the record will continue to be self-describing:

    • New records will have the additional field flagged as authenticated in meta-auth (and meta-enc will grow)
    • Old records will not, but your code will still sign them successfully
    • To prevent downgrade attacks, simply include a schema version ID as an additional plaintext field that gets authenticated. An attacker who tries to downgrade will need to be able to produce a valid signature too.

    You might think meta-auth gives an attacker some advantage, but this only includes which fields are included in the security boundary of the signature or MAC, which allows unauthenticated data to be appended for whatever operational purpose without having to update signatures or expose signing keys to a wider part of the network.

    {  "example": {    "credit-card": {      "number": /* encrypted */,      "expiration": /* encrypted */,      "ccv": /* encrypted */    },    "superfluous": {      "rewards-member": null    }  },  "meta-auth": compress_bools([    true,  /* example.credit-card.number */    true,  /* example.credit-card.expiration */    true,  /* example.credit-card.ccv */    false, /* example.superfluous.rewards-member */    true,  /* meta-enc */    true   /* meta-version */  ]),  "meta-enc": compress_bools([    true,  /* example.credit-card.number */    true,  /* example.credit-card.expiration */    true,  /* example.credit-card.ccv */    false, /* example.superfluous.rewards-member */    true   /* meta-version */  ]),  "meta-version": 0x01000000,  "signature": /* -- snip -- */}

    If an attacker tries to use the meta-auth field to mess with a record, the best they can hope for is an Invalid Signature exception (assuming the signature algorithm is secure to begin with).

    Even if they keep all of the fields the same, but play around with the structure of the record (e.g. changing the XPath or equivalent), so long as the path is authenticated with each field, breaking this is computationally infeasible.

    Searchable Encryption

    If you’ve managed to make it through the previous sections, congratulations, you now know enough to build a secure but completely useless database.

    Art: CMYKat

    Okay, put away the pitchforks; I will explain.

    Part of the reason why we store data in a database, rather than a flat file, is because we want to do more than just read and write. Sometimes computer scientists want to compute. Almost always, you want to be able to query your database for a subset of records based on your specific business logic needs.

    And so, a database which doesn’t do anything more than store ciphertext and maybe signatures is pretty useless to most people. You’d have better luck selling Monkey JPEGs to furries than convincing most businesses to part with their precious database-driven report generators.

    Art: Sophie

    So whenever one of your users wants to actually use their data, rather than just store it, they’re forced to decide between two mutually exclusive options:

    1. Encrypting the data, to protect it from unauthorized disclosure, but render it useless
    2. Doing anything useful with the data, but leaving it unencrypted in the database

    This is especially annoying for business types that are all in on the Zero Trust buzzword.

    Fortunately, the cryptographers are at it again, and boy howdy do they have a lot of solutions for this problem.

    Order-{Preserving, Revealing} Encryption

    On the fun side of things, you have things like Order-Preserving and Order-Revealing Encryption, which Matthew Green wrote about at length.

    [D]atabase encryption has been a controversial subject in our field. I wish I could say that there’s been an actual debate, but it’s more that different researchers have fallen into different camps, and nobody has really had the data to make their position in a compelling way. There have actually been some very personal arguments made about it.

    Attack of the week: searchable encryption and the ever-expanding leakage function

    The problem with these designs is that they have a significant enough leakage that it no longer provides semantic security.

    From Grubbs, et al. (GLMP, 2019.)
    Colors inverted to fit my blog’s theme better.

    To put it in other words: These designs are only marginally better than ECB mode, and probably deserve their own poems too.

    Order revealing
    Reveals much more than order
    Softcore ECB

    Order preserving
    Semantic security?
    Only in your dreams

    Haiku for your consideration

    Deterministic Encryption

    Here’s a simpler, but also terrible, idea for searchable encryption: Simply give up on semantic security entirely.

    If you recall the AES_{De,En}crypt() functions built into MySQL I mentioned at the start of this article, those are the most common form of deterministic encryption I’ve seen in use.

     SELECT * FROM foo WHERE bar = AES_Encrypt('query', 'key');

    However, there are slightly less bad variants. If you use AES-GCM-SIV with a static nonce, your ciphertexts are fully deterministic, and you can encrypt a small number of distinct records safely before you’re no longer secure.

    From Page 14 of the linked paper. Full view.

    That’s certainly better than nothing, but you also can’t mitigate confused deputy attacks. But we can do better than this.

    Homomorphic Encryption

    In a safer plane of academia, you’ll find homomorphic encryption, which researchers recently demonstrated with serving Wikipedia pages in a reasonable amount of time.

    Homomorphic encryption allows computations over the ciphertext, which will be reflected in the plaintext, without ever revealing the key to the entity performing the computation.

    If this sounds vaguely similar to the conditions that enable chosen-ciphertext attacks, you probably have a good intuition for how it works: RSA is homomorphic to multiplication, AES-CTR is homomorphic to XOR. Fully homomorphic encryption uses lattices, which enables multiple operations but carries a relatively enormous performance cost.

    Art: Harubaki

    Homomorphic encryption sometimes intersects with machine learning, because the notion of training an encrypted model by feeding it encrypted data, then decrypting it after-the-fact is desirable for certain business verticals. Your data scientists never see your data, and you have some plausible deniability about the final ML model this work produces. This is like a Siren song for Venture Capitalist-backed medical technology companies. Tech journalists love writing about it.

    However, a less-explored use case is the ability to encrypt your programs but still get the correct behavior and outputs. Although this sounds like a DRM technology, it’s actually something that individuals could one day use to prevent their ISPs or cloud providers from knowing what software is being executed on the customer’s leased hardware. The potential for a privacy win here is certainly worth pondering, even if you’re a tried and true Pirate Party member.

    Just say “NO” to the copyright cartels.

    Art: CMYKat

    Searchable Symmetric Encryption (SSE)

    Forget about working at the level of fields and rows or individual records. What if we, instead, worked over collections of documents, where each document is viewed as a set of keywords from a keyword space?

    Art: CMYKat

    That’s the basic premise of SSE: Encrypting collections of documents rather than individual records.

    The actual implementation details differ greatly between designs. They also differ greatly in their leakage profiles and susceptibility to side-channel attacks.

    Some schemes use a so-called trapdoor permutation, such as RSA, as one of their building blocks.

    Some schemes only allow for searching a static set of records, while others can accommodate new data over time (with the trade-off between more leakage or worse performance).

    If you’re curious, you can learn more about SSE here, and see some open source SEE implementations online here.

    You’re probably wondering, “If SSE is this well-studied and there are open source implementations available, why isn’t it more widely used?”

    Your guess is as good as mine, but I can think of a few reasons:

    1. The protocols can be a little complicated to implement, and aren’t shipped by default in cryptography libraries (i.e. OpenSSL’s libcrypto or libsodium).
    2. Every known security risk in SSE is the product of a trade-offs, rather than there being a single winner for all use cases that developers can feel comfortable picking.
    3. Insufficient marketing and developer advocacy.
      SSE schemes are mostly of interest to academics, although Seny Kamara (Brown Univeristy professior and one of the luminaries of searchable encryption) did try to develop an app called Pixek which used SSE to encrypt photos.

    Maybe there’s room for a cryptography competition on searchable encryption schemes in the future.

    You Can Have Little a HMAC, As a Treat

    Finally, I can’t talk about searchable encryption without discussing a technique that’s older than dirt by Internet standards, that has been independently reinvented by countless software developers tasked with encrypting database records.

    The oldest version I’ve been able to track down dates to 2006 by Raul Garcia at Microsoft, but I’m not confident that it didn’t exist before.

    The idea I’m alluding to goes like this:

    1. Encrypt your data, securely, using symmetric cryptography.
      (Hopefully your encryption addresses the considerations outlined in the relevant sections above.)
    2. Separately, calculate an HMAC over the unencrypted data with a separate key used exclusively for indexing.

    When you need to query your data, you can just recalculate the HMAC of your challenge and fetch the records that match it. Easy, right?

    Even if you rotate your keys for encryption, you keep your indexing keys static across your entire data set. This lets you have durable indexes for encrypted data, which gives you the ability to do literal lookups for the performance hit of a hash function.

    Additionally, everyone has HMAC in their toolkit, so you don’t have to move around implementations of complex cryptographic building blocks. You can live off the land. What’s not to love?

    Hooray!

    However, if you stopped here, we regret to inform you that your data is no longer indistinguishable from random, which probably undermines the security proof for your encryption scheme.

    How annoying!

    Of course, you don’t have to stop with the addition of plain HMAC to your database encryption software.

    Take a page from Troy Hunt: Truncate the output to provide k-anonymity rather than a direct literal look-up.

    “K-What Now?”

    Imagine you have a full HMAC-SHA256 of the plaintext next to every ciphertext record with a static key, for searchability.

    Each HMAC output corresponds 1:1 with a unique plaintext.

    Because you’re using HMAC with a secret key, an attacker can’t just build a rainbow table like they would when attempting password cracking, but it still leaks duplicate plaintexts.

    For example, an HMAC-SHA256 output might look like this: 04a74e4c0158e34a566785d1a5e1167c4e3455c42aea173104e48ca810a8b1ae

    Art: CMYKat\

    If you were to slice off most of those bytes (e.g. leaving only the last 3, which in the previous example yields a8b1ae), then with sufficient records, multiple plaintexts will now map to the same truncated HMAC tag.

    Which means if you’re only revealing a truncated HMAC tag to the database server (both when storing records or retrieving them), you can now expect false positives due to collisions in your truncated HMAC tag.

    These false positives give your data a discrete set of anonymity (called k-anonymity), which means an attacker with access to your database cannot:

    1. Distinguish between two encrypted records with the same short HMAC tag.
    2. Reverse engineer the short HMAC tag into a single possible plaintext value, even if they can supply candidate queries and study the tags sent to the database.
    Art: CMYKat\

    As with SSE above, this short HMAC technique exposes a trade-off to users.

    • Too much k-anonymity (i.e. too many false positives), and you will have to decrypt-then-discard multiple mismatching records. This can make queries slow.
    • Not enough k-anonymity (i.e. insufficient false positives), and you’re no better off than a full HMAC.

    Even more troublesome, the right amount to truncate is expressed in bits (not bytes), and calculating this value depends on the number of unique plaintext values you anticipate in your dataset. (Fortunately, it grows logarithmically, so you’ll rarely if ever have to tune this.)

    If you’d like to play with this idea, here’s a quick and dirty demo script.

    Intermission

    If you started reading this post with any doubts about Cendyne’s statement that “Database cryptography is hard”, by making it to this point, they’ve probably been long since put to rest.

    Art: Harubaki

    Conversely, anyone that specializes in this topic is probably waiting for me to say anything novel or interesting; their patience wearing thin as I continue to rehash a surface-level introduction of their field without really diving deep into anything.

    Thus, if you’ve read this far, I’d like to demonstrate the application of what I’ve covered thus far into a real-world case study into an database cryptography product.

    Case Study: MongoDB Client-Side Encryption

    MongoDB is an open source schema-free NoSQL database. Last year, MongoDB made waves when they announced Queryable Encryption in their upcoming client-side encryption release.

    Taken from the press release, but adapted for dark themes.

    A statement at the bottom of their press release indicates that this isn’t clown-shoes:

    Queryable Encryption was designed by MongoDB’s Advanced Cryptography Research Group, headed by Seny Kamara and Tarik Moataz, who are pioneers in the field of encrypted search. The Group conducts cutting-edge peer-reviewed research in cryptography and works with MongoDB engineering teams to transfer and deploy the latest innovations in cryptography and privacy to the MongoDB data platform.

    If you recall, I mentioned Seny Kamara in the SSE section of this post. They certainly aren’t wrong about Kamara and Moataz being pioneers in this field.

    So with that in mind, let’s explore the implementation in libmongocrypt and see how it stands up to scrutiny.

    MongoCrypt: The Good

    MongoDB’s encryption library takes key management seriously: They provide a KMS integration for cloud users by default (supporting both AWS and Azure).

    MongoDB uses Encrypt-then-MAC with AES-CBC and HMAC-SHA256, which is congruent to what Signal does for message encryption.

    How Is Queryable Encryption Implemented?

    From the current source code, we can see that MongoCrypt generates several different types of tokens, using HMAC (calculation defined here).

    According to their press release:

    The feature supports equality searches, with additional query types such as range, prefix, suffix, and substring planned for future releases.

    MongoDB Queryable Encryption Announcement

    Which means that most of the juicy details probably aren’t public yet.

    These HMAC-derived tokens are stored wholesale in the data structure, but most are encrypted before storage using AES-CTR.

    There are more layers of encryption (using AEAD), server-side token processing, and more AES-CTR-encrypted edge tokens. All of this is finally serialized (implementation) as one blob for storage.

    Since only the equality operation is currently supported (which is the same feature you’d get from HMAC), it’s difficult to speculate what the full feature set looks like.

    However, since Kamara and Moataz are leading its development, it’s likely that this feature set will be excellent.

    MongoCrypt: The Bad

    Every call to do_encrypt() includes at most the Key ID (but typically NULL) as the AAD. This means that the concerns over Confused Deputies (and NoSQL specifically) are relevant to MongoDB.

    However, even if they did support authenticating the fully qualified path to a field in the AAD for their encryption, their AEAD construction is vulnerable to the kind of canonicalization attack I wrote about previously.

    First, observe this code which assembles the multi-part inputs into HMAC.

    /* Construct the input to the HMAC */uint32_t num_intermediates = 0;_mongocrypt_buffer_t intermediates[3];// -- snip --if (!_mongocrypt_buffer_concat (  &to_hmac, intermediates, num_intermediates)) {   CLIENT_ERR ("failed to allocate buffer");   goto done;}if (hmac == HMAC_SHA_512_256) {   uint8_t storage[64];   _mongocrypt_buffer_t tag = {.data = storage, .len = sizeof (storage)};   if (!_crypto_hmac_sha_512 (crypto, Km, &to_hmac, &tag, status)) {      goto done;   }   // Truncate sha512 to first 256 bits.   memcpy (out->data, tag.data, MONGOCRYPT_HMAC_LEN);} else {   BSON_ASSERT (hmac == HMAC_SHA_256);   if (!_mongocrypt_hmac_sha_256 (crypto, Km, &to_hmac, out, status)) {      goto done;   }}

    The implementation of _mongocrypt_buffer_concat() can be found here.

    If either the implementation of that function, or the code I snipped from my excerpt, had contained code that prefixed every segment of the AAD with the length of the segment (represented as a uint64_t to make overflow infeasible), then their AEAD mode would not be vulnerable to canonicalization issues.

    Using TupleHash would also have prevented this issue.

    Silver lining for MongoDB developers: Because the AAD is either a key ID or NULL, this isn’t exploitable in practice.

    The first cryptographic flaw sort of cancels the second out.

    If the libmongocrypt developers ever want to mitigate Confused Deputy attacks, they’ll need to address this canonicalization issue too.

    MongoCrypt: The Ugly

    MongoCrypt supports deterministic encryption.

    If you specify deterministic encryption for a field, your application passes a deterministic initialization vector to AEAD.

    MongoDB documentation

    We already discussed why this is bad above.

    Wrapping Up

    This was not a comprehensive treatment of the field of database cryptography. There are many areas of this field that I did not cover, nor do I feel qualified to discuss.

    However, I hope anyone who takes the time to read this finds themselves more familiar with the subject.

    Additionally, I hope any developers who think “encrypting data in a database is [easy, trivial] (select appropriate)” will find this broad introduction a humbling experience.

    Art: CMYKat

    https://soatok.blog/2023/03/01/database-cryptography-fur-the-rest-of-us/

    #appliedCryptography #blockCipherModes #cryptography #databaseCryptography #databases #encryptedSearch #HMAC #MongoCrypt #MongoDB #QueryableEncryption #realWorldCryptography #security #SecurityGuidance #SQL #SSE #symmetricCryptography #symmetricSearchableEncryption

  22. Earlier this year, Cendyne wrote a blog post covering the use of HKDF, building partially upon my own blog post about HKDF and the KDF security definition, but moreso inspired by a cryptographic issue they identified in another company’s product (dubbed AnonCo).

    At the bottom they teased:

    Database cryptography is hard. The above sketch is not complete and does not address several threats! This article is quite long, so I will not be sharing the fixes.

    Cendyne

    If you read Cendyne’s post, you may have nodded along with that remark and not appreciate the degree to which our naga friend was putting it mildly. So I thought I’d share some of my knowledge about real-world database cryptography in an accessible and fun format in the hopes that it might serve as an introduction to the specialization.

    Note: I’m also not going to fix Cendyne’s sketch of AnonCo’s software here–partly because I don’t want to get in the habit of assigning homework or required reading, but mostly because it’s kind of obvious once you’ve learned the basics.

    I’m including art of my fursona in this post… as is tradition for furry blogs.

    If you don’t like furries, please feel free to leave this blog and read about this topic elsewhere.

    Thanks to CMYKat for the awesome stickers.

    Contents

    • Database Cryptography?
    • Cryptography for Relational Databases
      • The Perils of Built-in Encryption Functions
      • Application-Layer Relational Database Cryptography
        • Confused Deputies
        • Canonicalization Attacks
        • Multi-Tenancy
    • Cryptography for NoSQL Databases
      • NoSQL is Built Different
      • Record Authentication
        • Bonus: A Maximally Schema-Free, Upgradeable Authentication Design
    • Searchable Encryption
      • Order-{Preserving, Revealing} Encryption
      • Deterministic Encryption
      • Homomorphic Encryption
      • Searchable Symmetric Encryption (SSE)
      • You Can Have Little a HMAC, As a Treat
    • Intermission
    • Case Study: MongoDB Client-Side Encryption
      • MongoCrypt: The Good
        • How is Queryable Encryption Implemented?
      • MongoCrypt: The Bad
      • MongoCrypt: The Ugly
    • Wrapping Up

    Database Cryptography?

    The premise of database cryptography is deceptively simple: You have a database, of some sort, and you want to store sensitive data in said database.

    The consequences of this simple premise are anything but simple. Let me explain.

    Art: ScruffKerfluff

    The sensitive data you want to store may need to remain confidential, or you may need to provide some sort of integrity guarantees throughout your entire system, or sometimes both. Sometimes all of your data is sensitive, sometimes only some of it is. Sometimes the confidentiality requirements of your data extends to where within a dataset the record you want actually lives. Sometimes that’s true of some data, but not others, so your cryptography has to be flexible to support multiple types of workloads.

    Other times, you just want your disks encrypted at rest so if they grow legs and walk out of the data center, the data cannot be comprehended by an attacker. And you can’t be bothered to work on this problem any deeper. This is usually what compliance requirements cover. Boxes get checked, executives feel safer about their operation, and the whole time nobody has really analyzed the risks they’re facing.

    But we’re not settling for mere compliance on this blog. Furries have standards, after all.

    So the first thing you need to do before diving into database cryptography is threat modelling. The first step in any good threat model is taking inventory; especially of assumptions, requirements, and desired outcomes. A few good starter questions:

    1. What database software is being used? Is it up to date?
    2. What data is being stored in which database software?
    3. How are databases oriented in the network of the overall system?
      • Is your database properly firewalled from the public Internet?
    4. How does data flow throughout the network, and when do these data flows intersect with the database?
      • Which applications talk to the database? What languages are they written in? Which APIs do they use?
    5. How will cryptography secrets be managed?
      • Is there one key for everyone, one key per tenant, etc.?
      • How are keys rotated?
      • Do you use envelope encryption with an HSM, or vend the raw materials to your end devices?

    The first two questions are paramount for deciding how to write software for database cryptography, before you even get to thinking about the cryptography itself.

    (This is not a comprehensive set of questions to ask, either. A formal threat model is much deeper in the weeds.)

    The kind of cryptography protocol you need for, say, storing encrypted CSV files an S3 bucket is vastly different from relational (SQL) databases, which in turn will be significantly different from schema-free (NoSQL) databases.

    Furthermore, when you get to the point that you can start to think about the cryptography, you’ll often need to tackle confidentiality and integrity separately.

    If that’s unclear, think of a scenario like, “I need to encrypt PII, but I also need to digitally sign the lab results so I know it wasn’t tampered with at rest.”

    My point is, right off the bat, we’ve got a three-dimensional matrix of complexity to contend with:

    1. On one axis, we have the type of database.
      • Flat-file
      • Relational
      • Schema-free
    2. On another, we have the basic confidentiality requirements of the data.
      • Field encryption
      • Row encryption
      • Column encryption
      • Unstructured record encryption
      • Encrypting entire collections of records
    3. Finally, we have the integrity requirements of the data.
      • Field authentication
      • Row/column authentication
      • Unstructured record authentication
      • Collection authentication (based on e.g. Sparse Merkle Trees)

    And then you have a fourth dimension that often falls out of operational requirements for databases: Searchability.

    Why store data in a database if you have no way to index or search the data for fast retrieval?

    Credit: Harubaki

    If you’re starting to feel overwhelmed, you’re not alone. A lot of developers drastically underestimate the difficulty of the undertaking, until they run head-first into the complexity.

    Some just phone it in with AES_Encrypt() calls in their MySQL queries. (Too bad ECB mode doesn’t provide semantic security!)

    Which brings us to the meat of this blog post: The actual cryptography part.

    Cryptography is the art of transforming information security problems into key management problems.

    Former coworker

    Note: In the interest of time, I’m skipping over flat files and focusing instead on actual database technologies.

    Cryptography for Relational Databases

    Encrypting data in an SQL database seems simple enough, even if you’ve managed to shake off the complexity I teased from the introduction.

    You’ve got data, you’ve got a column on a table. Just encrypt the data and shove it in a cell on that column and call it a day, right?

    But, alas, this is a trap. There are so many gotchas that I can’t weave a coherent, easy-to-follow narrative between them all.

    So let’s start with a simple question: where and how are you performing your encryption?

    The Perils of Built-in Encryption Functions

    MySQL provides functions called AES_Encrypt and AES_Decrypt, which many developers have unfortunately decided to rely on in the past.

    It’s unfortunate because these functions implement ECB mode. To illustrate why ECB mode is bad, I encrypted one of my art commissions with AES in ECB mode:

    Art by Riley, encrypted with AES-ECB

    The problems with ECB mode aren’t exactly “you can see the image through it,” because ECB-encrypting a compressed image won’t have redundancy (and thus can make you feel safer than you are).

    ECB art is a good visual for the actual issue you should care about, however: A lack of semantic security.

    A cryptosystem is considered semantically secure if observing the ciphertext doesn’t reveal information about the plaintext (except, perhaps, the length; which all cryptosystems leak to some extent). More information here.

    ECB art isn’t to be confused with ECB poetry, which looks like this:

    Oh little one, you’re growing up
    You’ll soon be writing C
    You’ll treat your ints as pointers
    You’ll nest the ternary
    You’ll cut and paste from github
    And try cryptography
    But even in your darkest hour
    Do not use ECB

    CBC’s BEASTly when padding’s abused
    And CTR’s fine til a nonce is reused
    Some say it’s a CRIME to compress then encrypt
    Or store keys in the browser (or use javascript)
    Diffie Hellman will collapse if hackers choose your g
    And RSA is full of traps when e is set to 3
    Whiten! Blind! In constant time! Don’t write an RNG!
    But failing all, and listen well: Do not use ECB

    They’ll say “It’s like a one-time-pad!
    The data’s short, it’s not so bad
    the keys are long–they’re iron clad
    I have a PhD!”
    And then you’re front page Hacker News
    Your passwords cracked–Adobe Blues.
    Don’t leave your penguins showing through,
    Do not use ECB

    — Ben Nagy, PoC||GTFO 0x04:13

    Most people reading this probably know better than to use ECB mode already, and don’t need any of these reminders, but there is still a lot of code that inadvertently uses ECB mode to encrypt data in the database.

    Also, SHOW processlist; leaks your encryption keys. Oops.

    Credit: CMYKatt

    Application-layer Relational Database Cryptography

    Whether burned by ECB or just cautious about not giving your secrets to the system that stores all the ciphertext protected by said secret, a common next step for developers is to simply encrypt in their server-side application code.

    And, yes, that’s part of the answer. But how you encrypt is important.

    Credit: Harubaki

    “I’ll encrypt with CBC mode.”
    If you don’t authenticate your ciphertext, you’ll be sorry. Maybe try again?

    “Okay, fine, I’ll use an authenticated mode like GCM.”
    Did you remember to make the table and column name part of your AAD? What about the primary key of the record?

    “What on Earth are you talking about, Soatok?”
    Welcome to the first footgun of database cryptography!

    Confused Deputies

    Encrypting your sensitive data is necessary, but not sufficient. You need to also bind your ciphertexts to the specific context in which they are stored.

    To understand why, let’s take a step back: What specific threat does encrypting your database records protect against?

    We’ve already established that “your disks walk out of the datacenter” is a “full disk encryption” problem, so if you’re using application-layer cryptography to encrypt data in a relational database, your threat model probably involves unauthorized access to the database server.

    What, then, stops an attacker from copying ciphertexts around?

    Credit: CMYKatt

    Let’s say I have a legitimate user account with an ID 12345, and I want to read your street address, but it’s encrypted in the database. But because I’m a clever hacker, I have unfettered access to your relational database server.

    All I would need to do is simply…

    UPDATE table SET addr_encrypted = 'your-ciphertext' WHERE id = 12345

    …and then access the application through my legitimate access. Bam, data leaked. As an attacker, I can probably even copy fields from other columns and it will just decrypt. Even if you’re using an authenticated mode.

    We call this a confused deputy attack, because the deputy (the component of the system that has been delegated some authority or privilege) has become confused by the attacker, and thus undermined an intended security goal.

    The fix is to use the AAD parameter from the authenticated mode to bind the data to a given context. (AAD = Additional Authenticated Data.)

    - $addr = aes_gcm_encrypt($addr, $key);+ $addr = aes_gcm_encrypt($addr, $key, canonicalize([+     $tableName,+     $columnName,+     $primaryKey+ ]);

    Now if I start cutting and pasting ciphertexts around, I get a decryption failure instead of silently decrypting plaintext.

    This may sound like a specific vulnerability, but it’s more of a failure to understand an important general lesson with database cryptography:

    Where your data lives is part of its identity, and MUST be authenticated.

    Soatok’s Rule of Database Cryptography

    Canonicalization Attacks

    In the previous section, I introduced a pseudocode called canonicalize(). This isn’t a pasto from some reference code; it’s an important design detail that I will elaborate on now.

    First, consider you didn’t do anything to canonicalize your data, and you just joined strings together and called it a day…

    function dumbCanonicalize(    string $tableName,    string $columnName,    string|int $primaryKey): string {    return $tableName . '_' . $columnName . '#' . $primaryKey;}

    Consider these two inputs to this function:

    1. dumbCanonicalize('customers', 'last_order_uuid', 123);
    2. dumbCanonicalize('customers_last_order', 'uuid', 123);

    In this case, your AAD would be the same, and therefore, your deputy can still be confused (albeit in a narrower use case).

    In Cendyne’s article, AnonCo did something more subtle: The canonicalization bug created a collision on the inputs to HKDF, which resulted in an unintentional key reuse.

    Up until this point, their mistake isn’t relevant to us, because we haven’t even explored key management at all. But the same design flaw can re-emerge in multiple locations, with drastically different consequence.

    Multi-Tenancy

    Once you’ve implemented a mitigation against Confused Deputies, you may think your job is done. And it very well could be.

    Often times, however, software developers are tasked with building support for Bring Your Own Key (BYOK).

    This is often spawned from a specific compliance requirement (such as cryptographic shredding; i.e. if you erase the key, you can no longer recover the plaintext, so it may as well be deleted).

    Other times, this is driven by a need to cut costs: Storing different users’ data in the same database server, but encrypting it such that they can only encrypt their own records.

    Two things can happen when you introduce multi-tenancy into your database cryptography designs:

    1. Invisible Salamanders becomes a risk, due to multiple keys being possible for any given encrypted record.
    2. Failure to address the risk of Invisible Salamanders can undermine your protection against Confused Deputies, thereby returning you to a state before you properly used the AAD.

    So now you have to revisit your designs and ensure you’re using a key-committing authenticated mode, rather than just a regular authenticated mode.

    Isn’t cryptography fun?

    “What Are Invisible Salamanders?”

    This refers to a fun property of AEAD modes based on Polynomical MACs. Basically, if you:

    1. Encrypt one message under a specific key and nonce.
    2. Encrypt another message under a separate key and nonce.

    …Then you can get the same exact ciphertext and authentication tag. Performing this attack requires you to control the keys for both encryption operations.

    This was first demonstrated in an attack against encrypted messaging applications, where a picture of a salamander was hidden from the abuse reporting feature because another attached file had the same authentication tag and ciphertext, and you could trick the system if you disclosed the second key instead of the first. Thus, the salamander is invisible to attackers.

    Art: CMYKat

    We’re not quite done with relational databases yet, but we should talk about NoSQL databases for a bit. The final topic in scope applies equally to both, after all.

    Cryptography for NoSQL Databases

    Most of the topics from relational databases also apply to NoSQL databases, so I shall refrain from duplicating them here. This article is already sufficiently long to read, after all, and I dislike redundancy.

    NoSQL is Built Different

    The main thing that NoSQL databases offer in the service of making cryptographers lose sleep at night is the schema-free nature of NoSQL designs.

    What this means is that, if you’re using a client-side encryption library for a NoSQL database, the previous concerns about confused deputy attacks are amplified by the malleability of the document structure.

    Additionally, the previously discussed cryptographic attacks against the encryption mode may be less expensive for an attacker to pull off.

    Consider the following record structure, which stores a bunch of data stored with AES in CBC mode:

    {  "encrypted-data-key": "<blob>",  "name": "<ciphertext>",  "address": [    "<ciphertext>",    "<ciphertext>"  ],  "social-security": "<ciphertext>",  "zip-code": "<ciphertext>"}

    If this record is decrypted with code that looks something like this:

    $decrypted = [];// ... snip ...foreach ($record['address'] as $i => $addrLine) {    try {        $decrypted['address'][$i] = $this->decrypt($addrLine);    } catch (Throwable $ex) {        // You'd never deliberately do this, but it's for illustration        $this->doSomethingAnOracleCanObserve($i);                // This is more believable, of course:        $this->logDecryptionError($ex, $addrLine);        $decrypted['address'][$i] = '';    }}

    Then you can keep appending rows to the "address" field to reduce the number of writes needed to exploit a padding oracle attack against any of the <ciphertext> fields.

    Art: Harubaki

    This isn’t to say that NoSQL is less secure than SQL, from the context of client-side encryption. However, the powerful feature sets that NoSQL users are accustomed to may also give attackers a more versatile toolkit to work with.

    Record Authentication

    A pedant may point out that record authentication applies to both SQL and NoSQL. However, I mostly only observe this feature in NoSQL databases and document storage systems in the wild, so I’m shoving it in here.

    Encrypting fields is nice and all, but sometimes what you want to know is that your unencrypted data hasn’t been tampered with as it flows through your system.

    The trivial way this is done is by using a digital signature algorithm over the whole record, and then appending the signature to the end. When you go to verify the record, all of the information you need is right there.

    This works well enough for most use cases, and everyone can pack up and go home. Nothing more to see here.

    Except…

    When you’re working with NoSQL databases, you often want systems to be able to write to additional fields, and since you’re working with schema-free blobs of data rather than a normalized set of relatable tables, the most sensible thing to do is to is to append this data to the same record.

    Except, oops! You can’t do that if you’re shoving a digital signature over the record. So now you need to specify which fields are to be included in the signature.

    And you need to think about how to model that in a way that doesn’t prohibit schema upgrades nor allow attackers to perform downgrade attacks. (See below.)

    I don’t have any specific real-world examples here that I can point to of this problem being solved well.

    Art: CMYKat

    Furthermore, as with preventing confused deputy and/or canonicalization attacks above, you must also include the fully qualified path of each field in the data that gets signed.

    As I said with encryption before, but also true here:

    Where your data lives is part of its identity, and MUST be authenticated.

    Soatok’s Rule of Database Cryptography

    This requirement holds true whether you’re using symmetric-key authentication (i.e. HMAC) or asymmetric-key digital signatures (e.g. EdDSA).

    Bonus: A Maximally Schema-Free, Upgradeable Authentication Design

    Art: Harubaki

    Okay, how do you solve this problem so that you can perform updates and upgrades to your schema but without enabling attackers to downgrade the security? Here’s one possible design.

    Let’s say you have two metadata fields on each record:

    1. A compressed binary string representing which fields should be authenticated. This field is, itself, not authenticated. Let’s call this meta-auth.
    2. A compressed binary string representing which of the authenticated fields should also be encrypted. This field is also authenticated. This is at most the same length as the first metadata field. Let’s call this meta-enc.

    Furthermore, you will specify a canonical field ordering for both how data is fed into the signature algorithm as well as the field mappings in meta-auth and meta-enc.

    {  "example": {    "credit-card": {      "number": /* encrypted */,      "expiration": /* encrypted */,      "ccv": /* encrypted */    },    "superfluous": {      "rewards-member": null    }  },  "meta-auth": compress_bools([    true,  /* example.credit-card.number */    true,  /* example.credit-card.expiration */    true,  /* example.credit-card.ccv */    false, /* example.superfluous.rewards-member */    true   /* meta-enc */  ]),  "meta-enc": compress_bools([    true,  /* example.credit-card.number */    true,  /* example.credit-card.expiration */    true,  /* example.credit-card.ccv */    false  /* example.superfluous.rewards-member */  ]),  "signature": /* -- snip -- */}

    When you go to append data to an existing record, you’ll need to update meta-auth to include the mapping of fields based on this canonical ordering to ensure only the intended fields get validated.

    When you update your code to add an additional field that is intended to be signed, you can roll that out for new records and the record will continue to be self-describing:

    • New records will have the additional field flagged as authenticated in meta-auth (and meta-enc will grow)
    • Old records will not, but your code will still sign them successfully
    • To prevent downgrade attacks, simply include a schema version ID as an additional plaintext field that gets authenticated. An attacker who tries to downgrade will need to be able to produce a valid signature too.

    You might think meta-auth gives an attacker some advantage, but this only includes which fields are included in the security boundary of the signature or MAC, which allows unauthenticated data to be appended for whatever operational purpose without having to update signatures or expose signing keys to a wider part of the network.

    {  "example": {    "credit-card": {      "number": /* encrypted */,      "expiration": /* encrypted */,      "ccv": /* encrypted */    },    "superfluous": {      "rewards-member": null    }  },  "meta-auth": compress_bools([    true,  /* example.credit-card.number */    true,  /* example.credit-card.expiration */    true,  /* example.credit-card.ccv */    false, /* example.superfluous.rewards-member */    true,  /* meta-enc */    true   /* meta-version */  ]),  "meta-enc": compress_bools([    true,  /* example.credit-card.number */    true,  /* example.credit-card.expiration */    true,  /* example.credit-card.ccv */    false, /* example.superfluous.rewards-member */    true   /* meta-version */  ]),  "meta-version": 0x01000000,  "signature": /* -- snip -- */}

    If an attacker tries to use the meta-auth field to mess with a record, the best they can hope for is an Invalid Signature exception (assuming the signature algorithm is secure to begin with).

    Even if they keep all of the fields the same, but play around with the structure of the record (e.g. changing the XPath or equivalent), so long as the path is authenticated with each field, breaking this is computationally infeasible.

    Searchable Encryption

    If you’ve managed to make it through the previous sections, congratulations, you now know enough to build a secure but completely useless database.

    Art: CMYKat

    Okay, put away the pitchforks; I will explain.

    Part of the reason why we store data in a database, rather than a flat file, is because we want to do more than just read and write. Sometimes computer scientists want to compute. Almost always, you want to be able to query your database for a subset of records based on your specific business logic needs.

    And so, a database which doesn’t do anything more than store ciphertext and maybe signatures is pretty useless to most people. You’d have better luck selling Monkey JPEGs to furries than convincing most businesses to part with their precious database-driven report generators.

    Art: Sophie

    So whenever one of your users wants to actually use their data, rather than just store it, they’re forced to decide between two mutually exclusive options:

    1. Encrypting the data, to protect it from unauthorized disclosure, but render it useless
    2. Doing anything useful with the data, but leaving it unencrypted in the database

    This is especially annoying for business types that are all in on the Zero Trust buzzword.

    Fortunately, the cryptographers are at it again, and boy howdy do they have a lot of solutions for this problem.

    Order-{Preserving, Revealing} Encryption

    On the fun side of things, you have things like Order-Preserving and Order-Revealing Encryption, which Matthew Green wrote about at length.

    [D]atabase encryption has been a controversial subject in our field. I wish I could say that there’s been an actual debate, but it’s more that different researchers have fallen into different camps, and nobody has really had the data to make their position in a compelling way. There have actually been some very personal arguments made about it.

    Attack of the week: searchable encryption and the ever-expanding leakage function

    The problem with these designs is that they have a significant enough leakage that it no longer provides semantic security.

    From Grubbs, et al. (GLMP, 2019.)
    Colors inverted to fit my blog’s theme better.

    To put it in other words: These designs are only marginally better than ECB mode, and probably deserve their own poems too.

    Order revealing
    Reveals much more than order
    Softcore ECB

    Order preserving
    Semantic security?
    Only in your dreams

    Haiku for your consideration

    Deterministic Encryption

    Here’s a simpler, but also terrible, idea for searchable encryption: Simply give up on semantic security entirely.

    If you recall the AES_{De,En}crypt() functions built into MySQL I mentioned at the start of this article, those are the most common form of deterministic encryption I’ve seen in use.

     SELECT * FROM foo WHERE bar = AES_Encrypt('query', 'key');

    However, there are slightly less bad variants. If you use AES-GCM-SIV with a static nonce, your ciphertexts are fully deterministic, and you can encrypt a small number of distinct records safely before you’re no longer secure.

    From Page 14 of the linked paper. Full view.

    That’s certainly better than nothing, but you also can’t mitigate confused deputy attacks. But we can do better than this.

    Homomorphic Encryption

    In a safer plane of academia, you’ll find homomorphic encryption, which researchers recently demonstrated with serving Wikipedia pages in a reasonable amount of time.

    Homomorphic encryption allows computations over the ciphertext, which will be reflected in the plaintext, without ever revealing the key to the entity performing the computation.

    If this sounds vaguely similar to the conditions that enable chosen-ciphertext attacks, you probably have a good intuition for how it works: RSA is homomorphic to multiplication, AES-CTR is homomorphic to XOR. Fully homomorphic encryption uses lattices, which enables multiple operations but carries a relatively enormous performance cost.

    Art: Harubaki

    Homomorphic encryption sometimes intersects with machine learning, because the notion of training an encrypted model by feeding it encrypted data, then decrypting it after-the-fact is desirable for certain business verticals. Your data scientists never see your data, and you have some plausible deniability about the final ML model this work produces. This is like a Siren song for Venture Capitalist-backed medical technology companies. Tech journalists love writing about it.

    However, a less-explored use case is the ability to encrypt your programs but still get the correct behavior and outputs. Although this sounds like a DRM technology, it’s actually something that individuals could one day use to prevent their ISPs or cloud providers from knowing what software is being executed on the customer’s leased hardware. The potential for a privacy win here is certainly worth pondering, even if you’re a tried and true Pirate Party member.

    Just say “NO” to the copyright cartels.

    Art: CMYKat

    Searchable Symmetric Encryption (SSE)

    Forget about working at the level of fields and rows or individual records. What if we, instead, worked over collections of documents, where each document is viewed as a set of keywords from a keyword space?

    Art: CMYKat

    That’s the basic premise of SSE: Encrypting collections of documents rather than individual records.

    The actual implementation details differ greatly between designs. They also differ greatly in their leakage profiles and susceptibility to side-channel attacks.

    Some schemes use a so-called trapdoor permutation, such as RSA, as one of their building blocks.

    Some schemes only allow for searching a static set of records, while others can accommodate new data over time (with the trade-off between more leakage or worse performance).

    If you’re curious, you can learn more about SSE here, and see some open source SEE implementations online here.

    You’re probably wondering, “If SSE is this well-studied and there are open source implementations available, why isn’t it more widely used?”

    Your guess is as good as mine, but I can think of a few reasons:

    1. The protocols can be a little complicated to implement, and aren’t shipped by default in cryptography libraries (i.e. OpenSSL’s libcrypto or libsodium).
    2. Every known security risk in SSE is the product of a trade-offs, rather than there being a single winner for all use cases that developers can feel comfortable picking.
    3. Insufficient marketing and developer advocacy.
      SSE schemes are mostly of interest to academics, although Seny Kamara (Brown Univeristy professior and one of the luminaries of searchable encryption) did try to develop an app called Pixek which used SSE to encrypt photos.

    Maybe there’s room for a cryptography competition on searchable encryption schemes in the future.

    You Can Have Little a HMAC, As a Treat

    Finally, I can’t talk about searchable encryption without discussing a technique that’s older than dirt by Internet standards, that has been independently reinvented by countless software developers tasked with encrypting database records.

    The oldest version I’ve been able to track down dates to 2006 by Raul Garcia at Microsoft, but I’m not confident that it didn’t exist before.

    The idea I’m alluding to goes like this:

    1. Encrypt your data, securely, using symmetric cryptography.
      (Hopefully your encryption addresses the considerations outlined in the relevant sections above.)
    2. Separately, calculate an HMAC over the unencrypted data with a separate key used exclusively for indexing.

    When you need to query your data, you can just recalculate the HMAC of your challenge and fetch the records that match it. Easy, right?

    Even if you rotate your keys for encryption, you keep your indexing keys static across your entire data set. This lets you have durable indexes for encrypted data, which gives you the ability to do literal lookups for the performance hit of a hash function.

    Additionally, everyone has HMAC in their toolkit, so you don’t have to move around implementations of complex cryptographic building blocks. You can live off the land. What’s not to love?

    Hooray!

    However, if you stopped here, we regret to inform you that your data is no longer indistinguishable from random, which probably undermines the security proof for your encryption scheme.

    How annoying!

    Of course, you don’t have to stop with the addition of plain HMAC to your database encryption software.

    Take a page from Troy Hunt: Truncate the output to provide k-anonymity rather than a direct literal look-up.

    “K-What Now?”

    Imagine you have a full HMAC-SHA256 of the plaintext next to every ciphertext record with a static key, for searchability.

    Each HMAC output corresponds 1:1 with a unique plaintext.

    Because you’re using HMAC with a secret key, an attacker can’t just build a rainbow table like they would when attempting password cracking, but it still leaks duplicate plaintexts.

    For example, an HMAC-SHA256 output might look like this: 04a74e4c0158e34a566785d1a5e1167c4e3455c42aea173104e48ca810a8b1ae

    Art: CMYKat\

    If you were to slice off most of those bytes (e.g. leaving only the last 3, which in the previous example yields a8b1ae), then with sufficient records, multiple plaintexts will now map to the same truncated HMAC tag.

    Which means if you’re only revealing a truncated HMAC tag to the database server (both when storing records or retrieving them), you can now expect false positives due to collisions in your truncated HMAC tag.

    These false positives give your data a discrete set of anonymity (called k-anonymity), which means an attacker with access to your database cannot:

    1. Distinguish between two encrypted records with the same short HMAC tag.
    2. Reverse engineer the short HMAC tag into a single possible plaintext value, even if they can supply candidate queries and study the tags sent to the database.
    Art: CMYKat\

    As with SSE above, this short HMAC technique exposes a trade-off to users.

    • Too much k-anonymity (i.e. too many false positives), and you will have to decrypt-then-discard multiple mismatching records. This can make queries slow.
    • Not enough k-anonymity (i.e. insufficient false positives), and you’re no better off than a full HMAC.

    Even more troublesome, the right amount to truncate is expressed in bits (not bytes), and calculating this value depends on the number of unique plaintext values you anticipate in your dataset. (Fortunately, it grows logarithmically, so you’ll rarely if ever have to tune this.)

    If you’d like to play with this idea, here’s a quick and dirty demo script.

    Intermission

    If you started reading this post with any doubts about Cendyne’s statement that “Database cryptography is hard”, by making it to this point, they’ve probably been long since put to rest.

    Art: Harubaki

    Conversely, anyone that specializes in this topic is probably waiting for me to say anything novel or interesting; their patience wearing thin as I continue to rehash a surface-level introduction of their field without really diving deep into anything.

    Thus, if you’ve read this far, I’d like to demonstrate the application of what I’ve covered thus far into a real-world case study into an database cryptography product.

    Case Study: MongoDB Client-Side Encryption

    MongoDB is an open source schema-free NoSQL database. Last year, MongoDB made waves when they announced Queryable Encryption in their upcoming client-side encryption release.

    Taken from the press release, but adapted for dark themes.

    A statement at the bottom of their press release indicates that this isn’t clown-shoes:

    Queryable Encryption was designed by MongoDB’s Advanced Cryptography Research Group, headed by Seny Kamara and Tarik Moataz, who are pioneers in the field of encrypted search. The Group conducts cutting-edge peer-reviewed research in cryptography and works with MongoDB engineering teams to transfer and deploy the latest innovations in cryptography and privacy to the MongoDB data platform.

    If you recall, I mentioned Seny Kamara in the SSE section of this post. They certainly aren’t wrong about Kamara and Moataz being pioneers in this field.

    So with that in mind, let’s explore the implementation in libmongocrypt and see how it stands up to scrutiny.

    MongoCrypt: The Good

    MongoDB’s encryption library takes key management seriously: They provide a KMS integration for cloud users by default (supporting both AWS and Azure).

    MongoDB uses Encrypt-then-MAC with AES-CBC and HMAC-SHA256, which is congruent to what Signal does for message encryption.

    How Is Queryable Encryption Implemented?

    From the current source code, we can see that MongoCrypt generates several different types of tokens, using HMAC (calculation defined here).

    According to their press release:

    The feature supports equality searches, with additional query types such as range, prefix, suffix, and substring planned for future releases.

    MongoDB Queryable Encryption Announcement

    Which means that most of the juicy details probably aren’t public yet.

    These HMAC-derived tokens are stored wholesale in the data structure, but most are encrypted before storage using AES-CTR.

    There are more layers of encryption (using AEAD), server-side token processing, and more AES-CTR-encrypted edge tokens. All of this is finally serialized (implementation) as one blob for storage.

    Since only the equality operation is currently supported (which is the same feature you’d get from HMAC), it’s difficult to speculate what the full feature set looks like.

    However, since Kamara and Moataz are leading its development, it’s likely that this feature set will be excellent.

    MongoCrypt: The Bad

    Every call to do_encrypt() includes at most the Key ID (but typically NULL) as the AAD. This means that the concerns over Confused Deputies (and NoSQL specifically) are relevant to MongoDB.

    However, even if they did support authenticating the fully qualified path to a field in the AAD for their encryption, their AEAD construction is vulnerable to the kind of canonicalization attack I wrote about previously.

    First, observe this code which assembles the multi-part inputs into HMAC.

    /* Construct the input to the HMAC */uint32_t num_intermediates = 0;_mongocrypt_buffer_t intermediates[3];// -- snip --if (!_mongocrypt_buffer_concat (  &to_hmac, intermediates, num_intermediates)) {   CLIENT_ERR ("failed to allocate buffer");   goto done;}if (hmac == HMAC_SHA_512_256) {   uint8_t storage[64];   _mongocrypt_buffer_t tag = {.data = storage, .len = sizeof (storage)};   if (!_crypto_hmac_sha_512 (crypto, Km, &to_hmac, &tag, status)) {      goto done;   }   // Truncate sha512 to first 256 bits.   memcpy (out->data, tag.data, MONGOCRYPT_HMAC_LEN);} else {   BSON_ASSERT (hmac == HMAC_SHA_256);   if (!_mongocrypt_hmac_sha_256 (crypto, Km, &to_hmac, out, status)) {      goto done;   }}

    The implementation of _mongocrypt_buffer_concat() can be found here.

    If either the implementation of that function, or the code I snipped from my excerpt, had contained code that prefixed every segment of the AAD with the length of the segment (represented as a uint64_t to make overflow infeasible), then their AEAD mode would not be vulnerable to canonicalization issues.

    Using TupleHash would also have prevented this issue.

    Silver lining for MongoDB developers: Because the AAD is either a key ID or NULL, this isn’t exploitable in practice.

    The first cryptographic flaw sort of cancels the second out.

    If the libmongocrypt developers ever want to mitigate Confused Deputy attacks, they’ll need to address this canonicalization issue too.

    MongoCrypt: The Ugly

    MongoCrypt supports deterministic encryption.

    If you specify deterministic encryption for a field, your application passes a deterministic initialization vector to AEAD.

    MongoDB documentation

    We already discussed why this is bad above.

    Wrapping Up

    This was not a comprehensive treatment of the field of database cryptography. There are many areas of this field that I did not cover, nor do I feel qualified to discuss.

    However, I hope anyone who takes the time to read this finds themselves more familiar with the subject.

    Additionally, I hope any developers who think “encrypting data in a database is [easy, trivial] (select appropriate)” will find this broad introduction a humbling experience.

    Art: CMYKat

    https://soatok.blog/2023/03/01/database-cryptography-fur-the-rest-of-us/

    #appliedCryptography #blockCipherModes #cryptography #databaseCryptography #databases #encryptedSearch #HMAC #MongoCrypt #MongoDB #QueryableEncryption #realWorldCryptography #security #SecurityGuidance #SQL #SSE #symmetricCryptography #symmetricSearchableEncryption

  23. Earlier this year, Cendyne wrote a blog post covering the use of HKDF, building partially upon my own blog post about HKDF and the KDF security definition, but moreso inspired by a cryptographic issue they identified in another company’s product (dubbed AnonCo).

    At the bottom they teased:

    Database cryptography is hard. The above sketch is not complete and does not address several threats! This article is quite long, so I will not be sharing the fixes.

    Cendyne

    If you read Cendyne’s post, you may have nodded along with that remark and not appreciate the degree to which our naga friend was putting it mildly. So I thought I’d share some of my knowledge about real-world database cryptography in an accessible and fun format in the hopes that it might serve as an introduction to the specialization.

    Note: I’m also not going to fix Cendyne’s sketch of AnonCo’s software here–partly because I don’t want to get in the habit of assigning homework or required reading, but mostly because it’s kind of obvious once you’ve learned the basics.

    I’m including art of my fursona in this post… as is tradition for furry blogs.

    If you don’t like furries, please feel free to leave this blog and read about this topic elsewhere.

    Thanks to CMYKat for the awesome stickers.

    Contents

    • Database Cryptography?
    • Cryptography for Relational Databases
      • The Perils of Built-in Encryption Functions
      • Application-Layer Relational Database Cryptography
        • Confused Deputies
        • Canonicalization Attacks
        • Multi-Tenancy
    • Cryptography for NoSQL Databases
      • NoSQL is Built Different
      • Record Authentication
        • Bonus: A Maximally Schema-Free, Upgradeable Authentication Design
    • Searchable Encryption
      • Order-{Preserving, Revealing} Encryption
      • Deterministic Encryption
      • Homomorphic Encryption
      • Searchable Symmetric Encryption (SSE)
      • You Can Have Little a HMAC, As a Treat
    • Intermission
    • Case Study: MongoDB Client-Side Encryption
      • MongoCrypt: The Good
        • How is Queryable Encryption Implemented?
      • MongoCrypt: The Bad
      • MongoCrypt: The Ugly
    • Wrapping Up

    Database Cryptography?

    The premise of database cryptography is deceptively simple: You have a database, of some sort, and you want to store sensitive data in said database.

    The consequences of this simple premise are anything but simple. Let me explain.

    Art: ScruffKerfluff

    The sensitive data you want to store may need to remain confidential, or you may need to provide some sort of integrity guarantees throughout your entire system, or sometimes both. Sometimes all of your data is sensitive, sometimes only some of it is. Sometimes the confidentiality requirements of your data extends to where within a dataset the record you want actually lives. Sometimes that’s true of some data, but not others, so your cryptography has to be flexible to support multiple types of workloads.

    Other times, you just want your disks encrypted at rest so if they grow legs and walk out of the data center, the data cannot be comprehended by an attacker. And you can’t be bothered to work on this problem any deeper. This is usually what compliance requirements cover. Boxes get checked, executives feel safer about their operation, and the whole time nobody has really analyzed the risks they’re facing.

    But we’re not settling for mere compliance on this blog. Furries have standards, after all.

    So the first thing you need to do before diving into database cryptography is threat modelling. The first step in any good threat model is taking inventory; especially of assumptions, requirements, and desired outcomes. A few good starter questions:

    1. What database software is being used? Is it up to date?
    2. What data is being stored in which database software?
    3. How are databases oriented in the network of the overall system?
      • Is your database properly firewalled from the public Internet?
    4. How does data flow throughout the network, and when do these data flows intersect with the database?
      • Which applications talk to the database? What languages are they written in? Which APIs do they use?
    5. How will cryptography secrets be managed?
      • Is there one key for everyone, one key per tenant, etc.?
      • How are keys rotated?
      • Do you use envelope encryption with an HSM, or vend the raw materials to your end devices?

    The first two questions are paramount for deciding how to write software for database cryptography, before you even get to thinking about the cryptography itself.

    (This is not a comprehensive set of questions to ask, either. A formal threat model is much deeper in the weeds.)

    The kind of cryptography protocol you need for, say, storing encrypted CSV files an S3 bucket is vastly different from relational (SQL) databases, which in turn will be significantly different from schema-free (NoSQL) databases.

    Furthermore, when you get to the point that you can start to think about the cryptography, you’ll often need to tackle confidentiality and integrity separately.

    If that’s unclear, think of a scenario like, “I need to encrypt PII, but I also need to digitally sign the lab results so I know it wasn’t tampered with at rest.”

    My point is, right off the bat, we’ve got a three-dimensional matrix of complexity to contend with:

    1. On one axis, we have the type of database.
      • Flat-file
      • Relational
      • Schema-free
    2. On another, we have the basic confidentiality requirements of the data.
      • Field encryption
      • Row encryption
      • Column encryption
      • Unstructured record encryption
      • Encrypting entire collections of records
    3. Finally, we have the integrity requirements of the data.
      • Field authentication
      • Row/column authentication
      • Unstructured record authentication
      • Collection authentication (based on e.g. Sparse Merkle Trees)

    And then you have a fourth dimension that often falls out of operational requirements for databases: Searchability.

    Why store data in a database if you have no way to index or search the data for fast retrieval?

    Credit: Harubaki

    If you’re starting to feel overwhelmed, you’re not alone. A lot of developers drastically underestimate the difficulty of the undertaking, until they run head-first into the complexity.

    Some just phone it in with AES_Encrypt() calls in their MySQL queries. (Too bad ECB mode doesn’t provide semantic security!)

    Which brings us to the meat of this blog post: The actual cryptography part.

    Cryptography is the art of transforming information security problems into key management problems.

    Former coworker

    Note: In the interest of time, I’m skipping over flat files and focusing instead on actual database technologies.

    Cryptography for Relational Databases

    Encrypting data in an SQL database seems simple enough, even if you’ve managed to shake off the complexity I teased from the introduction.

    You’ve got data, you’ve got a column on a table. Just encrypt the data and shove it in a cell on that column and call it a day, right?

    But, alas, this is a trap. There are so many gotchas that I can’t weave a coherent, easy-to-follow narrative between them all.

    So let’s start with a simple question: where and how are you performing your encryption?

    The Perils of Built-in Encryption Functions

    MySQL provides functions called AES_Encrypt and AES_Decrypt, which many developers have unfortunately decided to rely on in the past.

    It’s unfortunate because these functions implement ECB mode. To illustrate why ECB mode is bad, I encrypted one of my art commissions with AES in ECB mode:

    Art by Riley, encrypted with AES-ECB

    The problems with ECB mode aren’t exactly “you can see the image through it,” because ECB-encrypting a compressed image won’t have redundancy (and thus can make you feel safer than you are).

    ECB art is a good visual for the actual issue you should care about, however: A lack of semantic security.

    A cryptosystem is considered semantically secure if observing the ciphertext doesn’t reveal information about the plaintext (except, perhaps, the length; which all cryptosystems leak to some extent). More information here.

    ECB art isn’t to be confused with ECB poetry, which looks like this:

    Oh little one, you’re growing up
    You’ll soon be writing C
    You’ll treat your ints as pointers
    You’ll nest the ternary
    You’ll cut and paste from github
    And try cryptography
    But even in your darkest hour
    Do not use ECB

    CBC’s BEASTly when padding’s abused
    And CTR’s fine til a nonce is reused
    Some say it’s a CRIME to compress then encrypt
    Or store keys in the browser (or use javascript)
    Diffie Hellman will collapse if hackers choose your g
    And RSA is full of traps when e is set to 3
    Whiten! Blind! In constant time! Don’t write an RNG!
    But failing all, and listen well: Do not use ECB

    They’ll say “It’s like a one-time-pad!
    The data’s short, it’s not so bad
    the keys are long–they’re iron clad
    I have a PhD!”
    And then you’re front page Hacker News
    Your passwords cracked–Adobe Blues.
    Don’t leave your penguins showing through,
    Do not use ECB

    — Ben Nagy, PoC||GTFO 0x04:13

    Most people reading this probably know better than to use ECB mode already, and don’t need any of these reminders, but there is still a lot of code that inadvertently uses ECB mode to encrypt data in the database.

    Also, SHOW processlist; leaks your encryption keys. Oops.

    Credit: CMYKatt

    Application-layer Relational Database Cryptography

    Whether burned by ECB or just cautious about not giving your secrets to the system that stores all the ciphertext protected by said secret, a common next step for developers is to simply encrypt in their server-side application code.

    And, yes, that’s part of the answer. But how you encrypt is important.

    Credit: Harubaki

    “I’ll encrypt with CBC mode.”
    If you don’t authenticate your ciphertext, you’ll be sorry. Maybe try again?

    “Okay, fine, I’ll use an authenticated mode like GCM.”
    Did you remember to make the table and column name part of your AAD? What about the primary key of the record?

    “What on Earth are you talking about, Soatok?”
    Welcome to the first footgun of database cryptography!

    Confused Deputies

    Encrypting your sensitive data is necessary, but not sufficient. You need to also bind your ciphertexts to the specific context in which they are stored.

    To understand why, let’s take a step back: What specific threat does encrypting your database records protect against?

    We’ve already established that “your disks walk out of the datacenter” is a “full disk encryption” problem, so if you’re using application-layer cryptography to encrypt data in a relational database, your threat model probably involves unauthorized access to the database server.

    What, then, stops an attacker from copying ciphertexts around?

    Credit: CMYKatt

    Let’s say I have a legitimate user account with an ID 12345, and I want to read your street address, but it’s encrypted in the database. But because I’m a clever hacker, I have unfettered access to your relational database server.

    All I would need to do is simply…

    UPDATE table SET addr_encrypted = 'your-ciphertext' WHERE id = 12345

    …and then access the application through my legitimate access. Bam, data leaked. As an attacker, I can probably even copy fields from other columns and it will just decrypt. Even if you’re using an authenticated mode.

    We call this a confused deputy attack, because the deputy (the component of the system that has been delegated some authority or privilege) has become confused by the attacker, and thus undermined an intended security goal.

    The fix is to use the AAD parameter from the authenticated mode to bind the data to a given context. (AAD = Additional Authenticated Data.)

    - $addr = aes_gcm_encrypt($addr, $key);+ $addr = aes_gcm_encrypt($addr, $key, canonicalize([+     $tableName,+     $columnName,+     $primaryKey+ ]);

    Now if I start cutting and pasting ciphertexts around, I get a decryption failure instead of silently decrypting plaintext.

    This may sound like a specific vulnerability, but it’s more of a failure to understand an important general lesson with database cryptography:

    Where your data lives is part of its identity, and MUST be authenticated.

    Soatok’s Rule of Database Cryptography

    Canonicalization Attacks

    In the previous section, I introduced a pseudocode called canonicalize(). This isn’t a pasto from some reference code; it’s an important design detail that I will elaborate on now.

    First, consider you didn’t do anything to canonicalize your data, and you just joined strings together and called it a day…

    function dumbCanonicalize(    string $tableName,    string $columnName,    string|int $primaryKey): string {    return $tableName . '_' . $columnName . '#' . $primaryKey;}

    Consider these two inputs to this function:

    1. dumbCanonicalize('customers', 'last_order_uuid', 123);
    2. dumbCanonicalize('customers_last_order', 'uuid', 123);

    In this case, your AAD would be the same, and therefore, your deputy can still be confused (albeit in a narrower use case).

    In Cendyne’s article, AnonCo did something more subtle: The canonicalization bug created a collision on the inputs to HKDF, which resulted in an unintentional key reuse.

    Up until this point, their mistake isn’t relevant to us, because we haven’t even explored key management at all. But the same design flaw can re-emerge in multiple locations, with drastically different consequence.

    Multi-Tenancy

    Once you’ve implemented a mitigation against Confused Deputies, you may think your job is done. And it very well could be.

    Often times, however, software developers are tasked with building support for Bring Your Own Key (BYOK).

    This is often spawned from a specific compliance requirement (such as cryptographic shredding; i.e. if you erase the key, you can no longer recover the plaintext, so it may as well be deleted).

    Other times, this is driven by a need to cut costs: Storing different users’ data in the same database server, but encrypting it such that they can only encrypt their own records.

    Two things can happen when you introduce multi-tenancy into your database cryptography designs:

    1. Invisible Salamanders becomes a risk, due to multiple keys being possible for any given encrypted record.
    2. Failure to address the risk of Invisible Salamanders can undermine your protection against Confused Deputies, thereby returning you to a state before you properly used the AAD.

    So now you have to revisit your designs and ensure you’re using a key-committing authenticated mode, rather than just a regular authenticated mode.

    Isn’t cryptography fun?

    “What Are Invisible Salamanders?”

    This refers to a fun property of AEAD modes based on Polynomical MACs. Basically, if you:

    1. Encrypt one message under a specific key and nonce.
    2. Encrypt another message under a separate key and nonce.

    …Then you can get the same exact ciphertext and authentication tag. Performing this attack requires you to control the keys for both encryption operations.

    This was first demonstrated in an attack against encrypted messaging applications, where a picture of a salamander was hidden from the abuse reporting feature because another attached file had the same authentication tag and ciphertext, and you could trick the system if you disclosed the second key instead of the first. Thus, the salamander is invisible to attackers.

    Art: CMYKat

    We’re not quite done with relational databases yet, but we should talk about NoSQL databases for a bit. The final topic in scope applies equally to both, after all.

    Cryptography for NoSQL Databases

    Most of the topics from relational databases also apply to NoSQL databases, so I shall refrain from duplicating them here. This article is already sufficiently long to read, after all, and I dislike redundancy.

    NoSQL is Built Different

    The main thing that NoSQL databases offer in the service of making cryptographers lose sleep at night is the schema-free nature of NoSQL designs.

    What this means is that, if you’re using a client-side encryption library for a NoSQL database, the previous concerns about confused deputy attacks are amplified by the malleability of the document structure.

    Additionally, the previously discussed cryptographic attacks against the encryption mode may be less expensive for an attacker to pull off.

    Consider the following record structure, which stores a bunch of data stored with AES in CBC mode:

    {  "encrypted-data-key": "<blob>",  "name": "<ciphertext>",  "address": [    "<ciphertext>",    "<ciphertext>"  ],  "social-security": "<ciphertext>",  "zip-code": "<ciphertext>"}

    If this record is decrypted with code that looks something like this:

    $decrypted = [];// ... snip ...foreach ($record['address'] as $i => $addrLine) {    try {        $decrypted['address'][$i] = $this->decrypt($addrLine);    } catch (Throwable $ex) {        // You'd never deliberately do this, but it's for illustration        $this->doSomethingAnOracleCanObserve($i);                // This is more believable, of course:        $this->logDecryptionError($ex, $addrLine);        $decrypted['address'][$i] = '';    }}

    Then you can keep appending rows to the "address" field to reduce the number of writes needed to exploit a padding oracle attack against any of the <ciphertext> fields.

    Art: Harubaki

    This isn’t to say that NoSQL is less secure than SQL, from the context of client-side encryption. However, the powerful feature sets that NoSQL users are accustomed to may also give attackers a more versatile toolkit to work with.

    Record Authentication

    A pedant may point out that record authentication applies to both SQL and NoSQL. However, I mostly only observe this feature in NoSQL databases and document storage systems in the wild, so I’m shoving it in here.

    Encrypting fields is nice and all, but sometimes what you want to know is that your unencrypted data hasn’t been tampered with as it flows through your system.

    The trivial way this is done is by using a digital signature algorithm over the whole record, and then appending the signature to the end. When you go to verify the record, all of the information you need is right there.

    This works well enough for most use cases, and everyone can pack up and go home. Nothing more to see here.

    Except…

    When you’re working with NoSQL databases, you often want systems to be able to write to additional fields, and since you’re working with schema-free blobs of data rather than a normalized set of relatable tables, the most sensible thing to do is to is to append this data to the same record.

    Except, oops! You can’t do that if you’re shoving a digital signature over the record. So now you need to specify which fields are to be included in the signature.

    And you need to think about how to model that in a way that doesn’t prohibit schema upgrades nor allow attackers to perform downgrade attacks. (See below.)

    I don’t have any specific real-world examples here that I can point to of this problem being solved well.

    Art: CMYKat

    Furthermore, as with preventing confused deputy and/or canonicalization attacks above, you must also include the fully qualified path of each field in the data that gets signed.

    As I said with encryption before, but also true here:

    Where your data lives is part of its identity, and MUST be authenticated.

    Soatok’s Rule of Database Cryptography

    This requirement holds true whether you’re using symmetric-key authentication (i.e. HMAC) or asymmetric-key digital signatures (e.g. EdDSA).

    Bonus: A Maximally Schema-Free, Upgradeable Authentication Design

    Art: Harubaki

    Okay, how do you solve this problem so that you can perform updates and upgrades to your schema but without enabling attackers to downgrade the security? Here’s one possible design.

    Let’s say you have two metadata fields on each record:

    1. A compressed binary string representing which fields should be authenticated. This field is, itself, not authenticated. Let’s call this meta-auth.
    2. A compressed binary string representing which of the authenticated fields should also be encrypted. This field is also authenticated. This is at most the same length as the first metadata field. Let’s call this meta-enc.

    Furthermore, you will specify a canonical field ordering for both how data is fed into the signature algorithm as well as the field mappings in meta-auth and meta-enc.

    {  "example": {    "credit-card": {      "number": /* encrypted */,      "expiration": /* encrypted */,      "ccv": /* encrypted */    },    "superfluous": {      "rewards-member": null    }  },  "meta-auth": compress_bools([    true,  /* example.credit-card.number */    true,  /* example.credit-card.expiration */    true,  /* example.credit-card.ccv */    false, /* example.superfluous.rewards-member */    true   /* meta-enc */  ]),  "meta-enc": compress_bools([    true,  /* example.credit-card.number */    true,  /* example.credit-card.expiration */    true,  /* example.credit-card.ccv */    false  /* example.superfluous.rewards-member */  ]),  "signature": /* -- snip -- */}

    When you go to append data to an existing record, you’ll need to update meta-auth to include the mapping of fields based on this canonical ordering to ensure only the intended fields get validated.

    When you update your code to add an additional field that is intended to be signed, you can roll that out for new records and the record will continue to be self-describing:

    • New records will have the additional field flagged as authenticated in meta-auth (and meta-enc will grow)
    • Old records will not, but your code will still sign them successfully
    • To prevent downgrade attacks, simply include a schema version ID as an additional plaintext field that gets authenticated. An attacker who tries to downgrade will need to be able to produce a valid signature too.

    You might think meta-auth gives an attacker some advantage, but this only includes which fields are included in the security boundary of the signature or MAC, which allows unauthenticated data to be appended for whatever operational purpose without having to update signatures or expose signing keys to a wider part of the network.

    {  "example": {    "credit-card": {      "number": /* encrypted */,      "expiration": /* encrypted */,      "ccv": /* encrypted */    },    "superfluous": {      "rewards-member": null    }  },  "meta-auth": compress_bools([    true,  /* example.credit-card.number */    true,  /* example.credit-card.expiration */    true,  /* example.credit-card.ccv */    false, /* example.superfluous.rewards-member */    true,  /* meta-enc */    true   /* meta-version */  ]),  "meta-enc": compress_bools([    true,  /* example.credit-card.number */    true,  /* example.credit-card.expiration */    true,  /* example.credit-card.ccv */    false, /* example.superfluous.rewards-member */    true   /* meta-version */  ]),  "meta-version": 0x01000000,  "signature": /* -- snip -- */}

    If an attacker tries to use the meta-auth field to mess with a record, the best they can hope for is an Invalid Signature exception (assuming the signature algorithm is secure to begin with).

    Even if they keep all of the fields the same, but play around with the structure of the record (e.g. changing the XPath or equivalent), so long as the path is authenticated with each field, breaking this is computationally infeasible.

    Searchable Encryption

    If you’ve managed to make it through the previous sections, congratulations, you now know enough to build a secure but completely useless database.

    Art: CMYKat

    Okay, put away the pitchforks; I will explain.

    Part of the reason why we store data in a database, rather than a flat file, is because we want to do more than just read and write. Sometimes computer scientists want to compute. Almost always, you want to be able to query your database for a subset of records based on your specific business logic needs.

    And so, a database which doesn’t do anything more than store ciphertext and maybe signatures is pretty useless to most people. You’d have better luck selling Monkey JPEGs to furries than convincing most businesses to part with their precious database-driven report generators.

    Art: Sophie

    So whenever one of your users wants to actually use their data, rather than just store it, they’re forced to decide between two mutually exclusive options:

    1. Encrypting the data, to protect it from unauthorized disclosure, but render it useless
    2. Doing anything useful with the data, but leaving it unencrypted in the database

    This is especially annoying for business types that are all in on the Zero Trust buzzword.

    Fortunately, the cryptographers are at it again, and boy howdy do they have a lot of solutions for this problem.

    Order-{Preserving, Revealing} Encryption

    On the fun side of things, you have things like Order-Preserving and Order-Revealing Encryption, which Matthew Green wrote about at length.

    [D]atabase encryption has been a controversial subject in our field. I wish I could say that there’s been an actual debate, but it’s more that different researchers have fallen into different camps, and nobody has really had the data to make their position in a compelling way. There have actually been some very personal arguments made about it.

    Attack of the week: searchable encryption and the ever-expanding leakage function

    The problem with these designs is that they have a significant enough leakage that it no longer provides semantic security.

    From Grubbs, et al. (GLMP, 2019.)
    Colors inverted to fit my blog’s theme better.

    To put it in other words: These designs are only marginally better than ECB mode, and probably deserve their own poems too.

    Order revealing
    Reveals much more than order
    Softcore ECB

    Order preserving
    Semantic security?
    Only in your dreams

    Haiku for your consideration

    Deterministic Encryption

    Here’s a simpler, but also terrible, idea for searchable encryption: Simply give up on semantic security entirely.

    If you recall the AES_{De,En}crypt() functions built into MySQL I mentioned at the start of this article, those are the most common form of deterministic encryption I’ve seen in use.

     SELECT * FROM foo WHERE bar = AES_Encrypt('query', 'key');

    However, there are slightly less bad variants. If you use AES-GCM-SIV with a static nonce, your ciphertexts are fully deterministic, and you can encrypt a small number of distinct records safely before you’re no longer secure.

    From Page 14 of the linked paper. Full view.

    That’s certainly better than nothing, but you also can’t mitigate confused deputy attacks. But we can do better than this.

    Homomorphic Encryption

    In a safer plane of academia, you’ll find homomorphic encryption, which researchers recently demonstrated with serving Wikipedia pages in a reasonable amount of time.

    Homomorphic encryption allows computations over the ciphertext, which will be reflected in the plaintext, without ever revealing the key to the entity performing the computation.

    If this sounds vaguely similar to the conditions that enable chosen-ciphertext attacks, you probably have a good intuition for how it works: RSA is homomorphic to multiplication, AES-CTR is homomorphic to XOR. Fully homomorphic encryption uses lattices, which enables multiple operations but carries a relatively enormous performance cost.

    Art: Harubaki

    Homomorphic encryption sometimes intersects with machine learning, because the notion of training an encrypted model by feeding it encrypted data, then decrypting it after-the-fact is desirable for certain business verticals. Your data scientists never see your data, and you have some plausible deniability about the final ML model this work produces. This is like a Siren song for Venture Capitalist-backed medical technology companies. Tech journalists love writing about it.

    However, a less-explored use case is the ability to encrypt your programs but still get the correct behavior and outputs. Although this sounds like a DRM technology, it’s actually something that individuals could one day use to prevent their ISPs or cloud providers from knowing what software is being executed on the customer’s leased hardware. The potential for a privacy win here is certainly worth pondering, even if you’re a tried and true Pirate Party member.

    Just say “NO” to the copyright cartels.

    Art: CMYKat

    Searchable Symmetric Encryption (SSE)

    Forget about working at the level of fields and rows or individual records. What if we, instead, worked over collections of documents, where each document is viewed as a set of keywords from a keyword space?

    Art: CMYKat

    That’s the basic premise of SSE: Encrypting collections of documents rather than individual records.

    The actual implementation details differ greatly between designs. They also differ greatly in their leakage profiles and susceptibility to side-channel attacks.

    Some schemes use a so-called trapdoor permutation, such as RSA, as one of their building blocks.

    Some schemes only allow for searching a static set of records, while others can accommodate new data over time (with the trade-off between more leakage or worse performance).

    If you’re curious, you can learn more about SSE here, and see some open source SEE implementations online here.

    You’re probably wondering, “If SSE is this well-studied and there are open source implementations available, why isn’t it more widely used?”

    Your guess is as good as mine, but I can think of a few reasons:

    1. The protocols can be a little complicated to implement, and aren’t shipped by default in cryptography libraries (i.e. OpenSSL’s libcrypto or libsodium).
    2. Every known security risk in SSE is the product of a trade-offs, rather than there being a single winner for all use cases that developers can feel comfortable picking.
    3. Insufficient marketing and developer advocacy.
      SSE schemes are mostly of interest to academics, although Seny Kamara (Brown Univeristy professior and one of the luminaries of searchable encryption) did try to develop an app called Pixek which used SSE to encrypt photos.

    Maybe there’s room for a cryptography competition on searchable encryption schemes in the future.

    You Can Have Little a HMAC, As a Treat

    Finally, I can’t talk about searchable encryption without discussing a technique that’s older than dirt by Internet standards, that has been independently reinvented by countless software developers tasked with encrypting database records.

    The oldest version I’ve been able to track down dates to 2006 by Raul Garcia at Microsoft, but I’m not confident that it didn’t exist before.

    The idea I’m alluding to goes like this:

    1. Encrypt your data, securely, using symmetric cryptography.
      (Hopefully your encryption addresses the considerations outlined in the relevant sections above.)
    2. Separately, calculate an HMAC over the unencrypted data with a separate key used exclusively for indexing.

    When you need to query your data, you can just recalculate the HMAC of your challenge and fetch the records that match it. Easy, right?

    Even if you rotate your keys for encryption, you keep your indexing keys static across your entire data set. This lets you have durable indexes for encrypted data, which gives you the ability to do literal lookups for the performance hit of a hash function.

    Additionally, everyone has HMAC in their toolkit, so you don’t have to move around implementations of complex cryptographic building blocks. You can live off the land. What’s not to love?

    Hooray!

    However, if you stopped here, we regret to inform you that your data is no longer indistinguishable from random, which probably undermines the security proof for your encryption scheme.

    How annoying!

    Of course, you don’t have to stop with the addition of plain HMAC to your database encryption software.

    Take a page from Troy Hunt: Truncate the output to provide k-anonymity rather than a direct literal look-up.

    “K-What Now?”

    Imagine you have a full HMAC-SHA256 of the plaintext next to every ciphertext record with a static key, for searchability.

    Each HMAC output corresponds 1:1 with a unique plaintext.

    Because you’re using HMAC with a secret key, an attacker can’t just build a rainbow table like they would when attempting password cracking, but it still leaks duplicate plaintexts.

    For example, an HMAC-SHA256 output might look like this: 04a74e4c0158e34a566785d1a5e1167c4e3455c42aea173104e48ca810a8b1ae

    Art: CMYKat\

    If you were to slice off most of those bytes (e.g. leaving only the last 3, which in the previous example yields a8b1ae), then with sufficient records, multiple plaintexts will now map to the same truncated HMAC tag.

    Which means if you’re only revealing a truncated HMAC tag to the database server (both when storing records or retrieving them), you can now expect false positives due to collisions in your truncated HMAC tag.

    These false positives give your data a discrete set of anonymity (called k-anonymity), which means an attacker with access to your database cannot:

    1. Distinguish between two encrypted records with the same short HMAC tag.
    2. Reverse engineer the short HMAC tag into a single possible plaintext value, even if they can supply candidate queries and study the tags sent to the database.
    Art: CMYKat\

    As with SSE above, this short HMAC technique exposes a trade-off to users.

    • Too much k-anonymity (i.e. too many false positives), and you will have to decrypt-then-discard multiple mismatching records. This can make queries slow.
    • Not enough k-anonymity (i.e. insufficient false positives), and you’re no better off than a full HMAC.

    Even more troublesome, the right amount to truncate is expressed in bits (not bytes), and calculating this value depends on the number of unique plaintext values you anticipate in your dataset. (Fortunately, it grows logarithmically, so you’ll rarely if ever have to tune this.)

    If you’d like to play with this idea, here’s a quick and dirty demo script.

    Intermission

    If you started reading this post with any doubts about Cendyne’s statement that “Database cryptography is hard”, by making it to this point, they’ve probably been long since put to rest.

    Art: Harubaki

    Conversely, anyone that specializes in this topic is probably waiting for me to say anything novel or interesting; their patience wearing thin as I continue to rehash a surface-level introduction of their field without really diving deep into anything.

    Thus, if you’ve read this far, I’d like to demonstrate the application of what I’ve covered thus far into a real-world case study into an database cryptography product.

    Case Study: MongoDB Client-Side Encryption

    MongoDB is an open source schema-free NoSQL database. Last year, MongoDB made waves when they announced Queryable Encryption in their upcoming client-side encryption release.

    Taken from the press release, but adapted for dark themes.

    A statement at the bottom of their press release indicates that this isn’t clown-shoes:

    Queryable Encryption was designed by MongoDB’s Advanced Cryptography Research Group, headed by Seny Kamara and Tarik Moataz, who are pioneers in the field of encrypted search. The Group conducts cutting-edge peer-reviewed research in cryptography and works with MongoDB engineering teams to transfer and deploy the latest innovations in cryptography and privacy to the MongoDB data platform.

    If you recall, I mentioned Seny Kamara in the SSE section of this post. They certainly aren’t wrong about Kamara and Moataz being pioneers in this field.

    So with that in mind, let’s explore the implementation in libmongocrypt and see how it stands up to scrutiny.

    MongoCrypt: The Good

    MongoDB’s encryption library takes key management seriously: They provide a KMS integration for cloud users by default (supporting both AWS and Azure).

    MongoDB uses Encrypt-then-MAC with AES-CBC and HMAC-SHA256, which is congruent to what Signal does for message encryption.

    How Is Queryable Encryption Implemented?

    From the current source code, we can see that MongoCrypt generates several different types of tokens, using HMAC (calculation defined here).

    According to their press release:

    The feature supports equality searches, with additional query types such as range, prefix, suffix, and substring planned for future releases.

    MongoDB Queryable Encryption Announcement

    Which means that most of the juicy details probably aren’t public yet.

    These HMAC-derived tokens are stored wholesale in the data structure, but most are encrypted before storage using AES-CTR.

    There are more layers of encryption (using AEAD), server-side token processing, and more AES-CTR-encrypted edge tokens. All of this is finally serialized (implementation) as one blob for storage.

    Since only the equality operation is currently supported (which is the same feature you’d get from HMAC), it’s difficult to speculate what the full feature set looks like.

    However, since Kamara and Moataz are leading its development, it’s likely that this feature set will be excellent.

    MongoCrypt: The Bad

    Every call to do_encrypt() includes at most the Key ID (but typically NULL) as the AAD. This means that the concerns over Confused Deputies (and NoSQL specifically) are relevant to MongoDB.

    However, even if they did support authenticating the fully qualified path to a field in the AAD for their encryption, their AEAD construction is vulnerable to the kind of canonicalization attack I wrote about previously.

    First, observe this code which assembles the multi-part inputs into HMAC.

    /* Construct the input to the HMAC */uint32_t num_intermediates = 0;_mongocrypt_buffer_t intermediates[3];// -- snip --if (!_mongocrypt_buffer_concat (  &to_hmac, intermediates, num_intermediates)) {   CLIENT_ERR ("failed to allocate buffer");   goto done;}if (hmac == HMAC_SHA_512_256) {   uint8_t storage[64];   _mongocrypt_buffer_t tag = {.data = storage, .len = sizeof (storage)};   if (!_crypto_hmac_sha_512 (crypto, Km, &to_hmac, &tag, status)) {      goto done;   }   // Truncate sha512 to first 256 bits.   memcpy (out->data, tag.data, MONGOCRYPT_HMAC_LEN);} else {   BSON_ASSERT (hmac == HMAC_SHA_256);   if (!_mongocrypt_hmac_sha_256 (crypto, Km, &to_hmac, out, status)) {      goto done;   }}

    The implementation of _mongocrypt_buffer_concat() can be found here.

    If either the implementation of that function, or the code I snipped from my excerpt, had contained code that prefixed every segment of the AAD with the length of the segment (represented as a uint64_t to make overflow infeasible), then their AEAD mode would not be vulnerable to canonicalization issues.

    Using TupleHash would also have prevented this issue.

    Silver lining for MongoDB developers: Because the AAD is either a key ID or NULL, this isn’t exploitable in practice.

    The first cryptographic flaw sort of cancels the second out.

    If the libmongocrypt developers ever want to mitigate Confused Deputy attacks, they’ll need to address this canonicalization issue too.

    MongoCrypt: The Ugly

    MongoCrypt supports deterministic encryption.

    If you specify deterministic encryption for a field, your application passes a deterministic initialization vector to AEAD.

    MongoDB documentation

    We already discussed why this is bad above.

    Wrapping Up

    This was not a comprehensive treatment of the field of database cryptography. There are many areas of this field that I did not cover, nor do I feel qualified to discuss.

    However, I hope anyone who takes the time to read this finds themselves more familiar with the subject.

    Additionally, I hope any developers who think “encrypting data in a database is [easy, trivial] (select appropriate)” will find this broad introduction a humbling experience.

    Art: CMYKat

    https://soatok.blog/2023/03/01/database-cryptography-fur-the-rest-of-us/

    #appliedCryptography #blockCipherModes #cryptography #databaseCryptography #databases #encryptedSearch #HMAC #MongoCrypt #MongoDB #QueryableEncryption #realWorldCryptography #security #SecurityGuidance #SQL #SSE #symmetricCryptography #symmetricSearchableEncryption

  24. #TheMetalDogArticleList
    #guitarworld
    “He’d call me at four in the morning and leave a 15-minute guitar solo on my voicemail”: Serj Tankian on his collaborations with the enigmatic Buckethead – and the time they played a high school battle of the bands together
    From taxidermy-inspired music videos to playing a high school battle of the bands as fully grown adults, the System of a Down frontman recalls his artistic camaraderie with Buckethead

    guitarworld.com/news/serj-tank

    #SerjTankian #Buckethead

  25. This Thursday I'm back on hosting duties for monthly comedy night SIPS & GIGGLES in Wakefield 🍹

    9 comedians (and me) at RBT VIDEO on Northgate - mic goes hot at 8pm, free in but bring cash to chuck in the bucket for the acts.

    This month I'm welcoming Rachel Cracknell, Jamie Mcauley, Lewis Costello, Lucy Holbrook, Rachel Selkirk, Louis Etinne, Perry Martins, Annabelle Devey and Tom Douglas.

    #comedy #comedyClub #comedyNight #wakefield #westYorkshire #Yorkshire

  26. My friend, Jane, from Lake Cowichan, was the first senior citizen activist who was arrested in 2021, at Fairy Creek Blockades. She messaged before heading out to meet up with me at HQ camp. She told me she's never been arrested for protesting anything before but due to seeing too much ecocide still happening in her golden years - declared, 'Getting arrested for protecting Mother Nature is now on my bucket list"🌲💦 💗🦅🌲🦉🌲 Jane stayed at HQ camp for 2 nights & on her 3rd day, headed out to chain herself/walker to a tripod hardblock. Jane was arrested, along with 12 others that day, after several hours. Jane is a wonderful human being, a musician, a fibers artist, a Mother & Grandmother, who loves nature very much.

    #RadicalSeniors #AwesomeElders #ClimateAction #Activists #AsianMastodon #AncientForestDefenders #BritishColumbia #StopDeforestation #StopEcocide #blockade #FairyCreekBlockade #SaveOldGrowth #WorthMoreStanding #VancouverIsland #VanIsle #PacificNorthwest #OneEarth #TreesOverGreed #PNW #StandEarth #EcoJustice #AbolishRCMPCIRG #BCpoli #BCNDP #BCForestryReform #environmentalists #ecological #ClimateChange #CarbonSink #BCOldGrowth #ProtectTrees #SilentSunday #Cascadia #RadicalSeniorCitizens #DirectAction #Resistance

  27. The Uncertainty Of It All, Heisenberg You Were Right

    As I walked out one evening,
    Walking down Urdu Gulli
    (I know friends are already suspecting
    Since I was in 9th class
    That I steal lines or wholesale,
    So let me confess those two lines
    Are mods of WH Auden’s poem
    Not telling which poem
    Do some homework).

    Returning from Minerva Coffee Shop
    After Hot and Sour Soup
    And Grilled Cheese Sandwich
    With a few chips as garnishment
    Waiters nowadays have become very caring
    He asked me if I wanted more helping of chips
    I smiled and pointed to my belly
    He gave a hearty chuckle
    This young guy who is in far better health
    Of the body, not finances,
    I ask him to get the bill,
    And then maître d’hôtel walks up to me
    And asks “Tea?”
    I say why not, and he who sees me
    At the coffee shop, alone, often,
    I know him better than he does me
    He only knows that I take tea in the end
    I know how much he earns,
    How long he has been working here
    How old he is
    He is still unmarried, etc.

    Now you are wondering,
    Where is the uncertainty in this all buddy
    All too predictable—the cheese sandwich—
    You who are not exactly slim,
    I start to protest I was once and stop
    Realizing I have to justify the title,
    Ok, read the last stanza.

    As I walked out one evening,
    Walking back down Urdu Gulli,
    I stepped on a banana peel
    With a little bit of its pulp inside
    I almost fall, and let out
    A mild gasp, either Oh My
    Or Oh God, must have been Oh God
    Because God is very much on my mind
    For too long now, in fact so long
    I keep remembering Samuel Beckett’s play
    Not telling which play, look up
    Do your homework, my God you guys are lazy
    See how God has made his appearance again
    But I digress, and upon hearing My God
    The male of the couple behind me
    Asked, “What happened Sir?”
    By the time I look up
    I find the girl/woman (his girlfriend or wife?)
    Ahead of him by a few meters
    He has lagged behind
    Does that happen these days
    The women outpacing we men
    Anyway, I digress again,
    So I point to the banana peel
    And shrug and say,
    “Can’t even sue anyone”
    He gives a chuckle
    Perhaps wondering
    Suing? What’s going on?
    And I add, maybe there’s a CC TV camera
    He says unnecessarily, “Evidence”
    And wanting to have the last word
    I say, ‘Yeah, the smokin’ gun”,
    And walk back to my apartment
    “Shaken and stirred” (sorry, Bond).

    #AsIWalkedOutOneEvening #MinervaCoffeeShop #Poem #Poetry #Uncertainty #WHAuden