Search
1000 results for “commons_protocol”
-
What To Use Instead of PGP
It’s been more than five years since The PGP Problem was published, and I still hear from people who believe that using PGP (whether GnuPG or another OpenPGP implementation) is a thing they should be doing.
It isn’t.
I don’t blame individual Internet users for this confusion. There is a lot of cargo-culting around communication tools in the software community, and the evangelists for the various projects muddy the waters for the rest of us.
HarubakiThe part of the free and open source software community that thinks PGP is just dandy, and therefore evangelize the hell out of it to unsuspecting people, are the same kind of people that happily use XMPP+OMEMO, Matrix, or weird Signal forks that remove forward secrecy and think it’s fine.
Not to mince words: The same people who believe PGP is good are also famously not great at cryptography engineering.
If you’re going to outsource your opinions on privacy technology to someone else, make sure it’s someone who has actually found vulnerabilities in cryptographic software before. Most evangelists have not.
CMYKatI’m not here to litigate the demerits of PGP. The Latacora article I linked above makes the same arguments I would make today, and is a more entertaining read.
It is of my opinion as a security engineer that specializes in applied cryptography that nobody should use PGP, because there’s virtually always a better tool for the job you want to use PGP for.
(And for the uncommon use cases, offering a secure, purpose-built replacement is a work-in-progress.)
Note: I’m deliberately being blunt in this post because literally more than a decade of softspokenness from cryptography experts has done nothing to talk users off the PGP cliff. Being direct seems more effective than being tactful.
If you want a gentler touch, ask your cryptographer. If you don’t have a cryptographer, hire one.
If you can accept that every billionaire is the result of a failed system, that’s how cryptographers feel about people using PGP.
Instead, let’s examine the “use cases” of PGP and what you should be using instead. (Some of this is redundant with the Latacora article, but I’m also writing it 5 years later, so some things have changed.)
CMYKatInstead of PGP, Use This
This section contains specific tools to solve the same problems that PGP tries to solve, but better.
What makes these recommendations better than PGP?
Simply, they don’t make cryptographers want to run the other way screaming when they look under the hood. PGP does.
Some people are forced to use PGP because they work for a government that legally requires them to use PGP. In that corner case, your hands are tied by lawyers, so you don’t need to bother with what cryptographers recommend.
CMYKatSigning Software Distributions
Use Sigstore.
Note that this is an ecosystem-wide consideration, not something that specific individuals must manually opt into for each of their hobby projects. The only downside to Sigstore is it hasn’t been widely adopted yet.
If you’re a Python developer, you can just use PEP 740 to get attestations with Trusted Publishers, which gives you Sigstore for free. For most developers, this is as simple as setting up a GitHub Action to publish to PyPI.
This is a developing trend: Other programming language and package management ecosystems are following suit. I expect to see Sigstore attestations baked into NPM and Maven before the next US presidential election. With any luck, your favorite programming language could be on this list too.
Sigstore doesn’t just give you a signature that you check with a long-lived public key, nor does it require you to do the Web Of Trust rigamarole.
Rather, Sigstore gives you a lot for free. Sigstore was designed around ephemeral signing certificates rather than a long-lived private key. It was purpose-built for preventing supply-chain attacks against open source software.
Combined with Reproducible Builds, Sigstore solves the triangle of secure code delivery.
Alternatively, use minisign. If your package ecosystem doesn’t support Sigstore yet, you can get by with minisign (which is signify-compatible) until they modernize.
You can also use SSH signatures, if you’d prefer. (More on that below.)
CMYKatSigning Git Tags/Commits
Use SSH Signatures, not PGP signatures.
With Ed25519. Stop using RSA.
Art by HarubakiSending Files Between Computers
Use Magic Wormhole.
You could also use SSH + rsync to do this job. That’s fine too.
CMYKatEncrypting Backups
Tarsnap is the usual recommendation here.
There are a lot of other encrypted backup tools that work fine, if you don’t want to give Colin Percival your business. I don’t have a financial stake in any of them, nor have I audited them thoroughly.
Borg uses reasonable cryptography, but I haven’t had the time to review it carefully.
Kopia looks fine, but I really hate that they misuse “zero knowledge” to describe an encryption protocol (rather than a proof system). We should not reward this misbehavior by marketers.
The point is: You’ve got options.
Too many options, in my opinion, to settle for PGP.
CMYKatEncrypting Application Data
Avoid: OpenPGP, OpenSSL and its competitors.
Not a lot to say here. I’ve written a lot about this over the years. Misuse-resistant cryptography libraries–especially ones that make key management less painful for users–are the way to go.
HarubakiEncrypting Files
Use age.
Age is what PGP file encryption would be if PGP didn’t suck shit.
Age has two modes: Public-key encryption, and password-based key derivation.
Here’s a quick comparison table between what age offers, and what PGP uses in the installed base:
agePGPData encryption modeAEAD (ChaPoly)CAST5 (64-bit block cipher) in CFB mode with a strippable SHA1 “MDC”Key-commitmentYes (via the header)Pah! You wish! Dream on.
PGP isn’t even AEAD.Password KDF memory hard?Yes, with scrypt.No.Vulnerable to chosen-ciphertext attacks?No.Yes, but PGP proponents stupidly consider this a good thing.Supports 90’s-era cryptography?No.Yes.Releases unauthenticated plaintext?No.Yes.Uses versioned protocols rather than “cipher agility”?Yes.No. See: 90’s era cryptography.Most common implementations are memory-safe?Yes (Go, Rust).No (C).Like, it’s not even close.
CMYKatSome PGP proponents will insist that AEAD is possible now, but as long as the installed base of PGP remains backwards compatible with the lowest common denominator, that’s what your software uses.
Just use age. Or rage, if you’re a Rust enthusiast.
(And if you have concerns about “which age key should I trust?”, I’m already planning an age-v1 extension for the Public Key Directory project. More on that below.)
Art by ScruffPrivate Messaging
Use Signal.
Security teams around the world insist that they need PGP for bug bounty submissions or security operations, but Signal does this job better than PGP ever did.
Once upon a time, you needed to give people a phone number to use Signal, but that hasn’t been the case for a long time. Still, many people have missed that memo and think it’s a requirement.
My Signal username is soatok.45. Go ahead and message me. You won’t learn my phone number that way.
In the near future, I plan on developing end-to-end encryption for direct messages on the Fediverse (including Mastodon). This is what motivated my work on the Public Key Directory to begin with.
But this is not intended to be a Signal competitor by any measure. It’s a bar-raising activity, nothing more.
CMYKatI understand some people don’t like or trust Signal for whatever reason, but every single alternative that’s been suggested to Signal has offered inferior cryptography to Signal’s. So I will continue to recommend Signal.
Miscellaneous PGP Alternatives
This section contains things people think they need PGP for.
Identity Verification
I’m actively working on something better!
via XKCDIf you want the ability to vend a transparently verifiable public key for a given user, that’s one of the use cases for the Public Key Directory I’m designing in order to build end-to-end encryption for the Fediverse.
Although this is purpose-built for the Fediverse, I’ve deliberately included support for Auxiliary Data messages, whose formats will be specified by protocol extensions.
Rather than trying to grok the Web-of-Trust, you can simply have your software check that multiple independent Public Key Directories have verified the record, since its inclusion is published in an append-only transparency log, secured by a Merkle tree.
My design doesn’t preclude any manual key verification, or key-signing parties, or whatever other PGP cultural weirdness you want to do with these public keys. It just establishes a baseline trustworthiness even if you’re not a paranoid computer nerd.
My project isn’t finished yet. In the meantime, you can manually check public keys when using the other recommendations on this page.
HarubakiEncrypted Email
Don’t encrypt email. From the Latacora article:
Email is insecure. Even with PGP, it’s default-plaintext, which means that even if you do everything right, some totally reasonable person you mail, doing totally reasonable things, will invariably CC the quoted plaintext of your encrypted message to someone else (we don’t know a PGP email user who hasn’t seen this happen). PGP email is forward-insecure. Email metadata, including the subject (which is literally message content), are always plaintext.
There isn’t a recommendation for encrypted email because that’s not a thing people should be doing.
Art by AJNow, there exists a minority of extremely technical computer user for which Signal is a nonstarter (because you need a smartphone and valid phone number to enroll in the first place).
Because those people are generally not the highest priority of cryptographers (who are commonly focused on the privacy of common folk–including people in poor and developing countries where smartphones are more common than desktop computers), there presently isn’t really a good recommendation for private messaging that meets their constraints.
Certainly not PGP, either.
What PGP offers here is security theater: the illusion of safety. But it’s not actually a robust private communication mechanism, as Latacora argues.
CMYKat“I insist that I need encrypted email!”
If you find someone insisting that they “need” encrypted email, read up on the XY Problem. In a lot of cases, that’s what’s happening here.
Do they ipso facto need email (as in, specifically the email protocols and email software)?
And do they care more about this constraint, or the privacy of their communications?
Because if their goal just to communicate privately, see above.
If the tool they’re using being email is more important than privacy, they should consider sending empty messages with an attachment, and use age to encrypt the actual message before attaching it.
That’s serviceable, just beware that everything Latacora wrote about encrypted emails still applies to your use case, so expect someone to CC or forward your message as plaintext.
(Unless you’re legally required to use PGP because of a government regulation… in which case, why do you care about my recommendations if you’re chained by the ankle to your government’s bad technology choices?)
Finally, miss me with the “but someone can screenshot Signal” genre of objections.
As Latacora noted, people accidentally fuck up PGP all the time! It’s very easy to do.
Conversely, you have to deliberately leak something from Signal. There is no plaintext mode.
That’s the fucking bar you need to meet to compete with Signal.
PGP fails to be a Signal competitor, in ways that are worse than Threema, Matrix, or OMEMO.
Watch This Space
With all that said, I am actually designing an encrypted messaging protocol that will have an email-like user experience, except:
- Everything is always end-to-end encrypted, with forward secrecy.
- It’s not backwards compatible with insecure email.
- It doesn’t use PGP, or any 1990’s era cryptography.
I can’t promise a release date yet. I’m prioritizing end-to-end encryption for the Fediverse before I write the specification for that project (tentatively called AWOO, but the cryptography underpinning both projects should be similar).
Maybe 2026? We’ll see!
If someone beats me to the punch, and their design is actually good, I’ll update the post and replace this with a specific recommendation.
CMYKatAgainst PGP
I don’t know how to get the message out louder or clearer about how cryptographers feel about PGP than what I wrote here.
Latacora wrote their criticism in 2019. As I write this, 2024 is almost over. When will the PGP-induced madness end?
CMYKatExperts are not divided here. There is no controversy to teach.
Every time a cryptographer has talked about PGP, it’s been to complain about how bad it is and opine that people shouldn’t be using it.
If you’ve read this far, you already know what you should be using instead.
Header art credits: CMYKat and the GnuPG logo.
Update (2024-11-16)
Someone tried to use their Fediverse software to submit an anti-furry comment to this blog post.
Therefore, I’ve added more furry art to it.
loviesophiee#alternatives #codeSigning #digitalSignatures #encryption #PGP #security #SecurityGuidance #signing
-
By Carcharodon
15 years ago, on May 19, 2009, Angry Metal Guy spoke. For the very first time as AMG. And he had opinions: Very Important Opinions™. The post attracted relatively little attention at the time, but times change and, over the decade and a half since then, AMG Industries has grown into the blog you know today. Now with a staff of around 25 overrating overwriters (and an entirely non-suspicious graveyard for writers on permanent, all-expenses-paid sabbaticals), we have written more than 9,100 posts, comprising over seven million words. Over the site’s lifetime, we’ve had more than 107 million visits and now achieve well over a million hits each and every month. Through this, we’ve built up a fantastic community of readers drawn from every corner of the globe, whom we have (mostly) loved getting to know in the more than 360,000 comments posted on the site.
We have done this under the careful (if sternly authoritarian) stewardship of our eponymous leader Angry Metal Guy and his iron enforcer, Steel Druhm, while adhering to strict editorial policies and principles. We have done this by simply offering honest (and occasionally brutal) takes, and without running a single advert or taking a single cent from anyone. Ever. Mistakes have undoubtedly been made and we may be a laughing stock in the eyes of music intellectuals, socialites and critics everywhere but we are incredibly proud of what AMG Industries represents. In fact, we believe it may be the best metal blog, with the best community of readers, on the internet.
Now join us as the people responsible for making AMG a reality reflect on what the site means to them and why they would willingly work for a blog that pays in the currency of deadlines, abuse, and hobo wine. Welcome to the 15th Birthdaynalia.
Thou Shalt Have No Other Blogs!
Steel Druhm
AMG and me
I stumbled into the world of AMG Inc. by chance, one day in early 2010 and just never got around to leaving. To put a finer point on it, I’ve been slaving in the AMG salt mines so long, even the extremely sabbaticalized Happy Metal Guy thinks my mind is gone. Over time, I’ve evolved from unpaid assistant to the Founding Overlord Himself to become site overseer and brvtal enforcer of deadlines, and morale (still unpaid). The journey has been a wild one, full of moments I’ll always cherish. It’s also introduced me to a collection of loveable oddballs I care about, even though I want to murderize them most of the time (you would too if you had to deal with their outrageous bullshit daily).1
The site and the extensive work that goes into it have provided me with a satisfaction that my real job often lacks, and even helped me find my soulmate. In short, AMG means the world to me and that’s why I’ve given so much of myself to this little blog these last 14 years. Looking back, I regret nothing (except the staff’s penchant for wildly overrating complete garbage) and I’d do it all again in a heartbeat. Thank you to the writers past and present who helped make the site possible, and thanks to the readers who make it worth the effort, even though most of you are woefully deficient in the good taste department. Here’s to 15 more years of this burning shitshow of a trainwreck!
AMG gave to me
As I’ve been a part of AMG since the early days, it’s nearly impossible to come up with just three albums the site gave me because it’s given me so many. Instead, I’ll enumerate the biggest non-musical gifts AMG has bestowed upon me over the years.2
Madam X // Be My (Pri)Mate / Down with the Steelness – The best thing AMG gave me by far was the chance to meet my best friend, soulmate and life partner, Madam X. She had read some of my early reviews for AMG and by chance, we happened to run into each other on a now-defunct Facebook metal fan page. She reached out to discuss my reviews and get some recommendations, we started chatting, and the rest, as they say, is history. I’m the luckiest guy in the world to have her and, since she lived in South Africa and I in New York, I highly doubt we ever would have found one another were it not for AMG. For this reason alone, I’ll cherish this little blog until my rusty metal heart explodes in my hairy ape chest. Fun fact: I never had a girlfriend that liked metal, and now I have a wife who listens to stuff that’s so extreme and out there, I end up sounding like my parents and saying shit like “This isn’t music, it’s just crazy noise!” Life is funny sometimes.
The Sadistic Pleasure that Comes from Unicorning Kvlt Strangeo Bands // You Axed for It – One cold, gloomy day back in February 2015, I was reviewing a cold, gloomy release by Danish doom/death act Dwell. Their Vermin and Ashes album didn’t especially thrill me, and I was annoyed that they had opted not to include a band photo in the promo materials. Sure, I get it. They wanted to be dark and mysterious. Who doesn’t? I searched online for a suitable image of them but there were none to be found. I became quite vexed. Where the inspiration came from I cannot say but I decided to bestow upon them a bright, mega-cheesy unicorn image, in place of the non-existent band shot. As I contemplated how the vomit of rainbow colors clashed with the murky gray malaise of the album cover, it looked so wrong that it felt so right! And so a blog protocol was born. Send band photos or face extreme unicorn judgment!
The Joys of Initiating Unsuspecting n00bs into the AMG Meatgrinder // Taste the Skull Pit, Poser – When I joined AMG back in its embryonic, protoplasmic stage, there was no probationary period or brutal abuse (aside from assigning me metalcore albums). Things changed as the blog grew and we started bringing on new writers. Soon, a system of impressment, indoctrination and re-education was put in place, and ruthlessly weaponized in service of internet “fame” and “glory.” Each carefully selected wannabe writer, eyes glistening with the ghosts of their past, would serve a tumultuous probationary term, working in complete isolation under the iron thumbs of AMG management. If they somehow survived this experiment in terror, they would be cast into the general population in the Skull Pit, with a besotted cadre of jaded, glassy-eyed veteran staffers. That’s when the real initiation would begin! Imagine Lord of the Flies mixed with The Hunger Games and The Devil’s Rejects, and you get the general idea. Through ritualized humiliation, unreasonable deadlines, and confrontational teaching methods, we slowly transform these sniveling amateurs into barely functional hack reviewers. Believe in the system or be buried by it me.
I wish I had written …
White Wizzard – The Devil’s Cut Review. Yes, the infamous review that’s hung around our necks like a rotting albatross ever since it saw the light of day in 2013. Had I been tasked with doing the review, I would have given it the rating it truly deserved, which is a big, fat, greasy 3.0. Just like the album that came before, and the one that followed. Now, I have nothing against White Wizzard and I enjoy the retro 80s metal style they play, but let’s face it, nothing they ever did came anywhere near a 5.0 (whether in its “Perfect” or “Iconic” guise). My common sense, real-world review would have spared us all a great deal of embarrassment, as well as saving the effort and bleach it took to scrub the office down after the First Grand Sabbaticaling. If only…
I wish I could do over …
Amon Amarth – Sutur Rising Review. As a relatively new reviewer, I got the unexpected chance to weigh in on a new Amon Amarth platter, while I was at the peak of my feverish AA fanboyism. This proved a deadly combination and, before my better angels could caution restraint and moderation, I stamped this thing with a 4.5, and got the album cover tattooed on my dog. With time (and much hobo wine), I realized that I let the moment get the better of me. Despite the presence of a few killer cuts like “War of the Gods” and “Destroyer of the Universe,” Sutur Rising is far from Amon Amarth’s best work. I dutifully submitted a groveling apology in a Contrite Metal Guy piece and tried to move on with my life. 13 years on, this one still stands as my biggest rating misadventure and a source of bitter regret. I blame society (AKA: you, the reader).
I wish more people had read …
Retro-spective Review: Hall Aflame – Guaranteed Forever. The side project of Metal Church’s Kurdt Vanderhoof, Hall Aflame saw but one release in 1991. But what a party this thing was and still is! Adopting a style somewhere between The Cult and The Four Horsemen, Hall Aflame roar through a collection of wildly catchy, burly rockers, making for a highly replay-able album, with only occasional reminders it’s made by the brain behind Metal Church. Cuts like “Shake the Pain,” Child of Medicine,” and “Money” are absolute monsters, and “Another Heartbeat” is one of my favorite songs of all time across all genres. The hugely ass-kicking vocals by completely unknown (then and now) frontman Ron Lowd alone are worth the effort it will take to track down this rare gem. The world continues to sleep on this killer, as evidenced by my retro-spective review scoring exactly ZERO comments. Don’t let this injustice continue. You need to hear this thing, especially with the recent news that Vanderhoof is releasing the long-awaited (by me at least) follow-up in May. You have my word as a Viking ape that satisfaction is Guaranteed Forever.
AMG is Now a Good Capitalist! In this gap-filler post from 2015, I posited the concept of AMG building a merch empire based upon goods of questionable quality (see our branded Uni-Friend and Sabbatical Sausage Maker pictured above). It got reads but, since I found the concept amusing, I wanted MOAR clicks. I credit this piece with motivating me to finally get a batch of actual AMG t-shirts printed up for the undeserving staff. If you see someone wearing one of these rare treasures and kill them, you take their place in the Skull Pit forevermore. It’s just like The Santa Clause, but much, much worse.
Dr. A.N. Grier
AMG and me
Back in the day, we’d be lucky to get two reviews a day at AMG. This led to me refreshing the site every few hours hoping for a bonus review for the day.3 I was obsessed with the writing and these gems I would never have found otherwise. Before I began writing here, I would do that regularly from 2010-2011. One morning I left the lab of my failing start-up and walked into my office to do some work. The post that morning wasn’t a review. Instead, it was instructions on how to apply to be an AMG writer. Without thinking—because I’d been up for roughly 40 straight hours—I submitted a review of 1349’s lackluster Demonoir. Weeks later, I was a n00b in these decrepit halls. And I’m still here regretting that decision, almost ten years to the day since I submitted my first review. It’s funny, now that I’ve gathered everything for this piece, that I found those early days the fondest of times. Those days when I still loved the writers, the readers, writing about metal, and well… music. Now I’m a broken soul, stalking the halls as a sex-depraved ghost,4 avoiding eye contact with Steel because his ape eyes make my pants tight.
But, in all seriousness, it’s been a wild ride and it’s odd to be one of the lucky few who have contributed to two-thirds of AMG’s existence. I’m proud to have kept the output so rounded, delivering correct scores and takes, and providing X-rated content for the younger generations. So, join me in celebrating AMG’s birthday, as I travel back to those early years when I became part of the family and discovered records that shaped the man known, for today at least, as Dr. All. Nostalgic. Grier.
AMG gave to me …
Mors Principium Est // Dawn of the 5th Era – As a n00b, Angry Metal Guy‘s review of Mors Principium Est’s Dawn of the 5th Era made me realize two things: I needed this band in my life and never release an album in December. Thankfully, AMG caught it (while everyone else was busting their asses to write their year-end lists) because it’s a stunning achievement. From that point on, I consider myself one of MPE’s biggest fans. That continuation of the At the Gates sound results in incredible performances and riff after massive riff. Not a single song on this album goes stale and I’ve been listening to it regularly for ten fucking years. I can never seem to find a melodeath group whose entire catalog I march through from beginning to end.5 But MPE is one of them. And, because you might be wondering, … And Death Said Live is their best album.
Voices // London – Back in 2014, I ranked an album I never reviewed. Weird, right? Not only was it a great album, but it was one of my favorite reviews from the illustrious Jean-Luc Ricard, who opened his thoughts with: “If you’re anything like me, you’re super awesome.” Still makes me laugh my ass off. Beyond that, Ricard conveyed the absolute nightmare that you experience when you listen to London. Though Akercocke has since reunited, Voices was an incredible substitute, which takes you through a journey that, somehow, Ricard was able to describe; because I sure as hell can’t. I was doing an oil change on my truck the first time I span it. Never have I taken so long to do that work but I constantly found myself staring off into space, literally frightened by the sounds erupting in my ears. The band has never been able to top London, but that’s OK. It’s one of the beautiful aspects of music—it’s permanent and will be there forever when you need it.
Trials // This Ruined World – When I joined AMG and worked side-by-side with Dr. Fisting, we hit it off. I love the guy and consider him a close friend (though he might not feel the same). When I found out that he started a band called Trials, I had to check it out. With two decent albums under his belt, 2014 saw the release of Trials’ best—and final—album, This Ruined World. I was hooked. And to imagine that without knowing about this band or this person, I might never have experienced his work in Bear Mace and the (to me, at least) incredible Black Sites. Though I don’t return to Trials often, mostly because I can’t pull myself away from Fisting‘s current work, I have a special place in my heart for This Ruined World. It introduced me to a fantastic musician and a good friend.
I wish I had written …
Origin – Omnipresent Review. When you join the crew, the hope is that you get to write that review for a big band. Those bands you grew up with, that released something at that point in your life, or which have such popularity that every other site overrates them. But, at AMG, you kinda have to earn that. Unless it’s, somehow, a popular dungeon synth group; you can just have that. So, when my most-anticipated album of 2014 dropped, I wanted it. But, there wasn’t a chance in hell I would get my hands on Origin’s Omnipresent. I bet you didn’t know I liked tech death, much less Origin. But, I do. I just know there are other, more qualified writers to cover that material. Thankfully, our wise and wonderful Kronos scored it correctly and wrote a fantastic review that describes it perfectly. Since then, I haven’t been as enamored with their material (mostly because this place has turned me into a hateful prick), but that album holds up and still gets many a spin.
I wish I could do over …
Resumed – Alienation Review. I remember when the review for Resumed’s Alienations was published. It was Thanksgiving 2014 and I was already six sheets to the wind when I realized what I was reading: the first double review in AMG history. It wasn’t a record that merited a double but Steel fucked up and double-booked it, thereby unintentionally beginning a trend. Though I couldn’t believe I wasted my time on this thing6 and subjected myself to uncalled-for ridicule, it started one of our most popular segments. Hell, it even led to our Unsigned Band Rodeö pieces. So, for better or worse (and by worse, I mean that year’s burned turkey), we can thank this worthless piece for contributing to AMG lore.
I wish more people had read …
Thine – The Dead City Blueprint [Things You Might Have Missed 2014]. In the process of writing the review for The Deathtrip’s stellar 2014 release, Deep Drone Master, Metal Archives led me to a release we never received. In walks Thine, a progressive rock outfit led by the same person who convinced Aldrahn to come back from retirement to front Deep Drone Master, not to mention drummer Dan Mullins, who returned for My Dying Bride’s newest release. Representing my first ever Things You Might Have Missed piece, I continue to return to this band’s swansong release: it’s beautiful and engaging, and is everything I ever wanted from an album of this caliber. My unpopularity as a n00b, combined with the new year beginning and everyone moving on to January releases, meant no one seemed to care. But I cared. I care so much, in fact, that I’m dropping Thine’s name again, in the hope that Bandcamp credits will be put to good use. You’re welcome.
Dr. Fisting
AMG and me
As a reader of the site’s earliest incarnation, the first thing that stood out to me was that AMG’s writers were clearly educated. Even back then, the reviews were extremely well-written. I don’t mean just in terms of spelling and grammar, but being able to express ideas coherently. If you’ve ever visited any other metal-related sites, you know that these qualities are rare. More importantly, AMG was clearly an independent operation, with no reliance on ad revenue or cozy relationships with record labels. This meant the site was free to post brutally honest reviews, which occasionally resulted in battles against the metal media’s narrative and even the fans themselves. I always enjoyed when some huge band would put out a half-assed album that got rave reviews everywhere else, and then the AMG writeup would take a well-deserved shit on it.
When I started writing for the site a couple of years later, I did my best to uphold those standards. Eventually, as my life and priorities changed, I chose to step back from reviewing to focus on other things. But it was an honor to ride with these guys for as long as I did. I got to review some fantastic records, talk shit about some terrible ones, and make some friends that I am still in contact with to this day.
AMG gave to me …
Pain of Salvation // Road Salt Pt. 1 – I don’t remember if I discovered this record from reading the site or from The Angry One Himself sending it to me (“here, you’ll like this”), but Road Salt Pt. 1 was a complete game-changer. At a time when I was completely bored of “modern metal” and its trappings, I related strongly to PoS’s new direction, in which chug riffs and rapping were replaced by analog ’70s tones and memorable songs. This record was in heavy rotation in the Fisting household, and became a significant influence on my own music.
Satan // Life Sentence – Having missed out on Satan’s original run, I was unaware of their comeback album until the AMG review heaped praise upon it. Lucky for me it did because Life Sentence is full of intelligent lyrics, clever riffs, and memorable hooks. The band has since made three more records, all of which have been varying degrees of excellent. More importantly, discovering Life Sentence sent me on a path to revisit the band’s earlier works, including the highly influential Court in the Act.
Anacrusis // Screams and Whispers – Anacrusis is another band I was completely oblivious to during their lifespan, but discovered much later via Grymm‘s excellent retrospective writeup. This album is incredibly ambitious for its time (1993), pushing thrash metal into new and more introspective territory. There are hints of industrial influence, occasional goth-y keyboards, and some very angular guitar work, even by 1990s standards. This is a classic record from metal’s lost years, and more people should hear it.
I wish I had written …
King’s X – Three Sides of One Review. Not to suggest that Huck didn’t do a fantastic job on the review, because he absolutely nailed it, but King’s X has held a special place in my cold black heart for many years. I should’ve been there for this. There is no good reason why I didn’t do this review (or the related Angry Metal Primer) other than my own laziness and poor time management. Life gets in the way sometimes. I wish I could do over … I regret nothing.
I wish more people had read …
Various reviews of Voivod and Failure albums. As several readers noticed, I made it a personal mission to preach the virtues of Voivod and Failure. I consider both bands to be absolutely brilliant and worthy of greater attention (particularly Failure, whom I suspect most AMG readers are unfamiliar with). I don’t know how many people read those reviews, but whatever that number is, it needed to be more.
#2024 #AMGTurns15 #AmonAmarth #Anacrusis #BlogPost #BlogPosts #Failure #HallAflame #KingsX #MorsPrincipiumEst #Origin #PainOfSalvation #Resumed #Satan #Thine #Trials #Voices #Voivod #WhiteWizzard
-
By Carcharodon
15 years ago, on May 19, 2009, Angry Metal Guy spoke. For the very first time as AMG. And he had opinions: Very Important Opinions™. The post attracted relatively little attention at the time, but times change and, over the decade and a half since then, AMG Industries has grown into the blog you know today. Now with a staff of around 25 overrating overwriters (and an entirely non-suspicious graveyard for writers on permanent, all-expenses-paid sabbaticals), we have written more than 9,100 posts, comprising over seven million words. Over the site’s lifetime, we’ve had more than 107 million visits and now achieve well over a million hits each and every month. Through this, we’ve built up a fantastic community of readers drawn from every corner of the globe, whom we have (mostly) loved getting to know in the more than 360,000 comments posted on the site.
We have done this under the careful (if sternly authoritarian) stewardship of our eponymous leader Angry Metal Guy and his iron enforcer, Steel Druhm, while adhering to strict editorial policies and principles. We have done this by simply offering honest (and occasionally brutal) takes, and without running a single advert or taking a single cent from anyone. Ever. Mistakes have undoubtedly been made and we may be a laughing stock in the eyes of music intellectuals, socialites and critics everywhere but we are incredibly proud of what AMG Industries represents. In fact, we believe it may be the best metal blog, with the best community of readers, on the internet.
Now join us as the people responsible for making AMG a reality reflect on what the site means to them and why they would willingly work for a blog that pays in the currency of deadlines, abuse, and hobo wine. Welcome to the 15th Birthdaynalia.
Thou Shalt Have No Other Blogs!
Steel Druhm
AMG and me
I stumbled into the world of AMG Inc. by chance, one day in early 2010 and just never got around to leaving. To put a finer point on it, I’ve been slaving in the AMG salt mines so long, even the extremely sabbaticalized Happy Metal Guy thinks my mind is gone. Over time, I’ve evolved from unpaid assistant to the Founding Overlord Himself to become site overseer and brvtal enforcer of deadlines, and morale (still unpaid). The journey has been a wild one, full of moments I’ll always cherish. It’s also introduced me to a collection of loveable oddballs I care about, even though I want to murderize them most of the time (you would too if you had to deal with their outrageous bullshit daily).1
The site and the extensive work that goes into it have provided me with a satisfaction that my real job often lacks, and even helped me find my soulmate. In short, AMG means the world to me and that’s why I’ve given so much of myself to this little blog these last 14 years. Looking back, I regret nothing (except the staff’s penchant for wildly overrating complete garbage) and I’d do it all again in a heartbeat. Thank you to the writers past and present who helped make the site possible, and thanks to the readers who make it worth the effort, even though most of you are woefully deficient in the good taste department. Here’s to 15 more years of this burning shitshow of a trainwreck!
AMG gave to me
As I’ve been a part of AMG since the early days, it’s nearly impossible to come up with just three albums the site gave me because it’s given me so many. Instead, I’ll enumerate the biggest non-musical gifts AMG has bestowed upon me over the years.2
Madam X // Be My (Pri)Mate / Down with the Steelness – The best thing AMG gave me by far was the chance to meet my best friend, soulmate and life partner, Madam X. She had read some of my early reviews for AMG and by chance, we happened to run into each other on a now-defunct Facebook metal fan page. She reached out to discuss my reviews and get some recommendations, we started chatting, and the rest, as they say, is history. I’m the luckiest guy in the world to have her and, since she lived in South Africa and I in New York, I highly doubt we ever would have found one another were it not for AMG. For this reason alone, I’ll cherish this little blog until my rusty metal heart explodes in my hairy ape chest. Fun fact: I never had a girlfriend that liked metal, and now I have a wife who listens to stuff that’s so extreme and out there, I end up sounding like my parents and saying shit like “This isn’t music, it’s just crazy noise!” Life is funny sometimes.
The Sadistic Pleasure that Comes from Unicorning Kvlt Strangeo Bands // You Axed for It – One cold, gloomy day back in February 2015, I was reviewing a cold, gloomy release by Danish doom/death act Dwell. Their Vermin and Ashes album didn’t especially thrill me, and I was annoyed that they had opted not to include a band photo in the promo materials. Sure, I get it. They wanted to be dark and mysterious. Who doesn’t? I searched online for a suitable image of them but there were none to be found. I became quite vexed. Where the inspiration came from I cannot say but I decided to bestow upon them a bright, mega-cheesy unicorn image, in place of the non-existent band shot. As I contemplated how the vomit of rainbow colors clashed with the murky gray malaise of the album cover, it looked so wrong that it felt so right! And so a blog protocol was born. Send band photos or face extreme unicorn judgment!
The Joys of Initiating Unsuspecting n00bs into the AMG Meatgrinder // Taste the Skull Pit, Poser – When I joined AMG back in its embryonic, protoplasmic stage, there was no probationary period or brutal abuse (aside from assigning me metalcore albums). Things changed as the blog grew and we started bringing on new writers. Soon, a system of impressment, indoctrination and re-education was put in place, and ruthlessly weaponized in service of internet “fame” and “glory.” Each carefully selected wannabe writer, eyes glistening with the ghosts of their past, would serve a tumultuous probationary term, working in complete isolation under the iron thumbs of AMG management. If they somehow survived this experiment in terror, they would be cast into the general population in the Skull Pit, with a besotted cadre of jaded, glassy-eyed veteran staffers. That’s when the real initiation would begin! Imagine Lord of the Flies mixed with The Hunger Games and The Devil’s Rejects, and you get the general idea. Through ritualized humiliation, unreasonable deadlines, and confrontational teaching methods, we slowly transform these sniveling amateurs into barely functional hack reviewers. Believe in the system or be buried by it me.
I wish I had written …
White Wizzard – The Devil’s Cut Review. Yes, the infamous review that’s hung around our necks like a rotting albatross ever since it saw the light of day in 2013. Had I been tasked with doing the review, I would have given it the rating it truly deserved, which is a big, fat, greasy 3.0. Just like the album that came before, and the one that followed. Now, I have nothing against White Wizzard and I enjoy the retro 80s metal style they play, but let’s face it, nothing they ever did came anywhere near a 5.0 (whether in its “Perfect” or “Iconic” guise). My common sense, real-world review would have spared us all a great deal of embarrassment, as well as saving the effort and bleach it took to scrub the office down after the First Grand Sabbaticaling. If only…
I wish I could do over …
Amon Amarth – Sutur Rising Review. As a relatively new reviewer, I got the unexpected chance to weigh in on a new Amon Amarth platter, while I was at the peak of my feverish AA fanboyism. This proved a deadly combination and, before my better angels could caution restraint and moderation, I stamped this thing with a 4.5, and got the album cover tattooed on my dog. With time (and much hobo wine), I realized that I let the moment get the better of me. Despite the presence of a few killer cuts like “War of the Gods” and “Destroyer of the Universe,” Sutur Rising is far from Amon Amarth’s best work. I dutifully submitted a groveling apology in a Contrite Metal Guy piece and tried to move on with my life. 13 years on, this one still stands as my biggest rating misadventure and a source of bitter regret. I blame society (AKA: you, the reader).
I wish more people had read …
Retro-spective Review: Hall Aflame – Guaranteed Forever. The side project of Metal Church’s Kurdt Vanderhoof, Hall Aflame saw but one release in 1991. But what a party this thing was and still is! Adopting a style somewhere between The Cult and The Four Horsemen, Hall Aflame roar through a collection of wildly catchy, burly rockers, making for a highly replay-able album, with only occasional reminders it’s made by the brain behind Metal Church. Cuts like “Shake the Pain,” Child of Medicine,” and “Money” are absolute monsters, and “Another Heartbeat” is one of my favorite songs of all time across all genres. The hugely ass-kicking vocals by completely unknown (then and now) frontman Ron Lowd alone are worth the effort it will take to track down this rare gem. The world continues to sleep on this killer, as evidenced by my retro-spective review scoring exactly ZERO comments. Don’t let this injustice continue. You need to hear this thing, especially with the recent news that Vanderhoof is releasing the long-awaited (by me at least) follow-up in May. You have my word as a Viking ape that satisfaction is Guaranteed Forever.
AMG is Now a Good Capitalist! In this gap-filler post from 2015, I posited the concept of AMG building a merch empire based upon goods of questionable quality (see our branded Uni-Friend and Sabbatical Sausage Maker pictured above). It got reads but, since I found the concept amusing, I wanted MOAR clicks. I credit this piece with motivating me to finally get a batch of actual AMG t-shirts printed up for the undeserving staff. If you see someone wearing one of these rare treasures and kill them, you take their place in the Skull Pit forevermore. It’s just like The Santa Clause, but much, much worse.
Dr. A.N. Grier
AMG and me
Back in the day, we’d be lucky to get two reviews a day at AMG. This led to me refreshing the site every few hours hoping for a bonus review for the day.3 I was obsessed with the writing and these gems I would never have found otherwise. Before I began writing here, I would do that regularly from 2010-2011. One morning I left the lab of my failing start-up and walked into my office to do some work. The post that morning wasn’t a review. Instead, it was instructions on how to apply to be an AMG writer. Without thinking—because I’d been up for roughly 40 straight hours—I submitted a review of 1349’s lackluster Demonoir. Weeks later, I was a n00b in these decrepit halls. And I’m still here regretting that decision, almost ten years to the day since I submitted my first review. It’s funny, now that I’ve gathered everything for this piece, that I found those early days the fondest of times. Those days when I still loved the writers, the readers, writing about metal, and well… music. Now I’m a broken soul, stalking the halls as a sex-depraved ghost,4 avoiding eye contact with Steel because his ape eyes make my pants tight.
But, in all seriousness, it’s been a wild ride and it’s odd to be one of the lucky few who have contributed to two-thirds of AMG’s existence. I’m proud to have kept the output so rounded, delivering correct scores and takes, and providing X-rated content for the younger generations. So, join me in celebrating AMG’s birthday, as I travel back to those early years when I became part of the family and discovered records that shaped the man known, for today at least, as Dr. All. Nostalgic. Grier.
AMG gave to me …
Mors Principium Est // Dawn of the 5th Era – As a n00b, Angry Metal Guy‘s review of Mors Principium Est’s Dawn of the 5th Era made me realize two things: I needed this band in my life and never release an album in December. Thankfully, AMG caught it (while everyone else was busting their asses to write their year-end lists) because it’s a stunning achievement. From that point on, I consider myself one of MPE’s biggest fans. That continuation of the At the Gates sound results in incredible performances and riff after massive riff. Not a single song on this album goes stale and I’ve been listening to it regularly for ten fucking years. I can never seem to find a melodeath group whose entire catalog I march through from beginning to end.5 But MPE is one of them. And, because you might be wondering, … And Death Said Live is their best album.
Voices // London – Back in 2014, I ranked an album I never reviewed. Weird, right? Not only was it a great album, but it was one of my favorite reviews from the illustrious Jean-Luc Ricard, who opened his thoughts with: “If you’re anything like me, you’re super awesome.” Still makes me laugh my ass off. Beyond that, Ricard conveyed the absolute nightmare that you experience when you listen to London. Though Akercocke has since reunited, Voices was an incredible substitute, which takes you through a journey that, somehow, Ricard was able to describe; because I sure as hell can’t. I was doing an oil change on my truck the first time I span it. Never have I taken so long to do that work but I constantly found myself staring off into space, literally frightened by the sounds erupting in my ears. The band has never been able to top London, but that’s OK. It’s one of the beautiful aspects of music—it’s permanent and will be there forever when you need it.
Trials // This Ruined World – When I joined AMG and worked side-by-side with Dr. Fisting, we hit it off. I love the guy and consider him a close friend (though he might not feel the same). When I found out that he started a band called Trials, I had to check it out. With two decent albums under his belt, 2014 saw the release of Trials’ best—and final—album, This Ruined World. I was hooked. And to imagine that without knowing about this band or this person, I might never have experienced his work in Bear Mace and the (to me, at least) incredible Black Sites. Though I don’t return to Trials often, mostly because I can’t pull myself away from Fisting‘s current work, I have a special place in my heart for This Ruined World. It introduced me to a fantastic musician and a good friend.
I wish I had written …
Origin – Omnipresent Review. When you join the crew, the hope is that you get to write that review for a big band. Those bands you grew up with, that released something at that point in your life, or which have such popularity that every other site overrates them. But, at AMG, you kinda have to earn that. Unless it’s, somehow, a popular dungeon synth group; you can just have that. So, when my most-anticipated album of 2014 dropped, I wanted it. But, there wasn’t a chance in hell I would get my hands on Origin’s Omnipresent. I bet you didn’t know I liked tech death, much less Origin. But, I do. I just know there are other, more qualified writers to cover that material. Thankfully, our wise and wonderful Kronos scored it correctly and wrote a fantastic review that describes it perfectly. Since then, I haven’t been as enamored with their material (mostly because this place has turned me into a hateful prick), but that album holds up and still gets many a spin.
I wish I could do over …
Resumed – Alienation Review. I remember when the review for Resumed’s Alienations was published. It was Thanksgiving 2014 and I was already six sheets to the wind when I realized what I was reading: the first double review in AMG history. It wasn’t a record that merited a double but Steel fucked up and double-booked it, thereby unintentionally beginning a trend. Though I couldn’t believe I wasted my time on this thing6 and subjected myself to uncalled-for ridicule, it started one of our most popular segments. Hell, it even led to our Unsigned Band Rodeö pieces. So, for better or worse (and by worse, I mean that year’s burned turkey), we can thank this worthless piece for contributing to AMG lore.
I wish more people had read …
Thine – The Dead City Blueprint [Things You Might Have Missed 2014]. In the process of writing the review for The Deathtrip’s stellar 2014 release, Deep Drone Master, Metal Archives led me to a release we never received. In walks Thine, a progressive rock outfit led by the same person who convinced Aldrahn to come back from retirement to front Deep Drone Master, not to mention drummer Dan Mullins, who returned for My Dying Bride’s newest release. Representing my first ever Things You Might Have Missed piece, I continue to return to this band’s swansong release: it’s beautiful and engaging, and is everything I ever wanted from an album of this caliber. My unpopularity as a n00b, combined with the new year beginning and everyone moving on to January releases, meant no one seemed to care. But I cared. I care so much, in fact, that I’m dropping Thine’s name again, in the hope that Bandcamp credits will be put to good use. You’re welcome.
Dr. Fisting
AMG and me
As a reader of the site’s earliest incarnation, the first thing that stood out to me was that AMG’s writers were clearly educated. Even back then, the reviews were extremely well-written. I don’t mean just in terms of spelling and grammar, but being able to express ideas coherently. If you’ve ever visited any other metal-related sites, you know that these qualities are rare. More importantly, AMG was clearly an independent operation, with no reliance on ad revenue or cozy relationships with record labels. This meant the site was free to post brutally honest reviews, which occasionally resulted in battles against the metal media’s narrative and even the fans themselves. I always enjoyed when some huge band would put out a half-assed album that got rave reviews everywhere else, and then the AMG writeup would take a well-deserved shit on it.
When I started writing for the site a couple of years later, I did my best to uphold those standards. Eventually, as my life and priorities changed, I chose to step back from reviewing to focus on other things. But it was an honor to ride with these guys for as long as I did. I got to review some fantastic records, talk shit about some terrible ones, and make some friends that I am still in contact with to this day.
AMG gave to me …
Pain of Salvation // Road Salt Pt. 1 – I don’t remember if I discovered this record from reading the site or from The Angry One Himself sending it to me (“here, you’ll like this”), but Road Salt Pt. 1 was a complete game-changer. At a time when I was completely bored of “modern metal” and its trappings, I related strongly to PoS’s new direction, in which chug riffs and rapping were replaced by analog ’70s tones and memorable songs. This record was in heavy rotation in the Fisting household, and became a significant influence on my own music.
Satan // Life Sentence – Having missed out on Satan’s original run, I was unaware of their comeback album until the AMG review heaped praise upon it. Lucky for me it did because Life Sentence is full of intelligent lyrics, clever riffs, and memorable hooks. The band has since made three more records, all of which have been varying degrees of excellent. More importantly, discovering Life Sentence sent me on a path to revisit the band’s earlier works, including the highly influential Court in the Act.
Anacrusis // Screams and Whispers – Anacrusis is another band I was completely oblivious to during their lifespan, but discovered much later via Grymm‘s excellent retrospective writeup. This album is incredibly ambitious for its time (1993), pushing thrash metal into new and more introspective territory. There are hints of industrial influence, occasional goth-y keyboards, and some very angular guitar work, even by 1990s standards. This is a classic record from metal’s lost years, and more people should hear it.
I wish I had written …
King’s X – Three Sides of One Review. Not to suggest that Huck didn’t do a fantastic job on the review, because he absolutely nailed it, but King’s X has held a special place in my cold black heart for many years. I should’ve been there for this. There is no good reason why I didn’t do this review (or the related Angry Metal Primer) other than my own laziness and poor time management. Life gets in the way sometimes. I wish I could do over … I regret nothing.
I wish more people had read …
Various reviews of Voivod and Failure albums. As several readers noticed, I made it a personal mission to preach the virtues of Voivod and Failure. I consider both bands to be absolutely brilliant and worthy of greater attention (particularly Failure, whom I suspect most AMG readers are unfamiliar with). I don’t know how many people read those reviews, but whatever that number is, it needed to be more.
#2024 #AMGTurns15 #AmonAmarth #Anacrusis #BlogPost #BlogPosts #Failure #HallAflame #KingsX #MorsPrincipiumEst #Origin #PainOfSalvation #Resumed #Satan #Thine #Trials #Voices #Voivod #WhiteWizzard
-
Hehehe.... I suppose this is my opportunity to plug Joshua's free domain name service here that's been a trusted mainstay for over 20 years :p
Perhaps one of the best parts is that you can see how many days (years) the domain has been part of the service, to dissuade concerns over whether there's a likelihood of it suddenly disappearing :)
And.... who doesn't love #FreeBSD?
Aside from lotech #Sneakernet schemas, there's also several FOSS based community driven initiatives. There's been a lot of real-world, ad-hoc development since my years of participation in the IRTF/IRSG DTNRG. Full disclosure, I was also formerly employed by Semtech. The notion that there's a use case for communications that can take months, even decades to arrive (or never at all) is a valid concern for many practical applications.
More immediate, and relevant communications systems for most folks here on Terra Firma include projects like Lokinet, which has some great info HERE, and also aspires to the same level of, um... disconnectivity (sneakernet-like) or operability that @silverpill mentions above - the value of having a client/server architecture that is prepared to exploit this out of the box is much relevant that one might think.
Some of these semi-production or production ready real world initiatives and communities are:
- CellSol - It is noteworthy to mention that Texas is the only state in the USA that sports an (actually only mostly) autonomous electricity grid (not dependent upon the national grid), although it's not publicly managed, and has been criticized as such due in part to nearly 1000 deaths occurring in the winter of 2021. The repo is HERE. Here's a PoC for one such use case from the PoV of a native Texan: Apologies for directly linking to an article in the monolithic silo space.
During my years living quite literally off-grid in the wilderness of the forested mountains in Northern California, I had the privilege of meeting and contracting for several farms and individuals deeply steeped in what is generally referred to as the prepper movement. These weren't cray cay militants (at least not most of them) or paranoiacs calling for revolution or believing the end is nigh, but rather, farmers and families who were, rightfully so, extremely concerned with security and safety for their small communities and loved ones. To survive in places like that, which exist all over the world, one must begin with self-sufficiency that covers 4 seasons; beyond that, protecting the 'me and mine' aspects of your assets and property are very real considerations.
The work I focused on led me to developing microwave surveillance systems using inexpensive, solar powered Ubiquiti Nanostations with ranges capable of exceeding 10KM, strategically placed in almost inaccessible locations overlooking entire valleys as well as within small perimeters of their farms and households. This included off the shelf PTZ cameras, many of which were capable of license plate and facial recognition, but more importantly, being able to determine the difference between things like Bears, Deer, and Humans - false positive intrusions detected are quite frustrating, lolz.
All of this was coupled together with Shinobi, which can be monitored and controlled from anywhere, on any device. The repo is here.
With the extreme threat levels of thievery and other concerns in those regions, and continuous incidents of such, a comprehensive #FOSS based, #solar powered communications and surveillance system is an in demand market. Internet access is of course, problematic in such regions, which creates the market for WISPs operating in the unlicensed microwave bands a high demand commodity as well.
This is merely demonstrative of the need for another niche type computing arena - community networks completely unconnected to the Internet:
- A BLE LRMN project - The git repo for the dual-licensed software is [HERE(https://github.com/mwaylabs/fruitymesh).
- Many more BLE projects (many in Rust) here
- More BLE for IoT
- Off-Grid Mesh Networks using LoRa and/or BLE
- BSD licensed OpenThread
- Message Queuing Telemetry Transport - NOTE: For XMPP, ejabberd has native broker services.
- Freeing up Zigbee devices w/MQTT
- OSS-7 / Dash-7 Lora/LoraWAN technologies are sub Ghz.
- More (dry) IEEE related Mesh stuffsAll of these projects, protocols, and initiatives have solution based choices for the various kinds of Delay Tolerant Networking standards and communities actively developing for connectionless, intermittently connected systems, or autonomous networks that aren't neccessarily interdependant upon a classic, traditional, Internet connection.
Not sufficient to just eschew the deprecated, privacy disrespecting monolithic silos, it's also not prudent to depend upon clearnet aspects of the Internet either. In practice, it's possible to take pretty much any platform technology that listens for packets and fashion the ability to be accessible and available via I2P, Tor, IPFS Yggdrasil, and other IP routed constructs, yet moreover, the majority of people only consider intercommunication in terms of the IP routed packet switched network we call the Internet (powered almost entirely by Cisco IOS and the like), without due consideration given to the fact that this single common denominator is also a single choke point - kludgy platforms like
mastothat can't even keep up with the contemporary movements in the social networking landscape aren't going to fare well when it comes to the expanding horizons opening up with movements like those above, while others like Sreams, Mitra, and perhaps protocols such as Nostr that exhibit the ambitions to explore and exploit emerging technologies in communication will fare much better, adapting (and embracing layer 1 & 2 networking) along the way.#tallship #DNS #nomadic_identity #sneakernet #PTZ #WISP
⛵
.
-
In brief:...this wasn’t a malicious scraper trying to crawl the network and make money off of people’s posts. It was some guy’s hobby project to build a service similar to micro.blog. ...
Sascha was building a free ActivityPub library for his project. While he managed to get the basic concepts down, there were still a lot of missing pieces that are essential for participating in the modern Fediverse. Unfortunately, a lot of those resources are not readily available to anyone. ...
...community members decided to have a little bit of extra fun, attempting to “make the crawler crash“, send angry emails to the service operator, and more. After some study of how the site worked, one person had the malicious idea to send a remote post containing child pornography to the site, before getting someone else to report Content Nation for Child Sexual Abuse Material. ...
Sascha’s life could have been turned upside down for absolutely nothing. Say what you will about how his website looked, or how his platform functioned: none of these things warranted such a disgusting level of abuse. Somebody basically tried to send a fledgling platform developer to prison, because they didn’t like what he was doing. ...
Look. I know this is not a majority of Mastodon users. But this is evil stuff. This isn't what the Fediverse is for.
In politics, a lot has been said about the results of tolerating intolerance. Well, the same principles apply to social networking.
And I'm going to point out again, just in case any Mastodon users are actually listening to me, that you have MANY Fediverse options besides Mastodon. If that tool is not serving your needs, and if the culture is becoming polluted, please look at the alternatives.
[email protected] wrote the following post Sat, 02 Mar 2024 16:19:00 -0800 Yesterday, Mastodon was abuzz regarding a strange new scraper that seemed to be pulling people’s profiles and content streams into a platform designed around monetization. Dubbed Content Nation, the site’s combination of strange design, stock images, and focus on getting paid for posts raised more than a few eyebrows. Indeed, the site visually resembles something akin to a domain parking page, with an eye-watering visual layout and strange mix of posts that don’t seem to fit anywhere.- Holy stock art, Batman!
- I can’t make heads or tails of this.
A lot of people were angry, and readily pulled out the torches and pitchforks. The sad truth, though, was that this wasn’t a malicious scraper trying to crawl the network and make money off of people’s posts. It was some guy’s hobby project to build a service similar to micro.blog.What is Content Nation?
Content Nation is, essentially, a feed reader for a small publishing community that just happened to be experimenting with ActivityPub. It’s a project developed by Sascha Nitsch, a backend developer and management consultant from Germany. Sascha is a relative outsider to the Fediverse that heard about the network, loved the idea behind it, and tried to integrate his site into the network.
The site is, understandably, somewhat jarring in its appearance, because Sascha is primarily a backend developer, not a frontend designer. He was more interested in building out a robust list of features prior to doing any visual design work, because the platform was still taking shape. As a one-man operation, this kind of approach made the most sense to him.
“The site was and is free,” Sascha wrote, “no ads, no cookies nor tracking. I did not make any money with the federation; it’s a service for users on the platform. And it was never intended to be to make money with those content.”How did the Fediverse React?
Several people came forward to point out to Sascha that his platform interoperates very, very poorly with Mastodon, and that Sascha did not do sufficient research prior to launching his service. Until recently, Content Nation didn’t properly make use of a User Agent in its request, so it was easy to mistake for a scraper.
Compounding things further, people didn’t realize that the site implemented Webfinger in its search function, allowing people to load remote content by putting in an address into a search field. People would go to Content Nation, search for themselves, and inadvertently kick off a fetch request, leading them to believe they had just been scraped. In reality, this is how 99% of Fediverse servers operate by default.
When users sent GDPR takedowns, Sascha would comply, but the system had nothing in place to block anything. Those same users were distraught to once again search for themselves, only to find their own data all over again.The High Barrier of Entry for Fediverse Development
The shortcomings described above paint a picture: Sascha was building a free ActivityPub library for his project While he managed to get the basic concepts down, there were still a lot of missing pieces that are essential for participating in the modern Fediverse. Unfortunately, a lot of those resources are not readily available to anyone.
Here’s the thing: if you were to take the ActivityPub specification from the W3C, and implement it as specified, you’d end up with something that wouldn’t correctly talk to any service in use today. Mastodon, and platforms designed to talk to it, have a dozen or so behaviors that are not actually in the spec at all: Webfinger, SharedInbox, Privacy Scopes, and Opt-Out for Search are just a few of them.
Many of these things are almost completely undocumented, and can only be developed by lengthy conversations with people who already built those things. Even Mastodon’s own specs say very little. The majority of people dismissed Content Nation as simply being a malicious attempt to slurp up their public and private content for profit. Even when Sascha tried to defend himself, he was ridiculed and mocked.From Bad to Worse
Aside from simply blocking the domain and moving on, community members decided to have a little bit of extra fun, attempting to “make the crawler crash“, send angry emails to the service operator, and more. After some study of how the site worked, one person had the malicious idea to send a remote post containing child pornography to the site, before getting someone else to report Content Nation for Child Sexual Abuse Material.
To be clear: someone searched a list of known illegal material, loaded that remote content onto Content Nation locally, and then put up a red flag for someone to file a report. Given the server’s jurisdiction being in Germany, this could have been catastrophic: Germany’s laws regarding CSAM stipulate a one-year prison term minimum for possession of this kind of material. Just look at the recent case of a teacher who found out that a pornographic video was circulating about one of her students. When she tried to turn in evidence to the police, she was arrested.It’s a case that causes people to shake their heads: A teacher wanted to help a student whose intimate video was circulating at school and now has to answer in court for, among other things, distributing child pornography.
Following a complaint from the public prosecutor’s office, the Koblenz regional court overturned the decision of the Montabaur district court not to open main proceedings in this case. “The regional court, like the public prosecutor, considers the behavior of the accused to be criminal in principle,” said senior public prosecutor Thomas Büttinghaus. The educator is currently facing at least a year in prison – and with it the loss of her civil servant status.
Sascha’s life could have been turned upside down for absolutely nothing. Say what you will about how his website looked, or how his platform functioned: none of these things warranted such a disgusting level of abuse. Somebody basically tried to send a fledgling platform developer to prison, because they didn’t like what he was doing. A series of assumptions and misunderstandings escalated to this point.Why is this Important?
Over the years, Mastodon’s user culture has become incredibly insular and hostile towards outsiders. Despite repeated claims of “People are just nicer here!” and “Everyone is just so welcoming!”, often those preaching about privacy and consent are the first to harass anyone doing something they don’t like. Reactions have extended to doxxing, death threats, DDoS attacks, and apparently, distribution of CSAM. Just the other week, Mastodon users were harassing a guy who built a protocol bridge that hadn’t even been enabled yet.
Neither of these things are first occurrences, either. People in the past have tried to build tooling for the Fediverse, from search engines to disposable account services for testing to indexes of verified accounts. People like Wil Wheaton were harassed off the network for their ignorance of nuances about who was on a given blocklist that they shared. Some Lemmy instsances have been flooded with CSAM as part of a community retaliation effort from other instances.
Mastodon’s user community have also long looked down their noses at other platforms such as Pleroma, due to a combination of platform rivalry, cultural clashes, personal squabbles, and an “us vs them” mentality. It wasn’t so long ago that simply using Pleroma was considered a valid reason for blocking someone on sight, because good people only used Mastodon.
Source: FediDB.org
Mastodon still makes up the majority of the Fediverse at this point, and acts as a defacto standard for ActivityPub. Many parts of the Mastodon community still threaten to block, doxx, or harass people simply because they expressed a thought or opinion that stands in contrast to what the hive mind demands.
Even Damon, at one point, has received death threats from total strangers for his perspective on FediPact and Threads that other people didn’t agree with. He’s told me on several occasions that the Fediverse doesn’t feel like it was made for people like him, and a good portion of it is due to Mastodon’s user culture.
Whatever this thing is, it’s not sustainable. A big aspect of Mastodon’s norms center around a type of Puritanical culture that half the time, isn’t even consistent with itself. We can’t advocate for a space and say that it’s so much better than everywhere else, when so many people are subjected to this.The Aftermath
A report was filed with Content Nation’s host, Hetzner, due to the presence of CSAM being detected. However, Sascha’s platform was only set up to cache remote content for an hour prior to purging it. The best conclusion we can draw from this, at the moment, is that someone willingly set Content Nation up.
“I’m not sure it even was CSAM,” Sascha writes in a private chat, “I never saw the pictures, as they had already been deleted. The data was already removed from the cache, and the original server was down, so it wasn’t refreshed [on Content Nation].”
“My flat could have been raided, and I would not have an electronic device left to write this,” he added.
As of this writing, Content Nation has turned off all Fediverse integrations, and Sascha has been turned off of having anything to do with the network after having this experience. He has been effectively bullied off the network.How can we avoid this happening again?
Throughout researching this article and situation, I think there are several things that really, really need to change for the better. The modern Fediverse operates involves a long list of internal knowledge that’s not really written down anywhere. No part of the ActivityPub spec or Mastodon talks about how to implement their special pieces, so that people writing new servers can be good actors.
As it stands today, no singular piece of Fediverse software includes instructions to load a “worst of the worst” blocklist when setting up an instance, or to put a Webfinger search form behind a login page. What seems like common sense to some people is literally a new concept to others.
Culturally, we need to accept that most people coming into the community for the first time are operating with a lack of prior knowledge. We can’t simply cross our arms and say “You should have known better”, and socially punish people, when in fact there was no way for them to learn about it.
https://wedistribute.org/2024/03/contentnation-mastodons-toxicity/
#CSAM #Harassment #TrustSafety
-
In brief:...this wasn’t a malicious scraper trying to crawl the network and make money off of people’s posts. It was some guy’s hobby project to build a service similar to micro.blog. ...
Sascha was building a free ActivityPub library for his project. While he managed to get the basic concepts down, there were still a lot of missing pieces that are essential for participating in the modern Fediverse. Unfortunately, a lot of those resources are not readily available to anyone. ...
...community members decided to have a little bit of extra fun, attempting to “make the crawler crash“, send angry emails to the service operator, and more. After some study of how the site worked, one person had the malicious idea to send a remote post containing child pornography to the site, before getting someone else to report Content Nation for Child Sexual Abuse Material. ...
Sascha’s life could have been turned upside down for absolutely nothing. Say what you will about how his website looked, or how his platform functioned: none of these things warranted such a disgusting level of abuse. Somebody basically tried to send a fledgling platform developer to prison, because they didn’t like what he was doing. ...
Look. I know this is not a majority of Mastodon users. But this is evil stuff. This isn't what the Fediverse is for.
In politics, a lot has been said about the results of tolerating intolerance. Well, the same principles apply to social networking.
And I'm going to point out again, just in case any Mastodon users are actually listening to me, that you have MANY Fediverse options besides Mastodon. If that tool is not serving your needs, and if the culture is becoming polluted, please look at the alternatives.
We Distribute wrote the following post Sat, 02 Mar 2024 16:19:00 -0800 Yesterday, Mastodon was abuzz regarding a strange new scraper that seemed to be pulling people’s profiles and content streams into a platform designed around monetization. Dubbed Content Nation, the site’s combination of strange design, stock images, and focus on getting paid for posts raised more than a few eyebrows. Indeed, the site visually resembles something akin to a domain parking page, with an eye-watering visual layout and strange mix of posts that don’t seem to fit anywhere.- Holy stock art, Batman!
- I can’t make heads or tails of this.
A lot of people were angry, and readily pulled out the torches and pitchforks. The sad truth, though, was that this wasn’t a malicious scraper trying to crawl the network and make money off of people’s posts. It was some guy’s hobby project to build a service similar to micro.blog.What is Content Nation?
Content Nation is, essentially, a feed reader for a small publishing community that just happened to be experimenting with ActivityPub. It’s a project developed by Sascha Nitsch, a backend developer and management consultant from Germany. Sascha is a relative outsider to the Fediverse that heard about the network, loved the idea behind it, and tried to integrate his site into the network.
The site is, understandably, somewhat jarring in its appearance, because Sascha is primarily a backend developer, not a frontend designer. He was more interested in building out a robust list of features prior to doing any visual design work, because the platform was still taking shape. As a one-man operation, this kind of approach made the most sense to him.
“The site was and is free,” Sascha wrote, “no ads, no cookies nor tracking. I did not make any money with the federation; it’s a service for users on the platform. And it was never intended to be to make money with those content.”How did the Fediverse React?
Several people came forward to point out to Sascha that his platform interoperates very, very poorly with Mastodon, and that Sascha did not do sufficient research prior to launching his service. Until recently, Content Nation didn’t properly make use of a User Agent in its request, so it was easy to mistake for a scraper.
Compounding things further, people didn’t realize that the site implemented Webfinger in its search function, allowing people to load remote content by putting in an address into a search field. People would go to Content Nation, search for themselves, and inadvertently kick off a fetch request, leading them to believe they had just been scraped. In reality, this is how 99% of Fediverse servers operate by default.
When users sent GDPR takedowns, Sascha would comply, but the system had nothing in place to block anything. Those same users were distraught to once again search for themselves, only to find their own data all over again.The High Barrier of Entry for Fediverse Development
The shortcomings described above paint a picture: Sascha was building a free ActivityPub library for his project While he managed to get the basic concepts down, there were still a lot of missing pieces that are essential for participating in the modern Fediverse. Unfortunately, a lot of those resources are not readily available to anyone.
Here’s the thing: if you were to take the ActivityPub specification from the W3C, and implement it as specified, you’d end up with something that wouldn’t correctly talk to any service in use today. Mastodon, and platforms designed to talk to it, have a dozen or so behaviors that are not actually in the spec at all: Webfinger, SharedInbox, Privacy Scopes, and Opt-Out for Search are just a few of them.
Many of these things are almost completely undocumented, and can only be developed by lengthy conversations with people who already built those things. Even Mastodon’s own specs say very little. The majority of people dismissed Content Nation as simply being a malicious attempt to slurp up their public and private content for profit. Even when Sascha tried to defend himself, he was ridiculed and mocked.From Bad to Worse
Aside from simply blocking the domain and moving on, community members decided to have a little bit of extra fun, attempting to “make the crawler crash“, send angry emails to the service operator, and more. After some study of how the site worked, one person had the malicious idea to send a remote post containing child pornography to the site, before getting someone else to report Content Nation for Child Sexual Abuse Material.
To be clear: someone searched a list of known illegal material, loaded that remote content onto Content Nation locally, and then put up a red flag for someone to file a report. Given the server’s jurisdiction being in Germany, this could have been catastrophic: Germany’s laws regarding CSAM stipulate a one-year prison term minimum for possession of this kind of material. Just look at the recent case of a teacher who found out that a pornographic video was circulating about one of her students. When she tried to turn in evidence to the police, she was arrested.It’s a case that causes people to shake their heads: A teacher wanted to help a student whose intimate video was circulating at school and now has to answer in court for, among other things, distributing child pornography.
Following a complaint from the public prosecutor’s office, the Koblenz regional court overturned the decision of the Montabaur district court not to open main proceedings in this case. “The regional court, like the public prosecutor, considers the behavior of the accused to be criminal in principle,” said senior public prosecutor Thomas Büttinghaus. The educator is currently facing at least a year in prison – and with it the loss of her civil servant status.
Sascha’s life could have been turned upside down for absolutely nothing. Say what you will about how his website looked, or how his platform functioned: none of these things warranted such a disgusting level of abuse. Somebody basically tried to send a fledgling platform developer to prison, because they didn’t like what he was doing. A series of assumptions and misunderstandings escalated to this point.Why is this Important?
Over the years, Mastodon’s user culture has become incredibly insular and hostile towards outsiders. Despite repeated claims of “People are just nicer here!” and “Everyone is just so welcoming!”, often those preaching about privacy and consent are the first to harass anyone doing something they don’t like. Reactions have extended to doxxing, death threats, DDoS attacks, and apparently, distribution of CSAM. Just the other week, Mastodon users were harassing a guy who built a protocol bridge that hadn’t even been enabled yet.
Neither of these things are first occurrences, either. People in the past have tried to build tooling for the Fediverse, from search engines to disposable account services for testing to indexes of verified accounts. People like Wil Wheaton were harassed off the network for their ignorance of nuances about who was on a given blocklist that they shared. Some Lemmy instsances have been flooded with CSAM as part of a community retaliation effort from other instances.
Mastodon’s user community have also long looked down their noses at other platforms such as Pleroma, due to a combination of platform rivalry, cultural clashes, personal squabbles, and an “us vs them” mentality. It wasn’t so long ago that simply using Pleroma was considered a valid reason for blocking someone on sight, because good people only used Mastodon.
Source: FediDB.org
Mastodon still makes up the majority of the Fediverse at this point, and acts as a defacto standard for ActivityPub. Many parts of the Mastodon community still threaten to block, doxx, or harass people simply because they expressed a thought or opinion that stands in contrast to what the hive mind demands.
Even Damon, at one point, has received death threats from total strangers for his perspective on FediPact and Threads that other people didn’t agree with. He’s told me on several occasions that the Fediverse doesn’t feel like it was made for people like him, and a good portion of it is due to Mastodon’s user culture.
Whatever this thing is, it’s not sustainable. A big aspect of Mastodon’s norms center around a type of Puritanical culture that half the time, isn’t even consistent with itself. We can’t advocate for a space and say that it’s so much better than everywhere else, when so many people are subjected to this.The Aftermath
A report was filed with Content Nation’s host, Hetzner, due to the presence of CSAM being detected. However, Sascha’s platform was only set up to cache remote content for an hour prior to purging it. The best conclusion we can draw from this, at the moment, is that someone willingly set Content Nation up.
“I’m not sure it even was CSAM,” Sascha writes in a private chat, “I never saw the pictures, as they had already been deleted. The data was already removed from the cache, and the original server was down, so it wasn’t refreshed [on Content Nation].”
“My flat could have been raided, and I would not have an electronic device left to write this,” he added.
As of this writing, Content Nation has turned off all Fediverse integrations, and Sascha has been turned off of having anything to do with the network after having this experience. He has been effectively bullied off the network.How can we avoid this happening again?
Throughout researching this article and situation, I think there are several things that really, really need to change for the better. The modern Fediverse operates involves a long list of internal knowledge that’s not really written down anywhere. No part of the ActivityPub spec or Mastodon talks about how to implement their special pieces, so that people writing new servers can be good actors.
As it stands today, no singular piece of Fediverse software includes instructions to load a “worst of the worst” blocklist when setting up an instance, or to put a Webfinger search form behind a login page. What seems like common sense to some people is literally a new concept to others.
Culturally, we need to accept that most people coming into the community for the first time are operating with a lack of prior knowledge. We can’t simply cross our arms and say “You should have known better”, and socially punish people, when in fact there was no way for them to learn about it.
https://wedistribute.org/2024/03/contentnation-mastodons-toxicity/
#CSAM #Harassment #TrustSafety
-
I was going to leave things at Part 3 blog-wise, and just get on with filling in the gaps in code now, but I’ve come back to add a few more notes. But this is likely to be the final part now.
Recall so far, I have:
- Part 1 where I work out how to build Synth_Dexed using the Pico SDK and get some sounds coming out.
- Part 2 where I take a detailed look at the performance with a diversion into the workings of the pico_audio library and floating point maths on the pico, on the way.
- Part 3 where I managed to get up to 16-note polyphony, by overclocking, and some basic serial MIDI support.
This is building on the last part and includes notes on how I’ve implemented the following:
- Fuller MIDI support, including control change, program change and pitch bend messages.
- Voice and voice banks, selectable over MIDI.
- MIDI SysEx messages for voice parameters.
- USB MIDI device support.
The latest code can be found on GitHub here: https://github.com/diyelectromusic/picodexed
Warning! I strongly recommend using old or second hand equipment for your experiments. I am not responsible for any damage to expensive instruments!
If you are new to microcontrollers, see the Getting Started pages.
MIDI Support
I’m not going to walk through all the details of how I’ve added MIDI but suffice to say that once again the implementation owes a lot to MiniDexed and the Arduino MIDI Library.
At the time of writing the following are all supported as they were already supported in Synth_Dexed, so I just needed to glue the bits together.
Channel Voice Messages (only channel 1 at present)
0x80MIDI Note Offnote=0..127, vel=0..1270x90MIDI Note Onnote=0..127, vel=0..1270xA0Channel Aftertouchnote=0..127, val=0..1270xB0Control ChangeSee below0xC0Program Change0..31 (If used with BANKSEL)
0..127 (if used independently)0xE0Pitch Bend0..16383 (in LSB/MSB 2×7-bit format)Channel Control Change Messages
0Bank Select (MSB)01Modulation0..1272Breath Control0..1274Foot Control0..1277Channel Volume0..12732Bank Select (LSB)0..864Sustain<=63 Off, 64=> On65Portamento<=63 Off, 64=> On95Master Tune0..127 *120All Sound Off0123All Notes Off0126Mono Mode0 **127Poly Mode0* There is a bug with the master tuning. It ought to accept -99 to 99 I believe, but only 0..99 will actually register and there is no way to send -99 via MIDI at the moment. I need to read up on what is going on here and what it ought to do!
** The Mono Mode parameter has the option for specifying how many of the playable voices can be dedicated to mono mode (at least I think that is what it is saying). I only support a value of 0 which I believe is meant to mean “all available voices”.
System Messages
0xF0..0xF7Start/End System ExclusiveSee below0xFEActive SensingFiltered out0xFnOther system messagesIgnoredSystem Exclusive Messages
Any valid Yamaha (DX) system exclusive messages are passed straight into Synth_Dexed. A Yamaha (DX) message has the following format (see the “DX7IIFD/D Supplemental Booklet: Advanced MIDI Data and Charts”):
F0 - start SysEx message
43 - Yamaha manufacturer ID
sd - s=substatus (command class:0,1,2); d=device ID (0..F)
.. data ..
F7 - end SysEx messageThe device ID can be set using the UI on a real DX7 to a value between 1 and 16, which becomes a value between 0 and 15 (0..F) as part of the SysEx message (see “DX7IIFD/D Supplemental Booklet: Advanced MIDI Applications, Section 8”). It is a Systems Exclusive value analogous to the MIDI channel for regular channel messages.
There are a range of Sys Ex parameter settings that have been passed onto Synth_Dexed as follows:
Mono Mode0..1Pitch Bend Range0..12Pitch Bend Step0..12Portamento Mode0..1Portamento Glissando0..1Portamento Time0..99Mod Wheel Range0..99Mod Wheel Target0..7Foot Control Range0..99Foot Control Target0..7Breath Control Range0..99Breath Control Target0..7Aftertouch Range0..99Aftertouch Target0..7Voice Dump Load<156 bytes of voice data>Voice Parameter SetParameter=0..155; Data=0..99At this stage, all of the MIDI support is on a “it’s probably something like this” basis, so it will evolve as I find out what it is meant to be doing!
Voice and Bank Loading
Banks of voices are programmed directly into the code. There is a python script from Synth_Dexed that will take a .syx format voice bank and generate a block of C code. I’ve included a script to download the main 8 banks of standard DX voices and run the script:
#!/bin/sh
# Get voices from
# https://yamahablackboxes.com/collection/yamaha-dx7-synthesizer/patches/
mkdir -p voices
DIR="https://yamahablackboxes.com/patches/dx7/factory"
wget -c "${DIR}"/rom1a.syx -O voices/rom1a.syx
wget -c "${DIR}"/rom1b.syx -O voices/rom1b.syx
wget -c "${DIR}"/rom2a.syx -O voices/rom2a.syx
wget -c "${DIR}"/rom2b.syx -O voices/rom2b.syx
wget -c "${DIR}"/rom3a.syx -O voices/rom3a.syx
wget -c "${DIR}"/rom3b.syx -O voices/rom3b.syx
wget -c "${DIR}"/rom4a.syx -O voices/rom4a.syx
wget -c "${DIR}"/rom4b.syx -O voices/rom4b.syx
./synth_dexed/Synth_Dexed/tools/sysex2c.py voices/* > src/voices.hThis only needs to be run once to create the src/voices.h file which is then included in the build.
Voices have the following format:
uint8_t progmem_bank[8][32][128] PROGMEM =
{
{ // Bank 1
{<--128 bytes of packed voice data-->} // Voice 1
...
{<--128 bytes of packed voice data-->} // Voice 32
}
{ // Bank 2
...
}
...
{ // Bank 8
{<--128 bytes of packed voice data-->} // Voice 1
...
{<--128 bytes of packed voice data-->} // Voice 32
}
}The system assumes 8 banks of 32 voices each, in the “packed” SYX header format, meaning each voice consists of 128 bytes.
MIDI Bank and Voice Selection
As there are only 8 banks, only BANKSEL (LSB) values 0..7 are valid. Program Change will work in two ways however:
- 0..31 will select voices 1 to 32 in the current bank.
- 31..127 will select voices from the following three adjacent banks.
To select any voice in all 8 banks thus requires the following sequence:
BANKSEL MSB = 0
BANKSEL LSB = 0..7
PROG CHANGE = 0..31But if bank selection is skipped, then Program Change messages can still be used to select one of the first 128 voices across four consecutive banks.
USB MIDI
The Raspberry Pi Pico SDK uses the TinyUSB protocol stack to implement USB device or host modes and there is an additional option to implement a second USB host port using the Pico’s PIO.
However, USB MIDI appears to only be supported for USB devices at the time of writing, so I’m just using the built-in USB port as a USB device, based on the code provided as part of the TinyUSB examples (more details of how to get basic USB MIDI running here).
TinyUSB MIDI supports two interfaces for reading data, and this wasn’t immediately obvious from the example as that is only sending data and ignores anything coming in.
- USB MIDI Stream mode: this will fill a provided buffer with MIDI data received over USB.
- USB MIDI Packet mode: this will return each 4-byte USB packet individually.
From what I can see of the USB MIDI Spec, all MIDI messages are turned into 4-byte packets for transferring over USB. All normal MIDI messages will consist of 1, 2 or 3 byte messages, and so will fit in a packet each – any unused bytes are padded with 0.
However SysEx messages are a little more complicated and have to be split across multiple packets.
This is the format for a USB MIDI Event Packet (see the “Universal Serial Bus Device Class Definition for MIDI Devices”, Release 1.0):
The code index number is an indication of the contents of each packet. For channel messages, this is basically a repeat of the MIDI command, so a MIDI Note On message might look something like the following:
09 92 3C 64
Cable 0
Code Index Number 9
MIDI Cmd 0x90 (Note On)
MIDI Channel 3 (0x0=1; 0x1=2; 0x2=3; ... 0xF=16)
Note 0x3C (60 = C4)
Velocity 0x64 (100)But things get a little more complex with System Common or System Exclusive messages which have their own set of codes, depending on the chunking of the packets required.
The critical ones for SysEx are CIN=4,5,6,7 which correspond to SysEx start and then various versions of continuation or end packets. So a larger SysEx message might look something like the following
04 F0 43 10 -- SysEx Start or Continuation
04 34 44 4D -- SysEx Start or Continuation
06 3E F7 00 -- SysEx End after two bytes
Complete message: F0 43 10 34 44 4D 3E F7So, if I opt to use the packet interface to TinyUSB MIDI then all this has to be sorted out in user code myself. However, the streaming interface will take care of all this for me and just return a buffer full of “traditional” MIDI messages.
Note that there is no concept of Running Status in USB MIDI. Even the oldest USB standard protocol speeds are an order of magnitude, or more, higher than serial MIDI so it isn’t necessary. Every MIDI message will either be a complete 1,2,3 byte message in a single USB packet, or a SysEx multi-packet message as described above.
The basic structure of the USB MIDI handler is as follows:
Init:
Initialise TinyUSB MIDI stack
Process:
Run the TinyUSB MIDI task
IF TinyUSB says MIDI data available:
Call the stream API to fill our RX buffer
WHILE data in the RX buffer:
Call the MIDIParser which reads from the RX buffer
IF MIDI messages found:
Call the MIDI Message Handler
Read:
Grab the next byte from the RX bufferI’ve actually split this over two files: usbmidi.cpp is the companion to serialmidi.cpp and provides the class that inherits from MIDIDevice (which provides the parser and message handler); usbtask.c provides the interface into the TinyUSB C driver code.
I haven’t done anything special with a USB manufacturer/vendor and device ID yet – so at some point I should see what TinyUSB is using by default and find something unique to PicoDexed (assuming I take it forward in any useful way).
Closing Thoughts
I have a fairly complete implementation now, which is quite nice. I do need to find some way to properly exercise the voice loading over SysEx and it would be good to get some idea of the performance when I throw a MIDI file at it over USB!
I’ve tested some of the parameter changes using the PC version of Dexed. When configured correctly, this can be used to send voice parameter changes to PicoDexed, but I haven’t found a way to download the entire voice as yet.
It’s a shame I can’t just plug in a USB MIDI controller and play it now, but I’ll work on some kind of interface board that should allow me to do it. It will need to be independently powered to act as a USB host anyway.
This is probably going to be my last blog post on PicoDexed for now, but I plan to keep tinkering away at the GitHub repository to see how things go. There are still a couple of limitations, the main one being that everything has to be hard-coded in at present. It would be nice to be able to have some kind of system configuration facility for the MIDI channel if nothing else.
At some point it would also be nice to have a build on the GitHub so others can try it too. And I still need to decide how best to manage the changes I needed to make to Synth_Dexed.
Kevin
https://diyelectromusic.wordpress.com/2024/02/16/raspberry-pi-pico-synth_dexed-part-4/
-
I was going to leave things at Part 3 blog-wise, and just get on with filling in the gaps in code now, but I’ve come back to add a few more notes. But this is likely to be the final part now.
Recall so far, I have:
- Part 1 where I work out how to build Synth_Dexed using the Pico SDK and get some sounds coming out.
- Part 2 where I take a detailed look at the performance with a diversion into the workings of the pico_audio library and floating point maths on the pico, on the way.
- Part 3 where I managed to get up to 16-note polyphony, by overclocking, and some basic serial MIDI support.
This is building on the last part and includes notes on how I’ve implemented the following:
- Fuller MIDI support, including control change, program change and pitch bend messages.
- Voice and voice banks, selectable over MIDI.
- MIDI SysEx messages for voice parameters.
- USB MIDI device support.
The latest code can be found on GitHub here: https://github.com/diyelectromusic/picodexed
Warning! I strongly recommend using old or second hand equipment for your experiments. I am not responsible for any damage to expensive instruments!
If you are new to microcontrollers, see the Getting Started pages.
MIDI Support
I’m not going to walk through all the details of how I’ve added MIDI but suffice to say that once again the implementation owes a lot to MiniDexed and the Arduino MIDI Library.
At the time of writing the following are all supported as they were already supported in Synth_Dexed, so I just needed to glue the bits together.
Channel Voice Messages (only channel 1 at present)
0x80MIDI Note Offnote=0..127, vel=0..1270x90MIDI Note Onnote=0..127, vel=0..1270xA0Channel Aftertouchnote=0..127, val=0..1270xB0Control ChangeSee below0xC0Program Change0..31 (If used with BANKSEL)
0..127 (if used independently)0xE0Pitch Bend0..16383 (in LSB/MSB 2×7-bit format)Channel Control Change Messages
0Bank Select (MSB)01Modulation0..1272Breath Control0..1274Foot Control0..1277Channel Volume0..12732Bank Select (LSB)0..864Sustain<=63 Off, 64=> On65Portamento<=63 Off, 64=> On95Master Tune0..127 *120All Sound Off0123All Notes Off0126Mono Mode0 **127Poly Mode0* There is a bug with the master tuning. It ought to accept -99 to 99 I believe, but only 0..99 will actually register and there is no way to send -99 via MIDI at the moment. I need to read up on what is going on here and what it ought to do!
** The Mono Mode parameter has the option for specifying how many of the playable voices can be dedicated to mono mode (at least I think that is what it is saying). I only support a value of 0 which I believe is meant to mean “all available voices”.
System Messages
0xF0..0xF7Start/End System ExclusiveSee below0xFEActive SensingFiltered out0xFnOther system messagesIgnoredSystem Exclusive Messages
Any valid Yamaha (DX) system exclusive messages are passed straight into Synth_Dexed. A Yamaha (DX) message has the following format (see the “DX7IIFD/D Supplemental Booklet: Advanced MIDI Data and Charts”):
F0 - start SysEx message
43 - Yamaha manufacturer ID
sd - s=substatus (command class:0,1,2); d=device ID (0..F)
.. data ..
F7 - end SysEx messageThe device ID can be set using the UI on a real DX7 to a value between 1 and 16, which becomes a value between 0 and 15 (0..F) as part of the SysEx message (see “DX7IIFD/D Supplemental Booklet: Advanced MIDI Applications, Section 8”). It is a Systems Exclusive value analogous to the MIDI channel for regular channel messages.
There are a range of Sys Ex parameter settings that have been passed onto Synth_Dexed as follows:
Mono Mode0..1Pitch Bend Range0..12Pitch Bend Step0..12Portamento Mode0..1Portamento Glissando0..1Portamento Time0..99Mod Wheel Range0..99Mod Wheel Target0..7Foot Control Range0..99Foot Control Target0..7Breath Control Range0..99Breath Control Target0..7Aftertouch Range0..99Aftertouch Target0..7Voice Dump Load<156 bytes of voice data>Voice Parameter SetParameter=0..155; Data=0..99At this stage, all of the MIDI support is on a “it’s probably something like this” basis, so it will evolve as I find out what it is meant to be doing!
Voice and Bank Loading
Banks of voices are programmed directly into the code. There is a python script from Synth_Dexed that will take a .syx format voice bank and generate a block of C code. I’ve included a script to download the main 8 banks of standard DX voices and run the script:
#!/bin/sh
# Get voices from
# https://yamahablackboxes.com/collection/yamaha-dx7-synthesizer/patches/
mkdir -p voices
DIR="https://yamahablackboxes.com/patches/dx7/factory"
wget -c "${DIR}"/rom1a.syx -O voices/rom1a.syx
wget -c "${DIR}"/rom1b.syx -O voices/rom1b.syx
wget -c "${DIR}"/rom2a.syx -O voices/rom2a.syx
wget -c "${DIR}"/rom2b.syx -O voices/rom2b.syx
wget -c "${DIR}"/rom3a.syx -O voices/rom3a.syx
wget -c "${DIR}"/rom3b.syx -O voices/rom3b.syx
wget -c "${DIR}"/rom4a.syx -O voices/rom4a.syx
wget -c "${DIR}"/rom4b.syx -O voices/rom4b.syx
./synth_dexed/Synth_Dexed/tools/sysex2c.py voices/* > src/voices.hThis only needs to be run once to create the src/voices.h file which is then included in the build.
Voices have the following format:
uint8_t progmem_bank[8][32][128] PROGMEM =
{
{ // Bank 1
{<--128 bytes of packed voice data-->} // Voice 1
...
{<--128 bytes of packed voice data-->} // Voice 32
}
{ // Bank 2
...
}
...
{ // Bank 8
{<--128 bytes of packed voice data-->} // Voice 1
...
{<--128 bytes of packed voice data-->} // Voice 32
}
}The system assumes 8 banks of 32 voices each, in the “packed” SYX header format, meaning each voice consists of 128 bytes.
MIDI Bank and Voice Selection
As there are only 8 banks, only BANKSEL (LSB) values 0..7 are valid. Program Change will work in two ways however:
- 0..31 will select voices 1 to 32 in the current bank.
- 31..127 will select voices from the following three adjacent banks.
To select any voice in all 8 banks thus requires the following sequence:
BANKSEL MSB = 0
BANKSEL LSB = 0..7
PROG CHANGE = 0..31But if bank selection is skipped, then Program Change messages can still be used to select one of the first 128 voices across four consecutive banks.
USB MIDI
The Raspberry Pi Pico SDK uses the TinyUSB protocol stack to implement USB device or host modes and there is an additional option to implement a second USB host port using the Pico’s PIO.
However, USB MIDI appears to only be supported for USB devices at the time of writing, so I’m just using the built-in USB port as a USB device, based on the code provided as part of the TinyUSB examples (more details of how to get basic USB MIDI running here).
TinyUSB MIDI supports two interfaces for reading data, and this wasn’t immediately obvious from the example as that is only sending data and ignores anything coming in.
- USB MIDI Stream mode: this will fill a provided buffer with MIDI data received over USB.
- USB MIDI Packet mode: this will return each 4-byte USB packet individually.
From what I can see of the USB MIDI Spec, all MIDI messages are turned into 4-byte packets for transferring over USB. All normal MIDI messages will consist of 1, 2 or 3 byte messages, and so will fit in a packet each – any unused bytes are padded with 0.
However SysEx messages are a little more complicated and have to be split across multiple packets.
This is the format for a USB MIDI Event Packet (see the “Universal Serial Bus Device Class Definition for MIDI Devices”, Release 1.0):
The code index number is an indication of the contents of each packet. For channel messages, this is basically a repeat of the MIDI command, so a MIDI Note On message might look something like the following:
09 92 3C 64
Cable 0
Code Index Number 9
MIDI Cmd 0x90 (Note On)
MIDI Channel 3 (0x0=1; 0x1=2; 0x2=3; ... 0xF=16)
Note 0x3C (60 = C4)
Velocity 0x64 (100)But things get a little more complex with System Common or System Exclusive messages which have their own set of codes, depending on the chunking of the packets required.
The critical ones for SysEx are CIN=4,5,6,7 which correspond to SysEx start and then various versions of continuation or end packets. So a larger SysEx message might look something like the following
04 F0 43 10 -- SysEx Start or Continuation
04 34 44 4D -- SysEx Start or Continuation
06 3E F7 00 -- SysEx End after two bytes
Complete message: F0 43 10 34 44 4D 3E F7So, if I opt to use the packet interface to TinyUSB MIDI then all this has to be sorted out in user code myself. However, the streaming interface will take care of all this for me and just return a buffer full of “traditional” MIDI messages.
Note that there is no concept of Running Status in USB MIDI. Even the oldest USB standard protocol speeds are an order of magnitude, or more, higher than serial MIDI so it isn’t necessary. Every MIDI message will either be a complete 1,2,3 byte message in a single USB packet, or a SysEx multi-packet message as described above.
The basic structure of the USB MIDI handler is as follows:
Init:
Initialise TinyUSB MIDI stack
Process:
Run the TinyUSB MIDI task
IF TinyUSB says MIDI data available:
Call the stream API to fill our RX buffer
WHILE data in the RX buffer:
Call the MIDIParser which reads from the RX buffer
IF MIDI messages found:
Call the MIDI Message Handler
Read:
Grab the next byte from the RX bufferI’ve actually split this over two files: usbmidi.cpp is the companion to serialmidi.cpp and provides the class that inherits from MIDIDevice (which provides the parser and message handler); usbtask.c provides the interface into the TinyUSB C driver code.
I haven’t done anything special with a USB manufacturer/vendor and device ID yet – so at some point I should see what TinyUSB is using by default and find something unique to PicoDexed (assuming I take it forward in any useful way).
Closing Thoughts
I have a fairly complete implementation now, which is quite nice. I do need to find some way to properly exercise the voice loading over SysEx and it would be good to get some idea of the performance when I throw a MIDI file at it over USB!
I’ve tested some of the parameter changes using the PC version of Dexed. When configured correctly, this can be used to send voice parameter changes to PicoDexed, but I haven’t found a way to download the entire voice as yet.
It’s a shame I can’t just plug in a USB MIDI controller and play it now, but I’ll work on some kind of interface board that should allow me to do it. It will need to be independently powered to act as a USB host anyway.
This is probably going to be my last blog post on PicoDexed for now, but I plan to keep tinkering away at the GitHub repository to see how things go. There are still a couple of limitations, the main one being that everything has to be hard-coded in at present. It would be nice to be able to have some kind of system configuration facility for the MIDI channel if nothing else.
At some point it would also be nice to have a build on the GitHub so others can try it too. And I still need to decide how best to manage the changes I needed to make to Synth_Dexed.
Kevin
https://diyelectromusic.wordpress.com/2024/02/16/raspberry-pi-pico-synth_dexed-part-4/
-
I was going to leave things at Part 3 blog-wise, and just get on with filling in the gaps in code now, but I’ve come back to add a few more notes. But this is likely to be the final part now.
Recall so far, I have:
- Part 1 where I work out how to build Synth_Dexed using the Pico SDK and get some sounds coming out.
- Part 2 where I take a detailed look at the performance with a diversion into the workings of the pico_audio library and floating point maths on the pico, on the way.
- Part 3 where I managed to get up to 16-note polyphony, by overclocking, and some basic serial MIDI support.
This is building on the last part and includes notes on how I’ve implemented the following:
- Fuller MIDI support, including control change, program change and pitch bend messages.
- Voice and voice banks, selectable over MIDI.
- MIDI SysEx messages for voice parameters.
- USB MIDI device support.
The latest code can be found on GitHub here: https://github.com/diyelectromusic/picodexed
Warning! I strongly recommend using old or second hand equipment for your experiments. I am not responsible for any damage to expensive instruments!
If you are new to microcontrollers, see the Getting Started pages.
MIDI Support
I’m not going to walk through all the details of how I’ve added MIDI but suffice to say that once again the implementation owes a lot to MiniDexed and the Arduino MIDI Library.
At the time of writing the following are all supported as they were already supported in Synth_Dexed, so I just needed to glue the bits together.
Channel Voice Messages (only channel 1 at present)
0x80MIDI Note Offnote=0..127, vel=0..1270x90MIDI Note Onnote=0..127, vel=0..1270xA0Channel Aftertouchnote=0..127, val=0..1270xB0Control ChangeSee below0xC0Program Change0..31 (If used with BANKSEL)
0..127 (if used independently)0xE0Pitch Bend0..16383 (in LSB/MSB 2×7-bit format)Channel Control Change Messages
0Bank Select (MSB)01Modulation0..1272Breath Control0..1274Foot Control0..1277Channel Volume0..12732Bank Select (LSB)0..864Sustain<=63 Off, 64=> On65Portamento<=63 Off, 64=> On95Master Tune0..127 *120All Sound Off0123All Notes Off0126Mono Mode0 **127Poly Mode0* There is a bug with the master tuning. It ought to accept -99 to 99 I believe, but only 0..99 will actually register and there is no way to send -99 via MIDI at the moment. I need to read up on what is going on here and what it ought to do!
** The Mono Mode parameter has the option for specifying how many of the playable voices can be dedicated to mono mode (at least I think that is what it is saying). I only support a value of 0 which I believe is meant to mean “all available voices”.
System Messages
0xF0..0xF7Start/End System ExclusiveSee below0xFEActive SensingFiltered out0xFnOther system messagesIgnoredSystem Exclusive Messages
Any valid Yamaha (DX) system exclusive messages are passed straight into Synth_Dexed. A Yamaha (DX) message has the following format (see the “DX7IIFD/D Supplemental Booklet: Advanced MIDI Data and Charts”):
F0 - start SysEx message
43 - Yamaha manufacturer ID
sd - s=substatus (command class:0,1,2); d=device ID (0..F)
.. data ..
F7 - end SysEx messageThe device ID can be set using the UI on a real DX7 to a value between 1 and 16, which becomes a value between 0 and 15 (0..F) as part of the SysEx message (see “DX7IIFD/D Supplemental Booklet: Advanced MIDI Applications, Section 8”). It is a Systems Exclusive value analogous to the MIDI channel for regular channel messages.
There are a range of Sys Ex parameter settings that have been passed onto Synth_Dexed as follows:
Mono Mode0..1Pitch Bend Range0..12Pitch Bend Step0..12Portamento Mode0..1Portamento Glissando0..1Portamento Time0..99Mod Wheel Range0..99Mod Wheel Target0..7Foot Control Range0..99Foot Control Target0..7Breath Control Range0..99Breath Control Target0..7Aftertouch Range0..99Aftertouch Target0..7Voice Dump Load<156 bytes of voice data>Voice Parameter SetParameter=0..155; Data=0..99At this stage, all of the MIDI support is on a “it’s probably something like this” basis, so it will evolve as I find out what it is meant to be doing!
Voice and Bank Loading
Banks of voices are programmed directly into the code. There is a python script from Synth_Dexed that will take a .syx format voice bank and generate a block of C code. I’ve included a script to download the main 8 banks of standard DX voices and run the script:
#!/bin/sh
# Get voices from
# https://yamahablackboxes.com/collection/yamaha-dx7-synthesizer/patches/
mkdir -p voices
DIR="https://yamahablackboxes.com/patches/dx7/factory"
wget -c "${DIR}"/rom1a.syx -O voices/rom1a.syx
wget -c "${DIR}"/rom1b.syx -O voices/rom1b.syx
wget -c "${DIR}"/rom2a.syx -O voices/rom2a.syx
wget -c "${DIR}"/rom2b.syx -O voices/rom2b.syx
wget -c "${DIR}"/rom3a.syx -O voices/rom3a.syx
wget -c "${DIR}"/rom3b.syx -O voices/rom3b.syx
wget -c "${DIR}"/rom4a.syx -O voices/rom4a.syx
wget -c "${DIR}"/rom4b.syx -O voices/rom4b.syx
./synth_dexed/Synth_Dexed/tools/sysex2c.py voices/* > src/voices.hThis only needs to be run once to create the src/voices.h file which is then included in the build.
Voices have the following format:
uint8_t progmem_bank[8][32][128] PROGMEM =
{
{ // Bank 1
{<--128 bytes of packed voice data-->} // Voice 1
...
{<--128 bytes of packed voice data-->} // Voice 32
}
{ // Bank 2
...
}
...
{ // Bank 8
{<--128 bytes of packed voice data-->} // Voice 1
...
{<--128 bytes of packed voice data-->} // Voice 32
}
}The system assumes 8 banks of 32 voices each, in the “packed” SYX header format, meaning each voice consists of 128 bytes.
MIDI Bank and Voice Selection
As there are only 8 banks, only BANKSEL (LSB) values 0..7 are valid. Program Change will work in two ways however:
- 0..31 will select voices 1 to 32 in the current bank.
- 31..127 will select voices from the following three adjacent banks.
To select any voice in all 8 banks thus requires the following sequence:
BANKSEL MSB = 0
BANKSEL LSB = 0..7
PROG CHANGE = 0..31But if bank selection is skipped, then Program Change messages can still be used to select one of the first 128 voices across four consecutive banks.
USB MIDI
The Raspberry Pi Pico SDK uses the TinyUSB protocol stack to implement USB device or host modes and there is an additional option to implement a second USB host port using the Pico’s PIO.
However, USB MIDI appears to only be supported for USB devices at the time of writing, so I’m just using the built-in USB port as a USB device, based on the code provided as part of the TinyUSB examples (more details of how to get basic USB MIDI running here).
TinyUSB MIDI supports two interfaces for reading data, and this wasn’t immediately obvious from the example as that is only sending data and ignores anything coming in.
- USB MIDI Stream mode: this will fill a provided buffer with MIDI data received over USB.
- USB MIDI Packet mode: this will return each 4-byte USB packet individually.
From what I can see of the USB MIDI Spec, all MIDI messages are turned into 4-byte packets for transferring over USB. All normal MIDI messages will consist of 1, 2 or 3 byte messages, and so will fit in a packet each – any unused bytes are padded with 0.
However SysEx messages are a little more complicated and have to be split across multiple packets.
This is the format for a USB MIDI Event Packet (see the “Universal Serial Bus Device Class Definition for MIDI Devices”, Release 1.0):
The code index number is an indication of the contents of each packet. For channel messages, this is basically a repeat of the MIDI command, so a MIDI Note On message might look something like the following:
09 92 3C 64
Cable 0
Code Index Number 9
MIDI Cmd 0x90 (Note On)
MIDI Channel 3 (0x0=1; 0x1=2; 0x2=3; ... 0xF=16)
Note 0x3C (60 = C4)
Velocity 0x64 (100)But things get a little more complex with System Common or System Exclusive messages which have their own set of codes, depending on the chunking of the packets required.
The critical ones for SysEx are CIN=4,5,6,7 which correspond to SysEx start and then various versions of continuation or end packets. So a larger SysEx message might look something like the following
04 F0 43 10 -- SysEx Start or Continuation
04 34 44 4D -- SysEx Start or Continuation
06 3E F7 00 -- SysEx End after two bytes
Complete message: F0 43 10 34 44 4D 3E F7So, if I opt to use the packet interface to TinyUSB MIDI then all this has to be sorted out in user code myself. However, the streaming interface will take care of all this for me and just return a buffer full of “traditional” MIDI messages.
Note that there is no concept of Running Status in USB MIDI. Even the oldest USB standard protocol speeds are an order of magnitude, or more, higher than serial MIDI so it isn’t necessary. Every MIDI message will either be a complete 1,2,3 byte message in a single USB packet, or a SysEx multi-packet message as described above.
The basic structure of the USB MIDI handler is as follows:
Init:
Initialise TinyUSB MIDI stack
Process:
Run the TinyUSB MIDI task
IF TinyUSB says MIDI data available:
Call the stream API to fill our RX buffer
WHILE data in the RX buffer:
Call the MIDIParser which reads from the RX buffer
IF MIDI messages found:
Call the MIDI Message Handler
Read:
Grab the next byte from the RX bufferI’ve actually split this over two files: usbmidi.cpp is the companion to serialmidi.cpp and provides the class that inherits from MIDIDevice (which provides the parser and message handler); usbtask.c provides the interface into the TinyUSB C driver code.
I haven’t done anything special with a USB manufacturer/vendor and device ID yet – so at some point I should see what TinyUSB is using by default and find something unique to PicoDexed (assuming I take it forward in any useful way).
Closing Thoughts
I have a fairly complete implementation now, which is quite nice. I do need to find some way to properly exercise the voice loading over SysEx and it would be good to get some idea of the performance when I throw a MIDI file at it over USB!
I’ve tested some of the parameter changes using the PC version of Dexed. When configured correctly, this can be used to send voice parameter changes to PicoDexed, but I haven’t found a way to download the entire voice as yet.
It’s a shame I can’t just plug in a USB MIDI controller and play it now, but I’ll work on some kind of interface board that should allow me to do it. It will need to be independently powered to act as a USB host anyway.
This is probably going to be my last blog post on PicoDexed for now, but I plan to keep tinkering away at the GitHub repository to see how things go. There are still a couple of limitations, the main one being that everything has to be hard-coded in at present. It would be nice to be able to have some kind of system configuration facility for the MIDI channel if nothing else.
At some point it would also be nice to have a build on the GitHub so others can try it too. And I still need to decide how best to manage the changes I needed to make to Synth_Dexed.
Kevin
https://diyelectromusic.wordpress.com/2024/02/16/raspberry-pi-pico-synth_dexed-part-4/
-
For a while I’ve wanted to get Synth_Dexed up and running on a Raspberry Pi Pico. This is the library written by Holger Wirtz for use with the Teensy microcontroller, and the core synth engine used for MiniDexed.
This is how I got everything up and running although it is too early to know if this is a worthwhile activity or not!
Note: this is just the very first set of tests to see if anything is even feasible, so don’t expect a playable, working synth! It is just making some hardcoded sounds for the time being.
And the performance isn’t what anyone would call stellar… I can currently manage 4-note polyphony if the sample rate is dropped. That is about it right now!
- Part 1: Building Synth_Dexed for the Pico.
- Part 2: Assessing the performance and analysis of the Pico audio library.
- Part 3: MIDI and some basic usability (finally).
- Part 4: More MIDI; Bank and Voice loading; SysEx support; and USB-MIDI.
- Part 5: Details of how to build the hardware.
This is based on ideas and information found from examining the following:
- Synth_Dexed library by Holger Wirtz.
- MiniDexed initial port and code by Rene Stange and Holger Wirtz based on an idea by probonopd discussed here: https://github.com/rsta2/circle/issues/274.
- Earle Philhower’s Raspberry Pi Pico Arduino core for all RP2040 boards.
- Chris Hockuba’s Raspberry Pi Pico version of CMSIS 5.
Reference material:
- Getting Started with Raspberry Pi Pico
- Raspberry Pi Pico C/C++ SDK
- cmake documentation – reference manual and learning resources
- CMSIS 5
Warning! I strongly recommend using old or second hand equipment for your experiments. I am not responsible for any damage to expensive instruments!
If you are new to microcontrollers, see the Getting Started pages.
Intro and Hardware Requirements
The core aim is to be able to run a instance of Synth_Dexed on one of the cores of a Raspberry Pi Pico and feed it information that controls how it generates sound and then play the samples back out over some kind of audio interface.
Eventually it should receive this information from MIDI or some kind of built-in interface. Sound output ideally would support an I2S audio DAC – I’ve started with the Pimoroni Pico Audio Pack – but I’d like to add PWM and possibly others.
It remains to be seen if the performance maps a single instance to a core or allows for several instances to be running concurrently and at what level of polyphony.
Other requirements that I’m chewing over:
- Polyphonic to some level.
- MIDI.
- Should allow for flexible separation of interface and synth engine.
- Should allow for several Picos to act together to provide multiple synth engines.
- Supports voice bank loading from the Pico’s onboard flash memory “disk”.
- Use the Synth_Dexed library with the minimum necessary changes – ideally none!
To get started and start experimenting will eventually requires the following hardware:
- Raspberry Pi Pico.
- Means of getting audio out – initially its only doing I2S, but PWM could be added.
- Eventually some means of getting control signals in (USB or serial MIDI or hard-coded).
The following GPIO pins are being used:
- I2S: I2S_DATA = GP9; I2S_CLOCK = GP10
- Debug UART: TX = GP0; RX = GP1
- MIDI (eventually): TX = GP4; RX = GP5
Note that the Pimoroni Audio Pack also makes use of GP22 as a mute switch, but I’m not using that.
The Environment
I set up the Raspberry Pi C/C++ SDK according to my previous set of instructions from here: Getting Started with the Raspberry Pi Pico C/C++ SDK and TinyUSB MIDI.
Once again I’m doing all this in the Ubuntu virtual machine.
The created a picodexed project with the following structure:
picodexed
+--- build
+--- cmsis
+--- src
+--- synth_dexedI’m not going through the whole discovery process of how I got to this point, but here are a few notes of things I learned on the way in case I need to refer back to them!
CMSIS is required to support some of the ARM DSP functions used by Synth_Dexed, but not all of it is necessary. After hunting around for how to build this into the Raspberry Pi Pico (the SDK only has some very basic interface definitions, not the whole proper thing) I eventually stumbled across Chris Hockuba’s Raspberry Pi Pico CMSIS – cmsis-pi-pico – on Gitlab so I cloned that into my cmsis area and used that.
A really useful clue as to what is required to get Synth_Dexed running on a different architecture to the original Teensy can be found from examining the following files created as part of the MiniDexed discussion:
- Synth_Dexed/src/synth.h – contains the following useful functions/macros required by the library, but re-implemented for circle: constrain, signed_saturate_rshift, millis.
- MiniDexed/src/Synth_Dexed.mk – contains a list of what has to be built to create the synth engine, including which bits of CMSIS are required.
- MiniDexed/src/dexedadaptor.h – a shallow wrapper around Dexed to control access to its key functions.
- Synth_Dexed/examples/SimplePlay – a sample application of Dexed built for the Teensy using its Audio libraries.
In terms of approach, I basically created the most basic main.cpp I could and created CMakeLists.txt files for the main application and the Synth_Dexed library and then just kept trying to build it to see what was still broken.
I had a frustrating diversion for a while, not noticing that I’d created a main.c and not a main.cpp and that when it included dexed.h from Synth_Dexed it couldn’t find <cstdlib> – the C++ standard library… I spent ages trying to work out why my compiler couldn’t find its own libraries…
One thing that appeared to be missing totally though, was a definition of boolean. Now I could have gone through Synth_Dexed changing boolean to bool, but in the end implemented a “filler” or “wrapper” header file with everything extra that Synth_Dexed needed but was missing – dexed_if_common.h – and stuck a
typedef bool boolean;in there.So far, the only change I’ve had to make to the library itself is to add the following line into Synth_Dexed/src/dexed.h and then make sure the path for include files can find it during compilation:
#include "dexed_if_common.h"
I still haven’t quite got to grips with the whole cmake infrastructure required for building Raspberry Pi Pico projects, so the CMakeLists.txt files I’ve created are almost certainly sub-optimal at present. I still need to get my head around PRIVATE, PUBLIC, INTERFACE qualifiers for example and probably have more directories defined for INCLUDE paths than is strictly necessary.
But it builds and everything I have so far can be found on GitHub. This includes a simple script to take the basic repository and add in CMSIS and Synth_Dexed and hack in that single change to dexed.h. This means that to build what I have so far requires the following:
- Install the Pico C/C++ SDK, toolchain and libraries (as I described here) including setting PICO_SDK_PATH to the location of the Pico SDK installation.
- Clone the picodexed repository.
- Run the getsubmod.sh script (only run this once for a fresh repository).
- Then build:
kevin@ubuntu:~/src/picodexed$ cd build
kevin@ubuntu:~/src/picodexed/build$ cmake ..
kevin@ubuntu:~/src/picodexed/build$ make
kevin@ubuntu:~/src/picodexed/build$Adding Audio
The Pico has no in-built audio capability, but there is an audio library that can be found in the “pico_extras” repository that supports I2S audio interfaces or PWM audio output on GPIO pins, both implementing using the Pico’s PIO subsystem.
There is an example in the “pico_playground” that shows how to output a sine wave and Pimoroni also have a sample mini-synth application for their I2S Pico Audio Pack.
Links:
- https://github.com/raspberrypi/pico-extras
- https://github.com/raspberrypi/pico-playground
- https://github.com/pimoroni/pimoroni-pico/tree/main/examples/pico_audio
In order to build for use with the pico_audio library, the pico_extras repository needs to be cloned into the source area and the location specified using PICO_EXTRAS_PATH.
There is another cmake file that can then be copied into your own project area to include the library.
cp $PICO_EXTRAS_PATH/external/pico_extras_import.cmake .
And then it can be included in the project’s CMakeLists.txt file, pico_audio_i2s can be added to the list of link libraries, and a compilation definition added to define USE_AUDIO_I2S.
include(pico_extras_import.cmake)
target_link_libraries(picodexed PUBLIC synth_dexed pico_stdlib tinyusb_device tinyusb_board pico_audio_i2s)
target_compile_definitions(picodexed PRIVATE
USE_AUDIO_I2S=1
)The actual examples for audio output are quite complicated and so far the documentation I’ve found for the pico_audio libraries is pretty minimal.
The Pimoroni build has abstracted most of the functionality out into a separate audio.hpp file with the following interface:
// init_audio initialises the audio library and returns a pointer
// to an object representing the audio buffers.
struct audio_buffer_pool *init_audio(uint32_t sample_rate, uint8_t pin_data, uint8_t pin_bclk, uint8_t pio_sm=0, uint8_t dma_ch=0);
// update_buffer will fill the audio buffer with samples returned
// by the provided callback function.
void update_buffer(struct audio_buffer_pool *ap, buffer_callback cb);The callback function that is meant to provide samples to the audio subsystem has the following definition:
int16_t getNextSample (void);
Dexed has a getSamples function that can fill a buffer with the next set of samples in either float or int16_t format, but the callback function only returns a single sample at a time.
Rather than attempt to make that fit (I did start off that way) I wrote my own “update_buffer” function to fill the Dexed buffer:
void fillSampleBuffer(struct audio_buffer_pool *ap) {
struct audio_buffer *buffer = take_audio_buffer(ap, true);
int16_t *samples = (int16_t *) buffer->buffer->bytes;
dexed.getSamples(samples, buffer->max_sample_count);
buffer->sample_count = buffer->max_sample_count;
give_audio_buffer(ap, buffer);
}Once slight complication is that the Dexed getSamples function is a protected function. To interface into Dexed I’ve thus created a Dexed Adaptor based on the MiniDexed dexedadaptor. I don’t know how many of these functions will really need to be a wrapper around Dexed, but the capability is there if required.
Typically with embedded audio applications, there is a regular “internal buffer out to audio hardware” layer that has to run at the sample rate and be reliably regular with its timing. This would usually be driven from a timer interrupt or similar. Then there will be a non-timing-critical part of code responsible for filling the buffer itself with samples to be played.
The Pico’s audio library is quite different. Unfortunately, I can’t quite see with the combination of complex C buffer management code, use of the Pico’s DMA and use of the PIO hardware for the I2S protocol itself, quite what the timing expectations of the pico_audio library are and how they should be split up, but from what I can gather, the basic idea is as follows:
- PIO is automatically pulling data from a set of buffers to send over I2S.
- DMA is providing that data somehow at a rate independent of the CPU, but maybe driven by interrupts linked to data requests and the PIO somehow.
- All I have to do is keep the buffer that can be accessed by the DMA fill of audio samples and magic “just happens”.
So for the time being, I’m just calling my update audio equivalent function in a relatively fast loop.
Unfortunately my first attempts all resulting in some kind of horrid growl. But it does turn on and off when the Dexed.keydown and Dexed.keyup calls are made so I was making progress!
Things that eventually fixed that:
- I updated the init_audio function to allow me to pass in the sample buffer size and chose a buffer size of 128 and then ensured this was set to the same in Dexed itself.
- I saw the sine example was using a sample rate of 24000 for I2S so I used that sample rate here too.
- The thing that eventually fixed everything was realising that Dexed is returning a mono stream of data but the default I2S interface is assuming a stereo set of samples. Setting the following in CMakeLists.txt told the I2S library to use mono:
target_compile_definitions(picodexed PRIVATE
PICO_AUDIO_I2S_MONO_OUTPUT=1
PICO_AUDIO_I2S_MONO_INPUT=1
USE_AUDIO_I2S=1
)The sine library seems to get away with only defining MONO_OUTPUT, but I was getting compilation errors, so I set MONO_INPUT too, even though I’m not using I2S input.
At this point I was getting a nice note playing via Dexed, so I grabbed the voice parameters for Brass 1 out of MiniDexed and dropped them into an array here and with a call to Dexed.loadVoiceParameters, I have a Raspberry Pi Pico playing the Dexed Brass 1 sound!
In terms of sample rates and polyphony, well, things are a little basic! I can currently achieve the following:
- 2-note polyphony at 44100.
- 4-note polyphony at 24000.
- 6-note polyphony, possibly if you squint at it and ask it really nicely, at 12000.
So it isn’t going to win any prizes for quality or quantity, but this is a point that I feel it’s now worth pushing this post out and sticking the progress up on GitHub.
Next, I want some simple MIDI control to act as a real synth. Other things on the “todo” list:
- See what scope there is for optimisation to increase the polyphony ideally with 44.1kHz.
- Specifically look into how it is currently doing the floating point maths.
- Investigate the loading of the core to see how the performance is looking more generally.
- Get some additional voices in there!
Removing Dependency on CMSIS
It turns out that only a few functions are required from the CMSIS/DSP area, so in order to examining them more closely I created my own version of arm_math.h with a corresponding arm_math.c containing the following functions:
void arm_float_to_q15()
void arm_fill_f32()
void arm_sub_f32()
void arm_scale_f32()
void arm_offset_f32()
void arm_mult_f32()
void arm_biquad_cascade_df1_f32()
void arm_biquad_cascade_df1_init_f32()with any additional definitions they require. Turns out it isn’t very many. The only thing left is to mirror how Synth_Dexed builds for the Teensy and ensure that the compressor isn’t included. This requires some conditional compilation in dexed.h, dexed.cpp and compressor.h.
There doesn’t appear to be a single obvious way to spot a build for a Raspberry Pi Pico though, but there is a definition of RASPBERRYPI_PICO in <board/pico.h> so I used that.
Find it on GitHub
The current state of progress can be found on GitHub here: https://github.com/diyelectromusic/picodexed
Closing Thoughts
I first wanted to do this when I first installed the Pico SDK back in August 2022! Yes it really has taken me that long to even attempt it. Mostly because my C is still quite rusty; my C++ is very much “learn as you go”; I’ve not worked with a CMSIS ARM embedded system in detail before; and the cmake infrastructure for a Raspberry Pi Pico seems so very complicated for what it is.
But you’ve got to start somewhere as they say, and I learn best from having a reason to do something. So even if this doesn’t amount to anything, it is finally making me learn about all these things in a useful way that will hopefully end up with something I’ll have some fun playing with.
The big limitation might be floating point maths. But the Pico does have some built-in fast floating point routines. I don’t know if any of these are enabled and running at the moment. I suspect the CMSIS DSP code is probably doing a “soft” floating point, so that is something to look into. But if it is already using the faster routines, then it may well keep the polyphony so low that it isn’t particularly practical to use. To be continued…
But once again, my renewed respect to all the above-mentioned people who have essentially already done most of the hard work for projects like this that allow someone like me to bumble along and join the various bits together!
By the way, seeing as MiniDexed exists for the Raspberry Pi and MicroDexed exists for the Teensy, PicoDexed seemed the obvious choice!
Kevin
https://diyelectromusic.wordpress.com/2024/01/09/raspberry-pi-pico-synth_dexed/
-
For a while I’ve wanted to get Synth_Dexed up and running on a Raspberry Pi Pico. This is the library written by Holger Wirtz for use with the Teensy microcontroller, and the core synth engine used for MiniDexed.
This is how I got everything up and running although it is too early to know if this is a worthwhile activity or not!
Note: this is just the very first set of tests to see if anything is even feasible, so don’t expect a playable, working synth! It is just making some hardcoded sounds for the time being.
And the performance isn’t what anyone would call stellar… I can currently manage 4-note polyphony if the sample rate is dropped. That is about it right now!
- Part 1: Building Synth_Dexed for the Pico.
- Part 2: Assessing the performance and analysis of the Pico audio library.
- Part 3: MIDI and some basic usability (finally).
- Part 4: More MIDI; Bank and Voice loading; SysEx support; and USB-MIDI.
- Part 5: Details of how to build the hardware.
This is based on ideas and information found from examining the following:
- Synth_Dexed library by Holger Wirtz.
- MiniDexed initial port and code by Rene Stange and Holger Wirtz based on an idea by probonopd discussed here: https://github.com/rsta2/circle/issues/274.
- Earle Philhower’s Raspberry Pi Pico Arduino core for all RP2040 boards.
- Chris Hockuba’s Raspberry Pi Pico version of CMSIS 5.
Reference material:
- Getting Started with Raspberry Pi Pico
- Raspberry Pi Pico C/C++ SDK
- cmake documentation – reference manual and learning resources
- CMSIS 5
Warning! I strongly recommend using old or second hand equipment for your experiments. I am not responsible for any damage to expensive instruments!
If you are new to microcontrollers, see the Getting Started pages.
Intro and Hardware Requirements
The core aim is to be able to run a instance of Synth_Dexed on one of the cores of a Raspberry Pi Pico and feed it information that controls how it generates sound and then play the samples back out over some kind of audio interface.
Eventually it should receive this information from MIDI or some kind of built-in interface. Sound output ideally would support an I2S audio DAC – I’ve started with the Pimoroni Pico Audio Pack – but I’d like to add PWM and possibly others.
It remains to be seen if the performance maps a single instance to a core or allows for several instances to be running concurrently and at what level of polyphony.
Other requirements that I’m chewing over:
- Polyphonic to some level.
- MIDI.
- Should allow for flexible separation of interface and synth engine.
- Should allow for several Picos to act together to provide multiple synth engines.
- Supports voice bank loading from the Pico’s onboard flash memory “disk”.
- Use the Synth_Dexed library with the minimum necessary changes – ideally none!
To get started and start experimenting will eventually requires the following hardware:
- Raspberry Pi Pico.
- Means of getting audio out – initially its only doing I2S, but PWM could be added.
- Eventually some means of getting control signals in (USB or serial MIDI or hard-coded).
The following GPIO pins are being used:
- I2S: I2S_DATA = GP9; I2S_CLOCK = GP10
- Debug UART: TX = GP0; RX = GP1
- MIDI (eventually): TX = GP4; RX = GP5
Note that the Pimoroni Audio Pack also makes use of GP22 as a mute switch, but I’m not using that.
The Environment
I set up the Raspberry Pi C/C++ SDK according to my previous set of instructions from here: Getting Started with the Raspberry Pi Pico C/C++ SDK and TinyUSB MIDI.
Once again I’m doing all this in the Ubuntu virtual machine.
The created a picodexed project with the following structure:
picodexed
+--- build
+--- cmsis
+--- src
+--- synth_dexedI’m not going through the whole discovery process of how I got to this point, but here are a few notes of things I learned on the way in case I need to refer back to them!
CMSIS is required to support some of the ARM DSP functions used by Synth_Dexed, but not all of it is necessary. After hunting around for how to build this into the Raspberry Pi Pico (the SDK only has some very basic interface definitions, not the whole proper thing) I eventually stumbled across Chris Hockuba’s Raspberry Pi Pico CMSIS – cmsis-pi-pico – on Gitlab so I cloned that into my cmsis area and used that.
A really useful clue as to what is required to get Synth_Dexed running on a different architecture to the original Teensy can be found from examining the following files created as part of the MiniDexed discussion:
- Synth_Dexed/src/synth.h – contains the following useful functions/macros required by the library, but re-implemented for circle: constrain, signed_saturate_rshift, millis.
- MiniDexed/src/Synth_Dexed.mk – contains a list of what has to be built to create the synth engine, including which bits of CMSIS are required.
- MiniDexed/src/dexedadaptor.h – a shallow wrapper around Dexed to control access to its key functions.
- Synth_Dexed/examples/SimplePlay – a sample application of Dexed built for the Teensy using its Audio libraries.
In terms of approach, I basically created the most basic main.cpp I could and created CMakeLists.txt files for the main application and the Synth_Dexed library and then just kept trying to build it to see what was still broken.
I had a frustrating diversion for a while, not noticing that I’d created a main.c and not a main.cpp and that when it included dexed.h from Synth_Dexed it couldn’t find <cstdlib> – the C++ standard library… I spent ages trying to work out why my compiler couldn’t find its own libraries…
One thing that appeared to be missing totally though, was a definition of boolean. Now I could have gone through Synth_Dexed changing boolean to bool, but in the end implemented a “filler” or “wrapper” header file with everything extra that Synth_Dexed needed but was missing – dexed_if_common.h – and stuck a
typedef bool boolean;in there.So far, the only change I’ve had to make to the library itself is to add the following line into Synth_Dexed/src/dexed.h and then make sure the path for include files can find it during compilation:
#include "dexed_if_common.h"
I still haven’t quite got to grips with the whole cmake infrastructure required for building Raspberry Pi Pico projects, so the CMakeLists.txt files I’ve created are almost certainly sub-optimal at present. I still need to get my head around PRIVATE, PUBLIC, INTERFACE qualifiers for example and probably have more directories defined for INCLUDE paths than is strictly necessary.
But it builds and everything I have so far can be found on GitHub. This includes a simple script to take the basic repository and add in CMSIS and Synth_Dexed and hack in that single change to dexed.h. This means that to build what I have so far requires the following:
- Install the Pico C/C++ SDK, toolchain and libraries (as I described here) including setting PICO_SDK_PATH to the location of the Pico SDK installation.
- Clone the picodexed repository.
- Run the getsubmod.sh script (only run this once for a fresh repository).
- Then build:
kevin@ubuntu:~/src/picodexed$ cd build
kevin@ubuntu:~/src/picodexed/build$ cmake ..
kevin@ubuntu:~/src/picodexed/build$ make
kevin@ubuntu:~/src/picodexed/build$Adding Audio
The Pico has no in-built audio capability, but there is an audio library that can be found in the “pico_extras” repository that supports I2S audio interfaces or PWM audio output on GPIO pins, both implementing using the Pico’s PIO subsystem.
There is an example in the “pico_playground” that shows how to output a sine wave and Pimoroni also have a sample mini-synth application for their I2S Pico Audio Pack.
Links:
- https://github.com/raspberrypi/pico-extras
- https://github.com/raspberrypi/pico-playground
- https://github.com/pimoroni/pimoroni-pico/tree/main/examples/pico_audio
In order to build for use with the pico_audio library, the pico_extras repository needs to be cloned into the source area and the location specified using PICO_EXTRAS_PATH.
There is another cmake file that can then be copied into your own project area to include the library.
cp $PICO_EXTRAS_PATH/external/pico_extras_import.cmake .
And then it can be included in the project’s CMakeLists.txt file, pico_audio_i2s can be added to the list of link libraries, and a compilation definition added to define USE_AUDIO_I2S.
include(pico_extras_import.cmake)
target_link_libraries(picodexed PUBLIC synth_dexed pico_stdlib tinyusb_device tinyusb_board pico_audio_i2s)
target_compile_definitions(picodexed PRIVATE
USE_AUDIO_I2S=1
)The actual examples for audio output are quite complicated and so far the documentation I’ve found for the pico_audio libraries is pretty minimal.
The Pimoroni build has abstracted most of the functionality out into a separate audio.hpp file with the following interface:
// init_audio initialises the audio library and returns a pointer
// to an object representing the audio buffers.
struct audio_buffer_pool *init_audio(uint32_t sample_rate, uint8_t pin_data, uint8_t pin_bclk, uint8_t pio_sm=0, uint8_t dma_ch=0);
// update_buffer will fill the audio buffer with samples returned
// by the provided callback function.
void update_buffer(struct audio_buffer_pool *ap, buffer_callback cb);The callback function that is meant to provide samples to the audio subsystem has the following definition:
int16_t getNextSample (void);
Dexed has a getSamples function that can fill a buffer with the next set of samples in either float or int16_t format, but the callback function only returns a single sample at a time.
Rather than attempt to make that fit (I did start off that way) I wrote my own “update_buffer” function to fill the Dexed buffer:
void fillSampleBuffer(struct audio_buffer_pool *ap) {
struct audio_buffer *buffer = take_audio_buffer(ap, true);
int16_t *samples = (int16_t *) buffer->buffer->bytes;
dexed.getSamples(samples, buffer->max_sample_count);
buffer->sample_count = buffer->max_sample_count;
give_audio_buffer(ap, buffer);
}Once slight complication is that the Dexed getSamples function is a protected function. To interface into Dexed I’ve thus created a Dexed Adaptor based on the MiniDexed dexedadaptor. I don’t know how many of these functions will really need to be a wrapper around Dexed, but the capability is there if required.
Typically with embedded audio applications, there is a regular “internal buffer out to audio hardware” layer that has to run at the sample rate and be reliably regular with its timing. This would usually be driven from a timer interrupt or similar. Then there will be a non-timing-critical part of code responsible for filling the buffer itself with samples to be played.
The Pico’s audio library is quite different. Unfortunately, I can’t quite see with the combination of complex C buffer management code, use of the Pico’s DMA and use of the PIO hardware for the I2S protocol itself, quite what the timing expectations of the pico_audio library are and how they should be split up, but from what I can gather, the basic idea is as follows:
- PIO is automatically pulling data from a set of buffers to send over I2S.
- DMA is providing that data somehow at a rate independent of the CPU, but maybe driven by interrupts linked to data requests and the PIO somehow.
- All I have to do is keep the buffer that can be accessed by the DMA fill of audio samples and magic “just happens”.
So for the time being, I’m just calling my update audio equivalent function in a relatively fast loop.
Unfortunately my first attempts all resulting in some kind of horrid growl. But it does turn on and off when the Dexed.keydown and Dexed.keyup calls are made so I was making progress!
Things that eventually fixed that:
- I updated the init_audio function to allow me to pass in the sample buffer size and chose a buffer size of 128 and then ensured this was set to the same in Dexed itself.
- I saw the sine example was using a sample rate of 24000 for I2S so I used that sample rate here too.
- The thing that eventually fixed everything was realising that Dexed is returning a mono stream of data but the default I2S interface is assuming a stereo set of samples. Setting the following in CMakeLists.txt told the I2S library to use mono:
target_compile_definitions(picodexed PRIVATE
PICO_AUDIO_I2S_MONO_OUTPUT=1
PICO_AUDIO_I2S_MONO_INPUT=1
USE_AUDIO_I2S=1
)The sine library seems to get away with only defining MONO_OUTPUT, but I was getting compilation errors, so I set MONO_INPUT too, even though I’m not using I2S input.
At this point I was getting a nice note playing via Dexed, so I grabbed the voice parameters for Brass 1 out of MiniDexed and dropped them into an array here and with a call to Dexed.loadVoiceParameters, I have a Raspberry Pi Pico playing the Dexed Brass 1 sound!
In terms of sample rates and polyphony, well, things are a little basic! I can currently achieve the following:
- 2-note polyphony at 44100.
- 4-note polyphony at 24000.
- 6-note polyphony, possibly if you squint at it and ask it really nicely, at 12000.
So it isn’t going to win any prizes for quality or quantity, but this is a point that I feel it’s now worth pushing this post out and sticking the progress up on GitHub.
Next, I want some simple MIDI control to act as a real synth. Other things on the “todo” list:
- See what scope there is for optimisation to increase the polyphony ideally with 44.1kHz.
- Specifically look into how it is currently doing the floating point maths.
- Investigate the loading of the core to see how the performance is looking more generally.
- Get some additional voices in there!
Removing Dependency on CMSIS
It turns out that only a few functions are required from the CMSIS/DSP area, so in order to examining them more closely I created my own version of arm_math.h with a corresponding arm_math.c containing the following functions:
void arm_float_to_q15()
void arm_fill_f32()
void arm_sub_f32()
void arm_scale_f32()
void arm_offset_f32()
void arm_mult_f32()
void arm_biquad_cascade_df1_f32()
void arm_biquad_cascade_df1_init_f32()with any additional definitions they require. Turns out it isn’t very many. The only thing left is to mirror how Synth_Dexed builds for the Teensy and ensure that the compressor isn’t included. This requires some conditional compilation in dexed.h, dexed.cpp and compressor.h.
There doesn’t appear to be a single obvious way to spot a build for a Raspberry Pi Pico though, but there is a definition of RASPBERRYPI_PICO in <board/pico.h> so I used that.
Find it on GitHub
The current state of progress can be found on GitHub here: https://github.com/diyelectromusic/picodexed
Closing Thoughts
I first wanted to do this when I first installed the Pico SDK back in August 2022! Yes it really has taken me that long to even attempt it. Mostly because my C is still quite rusty; my C++ is very much “learn as you go”; I’ve not worked with a CMSIS ARM embedded system in detail before; and the cmake infrastructure for a Raspberry Pi Pico seems so very complicated for what it is.
But you’ve got to start somewhere as they say, and I learn best from having a reason to do something. So even if this doesn’t amount to anything, it is finally making me learn about all these things in a useful way that will hopefully end up with something I’ll have some fun playing with.
The big limitation might be floating point maths. But the Pico does have some built-in fast floating point routines. I don’t know if any of these are enabled and running at the moment. I suspect the CMSIS DSP code is probably doing a “soft” floating point, so that is something to look into. But if it is already using the faster routines, then it may well keep the polyphony so low that it isn’t particularly practical to use. To be continued…
But once again, my renewed respect to all the above-mentioned people who have essentially already done most of the hard work for projects like this that allow someone like me to bumble along and join the various bits together!
By the way, seeing as MiniDexed exists for the Raspberry Pi and MicroDexed exists for the Teensy, PicoDexed seemed the obvious choice!
Kevin
https://diyelectromusic.wordpress.com/2024/01/09/raspberry-pi-pico-synth_dexed/
-
A native path out of the mess people make on the #openweb
The Open Governance Body (#OGB) describes a permissionless process/structure that is open and allows the group that forms using the tools to decide who is a part of the group or not. This process can divide into a web of connecting instances of governance as a natural human process of group formation. The #OGB emphasizes that there is no exclusion and always diversity, making it a natural fit for the #fediverse.
The #OGB also shows that if people are stupid and focused on individualism, each governance instance will have one member and no power. To gain power, people have to work together, which is built into the code. The #OGB emphasizes that hoarding power is limited, and it flows through the community, energizing and solidifying the community and building horizontal power to challenge/change vertical power.
The #OGB focus is on the importance of keeping things simple (#KISS) and that some people will try to push for existing power structures before democracy. However, as the process is permissionless, it is not possible to stop them from doing this. The #OGB emphasizes the need to do better, and that being native to the #fediverse is a big help in this regard.
The #OGB emphasizes the importance of recognizing where power comes from in the context of the #fediverse. The fediverse operates differently from corporations, governments, courts, and police, and it is important to think and build with this difference rather than trying to drag the fediverse back to the #mainstreaming path.
The #OGB builds from the #fediverse works because it is different, and it is easy to forget this important thing when #mainstreaming agendas grab and hold. The #OGB suggests that the missing question in almost all conversations is “who are we empowering,” and emphasizes the need to do better in alt-tech.
The #OGB notes that there are problems in alt-tech and suggests that starting with the #4opens would remove 90% of the mess, revealing the real potential for good outcomes. The #OGB highlights that doing better in alt-tech would involve using shovels to make compost and planting seeds of the world we want to see.
The #OGB describes the process scaffolding for the governance body as a default effect, where the decisions on how things work will be up to the members of the body. The power of the governance body is only the power of default, and the #OGB is about removing all hard default choices and building in a small number of KISS tools, then letting the body members work out themselves how to use them.
The #OGB uses the example of #Couchsurfing, where the website redesign removed the #DIY tools active Couchsurfers had used to self-organize, leading to disappointment among members. The #OGB argues that letting members make their own process, open vs. closed, is necessary to overcome the #geekproblem and have hope for alt-tech.
The #OGB builds governance with the way, rules, norms, and actions are structured, sustained, regulated, and held accountable. this is to mediate that the #Fediverse currently has a “herding cats” governance, denoting a futile attempt to control or organize a class of entities that are inherently uncontrollable.
The #OGB codebase is not just a tool for the #Fediverse, but it can be used to democratically run any structures that have stakeholders.
The #OGB provides an example of how the codebase can be used to run a local street market, with each stallholder as a stakeholder, people who shop at the market as users, and the local council, events company, and shop owner’s association as affiliate groups. The #OGB approach and codebase will scale sideways, with street markets governed city-wide, and each of the markets becoming a stakeholder, users as users, and city-wide orgs and groups as affiliate groups.
The #OGB shaping of the “body” comes from a long history/experience of horizontal activism, where “those who do the work have more say.” https://www.noisebridge.net/wiki/Do-ocracy
The #OGB pushes that the bulk of the voice comes from those who run the #Fediverse, the people who run/support the instances. The people who build the tools also get a say, as do support orgs and events, and the users who will be spread widely get a say, but their power is diluted by the much larger numbers involved.
This working practice comes from 30 years of building from The Tyranny of Structureless tick box list https://unite.openworlds.info/Open-Media-Network/openwebgovernancebody/wiki/03.-The-Tyranny-of-Stucturelessness That code being quite “anti-human” is an interesting challenge, and it’s important to figure out how to get the humane “mess” in a coding process that is based on being “exact” and in control #OGB
The #OGB project is grounded in lived experience, and it’s a way out of this mess. We cannot keep using traditional institutions. We have to stop the #techcurn if we are going to use #openweb tech for social/ecological change/challenge, and we need to think about this now.
The #OGB project is about developing better ways of having “trust” based conversations and “trust” based “governance” in the #openweb. The project is built from hundreds of years of on the ground organizing that has shaped every “freedom” we enjoy and is done in a #KISS approach. The #OGB is a #fedivers native way of working, NOT a #mainstreaming way, and it comes from directly working, setting up, and solving recurring problems at hundreds of direct action protest camps.
The #OGB focus on what we know works, as at the moment, almost nothing works for social good. The #OGB project is what is needed, a voluntary cooperative and collaborative alliance that is native to the #fediverse.
The thinking is that we need to put a stop to the #techchurn as we have piles of #techshit already to compost, that #nothingnew is a hashtag for this.
It’s not the goal of the #OGB project to create an organization that tells everyone what protocols and standards to use in the #fediverse. The #OGB project is about developing better ways of having good “trust” based conversations and “trust” based “governance” in the #openweb
To sum up, the current working models of “governance” in open-source projects are monarchy, aristocracy and oligarchy. This is the rock star developer, the coders and the funders. It should be obverse to anyone that 99.99% of people are missing from this feudalistic ideal of “governance”.
Democracy is the basic foundation of our shared modernity.
WHY DO WE PUT UP WITH THIS MESS IN TECH?
Let’s take a different path, please #OGB
Q. that is an optimistic projection
A. I have no illusion that the normal shitty behaver of fucking people over and being a prat will happen, but the codebase is designed to mediate this crap behaver for better outcomes 🙂
#OGB “permissionless” is an important word that needs some thought. The body is made up of three different, balanced groups: stakeholders, users, and affiliate stakeholders. Anybody can become a stakeholder by setting up and running an active instance, and users are self-explanatory. That affiliate stakeholders are a little more complex and are treated differently, and it’s up to the body itself to decide if they play an active and useful role.
That nothing in this is top-down, elitist, discriminatory, or undemocratic, and it’s #KISS and looks safe to the “normal world” while being native to the #fediverse and its roots. All the coding is #4opens, based on #activertypub.
With #OGB, it’s important not to get lost in the #processgeeks and their dogmatic love of #formalconsensus, as that’s a dead end and has been for the 30 years of activism and coding tech. It’s important to keep the #OGB both #KISS and human, understandable. The #OGB is native “governance” and federates in the same way as the projects it “governs”. That this approach is counterintuitive to mainstream ideas and “common sense,” but that’s not necessarily a bad thing.
This approach has worked to some extent, as seen in the “#Fediverse” as a living example, working to scale small to bigger. There will be lots of “smoke,” and help is needed to keep the project clear of this mess. We have to overcome our #stupidindividualism to have a hope of a better world.
#OGB To remind you that the need for “governance” came out of a practical problem where the #activitypub community is made up of “cats” who were doing seminars outreach to powerful #EU Eurocrats on why they should be interested in #activertypub. #OGB is designed to be messy and not tidy, and it’s a “governance” of a disorganization, not a traditional power structure. “governance” can cooperate with more formal models of governance like traditional cooperatives.
-
A native path out of the mess people make on the #openweb
The Open Governance Body (#OGB) describes a permissionless process/structure that is open and allows the group that forms using the tools to decide who is a part of the group or not. This process can divide into a web of connecting instances of governance as a natural human process of group formation. The #OGB emphasizes that there is no exclusion and always diversity, making it a natural fit for the #fediverse.
The #OGB also shows that if people are stupid and focused on individualism, each governance instance will have one member and no power. To gain power, people have to work together, which is built into the code. The #OGB emphasizes that hoarding power is limited, and it flows through the community, energizing and solidifying the community and building horizontal power to challenge/change vertical power.
The #OGB focus is on the importance of keeping things simple (#KISS) and that some people will try to push for existing power structures before democracy. However, as the process is permissionless, it is not possible to stop them from doing this. The #OGB emphasizes the need to do better, and that being native to the #fediverse is a big help in this regard.
The #OGB emphasizes the importance of recognizing where power comes from in the context of the #fediverse. The fediverse operates differently from corporations, governments, courts, and police, and it is important to think and build with this difference rather than trying to drag the fediverse back to the #mainstreaming path.
The #OGB builds from the #fediverse works because it is different, and it is easy to forget this important thing when #mainstreaming agendas grab and hold. The #OGB suggests that the missing question in almost all conversations is “who are we empowering,” and emphasizes the need to do better in alt-tech.
The #OGB notes that there are problems in alt-tech and suggests that starting with the #4opens would remove 90% of the mess, revealing the real potential for good outcomes. The #OGB highlights that doing better in alt-tech would involve using shovels to make compost and planting seeds of the world we want to see.
The #OGB describes the process scaffolding for the governance body as a default effect, where the decisions on how things work will be up to the members of the body. The power of the governance body is only the power of default, and the #OGB is about removing all hard default choices and building in a small number of KISS tools, then letting the body members work out themselves how to use them.
The #OGB uses the example of #Couchsurfing, where the website redesign removed the #DIY tools active Couchsurfers had used to self-organize, leading to disappointment among members. The #OGB argues that letting members make their own process, open vs. closed, is necessary to overcome the #geekproblem and have hope for alt-tech.
The #OGB builds governance with the way, rules, norms, and actions are structured, sustained, regulated, and held accountable. this is to mediate that the #Fediverse currently has a “herding cats” governance, denoting a futile attempt to control or organize a class of entities that are inherently uncontrollable.
The #OGB codebase is not just a tool for the #Fediverse, but it can be used to democratically run any structures that have stakeholders.
The #OGB provides an example of how the codebase can be used to run a local street market, with each stallholder as a stakeholder, people who shop at the market as users, and the local council, events company, and shop owner’s association as affiliate groups. The #OGB approach and codebase will scale sideways, with street markets governed city-wide, and each of the markets becoming a stakeholder, users as users, and city-wide orgs and groups as affiliate groups.
The #OGB shaping of the “body” comes from a long history/experience of horizontal activism, where “those who do the work have more say.” https://www.noisebridge.net/wiki/Do-ocracy
The #OGB pushes that the bulk of the voice comes from those who run the #Fediverse, the people who run/support the instances. The people who build the tools also get a say, as do support orgs and events, and the users who will be spread widely get a say, but their power is diluted by the much larger numbers involved.
This working practice comes from 30 years of building from The Tyranny of Structureless tick box list https://unite.openworlds.info/Open-Media-Network/openwebgovernancebody/wiki/03.-The-Tyranny-of-Stucturelessness That code being quite “anti-human” is an interesting challenge, and it’s important to figure out how to get the humane “mess” in a coding process that is based on being “exact” and in control #OGB
The #OGB project is grounded in lived experience, and it’s a way out of this mess. We cannot keep using traditional institutions. We have to stop the #techcurn if we are going to use #openweb tech for social/ecological change/challenge, and we need to think about this now.
The #OGB project is about developing better ways of having “trust” based conversations and “trust” based “governance” in the #openweb. The project is built from hundreds of years of on the ground organizing that has shaped every “freedom” we enjoy and is done in a #KISS approach. The #OGB is a #fedivers native way of working, NOT a #mainstreaming way, and it comes from directly working, setting up, and solving recurring problems at hundreds of direct action protest camps.
The #OGB focus on what we know works, as at the moment, almost nothing works for social good. The #OGB project is what is needed, a voluntary cooperative and collaborative alliance that is native to the #fediverse.
The thinking is that we need to put a stop to the #techchurn as we have piles of #techshit already to compost, that #nothingnew is a hashtag for this.
It’s not the goal of the #OGB project to create an organization that tells everyone what protocols and standards to use in the #fediverse. The #OGB project is about developing better ways of having good “trust” based conversations and “trust” based “governance” in the #openweb
To sum up, the current working models of “governance” in open-source projects are monarchy, aristocracy and oligarchy. This is the rock star developer, the coders and the funders. It should be obverse to anyone that 99.99% of people are missing from this feudalistic ideal of “governance”.
Democracy is the basic foundation of our shared modernity.
WHY DO WE PUT UP WITH THIS MESS IN TECH?
Let’s take a different path, please #OGB
Q. that is an optimistic projection
A. I have no illusion that the normal shitty behaver of fucking people over and being a prat will happen, but the codebase is designed to mediate this crap behaver for better outcomes 🙂
#OGB “permissionless” is an important word that needs some thought. The body is made up of three different, balanced groups: stakeholders, users, and affiliate stakeholders. Anybody can become a stakeholder by setting up and running an active instance, and users are self-explanatory. That affiliate stakeholders are a little more complex and are treated differently, and it’s up to the body itself to decide if they play an active and useful role.
That nothing in this is top-down, elitist, discriminatory, or undemocratic, and it’s #KISS and looks safe to the “normal world” while being native to the #fediverse and its roots. All the coding is #4opens, based on #activertypub.
With #OGB, it’s important not to get lost in the #processgeeks and their dogmatic love of #formalconsensus, as that’s a dead end and has been for the 30 years of activism and coding tech. It’s important to keep the #OGB both #KISS and human, understandable. The #OGB is native “governance” and federates in the same way as the projects it “governs”. That this approach is counterintuitive to mainstream ideas and “common sense,” but that’s not necessarily a bad thing.
This approach has worked to some extent, as seen in the “#Fediverse” as a living example, working to scale small to bigger. There will be lots of “smoke,” and help is needed to keep the project clear of this mess. We have to overcome our #stupidindividualism to have a hope of a better world.
#OGB To remind you that the need for “governance” came out of a practical problem where the #activitypub community is made up of “cats” who were doing seminars outreach to powerful #EU Eurocrats on why they should be interested in #activertypub. #OGB is designed to be messy and not tidy, and it’s a “governance” of a disorganization, not a traditional power structure. “governance” can cooperate with more formal models of governance like traditional cooperatives.
-
A native path out of the mess people make on the #openweb
The Open Governance Body (#OGB) describes a permissionless process/structure that is open and allows the group that forms using the tools to decide who is a part of the group or not. This process can divide into a web of connecting instances of governance as a natural human process of group formation. The #OGB emphasizes that there is no exclusion and always diversity, making it a natural fit for the #fediverse.
The #OGB also shows that if people are stupid and focused on individualism, each governance instance will have one member and no power. To gain power, people have to work together, which is built into the code. The #OGB emphasizes that hoarding power is limited, and it flows through the community, energizing and solidifying the community and building horizontal power to challenge/change vertical power.
The #OGB focus is on the importance of keeping things simple (#KISS) and that some people will try to push for existing power structures before democracy. However, as the process is permissionless, it is not possible to stop them from doing this. The #OGB emphasizes the need to do better, and that being native to the #fediverse is a big help in this regard.
The #OGB emphasizes the importance of recognizing where power comes from in the context of the #fediverse. The fediverse operates differently from corporations, governments, courts, and police, and it is important to think and build with this difference rather than trying to drag the fediverse back to the #mainstreaming path.
The #OGB builds from the #fediverse works because it is different, and it is easy to forget this important thing when #mainstreaming agendas grab and hold. The #OGB suggests that the missing question in almost all conversations is “who are we empowering,” and emphasizes the need to do better in alt-tech.
The #OGB notes that there are problems in alt-tech and suggests that starting with the #4opens would remove 90% of the mess, revealing the real potential for good outcomes. The #OGB highlights that doing better in alt-tech would involve using shovels to make compost and planting seeds of the world we want to see.
The #OGB describes the process scaffolding for the governance body as a default effect, where the decisions on how things work will be up to the members of the body. The power of the governance body is only the power of default, and the #OGB is about removing all hard default choices and building in a small number of KISS tools, then letting the body members work out themselves how to use them.
The #OGB uses the example of #Couchsurfing, where the website redesign removed the #DIY tools active Couchsurfers had used to self-organize, leading to disappointment among members. The #OGB argues that letting members make their own process, open vs. closed, is necessary to overcome the #geekproblem and have hope for alt-tech.
The #OGB builds governance with the way, rules, norms, and actions are structured, sustained, regulated, and held accountable. this is to mediate that the #Fediverse currently has a “herding cats” governance, denoting a futile attempt to control or organize a class of entities that are inherently uncontrollable.
The #OGB codebase is not just a tool for the #Fediverse, but it can be used to democratically run any structures that have stakeholders.
The #OGB provides an example of how the codebase can be used to run a local street market, with each stallholder as a stakeholder, people who shop at the market as users, and the local council, events company, and shop owner’s association as affiliate groups. The #OGB approach and codebase will scale sideways, with street markets governed city-wide, and each of the markets becoming a stakeholder, users as users, and city-wide orgs and groups as affiliate groups.
The #OGB shaping of the “body” comes from a long history/experience of horizontal activism, where “those who do the work have more say.” https://www.noisebridge.net/wiki/Do-ocracy
The #OGB pushes that the bulk of the voice comes from those who run the #Fediverse, the people who run/support the instances. The people who build the tools also get a say, as do support orgs and events, and the users who will be spread widely get a say, but their power is diluted by the much larger numbers involved.
This working practice comes from 30 years of building from The Tyranny of Structureless tick box list https://unite.openworlds.info/Open-Media-Network/openwebgovernancebody/wiki/03.-The-Tyranny-of-Stucturelessness That code being quite “anti-human” is an interesting challenge, and it’s important to figure out how to get the humane “mess” in a coding process that is based on being “exact” and in control #OGB
The #OGB project is grounded in lived experience, and it’s a way out of this mess. We cannot keep using traditional institutions. We have to stop the #techcurn if we are going to use #openweb tech for social/ecological change/challenge, and we need to think about this now.
The #OGB project is about developing better ways of having “trust” based conversations and “trust” based “governance” in the #openweb. The project is built from hundreds of years of on the ground organizing that has shaped every “freedom” we enjoy and is done in a #KISS approach. The #OGB is a #fedivers native way of working, NOT a #mainstreaming way, and it comes from directly working, setting up, and solving recurring problems at hundreds of direct action protest camps.
The #OGB focus on what we know works, as at the moment, almost nothing works for social good. The #OGB project is what is needed, a voluntary cooperative and collaborative alliance that is native to the #fediverse.
The thinking is that we need to put a stop to the #techchurn as we have piles of #techshit already to compost, that #nothingnew is a hashtag for this.
It’s not the goal of the #OGB project to create an organization that tells everyone what protocols and standards to use in the #fediverse. The #OGB project is about developing better ways of having good “trust” based conversations and “trust” based “governance” in the #openweb
To sum up, the current working models of “governance” in open-source projects are monarchy, aristocracy and oligarchy. This is the rock star developer, the coders and the funders. It should be obverse to anyone that 99.99% of people are missing from this feudalistic ideal of “governance”.
Democracy is the basic foundation of our shared modernity.
WHY DO WE PUT UP WITH THIS MESS IN TECH?
Let’s take a different path, please #OGB
Q. that is an optimistic projection
A. I have no illusion that the normal shitty behaver of fucking people over and being a prat will happen, but the codebase is designed to mediate this crap behaver for better outcomes 🙂
#OGB “permissionless” is an important word that needs some thought. The body is made up of three different, balanced groups: stakeholders, users, and affiliate stakeholders. Anybody can become a stakeholder by setting up and running an active instance, and users are self-explanatory. That affiliate stakeholders are a little more complex and are treated differently, and it’s up to the body itself to decide if they play an active and useful role.
That nothing in this is top-down, elitist, discriminatory, or undemocratic, and it’s #KISS and looks safe to the “normal world” while being native to the #fediverse and its roots. All the coding is #4opens, based on #activertypub.
With #OGB, it’s important not to get lost in the #processgeeks and their dogmatic love of #formalconsensus, as that’s a dead end and has been for the 30 years of activism and coding tech. It’s important to keep the #OGB both #KISS and human, understandable. The #OGB is native “governance” and federates in the same way as the projects it “governs”. That this approach is counterintuitive to mainstream ideas and “common sense,” but that’s not necessarily a bad thing.
This approach has worked to some extent, as seen in the “#Fediverse” as a living example, working to scale small to bigger. There will be lots of “smoke,” and help is needed to keep the project clear of this mess. We have to overcome our #stupidindividualism to have a hope of a better world.
#OGB To remind you that the need for “governance” came out of a practical problem where the #activitypub community is made up of “cats” who were doing seminars outreach to powerful #EU Eurocrats on why they should be interested in #activertypub. #OGB is designed to be messy and not tidy, and it’s a “governance” of a disorganization, not a traditional power structure. “governance” can cooperate with more formal models of governance like traditional cooperatives.
-
A native path out of the mess people make on the #openweb
The Open Governance Body (#OGB) describes a permissionless process/structure that is open and allows the group that forms using the tools to decide who is a part of the group or not. This process can divide into a web of connecting instances of governance as a natural human process of group formation. The #OGB emphasizes that there is no exclusion and always diversity, making it a natural fit for the #fediverse.
The #OGB also shows that if people are stupid and focused on individualism, each governance instance will have one member and no power. To gain power, people have to work together, which is built into the code. The #OGB emphasizes that hoarding power is limited, and it flows through the community, energizing and solidifying the community and building horizontal power to challenge/change vertical power.
The #OGB focus is on the importance of keeping things simple (#KISS) and that some people will try to push for existing power structures before democracy. However, as the process is permissionless, it is not possible to stop them from doing this. The #OGB emphasizes the need to do better, and that being native to the #fediverse is a big help in this regard.
The #OGB emphasizes the importance of recognizing where power comes from in the context of the #fediverse. The fediverse operates differently from corporations, governments, courts, and police, and it is important to think and build with this difference rather than trying to drag the fediverse back to the #mainstreaming path.
The #OGB builds from the #fediverse works because it is different, and it is easy to forget this important thing when #mainstreaming agendas grab and hold. The #OGB suggests that the missing question in almost all conversations is “who are we empowering,” and emphasizes the need to do better in alt-tech.
The #OGB notes that there are problems in alt-tech and suggests that starting with the #4opens would remove 90% of the mess, revealing the real potential for good outcomes. The #OGB highlights that doing better in alt-tech would involve using shovels to make compost and planting seeds of the world we want to see.
The #OGB describes the process scaffolding for the governance body as a default effect, where the decisions on how things work will be up to the members of the body. The power of the governance body is only the power of default, and the #OGB is about removing all hard default choices and building in a small number of KISS tools, then letting the body members work out themselves how to use them.
The #OGB uses the example of #Couchsurfing, where the website redesign removed the #DIY tools active Couchsurfers had used to self-organize, leading to disappointment among members. The #OGB argues that letting members make their own process, open vs. closed, is necessary to overcome the #geekproblem and have hope for alt-tech.
The #OGB builds governance with the way, rules, norms, and actions are structured, sustained, regulated, and held accountable. this is to mediate that the #Fediverse currently has a “herding cats” governance, denoting a futile attempt to control or organize a class of entities that are inherently uncontrollable.
The #OGB codebase is not just a tool for the #Fediverse, but it can be used to democratically run any structures that have stakeholders.
The #OGB provides an example of how the codebase can be used to run a local street market, with each stallholder as a stakeholder, people who shop at the market as users, and the local council, events company, and shop owner’s association as affiliate groups. The #OGB approach and codebase will scale sideways, with street markets governed city-wide, and each of the markets becoming a stakeholder, users as users, and city-wide orgs and groups as affiliate groups.
The #OGB shaping of the “body” comes from a long history/experience of horizontal activism, where “those who do the work have more say.” https://www.noisebridge.net/wiki/Do-ocracy
The #OGB pushes that the bulk of the voice comes from those who run the #Fediverse, the people who run/support the instances. The people who build the tools also get a say, as do support orgs and events, and the users who will be spread widely get a say, but their power is diluted by the much larger numbers involved.
This working practice comes from 30 years of building from The Tyranny of Structureless tick box list https://unite.openworlds.info/Open-Media-Network/openwebgovernancebody/wiki/03.-The-Tyranny-of-Stucturelessness That code being quite “anti-human” is an interesting challenge, and it’s important to figure out how to get the humane “mess” in a coding process that is based on being “exact” and in control #OGB
The #OGB project is grounded in lived experience, and it’s a way out of this mess. We cannot keep using traditional institutions. We have to stop the #techcurn if we are going to use #openweb tech for social/ecological change/challenge, and we need to think about this now.
The #OGB project is about developing better ways of having “trust” based conversations and “trust” based “governance” in the #openweb. The project is built from hundreds of years of on the ground organizing that has shaped every “freedom” we enjoy and is done in a #KISS approach. The #OGB is a #fedivers native way of working, NOT a #mainstreaming way, and it comes from directly working, setting up, and solving recurring problems at hundreds of direct action protest camps.
The #OGB focus on what we know works, as at the moment, almost nothing works for social good. The #OGB project is what is needed, a voluntary cooperative and collaborative alliance that is native to the #fediverse.
The thinking is that we need to put a stop to the #techchurn as we have piles of #techshit already to compost, that #nothingnew is a hashtag for this.
It’s not the goal of the #OGB project to create an organization that tells everyone what protocols and standards to use in the #fediverse. The #OGB project is about developing better ways of having good “trust” based conversations and “trust” based “governance” in the #openweb
To sum up, the current working models of “governance” in open-source projects are monarchy, aristocracy and oligarchy. This is the rock star developer, the coders and the funders. It should be obverse to anyone that 99.99% of people are missing from this feudalistic ideal of “governance”.
Democracy is the basic foundation of our shared modernity.
WHY DO WE PUT UP WITH THIS MESS IN TECH?
Let’s take a different path, please #OGB
Q. that is an optimistic projection
A. I have no illusion that the normal shitty behaver of fucking people over and being a prat will happen, but the codebase is designed to mediate this crap behaver for better outcomes 🙂
#OGB “permissionless” is an important word that needs some thought. The body is made up of three different, balanced groups: stakeholders, users, and affiliate stakeholders. Anybody can become a stakeholder by setting up and running an active instance, and users are self-explanatory. That affiliate stakeholders are a little more complex and are treated differently, and it’s up to the body itself to decide if they play an active and useful role.
That nothing in this is top-down, elitist, discriminatory, or undemocratic, and it’s #KISS and looks safe to the “normal world” while being native to the #fediverse and its roots. All the coding is #4opens, based on #activertypub.
With #OGB, it’s important not to get lost in the #processgeeks and their dogmatic love of #formalconsensus, as that’s a dead end and has been for the 30 years of activism and coding tech. It’s important to keep the #OGB both #KISS and human, understandable. The #OGB is native “governance” and federates in the same way as the projects it “governs”. That this approach is counterintuitive to mainstream ideas and “common sense,” but that’s not necessarily a bad thing.
This approach has worked to some extent, as seen in the “#Fediverse” as a living example, working to scale small to bigger. There will be lots of “smoke,” and help is needed to keep the project clear of this mess. We have to overcome our #stupidindividualism to have a hope of a better world.
#OGB To remind you that the need for “governance” came out of a practical problem where the #activitypub community is made up of “cats” who were doing seminars outreach to powerful #EU Eurocrats on why they should be interested in #activertypub. #OGB is designed to be messy and not tidy, and it’s a “governance” of a disorganization, not a traditional power structure. “governance” can cooperate with more formal models of governance like traditional cooperatives.
-
A native path out of the mess people make on the #openweb
The Open Governance Body (#OGB) describes a permissionless process/structure that is open and allows the group that forms using the tools to decide who is a part of the group or not. This process can divide into a web of connecting instances of governance as a natural human process of group formation. The #OGB emphasizes that there is no exclusion and always diversity, making it a natural fit for the #fediverse.
The #OGB also shows that if people are stupid and focused on individualism, each governance instance will have one member and no power. To gain power, people have to work together, which is built into the code. The #OGB emphasizes that hoarding power is limited, and it flows through the community, energizing and solidifying the community and building horizontal power to challenge/change vertical power.
The #OGB focus is on the importance of keeping things simple (#KISS) and that some people will try to push for existing power structures before democracy. However, as the process is permissionless, it is not possible to stop them from doing this. The #OGB emphasizes the need to do better, and that being native to the #fediverse is a big help in this regard.
The #OGB emphasizes the importance of recognizing where power comes from in the context of the #fediverse. The fediverse operates differently from corporations, governments, courts, and police, and it is important to think and build with this difference rather than trying to drag the fediverse back to the #mainstreaming path.
The #OGB builds from the #fediverse works because it is different, and it is easy to forget this important thing when #mainstreaming agendas grab and hold. The #OGB suggests that the missing question in almost all conversations is “who are we empowering,” and emphasizes the need to do better in alt-tech.
The #OGB notes that there are problems in alt-tech and suggests that starting with the #4opens would remove 90% of the mess, revealing the real potential for good outcomes. The #OGB highlights that doing better in alt-tech would involve using shovels to make compost and planting seeds of the world we want to see.
The #OGB describes the process scaffolding for the governance body as a default effect, where the decisions on how things work will be up to the members of the body. The power of the governance body is only the power of default, and the #OGB is about removing all hard default choices and building in a small number of KISS tools, then letting the body members work out themselves how to use them.
The #OGB uses the example of #Couchsurfing, where the website redesign removed the #DIY tools active Couchsurfers had used to self-organize, leading to disappointment among members. The #OGB argues that letting members make their own process, open vs. closed, is necessary to overcome the #geekproblem and have hope for alt-tech.
The #OGB builds governance with the way, rules, norms, and actions are structured, sustained, regulated, and held accountable. this is to mediate that the #Fediverse currently has a “herding cats” governance, denoting a futile attempt to control or organize a class of entities that are inherently uncontrollable.
The #OGB codebase is not just a tool for the #Fediverse, but it can be used to democratically run any structures that have stakeholders.
The #OGB provides an example of how the codebase can be used to run a local street market, with each stallholder as a stakeholder, people who shop at the market as users, and the local council, events company, and shop owner’s association as affiliate groups. The #OGB approach and codebase will scale sideways, with street markets governed city-wide, and each of the markets becoming a stakeholder, users as users, and city-wide orgs and groups as affiliate groups.
The #OGB shaping of the “body” comes from a long history/experience of horizontal activism, where “those who do the work have more say.” https://www.noisebridge.net/wiki/Do-ocracy
The #OGB pushes that the bulk of the voice comes from those who run the #Fediverse, the people who run/support the instances. The people who build the tools also get a say, as do support orgs and events, and the users who will be spread widely get a say, but their power is diluted by the much larger numbers involved.
This working practice comes from 30 years of building from The Tyranny of Structureless tick box list https://unite.openworlds.info/Open-Media-Network/openwebgovernancebody/wiki/03.-The-Tyranny-of-Stucturelessness That code being quite “anti-human” is an interesting challenge, and it’s important to figure out how to get the humane “mess” in a coding process that is based on being “exact” and in control #OGB
The #OGB project is grounded in lived experience, and it’s a way out of this mess. We cannot keep using traditional institutions. We have to stop the #techcurn if we are going to use #openweb tech for social/ecological change/challenge, and we need to think about this now.
The #OGB project is about developing better ways of having “trust” based conversations and “trust” based “governance” in the #openweb. The project is built from hundreds of years of on the ground organizing that has shaped every “freedom” we enjoy and is done in a #KISS approach. The #OGB is a #fedivers native way of working, NOT a #mainstreaming way, and it comes from directly working, setting up, and solving recurring problems at hundreds of direct action protest camps.
The #OGB focus on what we know works, as at the moment, almost nothing works for social good. The #OGB project is what is needed, a voluntary cooperative and collaborative alliance that is native to the #fediverse.
The thinking is that we need to put a stop to the #techchurn as we have piles of #techshit already to compost, that #nothingnew is a hashtag for this.
It’s not the goal of the #OGB project to create an organization that tells everyone what protocols and standards to use in the #fediverse. The #OGB project is about developing better ways of having good “trust” based conversations and “trust” based “governance” in the #openweb
To sum up, the current working models of “governance” in open-source projects are monarchy, aristocracy and oligarchy. This is the rock star developer, the coders and the funders. It should be obverse to anyone that 99.99% of people are missing from this feudalistic ideal of “governance”.
Democracy is the basic foundation of our shared modernity.
WHY DO WE PUT UP WITH THIS MESS IN TECH?
Let’s take a different path, please #OGB
Q. that is an optimistic projection
A. I have no illusion that the normal shitty behaver of fucking people over and being a prat will happen, but the codebase is designed to mediate this crap behaver for better outcomes 🙂
#OGB “permissionless” is an important word that needs some thought. The body is made up of three different, balanced groups: stakeholders, users, and affiliate stakeholders. Anybody can become a stakeholder by setting up and running an active instance, and users are self-explanatory. That affiliate stakeholders are a little more complex and are treated differently, and it’s up to the body itself to decide if they play an active and useful role.
That nothing in this is top-down, elitist, discriminatory, or undemocratic, and it’s #KISS and looks safe to the “normal world” while being native to the #fediverse and its roots. All the coding is #4opens, based on #activertypub.
With #OGB, it’s important not to get lost in the #processgeeks and their dogmatic love of #formalconsensus, as that’s a dead end and has been for the 30 years of activism and coding tech. It’s important to keep the #OGB both #KISS and human, understandable. The #OGB is native “governance” and federates in the same way as the projects it “governs”. That this approach is counterintuitive to mainstream ideas and “common sense,” but that’s not necessarily a bad thing.
This approach has worked to some extent, as seen in the “#Fediverse” as a living example, working to scale small to bigger. There will be lots of “smoke,” and help is needed to keep the project clear of this mess. We have to overcome our #stupidindividualism to have a hope of a better world.
#OGB To remind you that the need for “governance” came out of a practical problem where the #activitypub community is made up of “cats” who were doing seminars outreach to powerful #EU Eurocrats on why they should be interested in #activertypub. #OGB is designed to be messy and not tidy, and it’s a “governance” of a disorganization, not a traditional power structure. “governance” can cooperate with more formal models of governance like traditional cooperatives.
-
Earlier this year, Cendyne wrote a blog post covering the use of HKDF, building partially upon my own blog post about HKDF and the KDF security definition, but moreso inspired by a cryptographic issue they identified in another company’s product (dubbed AnonCo).
At the bottom they teased:
Database cryptography is hard. The above sketch is not complete and does not address several threats! This article is quite long, so I will not be sharing the fixes.
Cendyne
If you read Cendyne’s post, you may have nodded along with that remark and not appreciate the degree to which our naga friend was putting it mildly. So I thought I’d share some of my knowledge about real-world database cryptography in an accessible and fun format in the hopes that it might serve as an introduction to the specialization.
Note: I’m also not going to fix Cendyne’s sketch of AnonCo’s software here–partly because I don’t want to get in the habit of assigning homework or required reading, but mostly because it’s kind of obvious once you’ve learned the basics.
I’m including art of my fursona in this post… as is tradition for furry blogs.If you don’t like furries, please feel free to leave this blog and read about this topic elsewhere.
Thanks to CMYKat for the awesome stickers.
Contents
- Database Cryptography?
- Cryptography for Relational Databases
- The Perils of Built-in Encryption Functions
- Application-Layer Relational Database Cryptography
- Confused Deputies
- Canonicalization Attacks
- Multi-Tenancy
- Cryptography for NoSQL Databases
- NoSQL is Built Different
- Record Authentication
- Bonus: A Maximally Schema-Free, Upgradeable Authentication Design
- Searchable Encryption
- Order-{Preserving, Revealing} Encryption
- Deterministic Encryption
- Homomorphic Encryption
- Searchable Symmetric Encryption (SSE)
- You Can Have Little a HMAC, As a Treat
- Intermission
- Case Study: MongoDB Client-Side Encryption
- MongoCrypt: The Good
- How is Queryable Encryption Implemented?
- MongoCrypt: The Bad
- MongoCrypt: The Ugly
- MongoCrypt: The Good
- Wrapping Up
Database Cryptography?
The premise of database cryptography is deceptively simple: You have a database, of some sort, and you want to store sensitive data in said database.
The consequences of this simple premise are anything but simple. Let me explain.
Art: ScruffKerfluffThe sensitive data you want to store may need to remain confidential, or you may need to provide some sort of integrity guarantees throughout your entire system, or sometimes both. Sometimes all of your data is sensitive, sometimes only some of it is. Sometimes the confidentiality requirements of your data extends to where within a dataset the record you want actually lives. Sometimes that’s true of some data, but not others, so your cryptography has to be flexible to support multiple types of workloads.
Other times, you just want your disks encrypted at rest so if they grow legs and walk out of the data center, the data cannot be comprehended by an attacker. And you can’t be bothered to work on this problem any deeper. This is usually what compliance requirements cover. Boxes get checked, executives feel safer about their operation, and the whole time nobody has really analyzed the risks they’re facing.
But we’re not settling for mere compliance on this blog. Furries have standards, after all.
So the first thing you need to do before diving into database cryptography is threat modelling. The first step in any good threat model is taking inventory; especially of assumptions, requirements, and desired outcomes. A few good starter questions:
- What database software is being used? Is it up to date?
- What data is being stored in which database software?
- How are databases oriented in the network of the overall system?
- Is your database properly firewalled from the public Internet?
- How does data flow throughout the network, and when do these data flows intersect with the database?
- Which applications talk to the database? What languages are they written in? Which APIs do they use?
- How will cryptography secrets be managed?
- Is there one key for everyone, one key per tenant, etc.?
- How are keys rotated?
- Do you use envelope encryption with an HSM, or vend the raw materials to your end devices?
The first two questions are paramount for deciding how to write software for database cryptography, before you even get to thinking about the cryptography itself.
(This is not a comprehensive set of questions to ask, either. A formal threat model is much deeper in the weeds.)
The kind of cryptography protocol you need for, say, storing encrypted CSV files an S3 bucket is vastly different from relational (SQL) databases, which in turn will be significantly different from schema-free (NoSQL) databases.
Furthermore, when you get to the point that you can start to think about the cryptography, you’ll often need to tackle confidentiality and integrity separately.
If that’s unclear, think of a scenario like, “I need to encrypt PII, but I also need to digitally sign the lab results so I know it wasn’t tampered with at rest.”
My point is, right off the bat, we’ve got a three-dimensional matrix of complexity to contend with:
- On one axis, we have the type of database.
- Flat-file
- Relational
- Schema-free
- On another, we have the basic confidentiality requirements of the data.
- Field encryption
- Row encryption
- Column encryption
- Unstructured record encryption
- Encrypting entire collections of records
- Finally, we have the integrity requirements of the data.
- Field authentication
- Row/column authentication
- Unstructured record authentication
- Collection authentication (based on e.g. Sparse Merkle Trees)
And then you have a fourth dimension that often falls out of operational requirements for databases: Searchability.
Why store data in a database if you have no way to index or search the data for fast retrieval?
Credit: HarubakiIf you’re starting to feel overwhelmed, you’re not alone. A lot of developers drastically underestimate the difficulty of the undertaking, until they run head-first into the complexity.
Some just phone it in with
AES_Encrypt()calls in their MySQL queries. (Too bad ECB mode doesn’t provide semantic security!)Which brings us to the meat of this blog post: The actual cryptography part.
Cryptography is the art of transforming information security problems into key management problems.
Former coworker
Note: In the interest of time, I’m skipping over flat files and focusing instead on actual database technologies.
Cryptography for Relational Databases
Encrypting data in an SQL database seems simple enough, even if you’ve managed to shake off the complexity I teased from the introduction.
You’ve got data, you’ve got a column on a table. Just encrypt the data and shove it in a cell on that column and call it a day, right?
But, alas, this is a trap. There are so many gotchas that I can’t weave a coherent, easy-to-follow narrative between them all.
So let’s start with a simple question: where and how are you performing your encryption?
The Perils of Built-in Encryption Functions
MySQL provides functions called AES_Encrypt and AES_Decrypt, which many developers have unfortunately decided to rely on in the past.
It’s unfortunate because these functions implement ECB mode. To illustrate why ECB mode is bad, I encrypted one of my art commissions with AES in ECB mode:
Art by Riley, encrypted with AES-ECBThe problems with ECB mode aren’t exactly “you can see the image through it,” because ECB-encrypting a compressed image won’t have redundancy (and thus can make you feel safer than you are).
ECB art is a good visual for the actual issue you should care about, however: A lack of semantic security.
A cryptosystem is considered semantically secure if observing the ciphertext doesn’t reveal information about the plaintext (except, perhaps, the length; which all cryptosystems leak to some extent). More information here.
ECB art isn’t to be confused with ECB poetry, which looks like this:
Oh little one, you’re growing up
You’ll soon be writing C
You’ll treat your ints as pointers
You’ll nest the ternary
You’ll cut and paste from github
And try cryptography
But even in your darkest hour
Do not use ECBCBC’s BEASTly when padding’s abused
And CTR’s fine til a nonce is reused
Some say it’s a CRIME to compress then encrypt
Or store keys in the browser (or use javascript)
Diffie Hellman will collapse if hackers choose your g
And RSA is full of traps when e is set to 3
Whiten! Blind! In constant time! Don’t write an RNG!
But failing all, and listen well: Do not use ECBThey’ll say “It’s like a one-time-pad!
The data’s short, it’s not so bad
the keys are long–they’re iron clad
I have a PhD!”
And then you’re front page Hacker News
Your passwords cracked–Adobe Blues.
Don’t leave your penguins showing through,
Do not use ECB— Ben Nagy, PoC||GTFO 0x04:13
Most people reading this probably know better than to use ECB mode already, and don’t need any of these reminders, but there is still a lot of code that inadvertently uses ECB mode to encrypt data in the database.
Also,
Credit: CMYKattSHOW processlist;leaks your encryption keys. Oops.Application-layer Relational Database Cryptography
Whether burned by ECB or just cautious about not giving your secrets to the system that stores all the ciphertext protected by said secret, a common next step for developers is to simply encrypt in their server-side application code.
And, yes, that’s part of the answer. But how you encrypt is important.
Credit: Harubaki“I’ll encrypt with CBC mode.”
If you don’t authenticate your ciphertext, you’ll be sorry. Maybe try again?“Okay, fine, I’ll use an authenticated mode like GCM.”
Did you remember to make the table and column name part of your AAD? What about the primary key of the record?“What on Earth are you talking about, Soatok?”
Welcome to the first footgun of database cryptography!Confused Deputies
Encrypting your sensitive data is necessary, but not sufficient. You need to also bind your ciphertexts to the specific context in which they are stored.
To understand why, let’s take a step back: What specific threat does encrypting your database records protect against?
We’ve already established that “your disks walk out of the datacenter” is a “full disk encryption” problem, so if you’re using application-layer cryptography to encrypt data in a relational database, your threat model probably involves unauthorized access to the database server.
What, then, stops an attacker from copying ciphertexts around?
Credit: CMYKattLet’s say I have a legitimate user account with an ID 12345, and I want to read your street address, but it’s encrypted in the database. But because I’m a clever hacker, I have unfettered access to your relational database server.
All I would need to do is simply…
UPDATE table SET addr_encrypted = 'your-ciphertext' WHERE id = 12345…and then access the application through my legitimate access. Bam, data leaked. As an attacker, I can probably even copy fields from other columns and it will just decrypt. Even if you’re using an authenticated mode.
We call this a confused deputy attack, because the deputy (the component of the system that has been delegated some authority or privilege) has become confused by the attacker, and thus undermined an intended security goal.
The fix is to use the AAD parameter from the authenticated mode to bind the data to a given context. (AAD = Additional Authenticated Data.)
- $addr = aes_gcm_encrypt($addr, $key);+ $addr = aes_gcm_encrypt($addr, $key, canonicalize([+ $tableName,+ $columnName,+ $primaryKey+ ]);
Now if I start cutting and pasting ciphertexts around, I get a decryption failure instead of silently decrypting plaintext.
This may sound like a specific vulnerability, but it’s more of a failure to understand an important general lesson with database cryptography:
Where your data lives is part of its identity, and MUST be authenticated.
Soatok’s Rule of Database Cryptography
Canonicalization Attacks
In the previous section, I introduced a pseudocode called
canonicalize(). This isn’t a pasto from some reference code; it’s an important design detail that I will elaborate on now.First, consider you didn’t do anything to canonicalize your data, and you just joined strings together and called it a day…
function dumbCanonicalize( string $tableName, string $columnName, string|int $primaryKey): string { return $tableName . '_' . $columnName . '#' . $primaryKey;}Consider these two inputs to this function:
dumbCanonicalize('customers', 'last_order_uuid', 123);dumbCanonicalize('customers_last_order', 'uuid', 123);
In this case, your AAD would be the same, and therefore, your deputy can still be confused (albeit in a narrower use case).
In Cendyne’s article, AnonCo did something more subtle: The canonicalization bug created a collision on the inputs to HKDF, which resulted in an unintentional key reuse.
Up until this point, their mistake isn’t relevant to us, because we haven’t even explored key management at all. But the same design flaw can re-emerge in multiple locations, with drastically different consequence.
Multi-Tenancy
Once you’ve implemented a mitigation against Confused Deputies, you may think your job is done. And it very well could be.
Often times, however, software developers are tasked with building support for Bring Your Own Key (BYOK).
This is often spawned from a specific compliance requirement (such as cryptographic shredding; i.e. if you erase the key, you can no longer recover the plaintext, so it may as well be deleted).
Other times, this is driven by a need to cut costs: Storing different users’ data in the same database server, but encrypting it such that they can only encrypt their own records.
Two things can happen when you introduce multi-tenancy into your database cryptography designs:
- Invisible Salamanders becomes a risk, due to multiple keys being possible for any given encrypted record.
- Failure to address the risk of Invisible Salamanders can undermine your protection against Confused Deputies, thereby returning you to a state before you properly used the AAD.
So now you have to revisit your designs and ensure you’re using a key-committing authenticated mode, rather than just a regular authenticated mode.
Isn’t cryptography fun?
“What Are Invisible Salamanders?”
This refers to a fun property of AEAD modes based on Polynomical MACs. Basically, if you:
- Encrypt one message under a specific key and nonce.
- Encrypt another message under a separate key and nonce.
…Then you can get the same exact ciphertext and authentication tag. Performing this attack requires you to control the keys for both encryption operations.
This was first demonstrated in an attack against encrypted messaging applications, where a picture of a salamander was hidden from the abuse reporting feature because another attached file had the same authentication tag and ciphertext, and you could trick the system if you disclosed the second key instead of the first. Thus, the salamander is invisible to attackers.
Art: CMYKatWe’re not quite done with relational databases yet, but we should talk about NoSQL databases for a bit. The final topic in scope applies equally to both, after all.
Cryptography for NoSQL Databases
Most of the topics from relational databases also apply to NoSQL databases, so I shall refrain from duplicating them here. This article is already sufficiently long to read, after all, and I dislike redundancy.
NoSQL is Built Different
The main thing that NoSQL databases offer in the service of making cryptographers lose sleep at night is the schema-free nature of NoSQL designs.
What this means is that, if you’re using a client-side encryption library for a NoSQL database, the previous concerns about confused deputy attacks are amplified by the malleability of the document structure.
Additionally, the previously discussed cryptographic attacks against the encryption mode may be less expensive for an attacker to pull off.
Consider the following record structure, which stores a bunch of data stored with AES in CBC mode:
{ "encrypted-data-key": "<blob>", "name": "<ciphertext>", "address": [ "<ciphertext>", "<ciphertext>" ], "social-security": "<ciphertext>", "zip-code": "<ciphertext>"}If this record is decrypted with code that looks something like this:
$decrypted = [];// ... snip ...foreach ($record['address'] as $i => $addrLine) { try { $decrypted['address'][$i] = $this->decrypt($addrLine); } catch (Throwable $ex) { // You'd never deliberately do this, but it's for illustration $this->doSomethingAnOracleCanObserve($i); // This is more believable, of course: $this->logDecryptionError($ex, $addrLine); $decrypted['address'][$i] = ''; }}Then you can keep appending rows to the
Art: Harubaki"address"field to reduce the number of writes needed to exploit a padding oracle attack against any of the<ciphertext>fields.This isn’t to say that NoSQL is less secure than SQL, from the context of client-side encryption. However, the powerful feature sets that NoSQL users are accustomed to may also give attackers a more versatile toolkit to work with.
Record Authentication
A pedant may point out that record authentication applies to both SQL and NoSQL. However, I mostly only observe this feature in NoSQL databases and document storage systems in the wild, so I’m shoving it in here.
Encrypting fields is nice and all, but sometimes what you want to know is that your unencrypted data hasn’t been tampered with as it flows through your system.
The trivial way this is done is by using a digital signature algorithm over the whole record, and then appending the signature to the end. When you go to verify the record, all of the information you need is right there.
This works well enough for most use cases, and everyone can pack up and go home. Nothing more to see here.
Except…
When you’re working with NoSQL databases, you often want systems to be able to write to additional fields, and since you’re working with schema-free blobs of data rather than a normalized set of relatable tables, the most sensible thing to do is to is to append this data to the same record.
Except, oops! You can’t do that if you’re shoving a digital signature over the record. So now you need to specify which fields are to be included in the signature.
And you need to think about how to model that in a way that doesn’t prohibit schema upgrades nor allow attackers to perform downgrade attacks. (See below.)
I don’t have any specific real-world examples here that I can point to of this problem being solved well.Art: CMYKat
Furthermore, as with preventing confused deputy and/or canonicalization attacks above, you must also include the fully qualified path of each field in the data that gets signed.
As I said with encryption before, but also true here:
Where your data lives is part of its identity, and MUST be authenticated.
Soatok’s Rule of Database Cryptography
This requirement holds true whether you’re using symmetric-key authentication (i.e. HMAC) or asymmetric-key digital signatures (e.g. EdDSA).
Bonus: A Maximally Schema-Free, Upgradeable Authentication Design
Art: HarubakiOkay, how do you solve this problem so that you can perform updates and upgrades to your schema but without enabling attackers to downgrade the security? Here’s one possible design.
Let’s say you have two metadata fields on each record:
- A compressed binary string representing which fields should be authenticated. This field is, itself, not authenticated. Let’s call this
meta-auth. - A compressed binary string representing which of the authenticated fields should also be encrypted. This field is also authenticated. This is at most the same length as the first metadata field. Let’s call this
meta-enc.
Furthermore, you will specify a canonical field ordering for both how data is fed into the signature algorithm as well as the field mappings in
meta-authandmeta-enc.{ "example": { "credit-card": { "number": /* encrypted */, "expiration": /* encrypted */, "ccv": /* encrypted */ }, "superfluous": { "rewards-member": null } }, "meta-auth": compress_bools([ true, /* example.credit-card.number */ true, /* example.credit-card.expiration */ true, /* example.credit-card.ccv */ false, /* example.superfluous.rewards-member */ true /* meta-enc */ ]), "meta-enc": compress_bools([ true, /* example.credit-card.number */ true, /* example.credit-card.expiration */ true, /* example.credit-card.ccv */ false /* example.superfluous.rewards-member */ ]), "signature": /* -- snip -- */}When you go to append data to an existing record, you’ll need to update
meta-authto include the mapping of fields based on this canonical ordering to ensure only the intended fields get validated.When you update your code to add an additional field that is intended to be signed, you can roll that out for new records and the record will continue to be self-describing:
- New records will have the additional field flagged as authenticated in
meta-auth(andmeta-encwill grow) - Old records will not, but your code will still sign them successfully
- To prevent downgrade attacks, simply include a schema version ID as an additional plaintext field that gets authenticated. An attacker who tries to downgrade will need to be able to produce a valid signature too.
You might think
meta-authgives an attacker some advantage, but this only includes which fields are included in the security boundary of the signature or MAC, which allows unauthenticated data to be appended for whatever operational purpose without having to update signatures or expose signing keys to a wider part of the network.{ "example": { "credit-card": { "number": /* encrypted */, "expiration": /* encrypted */, "ccv": /* encrypted */ }, "superfluous": { "rewards-member": null } }, "meta-auth": compress_bools([ true, /* example.credit-card.number */ true, /* example.credit-card.expiration */ true, /* example.credit-card.ccv */ false, /* example.superfluous.rewards-member */ true, /* meta-enc */ true /* meta-version */ ]), "meta-enc": compress_bools([ true, /* example.credit-card.number */ true, /* example.credit-card.expiration */ true, /* example.credit-card.ccv */ false, /* example.superfluous.rewards-member */ true /* meta-version */ ]), "meta-version": 0x01000000, "signature": /* -- snip -- */}If an attacker tries to use the
meta-authfield to mess with a record, the best they can hope for is an Invalid Signature exception (assuming the signature algorithm is secure to begin with).Even if they keep all of the fields the same, but play around with the structure of the record (e.g. changing the XPath or equivalent), so long as the path is authenticated with each field, breaking this is computationally infeasible.
Searchable Encryption
If you’ve managed to make it through the previous sections, congratulations, you now know enough to build a secure but completely useless database.
Art: CMYKatOkay, put away the pitchforks; I will explain.
Part of the reason why we store data in a database, rather than a flat file, is because we want to do more than just read and write. Sometimes computer scientists want to compute. Almost always, you want to be able to query your database for a subset of records based on your specific business logic needs.
And so, a database which doesn’t do anything more than store ciphertext and maybe signatures is pretty useless to most people. You’d have better luck selling Monkey JPEGs to furries than convincing most businesses to part with their precious database-driven report generators.
Art: SophieSo whenever one of your users wants to actually use their data, rather than just store it, they’re forced to decide between two mutually exclusive options:
- Encrypting the data, to protect it from unauthorized disclosure, but render it useless
- Doing anything useful with the data, but leaving it unencrypted in the database
This is especially annoying for business types that are all in on the Zero Trust buzzword.
Fortunately, the cryptographers are at it again, and boy howdy do they have a lot of solutions for this problem.
Order-{Preserving, Revealing} Encryption
On the fun side of things, you have things like Order-Preserving and Order-Revealing Encryption, which Matthew Green wrote about at length.
[D]atabase encryption has been a controversial subject in our field. I wish I could say that there’s been an actual debate, but it’s more that different researchers have fallen into different camps, and nobody has really had the data to make their position in a compelling way. There have actually been some very personal arguments made about it.
Attack of the week: searchable encryption and the ever-expanding leakage function
The problem with these designs is that they have a significant enough leakage that it no longer provides semantic security.
From Grubbs, et al. (GLMP, 2019.)
Colors inverted to fit my blog’s theme better.To put it in other words: These designs are only marginally better than ECB mode, and probably deserve their own poems too.
Order revealing
Reveals much more than order
Softcore ECBOrder preserving
Semantic security?
Only in your dreamsHaiku for your consideration
Deterministic Encryption
Here’s a simpler, but also terrible, idea for searchable encryption: Simply give up on semantic security entirely.
If you recall the
AES_{De,En}crypt()functions built into MySQL I mentioned at the start of this article, those are the most common form of deterministic encryption I’ve seen in use.SELECT * FROM foo WHERE bar = AES_Encrypt('query', 'key');However, there are slightly less bad variants. If you use AES-GCM-SIV with a static nonce, your ciphertexts are fully deterministic, and you can encrypt a small number of distinct records safely before you’re no longer secure.
From Page 14 of the linked paper. Full view.That’s certainly better than nothing, but you also can’t mitigate confused deputy attacks. But we can do better than this.
Homomorphic Encryption
In a safer plane of academia, you’ll find homomorphic encryption, which researchers recently demonstrated with serving Wikipedia pages in a reasonable amount of time.
Homomorphic encryption allows computations over the ciphertext, which will be reflected in the plaintext, without ever revealing the key to the entity performing the computation.
If this sounds vaguely similar to the conditions that enable chosen-ciphertext attacks, you probably have a good intuition for how it works: RSA is homomorphic to multiplication, AES-CTR is homomorphic to XOR. Fully homomorphic encryption uses lattices, which enables multiple operations but carries a relatively enormous performance cost.
Art: HarubakiHomomorphic encryption sometimes intersects with machine learning, because the notion of training an encrypted model by feeding it encrypted data, then decrypting it after-the-fact is desirable for certain business verticals. Your data scientists never see your data, and you have some plausible deniability about the final ML model this work produces. This is like a Siren song for Venture Capitalist-backed medical technology companies. Tech journalists love writing about it.
However, a less-explored use case is the ability to encrypt your programs but still get the correct behavior and outputs. Although this sounds like a DRM technology, it’s actually something that individuals could one day use to prevent their ISPs or cloud providers from knowing what software is being executed on the customer’s leased hardware. The potential for a privacy win here is certainly worth pondering, even if you’re a tried and true Pirate Party member.
Just say “NO” to the copyright cartels.Art: CMYKat
Searchable Symmetric Encryption (SSE)
Forget about working at the level of fields and rows or individual records. What if we, instead, worked over collections of documents, where each document is viewed as a set of keywords from a keyword space?
Art: CMYKatThat’s the basic premise of SSE: Encrypting collections of documents rather than individual records.
The actual implementation details differ greatly between designs. They also differ greatly in their leakage profiles and susceptibility to side-channel attacks.
Some schemes use a so-called trapdoor permutation, such as RSA, as one of their building blocks.
Some schemes only allow for searching a static set of records, while others can accommodate new data over time (with the trade-off between more leakage or worse performance).
If you’re curious, you can learn more about SSE here, and see some open source SEE implementations online here.
You’re probably wondering, “If SSE is this well-studied and there are open source implementations available, why isn’t it more widely used?”
Your guess is as good as mine, but I can think of a few reasons:
- The protocols can be a little complicated to implement, and aren’t shipped by default in cryptography libraries (i.e. OpenSSL’s libcrypto or libsodium).
- Every known security risk in SSE is the product of a trade-offs, rather than there being a single winner for all use cases that developers can feel comfortable picking.
- Insufficient marketing and developer advocacy.
SSE schemes are mostly of interest to academics, although Seny Kamara (Brown Univeristy professior and one of the luminaries of searchable encryption) did try to develop an app called Pixek which used SSE to encrypt photos.
Maybe there’s room for a cryptography competition on searchable encryption schemes in the future.
You Can Have Little a HMAC, As a Treat
Finally, I can’t talk about searchable encryption without discussing a technique that’s older than dirt by Internet standards, that has been independently reinvented by countless software developers tasked with encrypting database records.
The oldest version I’ve been able to track down dates to 2006 by Raul Garcia at Microsoft, but I’m not confident that it didn’t exist before.
The idea I’m alluding to goes like this:
- Encrypt your data, securely, using symmetric cryptography.
(Hopefully your encryption addresses the considerations outlined in the relevant sections above.) - Separately, calculate an HMAC over the unencrypted data with a separate key used exclusively for indexing.
When you need to query your data, you can just recalculate the HMAC of your challenge and fetch the records that match it. Easy, right?
Even if you rotate your keys for encryption, you keep your indexing keys static across your entire data set. This lets you have durable indexes for encrypted data, which gives you the ability to do literal lookups for the performance hit of a hash function.
Additionally, everyone has HMAC in their toolkit, so you don’t have to move around implementations of complex cryptographic building blocks. You can live off the land. What’s not to love?
Hooray!However, if you stopped here, we regret to inform you that your data is no longer indistinguishable from random, which probably undermines the security proof for your encryption scheme.
How annoying!Of course, you don’t have to stop with the addition of plain HMAC to your database encryption software.
Take a page from Troy Hunt: Truncate the output to provide k-anonymity rather than a direct literal look-up.
“K-What Now?”
Imagine you have a full HMAC-SHA256 of the plaintext next to every ciphertext record with a static key, for searchability.
Each HMAC output corresponds 1:1 with a unique plaintext.
Because you’re using HMAC with a secret key, an attacker can’t just build a rainbow table like they would when attempting password cracking, but it still leaks duplicate plaintexts.
For example, an HMAC-SHA256 output might look like this:
Art: CMYKat\04a74e4c0158e34a566785d1a5e1167c4e3455c42aea173104e48ca810a8b1aeIf you were to slice off most of those bytes (e.g. leaving only the last 3, which in the previous example yields
a8b1ae), then with sufficient records, multiple plaintexts will now map to the same truncated HMAC tag.Which means if you’re only revealing a truncated HMAC tag to the database server (both when storing records or retrieving them), you can now expect false positives due to collisions in your truncated HMAC tag.
These false positives give your data a discrete set of anonymity (called k-anonymity), which means an attacker with access to your database cannot:
- Distinguish between two encrypted records with the same short HMAC tag.
- Reverse engineer the short HMAC tag into a single possible plaintext value, even if they can supply candidate queries and study the tags sent to the database.
As with SSE above, this short HMAC technique exposes a trade-off to users.
- Too much k-anonymity (i.e. too many false positives), and you will have to decrypt-then-discard multiple mismatching records. This can make queries slow.
- Not enough k-anonymity (i.e. insufficient false positives), and you’re no better off than a full HMAC.
Even more troublesome, the right amount to truncate is expressed in bits (not bytes), and calculating this value depends on the number of unique plaintext values you anticipate in your dataset. (Fortunately, it grows logarithmically, so you’ll rarely if ever have to tune this.)
If you’d like to play with this idea, here’s a quick and dirty demo script.
Intermission
If you started reading this post with any doubts about Cendyne’s statement that “Database cryptography is hard”, by making it to this point, they’ve probably been long since put to rest.
Art: HarubakiConversely, anyone that specializes in this topic is probably waiting for me to say anything novel or interesting; their patience wearing thin as I continue to rehash a surface-level introduction of their field without really diving deep into anything.
Thus, if you’ve read this far, I’d like to demonstrate the application of what I’ve covered thus far into a real-world case study into an database cryptography product.
Case Study: MongoDB Client-Side Encryption
MongoDB is an open source schema-free NoSQL database. Last year, MongoDB made waves when they announced Queryable Encryption in their upcoming client-side encryption release.
Taken from the press release, but adapted for dark themes.A statement at the bottom of their press release indicates that this isn’t clown-shoes:
Queryable Encryption was designed by MongoDB’s Advanced Cryptography Research Group, headed by Seny Kamara and Tarik Moataz, who are pioneers in the field of encrypted search. The Group conducts cutting-edge peer-reviewed research in cryptography and works with MongoDB engineering teams to transfer and deploy the latest innovations in cryptography and privacy to the MongoDB data platform.
If you recall, I mentioned Seny Kamara in the SSE section of this post. They certainly aren’t wrong about Kamara and Moataz being pioneers in this field.
So with that in mind, let’s explore the implementation in libmongocrypt and see how it stands up to scrutiny.
MongoCrypt: The Good
MongoDB’s encryption library takes key management seriously: They provide a KMS integration for cloud users by default (supporting both AWS and Azure).
MongoDB uses Encrypt-then-MAC with AES-CBC and HMAC-SHA256, which is congruent to what Signal does for message encryption.
How Is Queryable Encryption Implemented?
From the current source code, we can see that MongoCrypt generates several different types of tokens, using HMAC (calculation defined here).
According to their press release:
The feature supports equality searches, with additional query types such as range, prefix, suffix, and substring planned for future releases.
Which means that most of the juicy details probably aren’t public yet.
These HMAC-derived tokens are stored wholesale in the data structure, but most are encrypted before storage using AES-CTR.
There are more layers of encryption (using AEAD), server-side token processing, and more AES-CTR-encrypted edge tokens. All of this is finally serialized (implementation) as one blob for storage.
Since only the equality operation is currently supported (which is the same feature you’d get from HMAC), it’s difficult to speculate what the full feature set looks like.
However, since Kamara and Moataz are leading its development, it’s likely that this feature set will be excellent.
MongoCrypt: The Bad
Every call to
do_encrypt()includes at most the Key ID (but typicallyNULL) as the AAD. This means that the concerns over Confused Deputies (and NoSQL specifically) are relevant to MongoDB.However, even if they did support authenticating the fully qualified path to a field in the AAD for their encryption, their AEAD construction is vulnerable to the kind of canonicalization attack I wrote about previously.
First, observe this code which assembles the multi-part inputs into HMAC.
/* Construct the input to the HMAC */uint32_t num_intermediates = 0;_mongocrypt_buffer_t intermediates[3];// -- snip --if (!_mongocrypt_buffer_concat ( &to_hmac, intermediates, num_intermediates)) { CLIENT_ERR ("failed to allocate buffer"); goto done;}if (hmac == HMAC_SHA_512_256) { uint8_t storage[64]; _mongocrypt_buffer_t tag = {.data = storage, .len = sizeof (storage)}; if (!_crypto_hmac_sha_512 (crypto, Km, &to_hmac, &tag, status)) { goto done; } // Truncate sha512 to first 256 bits. memcpy (out->data, tag.data, MONGOCRYPT_HMAC_LEN);} else { BSON_ASSERT (hmac == HMAC_SHA_256); if (!_mongocrypt_hmac_sha_256 (crypto, Km, &to_hmac, out, status)) { goto done; }}The implementation of
_mongocrypt_buffer_concat()can be found here.If either the implementation of that function, or the code I snipped from my excerpt, had contained code that prefixed every segment of the AAD with the length of the segment (represented as a
uint64_tto make overflow infeasible), then their AEAD mode would not be vulnerable to canonicalization issues.Using TupleHash would also have prevented this issue.
Silver lining for MongoDB developers: Because the AAD is either a key ID or NULL, this isn’t exploitable in practice.
The first cryptographic flaw sort of cancels the second out.
If the libmongocrypt developers ever want to mitigate Confused Deputy attacks, they’ll need to address this canonicalization issue too.
MongoCrypt: The Ugly
MongoCrypt supports deterministic encryption.
If you specify deterministic encryption for a field, your application passes a deterministic initialization vector to AEAD.
We already discussed why this is bad above.
Wrapping Up
This was not a comprehensive treatment of the field of database cryptography. There are many areas of this field that I did not cover, nor do I feel qualified to discuss.
However, I hope anyone who takes the time to read this finds themselves more familiar with the subject.
Additionally, I hope any developers who think “encrypting data in a database is [easy, trivial] (select appropriate)” will find this broad introduction a humbling experience.
Art: CMYKathttps://soatok.blog/2023/03/01/database-cryptography-fur-the-rest-of-us/
#appliedCryptography #blockCipherModes #cryptography #databaseCryptography #databases #encryptedSearch #HMAC #MongoCrypt #MongoDB #QueryableEncryption #realWorldCryptography #security #SecurityGuidance #SQL #SSE #symmetricCryptography #symmetricSearchableEncryption
-
Earlier this year, Cendyne wrote a blog post covering the use of HKDF, building partially upon my own blog post about HKDF and the KDF security definition, but moreso inspired by a cryptographic issue they identified in another company’s product (dubbed AnonCo).
At the bottom they teased:
Database cryptography is hard. The above sketch is not complete and does not address several threats! This article is quite long, so I will not be sharing the fixes.
Cendyne
If you read Cendyne’s post, you may have nodded along with that remark and not appreciate the degree to which our naga friend was putting it mildly. So I thought I’d share some of my knowledge about real-world database cryptography in an accessible and fun format in the hopes that it might serve as an introduction to the specialization.
Note: I’m also not going to fix Cendyne’s sketch of AnonCo’s software here–partly because I don’t want to get in the habit of assigning homework or required reading, but mostly because it’s kind of obvious once you’ve learned the basics.
I’m including art of my fursona in this post… as is tradition for furry blogs.If you don’t like furries, please feel free to leave this blog and read about this topic elsewhere.
Thanks to CMYKat for the awesome stickers.
Contents
- Database Cryptography?
- Cryptography for Relational Databases
- The Perils of Built-in Encryption Functions
- Application-Layer Relational Database Cryptography
- Confused Deputies
- Canonicalization Attacks
- Multi-Tenancy
- Cryptography for NoSQL Databases
- NoSQL is Built Different
- Record Authentication
- Bonus: A Maximally Schema-Free, Upgradeable Authentication Design
- Searchable Encryption
- Order-{Preserving, Revealing} Encryption
- Deterministic Encryption
- Homomorphic Encryption
- Searchable Symmetric Encryption (SSE)
- You Can Have Little a HMAC, As a Treat
- Intermission
- Case Study: MongoDB Client-Side Encryption
- MongoCrypt: The Good
- How is Queryable Encryption Implemented?
- MongoCrypt: The Bad
- MongoCrypt: The Ugly
- MongoCrypt: The Good
- Wrapping Up
Database Cryptography?
The premise of database cryptography is deceptively simple: You have a database, of some sort, and you want to store sensitive data in said database.
The consequences of this simple premise are anything but simple. Let me explain.
Art: ScruffKerfluffThe sensitive data you want to store may need to remain confidential, or you may need to provide some sort of integrity guarantees throughout your entire system, or sometimes both. Sometimes all of your data is sensitive, sometimes only some of it is. Sometimes the confidentiality requirements of your data extends to where within a dataset the record you want actually lives. Sometimes that’s true of some data, but not others, so your cryptography has to be flexible to support multiple types of workloads.
Other times, you just want your disks encrypted at rest so if they grow legs and walk out of the data center, the data cannot be comprehended by an attacker. And you can’t be bothered to work on this problem any deeper. This is usually what compliance requirements cover. Boxes get checked, executives feel safer about their operation, and the whole time nobody has really analyzed the risks they’re facing.
But we’re not settling for mere compliance on this blog. Furries have standards, after all.
So the first thing you need to do before diving into database cryptography is threat modelling. The first step in any good threat model is taking inventory; especially of assumptions, requirements, and desired outcomes. A few good starter questions:
- What database software is being used? Is it up to date?
- What data is being stored in which database software?
- How are databases oriented in the network of the overall system?
- Is your database properly firewalled from the public Internet?
- How does data flow throughout the network, and when do these data flows intersect with the database?
- Which applications talk to the database? What languages are they written in? Which APIs do they use?
- How will cryptography secrets be managed?
- Is there one key for everyone, one key per tenant, etc.?
- How are keys rotated?
- Do you use envelope encryption with an HSM, or vend the raw materials to your end devices?
The first two questions are paramount for deciding how to write software for database cryptography, before you even get to thinking about the cryptography itself.
(This is not a comprehensive set of questions to ask, either. A formal threat model is much deeper in the weeds.)
The kind of cryptography protocol you need for, say, storing encrypted CSV files an S3 bucket is vastly different from relational (SQL) databases, which in turn will be significantly different from schema-free (NoSQL) databases.
Furthermore, when you get to the point that you can start to think about the cryptography, you’ll often need to tackle confidentiality and integrity separately.
If that’s unclear, think of a scenario like, “I need to encrypt PII, but I also need to digitally sign the lab results so I know it wasn’t tampered with at rest.”
My point is, right off the bat, we’ve got a three-dimensional matrix of complexity to contend with:
- On one axis, we have the type of database.
- Flat-file
- Relational
- Schema-free
- On another, we have the basic confidentiality requirements of the data.
- Field encryption
- Row encryption
- Column encryption
- Unstructured record encryption
- Encrypting entire collections of records
- Finally, we have the integrity requirements of the data.
- Field authentication
- Row/column authentication
- Unstructured record authentication
- Collection authentication (based on e.g. Sparse Merkle Trees)
And then you have a fourth dimension that often falls out of operational requirements for databases: Searchability.
Why store data in a database if you have no way to index or search the data for fast retrieval?
Credit: HarubakiIf you’re starting to feel overwhelmed, you’re not alone. A lot of developers drastically underestimate the difficulty of the undertaking, until they run head-first into the complexity.
Some just phone it in with
AES_Encrypt()calls in their MySQL queries. (Too bad ECB mode doesn’t provide semantic security!)Which brings us to the meat of this blog post: The actual cryptography part.
Cryptography is the art of transforming information security problems into key management problems.
Former coworker
Note: In the interest of time, I’m skipping over flat files and focusing instead on actual database technologies.
Cryptography for Relational Databases
Encrypting data in an SQL database seems simple enough, even if you’ve managed to shake off the complexity I teased from the introduction.
You’ve got data, you’ve got a column on a table. Just encrypt the data and shove it in a cell on that column and call it a day, right?
But, alas, this is a trap. There are so many gotchas that I can’t weave a coherent, easy-to-follow narrative between them all.
So let’s start with a simple question: where and how are you performing your encryption?
The Perils of Built-in Encryption Functions
MySQL provides functions called AES_Encrypt and AES_Decrypt, which many developers have unfortunately decided to rely on in the past.
It’s unfortunate because these functions implement ECB mode. To illustrate why ECB mode is bad, I encrypted one of my art commissions with AES in ECB mode:
Art by Riley, encrypted with AES-ECBThe problems with ECB mode aren’t exactly “you can see the image through it,” because ECB-encrypting a compressed image won’t have redundancy (and thus can make you feel safer than you are).
ECB art is a good visual for the actual issue you should care about, however: A lack of semantic security.
A cryptosystem is considered semantically secure if observing the ciphertext doesn’t reveal information about the plaintext (except, perhaps, the length; which all cryptosystems leak to some extent). More information here.
ECB art isn’t to be confused with ECB poetry, which looks like this:
Oh little one, you’re growing up
You’ll soon be writing C
You’ll treat your ints as pointers
You’ll nest the ternary
You’ll cut and paste from github
And try cryptography
But even in your darkest hour
Do not use ECBCBC’s BEASTly when padding’s abused
And CTR’s fine til a nonce is reused
Some say it’s a CRIME to compress then encrypt
Or store keys in the browser (or use javascript)
Diffie Hellman will collapse if hackers choose your g
And RSA is full of traps when e is set to 3
Whiten! Blind! In constant time! Don’t write an RNG!
But failing all, and listen well: Do not use ECBThey’ll say “It’s like a one-time-pad!
The data’s short, it’s not so bad
the keys are long–they’re iron clad
I have a PhD!”
And then you’re front page Hacker News
Your passwords cracked–Adobe Blues.
Don’t leave your penguins showing through,
Do not use ECB— Ben Nagy, PoC||GTFO 0x04:13
Most people reading this probably know better than to use ECB mode already, and don’t need any of these reminders, but there is still a lot of code that inadvertently uses ECB mode to encrypt data in the database.
Also,
Credit: CMYKattSHOW processlist;leaks your encryption keys. Oops.Application-layer Relational Database Cryptography
Whether burned by ECB or just cautious about not giving your secrets to the system that stores all the ciphertext protected by said secret, a common next step for developers is to simply encrypt in their server-side application code.
And, yes, that’s part of the answer. But how you encrypt is important.
Credit: Harubaki“I’ll encrypt with CBC mode.”
If you don’t authenticate your ciphertext, you’ll be sorry. Maybe try again?“Okay, fine, I’ll use an authenticated mode like GCM.”
Did you remember to make the table and column name part of your AAD? What about the primary key of the record?“What on Earth are you talking about, Soatok?”
Welcome to the first footgun of database cryptography!Confused Deputies
Encrypting your sensitive data is necessary, but not sufficient. You need to also bind your ciphertexts to the specific context in which they are stored.
To understand why, let’s take a step back: What specific threat does encrypting your database records protect against?
We’ve already established that “your disks walk out of the datacenter” is a “full disk encryption” problem, so if you’re using application-layer cryptography to encrypt data in a relational database, your threat model probably involves unauthorized access to the database server.
What, then, stops an attacker from copying ciphertexts around?
Credit: CMYKattLet’s say I have a legitimate user account with an ID 12345, and I want to read your street address, but it’s encrypted in the database. But because I’m a clever hacker, I have unfettered access to your relational database server.
All I would need to do is simply…
UPDATE table SET addr_encrypted = 'your-ciphertext' WHERE id = 12345…and then access the application through my legitimate access. Bam, data leaked. As an attacker, I can probably even copy fields from other columns and it will just decrypt. Even if you’re using an authenticated mode.
We call this a confused deputy attack, because the deputy (the component of the system that has been delegated some authority or privilege) has become confused by the attacker, and thus undermined an intended security goal.
The fix is to use the AAD parameter from the authenticated mode to bind the data to a given context. (AAD = Additional Authenticated Data.)
- $addr = aes_gcm_encrypt($addr, $key);+ $addr = aes_gcm_encrypt($addr, $key, canonicalize([+ $tableName,+ $columnName,+ $primaryKey+ ]);
Now if I start cutting and pasting ciphertexts around, I get a decryption failure instead of silently decrypting plaintext.
This may sound like a specific vulnerability, but it’s more of a failure to understand an important general lesson with database cryptography:
Where your data lives is part of its identity, and MUST be authenticated.
Soatok’s Rule of Database Cryptography
Canonicalization Attacks
In the previous section, I introduced a pseudocode called
canonicalize(). This isn’t a pasto from some reference code; it’s an important design detail that I will elaborate on now.First, consider you didn’t do anything to canonicalize your data, and you just joined strings together and called it a day…
function dumbCanonicalize( string $tableName, string $columnName, string|int $primaryKey): string { return $tableName . '_' . $columnName . '#' . $primaryKey;}Consider these two inputs to this function:
dumbCanonicalize('customers', 'last_order_uuid', 123);dumbCanonicalize('customers_last_order', 'uuid', 123);
In this case, your AAD would be the same, and therefore, your deputy can still be confused (albeit in a narrower use case).
In Cendyne’s article, AnonCo did something more subtle: The canonicalization bug created a collision on the inputs to HKDF, which resulted in an unintentional key reuse.
Up until this point, their mistake isn’t relevant to us, because we haven’t even explored key management at all. But the same design flaw can re-emerge in multiple locations, with drastically different consequence.
Multi-Tenancy
Once you’ve implemented a mitigation against Confused Deputies, you may think your job is done. And it very well could be.
Often times, however, software developers are tasked with building support for Bring Your Own Key (BYOK).
This is often spawned from a specific compliance requirement (such as cryptographic shredding; i.e. if you erase the key, you can no longer recover the plaintext, so it may as well be deleted).
Other times, this is driven by a need to cut costs: Storing different users’ data in the same database server, but encrypting it such that they can only encrypt their own records.
Two things can happen when you introduce multi-tenancy into your database cryptography designs:
- Invisible Salamanders becomes a risk, due to multiple keys being possible for any given encrypted record.
- Failure to address the risk of Invisible Salamanders can undermine your protection against Confused Deputies, thereby returning you to a state before you properly used the AAD.
So now you have to revisit your designs and ensure you’re using a key-committing authenticated mode, rather than just a regular authenticated mode.
Isn’t cryptography fun?
“What Are Invisible Salamanders?”
This refers to a fun property of AEAD modes based on Polynomical MACs. Basically, if you:
- Encrypt one message under a specific key and nonce.
- Encrypt another message under a separate key and nonce.
…Then you can get the same exact ciphertext and authentication tag. Performing this attack requires you to control the keys for both encryption operations.
This was first demonstrated in an attack against encrypted messaging applications, where a picture of a salamander was hidden from the abuse reporting feature because another attached file had the same authentication tag and ciphertext, and you could trick the system if you disclosed the second key instead of the first. Thus, the salamander is invisible to attackers.
Art: CMYKatWe’re not quite done with relational databases yet, but we should talk about NoSQL databases for a bit. The final topic in scope applies equally to both, after all.
Cryptography for NoSQL Databases
Most of the topics from relational databases also apply to NoSQL databases, so I shall refrain from duplicating them here. This article is already sufficiently long to read, after all, and I dislike redundancy.
NoSQL is Built Different
The main thing that NoSQL databases offer in the service of making cryptographers lose sleep at night is the schema-free nature of NoSQL designs.
What this means is that, if you’re using a client-side encryption library for a NoSQL database, the previous concerns about confused deputy attacks are amplified by the malleability of the document structure.
Additionally, the previously discussed cryptographic attacks against the encryption mode may be less expensive for an attacker to pull off.
Consider the following record structure, which stores a bunch of data stored with AES in CBC mode:
{ "encrypted-data-key": "<blob>", "name": "<ciphertext>", "address": [ "<ciphertext>", "<ciphertext>" ], "social-security": "<ciphertext>", "zip-code": "<ciphertext>"}If this record is decrypted with code that looks something like this:
$decrypted = [];// ... snip ...foreach ($record['address'] as $i => $addrLine) { try { $decrypted['address'][$i] = $this->decrypt($addrLine); } catch (Throwable $ex) { // You'd never deliberately do this, but it's for illustration $this->doSomethingAnOracleCanObserve($i); // This is more believable, of course: $this->logDecryptionError($ex, $addrLine); $decrypted['address'][$i] = ''; }}Then you can keep appending rows to the
Art: Harubaki"address"field to reduce the number of writes needed to exploit a padding oracle attack against any of the<ciphertext>fields.This isn’t to say that NoSQL is less secure than SQL, from the context of client-side encryption. However, the powerful feature sets that NoSQL users are accustomed to may also give attackers a more versatile toolkit to work with.
Record Authentication
A pedant may point out that record authentication applies to both SQL and NoSQL. However, I mostly only observe this feature in NoSQL databases and document storage systems in the wild, so I’m shoving it in here.
Encrypting fields is nice and all, but sometimes what you want to know is that your unencrypted data hasn’t been tampered with as it flows through your system.
The trivial way this is done is by using a digital signature algorithm over the whole record, and then appending the signature to the end. When you go to verify the record, all of the information you need is right there.
This works well enough for most use cases, and everyone can pack up and go home. Nothing more to see here.
Except…
When you’re working with NoSQL databases, you often want systems to be able to write to additional fields, and since you’re working with schema-free blobs of data rather than a normalized set of relatable tables, the most sensible thing to do is to is to append this data to the same record.
Except, oops! You can’t do that if you’re shoving a digital signature over the record. So now you need to specify which fields are to be included in the signature.
And you need to think about how to model that in a way that doesn’t prohibit schema upgrades nor allow attackers to perform downgrade attacks. (See below.)
I don’t have any specific real-world examples here that I can point to of this problem being solved well.Art: CMYKat
Furthermore, as with preventing confused deputy and/or canonicalization attacks above, you must also include the fully qualified path of each field in the data that gets signed.
As I said with encryption before, but also true here:
Where your data lives is part of its identity, and MUST be authenticated.
Soatok’s Rule of Database Cryptography
This requirement holds true whether you’re using symmetric-key authentication (i.e. HMAC) or asymmetric-key digital signatures (e.g. EdDSA).
Bonus: A Maximally Schema-Free, Upgradeable Authentication Design
Art: HarubakiOkay, how do you solve this problem so that you can perform updates and upgrades to your schema but without enabling attackers to downgrade the security? Here’s one possible design.
Let’s say you have two metadata fields on each record:
- A compressed binary string representing which fields should be authenticated. This field is, itself, not authenticated. Let’s call this
meta-auth. - A compressed binary string representing which of the authenticated fields should also be encrypted. This field is also authenticated. This is at most the same length as the first metadata field. Let’s call this
meta-enc.
Furthermore, you will specify a canonical field ordering for both how data is fed into the signature algorithm as well as the field mappings in
meta-authandmeta-enc.{ "example": { "credit-card": { "number": /* encrypted */, "expiration": /* encrypted */, "ccv": /* encrypted */ }, "superfluous": { "rewards-member": null } }, "meta-auth": compress_bools([ true, /* example.credit-card.number */ true, /* example.credit-card.expiration */ true, /* example.credit-card.ccv */ false, /* example.superfluous.rewards-member */ true /* meta-enc */ ]), "meta-enc": compress_bools([ true, /* example.credit-card.number */ true, /* example.credit-card.expiration */ true, /* example.credit-card.ccv */ false /* example.superfluous.rewards-member */ ]), "signature": /* -- snip -- */}When you go to append data to an existing record, you’ll need to update
meta-authto include the mapping of fields based on this canonical ordering to ensure only the intended fields get validated.When you update your code to add an additional field that is intended to be signed, you can roll that out for new records and the record will continue to be self-describing:
- New records will have the additional field flagged as authenticated in
meta-auth(andmeta-encwill grow) - Old records will not, but your code will still sign them successfully
- To prevent downgrade attacks, simply include a schema version ID as an additional plaintext field that gets authenticated. An attacker who tries to downgrade will need to be able to produce a valid signature too.
You might think
meta-authgives an attacker some advantage, but this only includes which fields are included in the security boundary of the signature or MAC, which allows unauthenticated data to be appended for whatever operational purpose without having to update signatures or expose signing keys to a wider part of the network.{ "example": { "credit-card": { "number": /* encrypted */, "expiration": /* encrypted */, "ccv": /* encrypted */ }, "superfluous": { "rewards-member": null } }, "meta-auth": compress_bools([ true, /* example.credit-card.number */ true, /* example.credit-card.expiration */ true, /* example.credit-card.ccv */ false, /* example.superfluous.rewards-member */ true, /* meta-enc */ true /* meta-version */ ]), "meta-enc": compress_bools([ true, /* example.credit-card.number */ true, /* example.credit-card.expiration */ true, /* example.credit-card.ccv */ false, /* example.superfluous.rewards-member */ true /* meta-version */ ]), "meta-version": 0x01000000, "signature": /* -- snip -- */}If an attacker tries to use the
meta-authfield to mess with a record, the best they can hope for is an Invalid Signature exception (assuming the signature algorithm is secure to begin with).Even if they keep all of the fields the same, but play around with the structure of the record (e.g. changing the XPath or equivalent), so long as the path is authenticated with each field, breaking this is computationally infeasible.
Searchable Encryption
If you’ve managed to make it through the previous sections, congratulations, you now know enough to build a secure but completely useless database.
Art: CMYKatOkay, put away the pitchforks; I will explain.
Part of the reason why we store data in a database, rather than a flat file, is because we want to do more than just read and write. Sometimes computer scientists want to compute. Almost always, you want to be able to query your database for a subset of records based on your specific business logic needs.
And so, a database which doesn’t do anything more than store ciphertext and maybe signatures is pretty useless to most people. You’d have better luck selling Monkey JPEGs to furries than convincing most businesses to part with their precious database-driven report generators.
Art: SophieSo whenever one of your users wants to actually use their data, rather than just store it, they’re forced to decide between two mutually exclusive options:
- Encrypting the data, to protect it from unauthorized disclosure, but render it useless
- Doing anything useful with the data, but leaving it unencrypted in the database
This is especially annoying for business types that are all in on the Zero Trust buzzword.
Fortunately, the cryptographers are at it again, and boy howdy do they have a lot of solutions for this problem.
Order-{Preserving, Revealing} Encryption
On the fun side of things, you have things like Order-Preserving and Order-Revealing Encryption, which Matthew Green wrote about at length.
[D]atabase encryption has been a controversial subject in our field. I wish I could say that there’s been an actual debate, but it’s more that different researchers have fallen into different camps, and nobody has really had the data to make their position in a compelling way. There have actually been some very personal arguments made about it.
Attack of the week: searchable encryption and the ever-expanding leakage function
The problem with these designs is that they have a significant enough leakage that it no longer provides semantic security.
From Grubbs, et al. (GLMP, 2019.)
Colors inverted to fit my blog’s theme better.To put it in other words: These designs are only marginally better than ECB mode, and probably deserve their own poems too.
Order revealing
Reveals much more than order
Softcore ECBOrder preserving
Semantic security?
Only in your dreamsHaiku for your consideration
Deterministic Encryption
Here’s a simpler, but also terrible, idea for searchable encryption: Simply give up on semantic security entirely.
If you recall the
AES_{De,En}crypt()functions built into MySQL I mentioned at the start of this article, those are the most common form of deterministic encryption I’ve seen in use.SELECT * FROM foo WHERE bar = AES_Encrypt('query', 'key');However, there are slightly less bad variants. If you use AES-GCM-SIV with a static nonce, your ciphertexts are fully deterministic, and you can encrypt a small number of distinct records safely before you’re no longer secure.
From Page 14 of the linked paper. Full view.That’s certainly better than nothing, but you also can’t mitigate confused deputy attacks. But we can do better than this.
Homomorphic Encryption
In a safer plane of academia, you’ll find homomorphic encryption, which researchers recently demonstrated with serving Wikipedia pages in a reasonable amount of time.
Homomorphic encryption allows computations over the ciphertext, which will be reflected in the plaintext, without ever revealing the key to the entity performing the computation.
If this sounds vaguely similar to the conditions that enable chosen-ciphertext attacks, you probably have a good intuition for how it works: RSA is homomorphic to multiplication, AES-CTR is homomorphic to XOR. Fully homomorphic encryption uses lattices, which enables multiple operations but carries a relatively enormous performance cost.
Art: HarubakiHomomorphic encryption sometimes intersects with machine learning, because the notion of training an encrypted model by feeding it encrypted data, then decrypting it after-the-fact is desirable for certain business verticals. Your data scientists never see your data, and you have some plausible deniability about the final ML model this work produces. This is like a Siren song for Venture Capitalist-backed medical technology companies. Tech journalists love writing about it.
However, a less-explored use case is the ability to encrypt your programs but still get the correct behavior and outputs. Although this sounds like a DRM technology, it’s actually something that individuals could one day use to prevent their ISPs or cloud providers from knowing what software is being executed on the customer’s leased hardware. The potential for a privacy win here is certainly worth pondering, even if you’re a tried and true Pirate Party member.
Just say “NO” to the copyright cartels.Art: CMYKat
Searchable Symmetric Encryption (SSE)
Forget about working at the level of fields and rows or individual records. What if we, instead, worked over collections of documents, where each document is viewed as a set of keywords from a keyword space?
Art: CMYKatThat’s the basic premise of SSE: Encrypting collections of documents rather than individual records.
The actual implementation details differ greatly between designs. They also differ greatly in their leakage profiles and susceptibility to side-channel attacks.
Some schemes use a so-called trapdoor permutation, such as RSA, as one of their building blocks.
Some schemes only allow for searching a static set of records, while others can accommodate new data over time (with the trade-off between more leakage or worse performance).
If you’re curious, you can learn more about SSE here, and see some open source SEE implementations online here.
You’re probably wondering, “If SSE is this well-studied and there are open source implementations available, why isn’t it more widely used?”
Your guess is as good as mine, but I can think of a few reasons:
- The protocols can be a little complicated to implement, and aren’t shipped by default in cryptography libraries (i.e. OpenSSL’s libcrypto or libsodium).
- Every known security risk in SSE is the product of a trade-offs, rather than there being a single winner for all use cases that developers can feel comfortable picking.
- Insufficient marketing and developer advocacy.
SSE schemes are mostly of interest to academics, although Seny Kamara (Brown Univeristy professior and one of the luminaries of searchable encryption) did try to develop an app called Pixek which used SSE to encrypt photos.
Maybe there’s room for a cryptography competition on searchable encryption schemes in the future.
You Can Have Little a HMAC, As a Treat
Finally, I can’t talk about searchable encryption without discussing a technique that’s older than dirt by Internet standards, that has been independently reinvented by countless software developers tasked with encrypting database records.
The oldest version I’ve been able to track down dates to 2006 by Raul Garcia at Microsoft, but I’m not confident that it didn’t exist before.
The idea I’m alluding to goes like this:
- Encrypt your data, securely, using symmetric cryptography.
(Hopefully your encryption addresses the considerations outlined in the relevant sections above.) - Separately, calculate an HMAC over the unencrypted data with a separate key used exclusively for indexing.
When you need to query your data, you can just recalculate the HMAC of your challenge and fetch the records that match it. Easy, right?
Even if you rotate your keys for encryption, you keep your indexing keys static across your entire data set. This lets you have durable indexes for encrypted data, which gives you the ability to do literal lookups for the performance hit of a hash function.
Additionally, everyone has HMAC in their toolkit, so you don’t have to move around implementations of complex cryptographic building blocks. You can live off the land. What’s not to love?
Hooray!However, if you stopped here, we regret to inform you that your data is no longer indistinguishable from random, which probably undermines the security proof for your encryption scheme.
How annoying!Of course, you don’t have to stop with the addition of plain HMAC to your database encryption software.
Take a page from Troy Hunt: Truncate the output to provide k-anonymity rather than a direct literal look-up.
“K-What Now?”
Imagine you have a full HMAC-SHA256 of the plaintext next to every ciphertext record with a static key, for searchability.
Each HMAC output corresponds 1:1 with a unique plaintext.
Because you’re using HMAC with a secret key, an attacker can’t just build a rainbow table like they would when attempting password cracking, but it still leaks duplicate plaintexts.
For example, an HMAC-SHA256 output might look like this:
Art: CMYKat\04a74e4c0158e34a566785d1a5e1167c4e3455c42aea173104e48ca810a8b1aeIf you were to slice off most of those bytes (e.g. leaving only the last 3, which in the previous example yields
a8b1ae), then with sufficient records, multiple plaintexts will now map to the same truncated HMAC tag.Which means if you’re only revealing a truncated HMAC tag to the database server (both when storing records or retrieving them), you can now expect false positives due to collisions in your truncated HMAC tag.
These false positives give your data a discrete set of anonymity (called k-anonymity), which means an attacker with access to your database cannot:
- Distinguish between two encrypted records with the same short HMAC tag.
- Reverse engineer the short HMAC tag into a single possible plaintext value, even if they can supply candidate queries and study the tags sent to the database.
As with SSE above, this short HMAC technique exposes a trade-off to users.
- Too much k-anonymity (i.e. too many false positives), and you will have to decrypt-then-discard multiple mismatching records. This can make queries slow.
- Not enough k-anonymity (i.e. insufficient false positives), and you’re no better off than a full HMAC.
Even more troublesome, the right amount to truncate is expressed in bits (not bytes), and calculating this value depends on the number of unique plaintext values you anticipate in your dataset. (Fortunately, it grows logarithmically, so you’ll rarely if ever have to tune this.)
If you’d like to play with this idea, here’s a quick and dirty demo script.
Intermission
If you started reading this post with any doubts about Cendyne’s statement that “Database cryptography is hard”, by making it to this point, they’ve probably been long since put to rest.
Art: HarubakiConversely, anyone that specializes in this topic is probably waiting for me to say anything novel or interesting; their patience wearing thin as I continue to rehash a surface-level introduction of their field without really diving deep into anything.
Thus, if you’ve read this far, I’d like to demonstrate the application of what I’ve covered thus far into a real-world case study into an database cryptography product.
Case Study: MongoDB Client-Side Encryption
MongoDB is an open source schema-free NoSQL database. Last year, MongoDB made waves when they announced Queryable Encryption in their upcoming client-side encryption release.
Taken from the press release, but adapted for dark themes.A statement at the bottom of their press release indicates that this isn’t clown-shoes:
Queryable Encryption was designed by MongoDB’s Advanced Cryptography Research Group, headed by Seny Kamara and Tarik Moataz, who are pioneers in the field of encrypted search. The Group conducts cutting-edge peer-reviewed research in cryptography and works with MongoDB engineering teams to transfer and deploy the latest innovations in cryptography and privacy to the MongoDB data platform.
If you recall, I mentioned Seny Kamara in the SSE section of this post. They certainly aren’t wrong about Kamara and Moataz being pioneers in this field.
So with that in mind, let’s explore the implementation in libmongocrypt and see how it stands up to scrutiny.
MongoCrypt: The Good
MongoDB’s encryption library takes key management seriously: They provide a KMS integration for cloud users by default (supporting both AWS and Azure).
MongoDB uses Encrypt-then-MAC with AES-CBC and HMAC-SHA256, which is congruent to what Signal does for message encryption.
How Is Queryable Encryption Implemented?
From the current source code, we can see that MongoCrypt generates several different types of tokens, using HMAC (calculation defined here).
According to their press release:
The feature supports equality searches, with additional query types such as range, prefix, suffix, and substring planned for future releases.
Which means that most of the juicy details probably aren’t public yet.
These HMAC-derived tokens are stored wholesale in the data structure, but most are encrypted before storage using AES-CTR.
There are more layers of encryption (using AEAD), server-side token processing, and more AES-CTR-encrypted edge tokens. All of this is finally serialized (implementation) as one blob for storage.
Since only the equality operation is currently supported (which is the same feature you’d get from HMAC), it’s difficult to speculate what the full feature set looks like.
However, since Kamara and Moataz are leading its development, it’s likely that this feature set will be excellent.
MongoCrypt: The Bad
Every call to
do_encrypt()includes at most the Key ID (but typicallyNULL) as the AAD. This means that the concerns over Confused Deputies (and NoSQL specifically) are relevant to MongoDB.However, even if they did support authenticating the fully qualified path to a field in the AAD for their encryption, their AEAD construction is vulnerable to the kind of canonicalization attack I wrote about previously.
First, observe this code which assembles the multi-part inputs into HMAC.
/* Construct the input to the HMAC */uint32_t num_intermediates = 0;_mongocrypt_buffer_t intermediates[3];// -- snip --if (!_mongocrypt_buffer_concat ( &to_hmac, intermediates, num_intermediates)) { CLIENT_ERR ("failed to allocate buffer"); goto done;}if (hmac == HMAC_SHA_512_256) { uint8_t storage[64]; _mongocrypt_buffer_t tag = {.data = storage, .len = sizeof (storage)}; if (!_crypto_hmac_sha_512 (crypto, Km, &to_hmac, &tag, status)) { goto done; } // Truncate sha512 to first 256 bits. memcpy (out->data, tag.data, MONGOCRYPT_HMAC_LEN);} else { BSON_ASSERT (hmac == HMAC_SHA_256); if (!_mongocrypt_hmac_sha_256 (crypto, Km, &to_hmac, out, status)) { goto done; }}The implementation of
_mongocrypt_buffer_concat()can be found here.If either the implementation of that function, or the code I snipped from my excerpt, had contained code that prefixed every segment of the AAD with the length of the segment (represented as a
uint64_tto make overflow infeasible), then their AEAD mode would not be vulnerable to canonicalization issues.Using TupleHash would also have prevented this issue.
Silver lining for MongoDB developers: Because the AAD is either a key ID or NULL, this isn’t exploitable in practice.
The first cryptographic flaw sort of cancels the second out.
If the libmongocrypt developers ever want to mitigate Confused Deputy attacks, they’ll need to address this canonicalization issue too.
MongoCrypt: The Ugly
MongoCrypt supports deterministic encryption.
If you specify deterministic encryption for a field, your application passes a deterministic initialization vector to AEAD.
We already discussed why this is bad above.
Wrapping Up
This was not a comprehensive treatment of the field of database cryptography. There are many areas of this field that I did not cover, nor do I feel qualified to discuss.
However, I hope anyone who takes the time to read this finds themselves more familiar with the subject.
Additionally, I hope any developers who think “encrypting data in a database is [easy, trivial] (select appropriate)” will find this broad introduction a humbling experience.
Art: CMYKathttps://soatok.blog/2023/03/01/database-cryptography-fur-the-rest-of-us/
#appliedCryptography #blockCipherModes #cryptography #databaseCryptography #databases #encryptedSearch #HMAC #MongoCrypt #MongoDB #QueryableEncryption #realWorldCryptography #security #SecurityGuidance #SQL #SSE #symmetricCryptography #symmetricSearchableEncryption
-
Earlier this year, Cendyne wrote a blog post covering the use of HKDF, building partially upon my own blog post about HKDF and the KDF security definition, but moreso inspired by a cryptographic issue they identified in another company’s product (dubbed AnonCo).
At the bottom they teased:
Database cryptography is hard. The above sketch is not complete and does not address several threats! This article is quite long, so I will not be sharing the fixes.
Cendyne
If you read Cendyne’s post, you may have nodded along with that remark and not appreciate the degree to which our naga friend was putting it mildly. So I thought I’d share some of my knowledge about real-world database cryptography in an accessible and fun format in the hopes that it might serve as an introduction to the specialization.
Note: I’m also not going to fix Cendyne’s sketch of AnonCo’s software here–partly because I don’t want to get in the habit of assigning homework or required reading, but mostly because it’s kind of obvious once you’ve learned the basics.
I’m including art of my fursona in this post… as is tradition for furry blogs.If you don’t like furries, please feel free to leave this blog and read about this topic elsewhere.
Thanks to CMYKat for the awesome stickers.
Contents
- Database Cryptography?
- Cryptography for Relational Databases
- The Perils of Built-in Encryption Functions
- Application-Layer Relational Database Cryptography
- Confused Deputies
- Canonicalization Attacks
- Multi-Tenancy
- Cryptography for NoSQL Databases
- NoSQL is Built Different
- Record Authentication
- Bonus: A Maximally Schema-Free, Upgradeable Authentication Design
- Searchable Encryption
- Order-{Preserving, Revealing} Encryption
- Deterministic Encryption
- Homomorphic Encryption
- Searchable Symmetric Encryption (SSE)
- You Can Have Little a HMAC, As a Treat
- Intermission
- Case Study: MongoDB Client-Side Encryption
- MongoCrypt: The Good
- How is Queryable Encryption Implemented?
- MongoCrypt: The Bad
- MongoCrypt: The Ugly
- MongoCrypt: The Good
- Wrapping Up
Database Cryptography?
The premise of database cryptography is deceptively simple: You have a database, of some sort, and you want to store sensitive data in said database.
The consequences of this simple premise are anything but simple. Let me explain.
Art: ScruffKerfluffThe sensitive data you want to store may need to remain confidential, or you may need to provide some sort of integrity guarantees throughout your entire system, or sometimes both. Sometimes all of your data is sensitive, sometimes only some of it is. Sometimes the confidentiality requirements of your data extends to where within a dataset the record you want actually lives. Sometimes that’s true of some data, but not others, so your cryptography has to be flexible to support multiple types of workloads.
Other times, you just want your disks encrypted at rest so if they grow legs and walk out of the data center, the data cannot be comprehended by an attacker. And you can’t be bothered to work on this problem any deeper. This is usually what compliance requirements cover. Boxes get checked, executives feel safer about their operation, and the whole time nobody has really analyzed the risks they’re facing.
But we’re not settling for mere compliance on this blog. Furries have standards, after all.
So the first thing you need to do before diving into database cryptography is threat modelling. The first step in any good threat model is taking inventory; especially of assumptions, requirements, and desired outcomes. A few good starter questions:
- What database software is being used? Is it up to date?
- What data is being stored in which database software?
- How are databases oriented in the network of the overall system?
- Is your database properly firewalled from the public Internet?
- How does data flow throughout the network, and when do these data flows intersect with the database?
- Which applications talk to the database? What languages are they written in? Which APIs do they use?
- How will cryptography secrets be managed?
- Is there one key for everyone, one key per tenant, etc.?
- How are keys rotated?
- Do you use envelope encryption with an HSM, or vend the raw materials to your end devices?
The first two questions are paramount for deciding how to write software for database cryptography, before you even get to thinking about the cryptography itself.
(This is not a comprehensive set of questions to ask, either. A formal threat model is much deeper in the weeds.)
The kind of cryptography protocol you need for, say, storing encrypted CSV files an S3 bucket is vastly different from relational (SQL) databases, which in turn will be significantly different from schema-free (NoSQL) databases.
Furthermore, when you get to the point that you can start to think about the cryptography, you’ll often need to tackle confidentiality and integrity separately.
If that’s unclear, think of a scenario like, “I need to encrypt PII, but I also need to digitally sign the lab results so I know it wasn’t tampered with at rest.”
My point is, right off the bat, we’ve got a three-dimensional matrix of complexity to contend with:
- On one axis, we have the type of database.
- Flat-file
- Relational
- Schema-free
- On another, we have the basic confidentiality requirements of the data.
- Field encryption
- Row encryption
- Column encryption
- Unstructured record encryption
- Encrypting entire collections of records
- Finally, we have the integrity requirements of the data.
- Field authentication
- Row/column authentication
- Unstructured record authentication
- Collection authentication (based on e.g. Sparse Merkle Trees)
And then you have a fourth dimension that often falls out of operational requirements for databases: Searchability.
Why store data in a database if you have no way to index or search the data for fast retrieval?
Credit: HarubakiIf you’re starting to feel overwhelmed, you’re not alone. A lot of developers drastically underestimate the difficulty of the undertaking, until they run head-first into the complexity.
Some just phone it in with
AES_Encrypt()calls in their MySQL queries. (Too bad ECB mode doesn’t provide semantic security!)Which brings us to the meat of this blog post: The actual cryptography part.
Cryptography is the art of transforming information security problems into key management problems.
Former coworker
Note: In the interest of time, I’m skipping over flat files and focusing instead on actual database technologies.
Cryptography for Relational Databases
Encrypting data in an SQL database seems simple enough, even if you’ve managed to shake off the complexity I teased from the introduction.
You’ve got data, you’ve got a column on a table. Just encrypt the data and shove it in a cell on that column and call it a day, right?
But, alas, this is a trap. There are so many gotchas that I can’t weave a coherent, easy-to-follow narrative between them all.
So let’s start with a simple question: where and how are you performing your encryption?
The Perils of Built-in Encryption Functions
MySQL provides functions called AES_Encrypt and AES_Decrypt, which many developers have unfortunately decided to rely on in the past.
It’s unfortunate because these functions implement ECB mode. To illustrate why ECB mode is bad, I encrypted one of my art commissions with AES in ECB mode:
Art by Riley, encrypted with AES-ECBThe problems with ECB mode aren’t exactly “you can see the image through it,” because ECB-encrypting a compressed image won’t have redundancy (and thus can make you feel safer than you are).
ECB art is a good visual for the actual issue you should care about, however: A lack of semantic security.
A cryptosystem is considered semantically secure if observing the ciphertext doesn’t reveal information about the plaintext (except, perhaps, the length; which all cryptosystems leak to some extent). More information here.
ECB art isn’t to be confused with ECB poetry, which looks like this:
Oh little one, you’re growing up
You’ll soon be writing C
You’ll treat your ints as pointers
You’ll nest the ternary
You’ll cut and paste from github
And try cryptography
But even in your darkest hour
Do not use ECBCBC’s BEASTly when padding’s abused
And CTR’s fine til a nonce is reused
Some say it’s a CRIME to compress then encrypt
Or store keys in the browser (or use javascript)
Diffie Hellman will collapse if hackers choose your g
And RSA is full of traps when e is set to 3
Whiten! Blind! In constant time! Don’t write an RNG!
But failing all, and listen well: Do not use ECBThey’ll say “It’s like a one-time-pad!
The data’s short, it’s not so bad
the keys are long–they’re iron clad
I have a PhD!”
And then you’re front page Hacker News
Your passwords cracked–Adobe Blues.
Don’t leave your penguins showing through,
Do not use ECB— Ben Nagy, PoC||GTFO 0x04:13
Most people reading this probably know better than to use ECB mode already, and don’t need any of these reminders, but there is still a lot of code that inadvertently uses ECB mode to encrypt data in the database.
Also,
Credit: CMYKattSHOW processlist;leaks your encryption keys. Oops.Application-layer Relational Database Cryptography
Whether burned by ECB or just cautious about not giving your secrets to the system that stores all the ciphertext protected by said secret, a common next step for developers is to simply encrypt in their server-side application code.
And, yes, that’s part of the answer. But how you encrypt is important.
Credit: Harubaki“I’ll encrypt with CBC mode.”
If you don’t authenticate your ciphertext, you’ll be sorry. Maybe try again?“Okay, fine, I’ll use an authenticated mode like GCM.”
Did you remember to make the table and column name part of your AAD? What about the primary key of the record?“What on Earth are you talking about, Soatok?”
Welcome to the first footgun of database cryptography!Confused Deputies
Encrypting your sensitive data is necessary, but not sufficient. You need to also bind your ciphertexts to the specific context in which they are stored.
To understand why, let’s take a step back: What specific threat does encrypting your database records protect against?
We’ve already established that “your disks walk out of the datacenter” is a “full disk encryption” problem, so if you’re using application-layer cryptography to encrypt data in a relational database, your threat model probably involves unauthorized access to the database server.
What, then, stops an attacker from copying ciphertexts around?
Credit: CMYKattLet’s say I have a legitimate user account with an ID 12345, and I want to read your street address, but it’s encrypted in the database. But because I’m a clever hacker, I have unfettered access to your relational database server.
All I would need to do is simply…
UPDATE table SET addr_encrypted = 'your-ciphertext' WHERE id = 12345…and then access the application through my legitimate access. Bam, data leaked. As an attacker, I can probably even copy fields from other columns and it will just decrypt. Even if you’re using an authenticated mode.
We call this a confused deputy attack, because the deputy (the component of the system that has been delegated some authority or privilege) has become confused by the attacker, and thus undermined an intended security goal.
The fix is to use the AAD parameter from the authenticated mode to bind the data to a given context. (AAD = Additional Authenticated Data.)
- $addr = aes_gcm_encrypt($addr, $key);+ $addr = aes_gcm_encrypt($addr, $key, canonicalize([+ $tableName,+ $columnName,+ $primaryKey+ ]);
Now if I start cutting and pasting ciphertexts around, I get a decryption failure instead of silently decrypting plaintext.
This may sound like a specific vulnerability, but it’s more of a failure to understand an important general lesson with database cryptography:
Where your data lives is part of its identity, and MUST be authenticated.
Soatok’s Rule of Database Cryptography
Canonicalization Attacks
In the previous section, I introduced a pseudocode called
canonicalize(). This isn’t a pasto from some reference code; it’s an important design detail that I will elaborate on now.First, consider you didn’t do anything to canonicalize your data, and you just joined strings together and called it a day…
function dumbCanonicalize( string $tableName, string $columnName, string|int $primaryKey): string { return $tableName . '_' . $columnName . '#' . $primaryKey;}Consider these two inputs to this function:
dumbCanonicalize('customers', 'last_order_uuid', 123);dumbCanonicalize('customers_last_order', 'uuid', 123);
In this case, your AAD would be the same, and therefore, your deputy can still be confused (albeit in a narrower use case).
In Cendyne’s article, AnonCo did something more subtle: The canonicalization bug created a collision on the inputs to HKDF, which resulted in an unintentional key reuse.
Up until this point, their mistake isn’t relevant to us, because we haven’t even explored key management at all. But the same design flaw can re-emerge in multiple locations, with drastically different consequence.
Multi-Tenancy
Once you’ve implemented a mitigation against Confused Deputies, you may think your job is done. And it very well could be.
Often times, however, software developers are tasked with building support for Bring Your Own Key (BYOK).
This is often spawned from a specific compliance requirement (such as cryptographic shredding; i.e. if you erase the key, you can no longer recover the plaintext, so it may as well be deleted).
Other times, this is driven by a need to cut costs: Storing different users’ data in the same database server, but encrypting it such that they can only encrypt their own records.
Two things can happen when you introduce multi-tenancy into your database cryptography designs:
- Invisible Salamanders becomes a risk, due to multiple keys being possible for any given encrypted record.
- Failure to address the risk of Invisible Salamanders can undermine your protection against Confused Deputies, thereby returning you to a state before you properly used the AAD.
So now you have to revisit your designs and ensure you’re using a key-committing authenticated mode, rather than just a regular authenticated mode.
Isn’t cryptography fun?
“What Are Invisible Salamanders?”
This refers to a fun property of AEAD modes based on Polynomical MACs. Basically, if you:
- Encrypt one message under a specific key and nonce.
- Encrypt another message under a separate key and nonce.
…Then you can get the same exact ciphertext and authentication tag. Performing this attack requires you to control the keys for both encryption operations.
This was first demonstrated in an attack against encrypted messaging applications, where a picture of a salamander was hidden from the abuse reporting feature because another attached file had the same authentication tag and ciphertext, and you could trick the system if you disclosed the second key instead of the first. Thus, the salamander is invisible to attackers.
Art: CMYKatWe’re not quite done with relational databases yet, but we should talk about NoSQL databases for a bit. The final topic in scope applies equally to both, after all.
Cryptography for NoSQL Databases
Most of the topics from relational databases also apply to NoSQL databases, so I shall refrain from duplicating them here. This article is already sufficiently long to read, after all, and I dislike redundancy.
NoSQL is Built Different
The main thing that NoSQL databases offer in the service of making cryptographers lose sleep at night is the schema-free nature of NoSQL designs.
What this means is that, if you’re using a client-side encryption library for a NoSQL database, the previous concerns about confused deputy attacks are amplified by the malleability of the document structure.
Additionally, the previously discussed cryptographic attacks against the encryption mode may be less expensive for an attacker to pull off.
Consider the following record structure, which stores a bunch of data stored with AES in CBC mode:
{ "encrypted-data-key": "<blob>", "name": "<ciphertext>", "address": [ "<ciphertext>", "<ciphertext>" ], "social-security": "<ciphertext>", "zip-code": "<ciphertext>"}If this record is decrypted with code that looks something like this:
$decrypted = [];// ... snip ...foreach ($record['address'] as $i => $addrLine) { try { $decrypted['address'][$i] = $this->decrypt($addrLine); } catch (Throwable $ex) { // You'd never deliberately do this, but it's for illustration $this->doSomethingAnOracleCanObserve($i); // This is more believable, of course: $this->logDecryptionError($ex, $addrLine); $decrypted['address'][$i] = ''; }}Then you can keep appending rows to the
Art: Harubaki"address"field to reduce the number of writes needed to exploit a padding oracle attack against any of the<ciphertext>fields.This isn’t to say that NoSQL is less secure than SQL, from the context of client-side encryption. However, the powerful feature sets that NoSQL users are accustomed to may also give attackers a more versatile toolkit to work with.
Record Authentication
A pedant may point out that record authentication applies to both SQL and NoSQL. However, I mostly only observe this feature in NoSQL databases and document storage systems in the wild, so I’m shoving it in here.
Encrypting fields is nice and all, but sometimes what you want to know is that your unencrypted data hasn’t been tampered with as it flows through your system.
The trivial way this is done is by using a digital signature algorithm over the whole record, and then appending the signature to the end. When you go to verify the record, all of the information you need is right there.
This works well enough for most use cases, and everyone can pack up and go home. Nothing more to see here.
Except…
When you’re working with NoSQL databases, you often want systems to be able to write to additional fields, and since you’re working with schema-free blobs of data rather than a normalized set of relatable tables, the most sensible thing to do is to is to append this data to the same record.
Except, oops! You can’t do that if you’re shoving a digital signature over the record. So now you need to specify which fields are to be included in the signature.
And you need to think about how to model that in a way that doesn’t prohibit schema upgrades nor allow attackers to perform downgrade attacks. (See below.)
I don’t have any specific real-world examples here that I can point to of this problem being solved well.Art: CMYKat
Furthermore, as with preventing confused deputy and/or canonicalization attacks above, you must also include the fully qualified path of each field in the data that gets signed.
As I said with encryption before, but also true here:
Where your data lives is part of its identity, and MUST be authenticated.
Soatok’s Rule of Database Cryptography
This requirement holds true whether you’re using symmetric-key authentication (i.e. HMAC) or asymmetric-key digital signatures (e.g. EdDSA).
Bonus: A Maximally Schema-Free, Upgradeable Authentication Design
Art: HarubakiOkay, how do you solve this problem so that you can perform updates and upgrades to your schema but without enabling attackers to downgrade the security? Here’s one possible design.
Let’s say you have two metadata fields on each record:
- A compressed binary string representing which fields should be authenticated. This field is, itself, not authenticated. Let’s call this
meta-auth. - A compressed binary string representing which of the authenticated fields should also be encrypted. This field is also authenticated. This is at most the same length as the first metadata field. Let’s call this
meta-enc.
Furthermore, you will specify a canonical field ordering for both how data is fed into the signature algorithm as well as the field mappings in
meta-authandmeta-enc.{ "example": { "credit-card": { "number": /* encrypted */, "expiration": /* encrypted */, "ccv": /* encrypted */ }, "superfluous": { "rewards-member": null } }, "meta-auth": compress_bools([ true, /* example.credit-card.number */ true, /* example.credit-card.expiration */ true, /* example.credit-card.ccv */ false, /* example.superfluous.rewards-member */ true /* meta-enc */ ]), "meta-enc": compress_bools([ true, /* example.credit-card.number */ true, /* example.credit-card.expiration */ true, /* example.credit-card.ccv */ false /* example.superfluous.rewards-member */ ]), "signature": /* -- snip -- */}When you go to append data to an existing record, you’ll need to update
meta-authto include the mapping of fields based on this canonical ordering to ensure only the intended fields get validated.When you update your code to add an additional field that is intended to be signed, you can roll that out for new records and the record will continue to be self-describing:
- New records will have the additional field flagged as authenticated in
meta-auth(andmeta-encwill grow) - Old records will not, but your code will still sign them successfully
- To prevent downgrade attacks, simply include a schema version ID as an additional plaintext field that gets authenticated. An attacker who tries to downgrade will need to be able to produce a valid signature too.
You might think
meta-authgives an attacker some advantage, but this only includes which fields are included in the security boundary of the signature or MAC, which allows unauthenticated data to be appended for whatever operational purpose without having to update signatures or expose signing keys to a wider part of the network.{ "example": { "credit-card": { "number": /* encrypted */, "expiration": /* encrypted */, "ccv": /* encrypted */ }, "superfluous": { "rewards-member": null } }, "meta-auth": compress_bools([ true, /* example.credit-card.number */ true, /* example.credit-card.expiration */ true, /* example.credit-card.ccv */ false, /* example.superfluous.rewards-member */ true, /* meta-enc */ true /* meta-version */ ]), "meta-enc": compress_bools([ true, /* example.credit-card.number */ true, /* example.credit-card.expiration */ true, /* example.credit-card.ccv */ false, /* example.superfluous.rewards-member */ true /* meta-version */ ]), "meta-version": 0x01000000, "signature": /* -- snip -- */}If an attacker tries to use the
meta-authfield to mess with a record, the best they can hope for is an Invalid Signature exception (assuming the signature algorithm is secure to begin with).Even if they keep all of the fields the same, but play around with the structure of the record (e.g. changing the XPath or equivalent), so long as the path is authenticated with each field, breaking this is computationally infeasible.
Searchable Encryption
If you’ve managed to make it through the previous sections, congratulations, you now know enough to build a secure but completely useless database.
Art: CMYKatOkay, put away the pitchforks; I will explain.
Part of the reason why we store data in a database, rather than a flat file, is because we want to do more than just read and write. Sometimes computer scientists want to compute. Almost always, you want to be able to query your database for a subset of records based on your specific business logic needs.
And so, a database which doesn’t do anything more than store ciphertext and maybe signatures is pretty useless to most people. You’d have better luck selling Monkey JPEGs to furries than convincing most businesses to part with their precious database-driven report generators.
Art: SophieSo whenever one of your users wants to actually use their data, rather than just store it, they’re forced to decide between two mutually exclusive options:
- Encrypting the data, to protect it from unauthorized disclosure, but render it useless
- Doing anything useful with the data, but leaving it unencrypted in the database
This is especially annoying for business types that are all in on the Zero Trust buzzword.
Fortunately, the cryptographers are at it again, and boy howdy do they have a lot of solutions for this problem.
Order-{Preserving, Revealing} Encryption
On the fun side of things, you have things like Order-Preserving and Order-Revealing Encryption, which Matthew Green wrote about at length.
[D]atabase encryption has been a controversial subject in our field. I wish I could say that there’s been an actual debate, but it’s more that different researchers have fallen into different camps, and nobody has really had the data to make their position in a compelling way. There have actually been some very personal arguments made about it.
Attack of the week: searchable encryption and the ever-expanding leakage function
The problem with these designs is that they have a significant enough leakage that it no longer provides semantic security.
From Grubbs, et al. (GLMP, 2019.)
Colors inverted to fit my blog’s theme better.To put it in other words: These designs are only marginally better than ECB mode, and probably deserve their own poems too.
Order revealing
Reveals much more than order
Softcore ECBOrder preserving
Semantic security?
Only in your dreamsHaiku for your consideration
Deterministic Encryption
Here’s a simpler, but also terrible, idea for searchable encryption: Simply give up on semantic security entirely.
If you recall the
AES_{De,En}crypt()functions built into MySQL I mentioned at the start of this article, those are the most common form of deterministic encryption I’ve seen in use.SELECT * FROM foo WHERE bar = AES_Encrypt('query', 'key');However, there are slightly less bad variants. If you use AES-GCM-SIV with a static nonce, your ciphertexts are fully deterministic, and you can encrypt a small number of distinct records safely before you’re no longer secure.
From Page 14 of the linked paper. Full view.That’s certainly better than nothing, but you also can’t mitigate confused deputy attacks. But we can do better than this.
Homomorphic Encryption
In a safer plane of academia, you’ll find homomorphic encryption, which researchers recently demonstrated with serving Wikipedia pages in a reasonable amount of time.
Homomorphic encryption allows computations over the ciphertext, which will be reflected in the plaintext, without ever revealing the key to the entity performing the computation.
If this sounds vaguely similar to the conditions that enable chosen-ciphertext attacks, you probably have a good intuition for how it works: RSA is homomorphic to multiplication, AES-CTR is homomorphic to XOR. Fully homomorphic encryption uses lattices, which enables multiple operations but carries a relatively enormous performance cost.
Art: HarubakiHomomorphic encryption sometimes intersects with machine learning, because the notion of training an encrypted model by feeding it encrypted data, then decrypting it after-the-fact is desirable for certain business verticals. Your data scientists never see your data, and you have some plausible deniability about the final ML model this work produces. This is like a Siren song for Venture Capitalist-backed medical technology companies. Tech journalists love writing about it.
However, a less-explored use case is the ability to encrypt your programs but still get the correct behavior and outputs. Although this sounds like a DRM technology, it’s actually something that individuals could one day use to prevent their ISPs or cloud providers from knowing what software is being executed on the customer’s leased hardware. The potential for a privacy win here is certainly worth pondering, even if you’re a tried and true Pirate Party member.
Just say “NO” to the copyright cartels.Art: CMYKat
Searchable Symmetric Encryption (SSE)
Forget about working at the level of fields and rows or individual records. What if we, instead, worked over collections of documents, where each document is viewed as a set of keywords from a keyword space?
Art: CMYKatThat’s the basic premise of SSE: Encrypting collections of documents rather than individual records.
The actual implementation details differ greatly between designs. They also differ greatly in their leakage profiles and susceptibility to side-channel attacks.
Some schemes use a so-called trapdoor permutation, such as RSA, as one of their building blocks.
Some schemes only allow for searching a static set of records, while others can accommodate new data over time (with the trade-off between more leakage or worse performance).
If you’re curious, you can learn more about SSE here, and see some open source SEE implementations online here.
You’re probably wondering, “If SSE is this well-studied and there are open source implementations available, why isn’t it more widely used?”
Your guess is as good as mine, but I can think of a few reasons:
- The protocols can be a little complicated to implement, and aren’t shipped by default in cryptography libraries (i.e. OpenSSL’s libcrypto or libsodium).
- Every known security risk in SSE is the product of a trade-offs, rather than there being a single winner for all use cases that developers can feel comfortable picking.
- Insufficient marketing and developer advocacy.
SSE schemes are mostly of interest to academics, although Seny Kamara (Brown Univeristy professior and one of the luminaries of searchable encryption) did try to develop an app called Pixek which used SSE to encrypt photos.
Maybe there’s room for a cryptography competition on searchable encryption schemes in the future.
You Can Have Little a HMAC, As a Treat
Finally, I can’t talk about searchable encryption without discussing a technique that’s older than dirt by Internet standards, that has been independently reinvented by countless software developers tasked with encrypting database records.
The oldest version I’ve been able to track down dates to 2006 by Raul Garcia at Microsoft, but I’m not confident that it didn’t exist before.
The idea I’m alluding to goes like this:
- Encrypt your data, securely, using symmetric cryptography.
(Hopefully your encryption addresses the considerations outlined in the relevant sections above.) - Separately, calculate an HMAC over the unencrypted data with a separate key used exclusively for indexing.
When you need to query your data, you can just recalculate the HMAC of your challenge and fetch the records that match it. Easy, right?
Even if you rotate your keys for encryption, you keep your indexing keys static across your entire data set. This lets you have durable indexes for encrypted data, which gives you the ability to do literal lookups for the performance hit of a hash function.
Additionally, everyone has HMAC in their toolkit, so you don’t have to move around implementations of complex cryptographic building blocks. You can live off the land. What’s not to love?
Hooray!However, if you stopped here, we regret to inform you that your data is no longer indistinguishable from random, which probably undermines the security proof for your encryption scheme.
How annoying!Of course, you don’t have to stop with the addition of plain HMAC to your database encryption software.
Take a page from Troy Hunt: Truncate the output to provide k-anonymity rather than a direct literal look-up.
“K-What Now?”
Imagine you have a full HMAC-SHA256 of the plaintext next to every ciphertext record with a static key, for searchability.
Each HMAC output corresponds 1:1 with a unique plaintext.
Because you’re using HMAC with a secret key, an attacker can’t just build a rainbow table like they would when attempting password cracking, but it still leaks duplicate plaintexts.
For example, an HMAC-SHA256 output might look like this:
Art: CMYKat\04a74e4c0158e34a566785d1a5e1167c4e3455c42aea173104e48ca810a8b1aeIf you were to slice off most of those bytes (e.g. leaving only the last 3, which in the previous example yields
a8b1ae), then with sufficient records, multiple plaintexts will now map to the same truncated HMAC tag.Which means if you’re only revealing a truncated HMAC tag to the database server (both when storing records or retrieving them), you can now expect false positives due to collisions in your truncated HMAC tag.
These false positives give your data a discrete set of anonymity (called k-anonymity), which means an attacker with access to your database cannot:
- Distinguish between two encrypted records with the same short HMAC tag.
- Reverse engineer the short HMAC tag into a single possible plaintext value, even if they can supply candidate queries and study the tags sent to the database.
As with SSE above, this short HMAC technique exposes a trade-off to users.
- Too much k-anonymity (i.e. too many false positives), and you will have to decrypt-then-discard multiple mismatching records. This can make queries slow.
- Not enough k-anonymity (i.e. insufficient false positives), and you’re no better off than a full HMAC.
Even more troublesome, the right amount to truncate is expressed in bits (not bytes), and calculating this value depends on the number of unique plaintext values you anticipate in your dataset. (Fortunately, it grows logarithmically, so you’ll rarely if ever have to tune this.)
If you’d like to play with this idea, here’s a quick and dirty demo script.
Intermission
If you started reading this post with any doubts about Cendyne’s statement that “Database cryptography is hard”, by making it to this point, they’ve probably been long since put to rest.
Art: HarubakiConversely, anyone that specializes in this topic is probably waiting for me to say anything novel or interesting; their patience wearing thin as I continue to rehash a surface-level introduction of their field without really diving deep into anything.
Thus, if you’ve read this far, I’d like to demonstrate the application of what I’ve covered thus far into a real-world case study into an database cryptography product.
Case Study: MongoDB Client-Side Encryption
MongoDB is an open source schema-free NoSQL database. Last year, MongoDB made waves when they announced Queryable Encryption in their upcoming client-side encryption release.
Taken from the press release, but adapted for dark themes.A statement at the bottom of their press release indicates that this isn’t clown-shoes:
Queryable Encryption was designed by MongoDB’s Advanced Cryptography Research Group, headed by Seny Kamara and Tarik Moataz, who are pioneers in the field of encrypted search. The Group conducts cutting-edge peer-reviewed research in cryptography and works with MongoDB engineering teams to transfer and deploy the latest innovations in cryptography and privacy to the MongoDB data platform.
If you recall, I mentioned Seny Kamara in the SSE section of this post. They certainly aren’t wrong about Kamara and Moataz being pioneers in this field.
So with that in mind, let’s explore the implementation in libmongocrypt and see how it stands up to scrutiny.
MongoCrypt: The Good
MongoDB’s encryption library takes key management seriously: They provide a KMS integration for cloud users by default (supporting both AWS and Azure).
MongoDB uses Encrypt-then-MAC with AES-CBC and HMAC-SHA256, which is congruent to what Signal does for message encryption.
How Is Queryable Encryption Implemented?
From the current source code, we can see that MongoCrypt generates several different types of tokens, using HMAC (calculation defined here).
According to their press release:
The feature supports equality searches, with additional query types such as range, prefix, suffix, and substring planned for future releases.
Which means that most of the juicy details probably aren’t public yet.
These HMAC-derived tokens are stored wholesale in the data structure, but most are encrypted before storage using AES-CTR.
There are more layers of encryption (using AEAD), server-side token processing, and more AES-CTR-encrypted edge tokens. All of this is finally serialized (implementation) as one blob for storage.
Since only the equality operation is currently supported (which is the same feature you’d get from HMAC), it’s difficult to speculate what the full feature set looks like.
However, since Kamara and Moataz are leading its development, it’s likely that this feature set will be excellent.
MongoCrypt: The Bad
Every call to
do_encrypt()includes at most the Key ID (but typicallyNULL) as the AAD. This means that the concerns over Confused Deputies (and NoSQL specifically) are relevant to MongoDB.However, even if they did support authenticating the fully qualified path to a field in the AAD for their encryption, their AEAD construction is vulnerable to the kind of canonicalization attack I wrote about previously.
First, observe this code which assembles the multi-part inputs into HMAC.
/* Construct the input to the HMAC */uint32_t num_intermediates = 0;_mongocrypt_buffer_t intermediates[3];// -- snip --if (!_mongocrypt_buffer_concat ( &to_hmac, intermediates, num_intermediates)) { CLIENT_ERR ("failed to allocate buffer"); goto done;}if (hmac == HMAC_SHA_512_256) { uint8_t storage[64]; _mongocrypt_buffer_t tag = {.data = storage, .len = sizeof (storage)}; if (!_crypto_hmac_sha_512 (crypto, Km, &to_hmac, &tag, status)) { goto done; } // Truncate sha512 to first 256 bits. memcpy (out->data, tag.data, MONGOCRYPT_HMAC_LEN);} else { BSON_ASSERT (hmac == HMAC_SHA_256); if (!_mongocrypt_hmac_sha_256 (crypto, Km, &to_hmac, out, status)) { goto done; }}The implementation of
_mongocrypt_buffer_concat()can be found here.If either the implementation of that function, or the code I snipped from my excerpt, had contained code that prefixed every segment of the AAD with the length of the segment (represented as a
uint64_tto make overflow infeasible), then their AEAD mode would not be vulnerable to canonicalization issues.Using TupleHash would also have prevented this issue.
Silver lining for MongoDB developers: Because the AAD is either a key ID or NULL, this isn’t exploitable in practice.
The first cryptographic flaw sort of cancels the second out.
If the libmongocrypt developers ever want to mitigate Confused Deputy attacks, they’ll need to address this canonicalization issue too.
MongoCrypt: The Ugly
MongoCrypt supports deterministic encryption.
If you specify deterministic encryption for a field, your application passes a deterministic initialization vector to AEAD.
We already discussed why this is bad above.
Wrapping Up
This was not a comprehensive treatment of the field of database cryptography. There are many areas of this field that I did not cover, nor do I feel qualified to discuss.
However, I hope anyone who takes the time to read this finds themselves more familiar with the subject.
Additionally, I hope any developers who think “encrypting data in a database is [easy, trivial] (select appropriate)” will find this broad introduction a humbling experience.
Art: CMYKathttps://soatok.blog/2023/03/01/database-cryptography-fur-the-rest-of-us/
#appliedCryptography #blockCipherModes #cryptography #databaseCryptography #databases #encryptedSearch #HMAC #MongoCrypt #MongoDB #QueryableEncryption #realWorldCryptography #security #SecurityGuidance #SQL #SSE #symmetricCryptography #symmetricSearchableEncryption
-
A Guide to Ghost Hunting Guidebooks: NO MORE! Please!
This might come as a shock to the millions of ghost enthusiasts out there: The scientific consensus is that ghosts are NOT spirits, remnants of the dead, recordings of energy, or supernatural entities. Our existing knowledge about nature does not point to a conclusion that ghosts are a single definable thing, paranormal or normal, that you can find, observe, measure, or study. Yet, there are about 200 guides to “ghost hunting” in print or e-book form that lay out ways to obtain evidence of or make contact with ghosts. Therefore, we have a conundrum at step one of any attempt at ghost hunting – we can’t define what a ghost is, and we do not know its properties because we’ve never determined that they exist and measured them. No ghost handbook has ever led anyone to catch and identify ghosts, they can only lead you to interpret something as a ghost.
In that sense, all ghost hunting books are worthless. So why bother with them?
First, it’s an interesting cultural phenomena. Actively investigating reports of ghosts and paranormal activity is mainstream and a popular hobby and tourism draw. In 2010, there were over 1000 paranormal investigation groups in the US, the majority of which researched hauntings. (Hill, 2010) It’s not worthless to examine why people spend their time and money on this hobby and how they go about doing it.
Second, the idea of paranormal investigation contains important aspects of society’s attitudes towards finding out about the world, decided what is meaningful and true, using science to examine questions, cooperation and trust in a community, and taking part in a larger effort beyond one’s own small role in life.
I’m deeply interested in the second point. I’ve found that examining amateur paranormal group behaviors and output highlights concepts about science education and public discourse about belief and reality. This piece mentions 11 books on ghost hunting that I have examined. They have broad similarities and distinct differences. In the main portion, I review 4 books on the basis of the following:
- Readability (language, errors, quality of writing)
- Credibility (sources, supported arguments vs speculation, factual correctness)
- Overall value as a cultural product (Buy it or not?)
I picked these particular books for several reasons. They span a significant spectrum in time over which we can watch the evolution of ghost hunting technique. I think they are generally representative of this narrow niche. There are better and worse ones, I’m sure. In searching for a selection, I realized I could not POSSIBLY read them all, nor would I want to spend money on them. Many appear to be self-published since several ghost investigation group leaders feel the need to have their own personal volume to use.
Please note that when I mention today’s “modern” ghost hunters I am referring to those who have watched Ghost Hunters, Ghost Adventures, Paranormal State and other television shows of this genre. It’s well-established (Hill, 2010) that today’s popular hobby grew from fans of these shows who copied what they saw on TV as their preferred method.
Ghost Hunting: A Practical Guide (UK) – Andrew Green, 1973
Andrew Green was called “the Spectre Inspector” and was a well-educated pursuer of ghosts for sixty years. He felt that there was such an interest in the subject of ghosts that there was a need for a small, non-technical guide for the amateur. This is the “first-ever do-it-yourself guide for the psychic researcher”. Green eschews fanaticism and suggests that those interested in the ghost phenomenon study parapsychology, thus reflecting the thinking at that time that academic parapsychology would unlock the mystery of life after death. Therefore, a good portion of the book describes parapsychological concepts, such as telepathy, which he states can be an important consideration as to the cause of a phenomena. He describes Zener cards experiments, which would later appear as what ghost researchers study in Ghostbusters (1984). This portion of the book will be rather strange to those weaned on 21st century ghost tv shows (if they manage to find and read this book AT ALL).
Green was certain that psychic powers would be soon be recognized (and respected) by science, the church, and society. He remarked that the existence of ghosts can hardly be challenged in the face of all the cases that have been reported – a common justification for investigators to do their thing. As with many paranormal investigators, Green considered serious ghost hunting important and “groundbreaking” work, the researchers as mavericks.
Contrasting Green’s book with modern ghost guides, we can see some striking differences:
- Crisis apparitions were described as “thought pictures”. These types of events were more commonly reported then (as were poltergeists). Both were seen to be manifestation of psychical powers. Today’s ghosts hunters are rarely fluent in these historical parapsychological terms.
- EVPs were called Raudive voices and are not emphasized as evidence. Green thought there were too many potential pitfalls to use them this way.
- The technology was primitive compared with what we have today. Equipment included very basic detective-type materials: level, compass, strain-gage, sand or sugar, powder for fingerprints, thread, maybe a camera. But the idea of measuring environmental variables was already being pursued by the Society of Psychical Research.
- Green mentions exorcism but it was clearly not as common as today and people were less bold about it. Today, the concept pervades pop culture and it is treated as a stunt or a ritual that you can train yourself to do. It’s taken less seriously.
- Green’s advice is that the investigator must be thorough and careful in research and provide a sophisticated investigation. He recommends studying the geology, geography, and past owners. I get the impression that Green’s investigations were not the weekend overnighters of today’s ghost hunters. They were long-term investments in time and effort. The resulting report was to be of print quality!
- The investigator should NEVER get involved in publicity for the case, Green advises. He recognized that some people are in it just for the attention and this was not a proper impetus to do this work. Well, maybe that hasn’t changed. But to restrict all publicity is not what today’s investigators would agree to.
Green judges the client in terms of credentials. Note this curious “test”:
“The production of a caseful of apparatus at the commencement of an investigation in itself constitutes a test, for the witness of a genuine phenomena will be, or should be, impressed with the serious nature of ghost hunting, while the fraudulent will be worried by the prospect of being exposed.”
That’s quaint. Times have changed.
Green states “I believe” this is the process and how it works but, as with all other ghost hunting guides reviewed here, no support is given to these suppositions. For example: Heat extracted from the environment will energize a haunting. Such ideas about ghost manifestations are very old but have yet to be supported or well-argued.
In summary, Green subscribes to ghosts as real, but this guide provides a number of pieces of sound advice and many examples of normal causes that you will not find in any recent book. He is NOT as careless and overtly credulous as modern ghost hunters. Even though he makes some howlers, he knew his history. This book is well-written and properly edited; the language is written at a higher reading level than most. Some sources are cited in the text but not enough.
How to be a Ghost Hunter – Richard Southall, 2003
This book appears to have been written in 2001 from the front information. That was at the start of the massive proliferation of ghost hunting groups in the US. Southall is located in Parkersburg, West Virginia so examples from around that area are included. He calls it a “unique handbook” and it possibly was at the time. It is not now.
The book is of the “Confessions of a Ghost Hunter” type: ghosts are defined, historical aspects are mentioned, prior cases related, procedures and equipment are suggested, collection of data and evidence are described, and advice on forming a team is offered. Southall states he has a degree in journalism and psychology; the book also has a genuine publisher (of New Age books), which brings the quality and readability of this guide above most others. However, it follows the typical outline of information and includes many unsupported claims, assumptions and statements of “fact”.
Here are some examples:
- He assumes that ghosts exists, paranormal activity is ghost activity, and these certain descriptions are characteristics of ghosts. How he “knows” this is never explained. No sources are supplied.
- Various unsourced, un-detailed anecdotes are included. The reader is asked to accept these “just so” without proper justification.
- Undefined, sciencey-sounding terms are used throughout: “highest amount of paranormal energy”, “life force”, “psychic energy”.
- If you investigate enough, you will encounter a “demonic entity”. The Ouija board can invite it in so that device is dangerous to use. “The entity will concentrate on the one with the lowest psyche”.
- You can “recharge” a haunting with an object.
- “It is common knowledge in parapsychology and metaphysics” that every thing has a life force or aura.
- Orbs are indications that an area contains a great deal of psychic energy. They concentrate around a person emanating psychic energy.
Why did Southall do a ghost hunting guide? To promote the topic. He was running a ghost tour at the time. He states his role shifted from investigation to teaching. This book fails to supply us with any sense of the author’s scientific credibility. He refers to fictional movies, such as The Sixth Sense, to suggest the real world is really like this. Southall states that the scientific method is the means to get “tangible, measurable evidence” as opposed to psychic impressions and divination, though the two methods can validate each other. He is not a scientist and it shows.
This book also shows its age. The equipment portion is written for someone who has never owned a camera. It is dull, overly simplistic and sorely out of date with regards to use of digital equipment. He states this howler: “A photograph of a ghost cannot be denied.” This wasn’t even rational advice at the TIME, let alone in the age of phone apps.
He states a good investigator should be unbiased but the language from start to finish is completely biased in the belief that an area is likely haunted. Short shrift is given to examination of mundane causes. But he advises to talk up your own credibility: “Clients love credentials and memberships”. The bibliography contains no journals or scientific sources, just references to other ghost hunters’ books and mass marketed paranormal pablum.
Southall’s writing projects the attitude of a good person who is concerned with people who are having a paranormal problem and want answers that he believes he can provide. He understands that people need reassurance that what they experience is understandable and things will be OK. Unfortunately, it’s not that simple and misinformation like this makes it worse.
Ultimate Ghost Tech – Vince Wilson, 2012
This book was also published with more or less the same content as another one of Wilson’s books, “Ultimate Ghost Hunter”. Wilson informed me that he did not care for the term “Ghost Hunter” and has recently pulled that book from publication. Different title or not, the book follows the typical ghost hunter guide book. In one of the forewords (one is spelled “foreword”, the other “forword”), Vince is described as the “foremost expert in the technological aspects of paranormal investigation.”
In the other foreword, a rather well-respected parapsychologist reveals the blatant truth about ghost hunting technology: “Let’s face it: ghost hunters love their tech – even if they don’t know how to use it or to assess the data from it in light of the reported phenomena”. Indeed. I agree with that.
The rest of this book is an example of sounding sciencey but falling short of representing anything like scientific investigation. Wilson focuses on technology, of course. An earlier book, Ghost Science – which I saw as a must-read since I am deeply interested in ghosts + science – was atrocious. It was sloppy, formatted terribly, and at the very least, desperately needed an editor who could spell and eliminate awful turns of phrase. That book begins with the premise “One of the main purposes of this book is to show that, not only do ghosts exist but also that the laws that govern reality allow them”. Neither that book, nor this one will demonstrate that stated purpose to anyone who understands how science actually works. Wilson’s array of books (3) are essentially self-published. But according to Wilson, he has progressed past that first book, yet he still stands by the work he did in this one. I cringed at many aspects of UGT and how readers will be misinformed by much of its content.
Examples:
- He states “random energy particles may hold the essence of consciousness…” There is no basis for such speculation. Shall we talk homeopathy?
- “Ghosts will be proven to exist one day and so will psychics…” What is the basis of this claim? What will that effort entail? Why after 100 years of trying by actual professionals will things change now with amateur researchers?
- He uses several phrases that are painful to read, such as “just another theory” (where “theory” is used to mean “a guess” instead of the scientific meaning of an evidence-supported overarching model of explanation), “science is absolute” (What does that even mean?), “sorry about the math” (If you have to apologize for the language of science, you should NOT be reading or writing such a book) and “blah blah blah” (I can hardly think of ANY excuse to write that).
- He refers to “stuffy scientists” and takes a disparaging tone towards skeptics. In Ghost Science, he called skepticism a quasi-religion.
Several statements rankle me as revealing a disturbingly superficial and inflated attitude of ghost hunting hobbyists. He says Ghostbusters (the movie) changed paranormal research with its lingo and gadgets, “Paranormal research just became really cool overnight.” He suggests science as way to pump up your credibility – not real science, but faking it – saying you should answer questions from people with sciencey words to sound “professional and cool” and a little “nerdy”. People are too embarrassed to ask what you mean.
Not me. I ask. And science-pretenders skirt the uncomfortable questions.
“Ghostbusters”Wilson relates all the ubiquitous (and wrong) assumptions about ghosts starting with the belief that they exist (thus scuttling any unbiased investigation of what might really be happening to people). The paradigm of today’s ghost investigation is reflected: changes in the environment can be related to ghost behavior and hauntings; technology can provide objective evidence, more and different data, than just human experience. For example, he suggests that a cold spot could be created (through an explanation of energy transfer) from an entity moving through dimensions. This type of rhetoric (apparent in nearly all ghost hunting guides) gives hope but very flimsy justification to other ghost hunters that they will discover something scientifically incredible:
“You can be an amateur parapsychologist and usher in a new era of paranormal research. Wow! That’s pretty deep for me!” (p 160)
Cringe-worthy and specious.
Wilson, like many of these guide writers, seems well-meaning, but also willing to learn new things, expand his horizons, and is fairly literate in science ideas – just enough to sound knowledgable to people who aren’t scientists, which is most of the population. He is not a scientist but a science enthusiast. It’s a widespread trend for ghost hunters to quote scientific buzzwords and namedrop famous scientists. They attempt to apply very complex physics concepts and theories, such as quantum mechanics, Einstein’s “spooky action at a distance”, to inappropriate situations. There are no scientific sources cited or referenced and explained. There are basically NO sources for the various claims or even the quotes. The recommended reading list contains references that repeat these unverified speculative claims and include pop science sources like The Handy Science Answer Book. This is just not acceptable if you claim to be doing science.
Wilson understands that TV ghost hunters are playing a role and that many paranormal investigators are “fooled by an intense need to believe”. Hoaxes are rampant. So, there is a kernel of truth in much of what he writes. However, that is trumped by his own faith that equipment CAN detect anomalous energy of some sort. The processes he suggests leave out critical considerations about confounding factors and alternative explanations. Wilson has lectured as a ghost tech expert in the past. He suggests giving workshops to teach people about this topic is a good way to fundraise for your group. I find this playing pretend professor/scientist to be profoundly distasteful.
I accept that Vince will be unhappy with my take on his publications as an unfortunate consequence. But if anyone attempts to make such extraordinary claims that are so off the mark, unjustified, and can misinform society, you open yourself to such harsh criticism. I will call you on bullshit and hope you will consider ceasing its propagation.
How to Hunt Ghosts – Joshua P. Warren, 2003
This volume was produced by an affiliate of Simon and Schuster publishing so the basic elements of a book – grammar, punctuation, spelling and formatting – is superior to small or self-published efforts. But I can’t say we get better quality in the content. The same unsupported model, built on speculative paranormal assumptions, is applied.
The first words “Ghosts are real” show us this is not about investigation but about finding proof to support a preexisting conclusion. These opening words oddly contrast with the last words of the book, “Never pretend to know all the answers. All the answers are not known”. In between, we get a mish-mash of silly claims and scientific misrepresentation. Warren’s resumé does not include science. He writes fiction and worked in film making. Like many who appear on TV shows as talking heads, he touts these appearances to bolster his credibility. It works for those who get their facts from TV, I imagine.
Warren wins the prize for the most sciencey namedropping in a ghost hunting guide – Descartes, Newton, Einstein, Sagan – none of whom had anything positive to say about spirits. Non-scientist Warren says “Let me tell you what static electricity is…”. No, thanks. I’d rather get my science information from someplace OTHER THAN in a book about entities that have not been demonstrated to exist. If we are to take these ghost hunters seriously, they should explain why physicists aren’t writing books about the paranormal but non-scientists are.
Here are some illustrations of the ideas presented:
- Spiritual manifestations are hidden from us. Our technology is not good enough. There is scientific evidence that ghostly manifestations are real, he says. Warren provides no hint of why physicists can detect subatomic particles and the tiniest releases of energy but our technology is not adequate to identify ghosts. What scientific evidence is he talking about? It’s not in any journals, as is standard with scientific protocol, cited or mentioned.
- Mainstream science is bad because they need to limit their work to activity of a certain category. “Most scientists are busy enough researching the activity they already know about.” This reveals a core ignorance of how knowledge can progress and is a self-evidently dumb claim. From the early days of the scientific endeavor, knowledge became specialized by necessity. To say science is flawed because of this is like saying medicine is bad because too many doctors specialize in distinct areas of health or surgery. Specialization is advantageous for advancing deep knowledge. Astronomers aren’t collecting and evaluating the same data as biologists or sociologists.
- If a person dies young, especially violently, “it is likley that a ghost will remain”.
- Ghosts wrap themselves in ions in order to interact physically. If this is correct, he adds, we can use this to predict and manipulate the phenomena. There is a kernel of science in there but the assumption that ghosts exists, utilize ions, and interact physcially are all grand assumptions.
- “Virtually any location can prove to be haunted.” You should experiment to decide if the Ouija board, automatic writing, pendulums, etc. work for you.
- Warps are areas were the laws of physics seem to be distorted. These may create natural portals. “Warps exemplify the most complicated issues facing science today”. They can be filled with “hundreds or thousands” of entities. The example of a warp is given as the Bermuda Triangle, a myth that was exploded decades ago as sensationalized fiction. Take note that Warren runs a “Bermuda Triangle Research” site in Puerto Rico.
- There is a “correlation between ghost manifestations and standing (acoustical) waves” – it may make the ghost appear. This is in contrast to the well-known research of Vic Tandy who demonstrated that an inadvertently created standing wave was responsible for behavior of materials (metal fencing foil) and possibly the fluid in our eyeballs that could lead to ghost-like reports. Unless I’m missing something (there are no citations to check), Warren has this concept COMPLETELY backwards.
We’re way out on the fringe here. Such incredible claims should have equally incredible documentation provided. Nope. Nothing. It’s practically lying.
Warren knows some science basics, that’s clear, but like many other ghost researchers, he applies them wildly incorrectly. There is an overuse of the term energy without a reasonable definition provided. Warren claims that there is energy of attraction, energy that comes out of our eyes when we look at someone. He says we have auras around us. Dowsing rods that you can make yourself can detect energy fields. His research group (of which he is founder and president) is called the League of Energy Materialization and Unexplained Phenomenon Research (LEMUR). I first heard of Warren through his investigation of the ghost light phenomena. He also thinks this is energy produced by the earth. On the whole, this is one of his lesser outrageous ideas, since such lights are actually documented in several places around the world, but the methods of amateur research are unlikely to produce any results of value. The answer to what causes ghost lights is certainly complex and multivariate.
Warren refers to many fictional movies for examples – he is, after all, a fiction novelist. I question at what level ghost hunters can distinguish scientific facts from PURE fictional license. And, their lack of attention to examination of very normal, reasonable explanations, providing foundationless claims instead that might as well be fiction, dooms them to failure in any effort to advance worthwhile conclusions about ghost experiences. It also leaves them wide open targets for derision by scientists working in legitimate research endeavors. Warren exhibits paranormal pretentiousness. Since he’s moved into the realm of hawking “wishing machines” and lucky charms, he’s lost all credibility. Scientific? Credible? Not in any senses of the words.
Additional Samples
To try to be as thorough as possible, I accessed a sample of several of the dozens of e-books available in the Amazon lenders library. I tried to pick those that ranked high in the search. I did not preview them beforehand so this is nearly a “random” selection off the shelf.
Unsurprisingly, these also fit into the same template and had similar characteristics:
- “Just so” facts and stories
- No references
- Lack of proofing or editing including several typographical errors and incorrect punctuation
- Poor layout and design
- Unsophisticated, overly casual writing style
- Superficial content
I included screen shots of various selections that I highlighted in these books to show I’m not making this stuff up – this is what people really wrote and marketed for sale.
Ultimate Ghost Hunting Guide – Jeff Terrozas, 2011
Subtitled “Everything you need to know for paranormal research”, the content is overly rambling and amateurish. Typos abound, the layout is annoyingly sloppy. The premise is that ghost hunting is “fun”, so have fun. It’s not to be taken seriously unless you want to make money. In that case, you should act “professional”. This book should not be taken seriously.
Ghost Seekers Field Guide, Volume 1 – Frank Potterstone, 2011
No proofreading or editing was apparently done to this manuscript. The language and grammar is poor, typos are abundant and the layout is simply ugly. There is an overuse of ellipses, and random unattributed quotes. Though the author means well, with these factors, the lack of adherence to punctuation conventions, and the unfocused content, this book is unreadable. Yes, there was a Volume 2 as well.
Ultimate Ghost Hunter Field Guide – Brandy Burgess, n.d.
Layout is very poor with line breaks in the middle of a sentence and random capitalization of words. Grammar is poor and the writing is amateurish and unfocused. The author lays out “facts” such as a description of “psychic burns” and “awakenings” without any support for such supernatural claims. She says you will know a spirit is demonic because of the sulfur or rotten flesh smell as well as the growling sounds. They also appear in half-human, half-animal form. These sound like verifiable claims; one wonders why we can’t prove such incredible new findings if they are so obvious.
* * * *
* * * *
Ghost Hunting 101: The Ultimate Resource for Beginner and Experienced Ghost Hunters – Ghostly World, 2015
Ghostly World is a website “dedicated to all things haunted”. The authors say on their site that they are not an investigation team or even “in the paranormal field”. Yet, here they are publishing and charging for an instruction book on ghost hunting. How’s that for zero credibility?
The layout of this book is good and the writing style is generally appropriate to a serious handbook. There are some typos. The content is shallow and lacks development and explanations. Terms and labels are assigned subjectively. For example, readers are told there are three kinds of ghost hunters: a hobbyist, a serious researcher and a home investigator. A random graph is included (because graphs look sciencey) without any source data to show 100% are hobbyists, 50% are serious researchers and only 10% are home investigators. Going into a client’s home is serious stuff where the ghost hunter needs to provide comfort and assistance to the residents while studying spirits. The unnamed author(s) suggest the ghost hunter may need to act in the capacity of a “therapist” – a highly unethical suggestion. Meanwhile, the reader is warned that Ouija boards and other occult dealings will bring about dangerous evil spirits. They seem to think Grant Wilson and Jason Hawes invented ghost hunting.
Some of these books are surprisingly candid, as I found with How to Legally Gain Access to Haunted Locations: A Guide for Paranormal Investigators (n.d.) by Casper Waylin. Waylin makes no apologies for playing pretend and weaseling your way into clients’ homes. He recommends following what you see on TV shows:
Professionalism starts as “pretending” but evolves into something that’s real. If you’re just getting started as a ghost hunting group, you’ll need to pretend that you’re a “professional” and put on a convincing act for the people you talk to in order to gain entry into a particular location. Put together a good costume (some nice clothes) and props (legal documents and contracts) and then tell clients and gatekeepers exactly what you plan to do from beginning to end. In terms of how you greet and speak to new clients, it can help to model other group leaders you’ve seen on TV or read about in books and for crying out loud, make sure that you have a firm handshake and you look them in the eye during your initial contact!
and
Acting professional is okay if you’re not really a professional. Find a character in a movie or watch some of the later episodes of TAPS [Ghost Hunters] or Ghost Adventures and emulate the paranormal investigators that you can relate to best.
So, copy the guys on TV when you enter other people’s houses. This is awful, awful stuff.
Finally, I would like to mention a specialty guide called The Other Side: A Teen’s guide to Ghost Hunting and the Paranormal (2009) by Gibson, Burns, and Schrader. This might be considered one of the least worst books since it was done by a reputable publisher and contains a handful of good advice. There are two overarching and egregious problems with this book. 1. Misinformation directed at teens to take on this topic and “educate the masses” about “what our place is in the universe and what possibilities there are of an afterlife”; and 2. The ignorant and condescending attitude towards science as hard and cumbersome, and skepticism as cynical bullying (p. 67). The logical fallacies and unsupported claims rampant in this book would make it excellent to use as an example for a critical thinking exercise.
Most, perhaps all, of these authors wrote these books because they believed it would be helpful to an audience or to their investigation group as a way to codify what they deemed to be important knowledge and procedures that everyone was expected to follow. With the advent of easy self-publishing, we’ve seen a proliferation of low-quality, previously unpublishable books like never before. Anyone, even someone who never wrote an article or term paper, can publish a book, sell it, and claim to be an author. There is no excuse for publishing a book without having it edited for basic grammar, spelling, and punctuation. If I had a nickel for all the times I read the phrases “First of all”, “First and foremost”, “Suffice (it) to say”, and “Let me be clear” in these books, I would take my few bucks and go buy a drink. There is no justification for the amount of self-serving, misguided misinformation out there that promises the reader that “this book” is the (ultimate) thing you need to set yourself up as a genuine, credible, and successful ghost hunter.
My recommendation: Don’t bother with any of them.
Look up books done by professional science writers or work done by actual parapsychologists to learn the literature of the field before you write a book and say you know what you are talking about.
I’ll end with some suggestions for those who plan to write future guides to the paranormal, if there has to be any…
There are two books you must research. BUY Scientific Paranormal Investigation by Benjamin Radford (2010). If you do any paranormal investigation, this should be your only guide for now.
Secondly, refer to Parapsychology, A Handbook for the 21st Century by Cardena et al., eds. (2015). You can borrow this from a university library or browse it online. While I have disagreements with content in this volume, it is an example of a credible way to construct a sophisticated and useful handbook that will be relevant for decades. It will also give the ghost hunter hobbyists an eye-opener on the insane amount of parapsychological research that has been done by far more qualified people of various disciplines. Written at a college reading level, it is not in the same class of books cited above making all amateur guides look extremely unsophisticated. But if you are going to claim to be doing groundbreaking important research that will enhance our future knowledge about spirits and hauntings, you REALLY need to up your game. Considerably. I call for no more ghost guidebooks.
References:
Hill, Sharon (2010) Being Scientifical: Popularity, Purpose and Promotion of Amateur Research and Investigation Groups in the U.S. A thesis submitted to the Faculty of the Graduate School of the University at Buffalo, State University of New York in partial fulfillment of requirements for Degree of Master of Education EdM [PDF]
Hill, Sharon (2013) Sounds Sciencey Presentation at NECSS https://www.youtube.com/watch?v=9CmgweT0eE0
#ghostHunters #ghostHuntingGuide #paranormalInvestigation #paranormalInvestigators
-
A Guide to Ghost Hunting Guidebooks: NO MORE! Please!
This might come as a shock to the millions of ghost enthusiasts out there: The scientific consensus is that ghosts are NOT spirits, remnants of the dead, recordings of energy, or supernatural entities. Our existing knowledge about nature does not point to a conclusion that ghosts are a single definable thing, paranormal or normal, that you can find, observe, measure, or study. Yet, there are about 200 guides to “ghost hunting” in print or e-book form that lay out ways to obtain evidence of or make contact with ghosts. Therefore, we have a conundrum at step one of any attempt at ghost hunting – we can’t define what a ghost is, and we do not know its properties because we’ve never determined that they exist and measured them. No ghost handbook has ever led anyone to catch and identify ghosts, they can only lead you to interpret something as a ghost.
In that sense, all ghost hunting books are worthless. So why bother with them?
First, it’s an interesting cultural phenomena. Actively investigating reports of ghosts and paranormal activity is mainstream and a popular hobby and tourism draw. In 2010, there were over 1000 paranormal investigation groups in the US, the majority of which researched hauntings. (Hill, 2010) It’s not worthless to examine why people spend their time and money on this hobby and how they go about doing it.
Second, the idea of paranormal investigation contains important aspects of society’s attitudes towards finding out about the world, decided what is meaningful and true, using science to examine questions, cooperation and trust in a community, and taking part in a larger effort beyond one’s own small role in life.
I’m deeply interested in the second point. I’ve found that examining amateur paranormal group behaviors and output highlights concepts about science education and public discourse about belief and reality. This piece mentions 11 books on ghost hunting that I have examined. They have broad similarities and distinct differences. In the main portion, I review 4 books on the basis of the following:
- Readability (language, errors, quality of writing)
- Credibility (sources, supported arguments vs speculation, factual correctness)
- Overall value as a cultural product (Buy it or not?)
I picked these particular books for several reasons. They span a significant spectrum in time over which we can watch the evolution of ghost hunting technique. I think they are generally representative of this narrow niche. There are better and worse ones, I’m sure. In searching for a selection, I realized I could not POSSIBLY read them all, nor would I want to spend money on them. Many appear to be self-published since several ghost investigation group leaders feel the need to have their own personal volume to use.
Please note that when I mention today’s “modern” ghost hunters I am referring to those who have watched Ghost Hunters, Ghost Adventures, Paranormal State and other television shows of this genre. It’s well-established (Hill, 2010) that today’s popular hobby grew from fans of these shows who copied what they saw on TV as their preferred method.
Ghost Hunting: A Practical Guide (UK) – Andrew Green, 1973
Andrew Green was called “the Spectre Inspector” and was a well-educated pursuer of ghosts for sixty years. He felt that there was such an interest in the subject of ghosts that there was a need for a small, non-technical guide for the amateur. This is the “first-ever do-it-yourself guide for the psychic researcher”. Green eschews fanaticism and suggests that those interested in the ghost phenomenon study parapsychology, thus reflecting the thinking at that time that academic parapsychology would unlock the mystery of life after death. Therefore, a good portion of the book describes parapsychological concepts, such as telepathy, which he states can be an important consideration as to the cause of a phenomena. He describes Zener cards experiments, which would later appear as what ghost researchers study in Ghostbusters (1984). This portion of the book will be rather strange to those weaned on 21st century ghost tv shows (if they manage to find and read this book AT ALL).
Green was certain that psychic powers would be soon be recognized (and respected) by science, the church, and society. He remarked that the existence of ghosts can hardly be challenged in the face of all the cases that have been reported – a common justification for investigators to do their thing. As with many paranormal investigators, Green considered serious ghost hunting important and “groundbreaking” work, the researchers as mavericks.
Contrasting Green’s book with modern ghost guides, we can see some striking differences:
- Crisis apparitions were described as “thought pictures”. These types of events were more commonly reported then (as were poltergeists). Both were seen to be manifestation of psychical powers. Today’s ghosts hunters are rarely fluent in these historical parapsychological terms.
- EVPs were called Raudive voices and are not emphasized as evidence. Green thought there were too many potential pitfalls to use them this way.
- The technology was primitive compared with what we have today. Equipment included very basic detective-type materials: level, compass, strain-gage, sand or sugar, powder for fingerprints, thread, maybe a camera. But the idea of measuring environmental variables was already being pursued by the Society of Psychical Research.
- Green mentions exorcism but it was clearly not as common as today and people were less bold about it. Today, the concept pervades pop culture and it is treated as a stunt or a ritual that you can train yourself to do. It’s taken less seriously.
- Green’s advice is that the investigator must be thorough and careful in research and provide a sophisticated investigation. He recommends studying the geology, geography, and past owners. I get the impression that Green’s investigations were not the weekend overnighters of today’s ghost hunters. They were long-term investments in time and effort. The resulting report was to be of print quality!
- The investigator should NEVER get involved in publicity for the case, Green advises. He recognized that some people are in it just for the attention and this was not a proper impetus to do this work. Well, maybe that hasn’t changed. But to restrict all publicity is not what today’s investigators would agree to.
Green judges the client in terms of credentials. Note this curious “test”:
“The production of a caseful of apparatus at the commencement of an investigation in itself constitutes a test, for the witness of a genuine phenomena will be, or should be, impressed with the serious nature of ghost hunting, while the fraudulent will be worried by the prospect of being exposed.”
That’s quaint. Times have changed.
Green states “I believe” this is the process and how it works but, as with all other ghost hunting guides reviewed here, no support is given to these suppositions. For example: Heat extracted from the environment will energize a haunting. Such ideas about ghost manifestations are very old but have yet to be supported or well-argued.
In summary, Green subscribes to ghosts as real, but this guide provides a number of pieces of sound advice and many examples of normal causes that you will not find in any recent book. He is NOT as careless and overtly credulous as modern ghost hunters. Even though he makes some howlers, he knew his history. This book is well-written and properly edited; the language is written at a higher reading level than most. Some sources are cited in the text but not enough.
How to be a Ghost Hunter – Richard Southall, 2003
This book appears to have been written in 2001 from the front information. That was at the start of the massive proliferation of ghost hunting groups in the US. Southall is located in Parkersburg, West Virginia so examples from around that area are included. He calls it a “unique handbook” and it possibly was at the time. It is not now.
The book is of the “Confessions of a Ghost Hunter” type: ghosts are defined, historical aspects are mentioned, prior cases related, procedures and equipment are suggested, collection of data and evidence are described, and advice on forming a team is offered. Southall states he has a degree in journalism and psychology; the book also has a genuine publisher (of New Age books), which brings the quality and readability of this guide above most others. However, it follows the typical outline of information and includes many unsupported claims, assumptions and statements of “fact”.
Here are some examples:
- He assumes that ghosts exists, paranormal activity is ghost activity, and these certain descriptions are characteristics of ghosts. How he “knows” this is never explained. No sources are supplied.
- Various unsourced, un-detailed anecdotes are included. The reader is asked to accept these “just so” without proper justification.
- Undefined, sciencey-sounding terms are used throughout: “highest amount of paranormal energy”, “life force”, “psychic energy”.
- If you investigate enough, you will encounter a “demonic entity”. The Ouija board can invite it in so that device is dangerous to use. “The entity will concentrate on the one with the lowest psyche”.
- You can “recharge” a haunting with an object.
- “It is common knowledge in parapsychology and metaphysics” that every thing has a life force or aura.
- Orbs are indications that an area contains a great deal of psychic energy. They concentrate around a person emanating psychic energy.
Why did Southall do a ghost hunting guide? To promote the topic. He was running a ghost tour at the time. He states his role shifted from investigation to teaching. This book fails to supply us with any sense of the author’s scientific credibility. He refers to fictional movies, such as The Sixth Sense, to suggest the real world is really like this. Southall states that the scientific method is the means to get “tangible, measurable evidence” as opposed to psychic impressions and divination, though the two methods can validate each other. He is not a scientist and it shows.
This book also shows its age. The equipment portion is written for someone who has never owned a camera. It is dull, overly simplistic and sorely out of date with regards to use of digital equipment. He states this howler: “A photograph of a ghost cannot be denied.” This wasn’t even rational advice at the TIME, let alone in the age of phone apps.
He states a good investigator should be unbiased but the language from start to finish is completely biased in the belief that an area is likely haunted. Short shrift is given to examination of mundane causes. But he advises to talk up your own credibility: “Clients love credentials and memberships”. The bibliography contains no journals or scientific sources, just references to other ghost hunters’ books and mass marketed paranormal pablum.
Southall’s writing projects the attitude of a good person who is concerned with people who are having a paranormal problem and want answers that he believes he can provide. He understands that people need reassurance that what they experience is understandable and things will be OK. Unfortunately, it’s not that simple and misinformation like this makes it worse.
Ultimate Ghost Tech – Vince Wilson, 2012
This book was also published with more or less the same content as another one of Wilson’s books, “Ultimate Ghost Hunter”. Wilson informed me that he did not care for the term “Ghost Hunter” and has recently pulled that book from publication. Different title or not, the book follows the typical ghost hunter guide book. In one of the forewords (one is spelled “foreword”, the other “forword”), Vince is described as the “foremost expert in the technological aspects of paranormal investigation.”
In the other foreword, a rather well-respected parapsychologist reveals the blatant truth about ghost hunting technology: “Let’s face it: ghost hunters love their tech – even if they don’t know how to use it or to assess the data from it in light of the reported phenomena”. Indeed. I agree with that.
The rest of this book is an example of sounding sciencey but falling short of representing anything like scientific investigation. Wilson focuses on technology, of course. An earlier book, Ghost Science – which I saw as a must-read since I am deeply interested in ghosts + science – was atrocious. It was sloppy, formatted terribly, and at the very least, desperately needed an editor who could spell and eliminate awful turns of phrase. That book begins with the premise “One of the main purposes of this book is to show that, not only do ghosts exist but also that the laws that govern reality allow them”. Neither that book, nor this one will demonstrate that stated purpose to anyone who understands how science actually works. Wilson’s array of books (3) are essentially self-published. But according to Wilson, he has progressed past that first book, yet he still stands by the work he did in this one. I cringed at many aspects of UGT and how readers will be misinformed by much of its content.
Examples:
- He states “random energy particles may hold the essence of consciousness…” There is no basis for such speculation. Shall we talk homeopathy?
- “Ghosts will be proven to exist one day and so will psychics…” What is the basis of this claim? What will that effort entail? Why after 100 years of trying by actual professionals will things change now with amateur researchers?
- He uses several phrases that are painful to read, such as “just another theory” (where “theory” is used to mean “a guess” instead of the scientific meaning of an evidence-supported overarching model of explanation), “science is absolute” (What does that even mean?), “sorry about the math” (If you have to apologize for the language of science, you should NOT be reading or writing such a book) and “blah blah blah” (I can hardly think of ANY excuse to write that).
- He refers to “stuffy scientists” and takes a disparaging tone towards skeptics. In Ghost Science, he called skepticism a quasi-religion.
Several statements rankle me as revealing a disturbingly superficial and inflated attitude of ghost hunting hobbyists. He says Ghostbusters (the movie) changed paranormal research with its lingo and gadgets, “Paranormal research just became really cool overnight.” He suggests science as way to pump up your credibility – not real science, but faking it – saying you should answer questions from people with sciencey words to sound “professional and cool” and a little “nerdy”. People are too embarrassed to ask what you mean.
Not me. I ask. And science-pretenders skirt the uncomfortable questions.
“Ghostbusters”Wilson relates all the ubiquitous (and wrong) assumptions about ghosts starting with the belief that they exist (thus scuttling any unbiased investigation of what might really be happening to people). The paradigm of today’s ghost investigation is reflected: changes in the environment can be related to ghost behavior and hauntings; technology can provide objective evidence, more and different data, than just human experience. For example, he suggests that a cold spot could be created (through an explanation of energy transfer) from an entity moving through dimensions. This type of rhetoric (apparent in nearly all ghost hunting guides) gives hope but very flimsy justification to other ghost hunters that they will discover something scientifically incredible:
“You can be an amateur parapsychologist and usher in a new era of paranormal research. Wow! That’s pretty deep for me!” (p 160)
Cringe-worthy and specious.
Wilson, like many of these guide writers, seems well-meaning, but also willing to learn new things, expand his horizons, and is fairly literate in science ideas – just enough to sound knowledgable to people who aren’t scientists, which is most of the population. He is not a scientist but a science enthusiast. It’s a widespread trend for ghost hunters to quote scientific buzzwords and namedrop famous scientists. They attempt to apply very complex physics concepts and theories, such as quantum mechanics, Einstein’s “spooky action at a distance”, to inappropriate situations. There are no scientific sources cited or referenced and explained. There are basically NO sources for the various claims or even the quotes. The recommended reading list contains references that repeat these unverified speculative claims and include pop science sources like The Handy Science Answer Book. This is just not acceptable if you claim to be doing science.
Wilson understands that TV ghost hunters are playing a role and that many paranormal investigators are “fooled by an intense need to believe”. Hoaxes are rampant. So, there is a kernel of truth in much of what he writes. However, that is trumped by his own faith that equipment CAN detect anomalous energy of some sort. The processes he suggests leave out critical considerations about confounding factors and alternative explanations. Wilson has lectured as a ghost tech expert in the past. He suggests giving workshops to teach people about this topic is a good way to fundraise for your group. I find this playing pretend professor/scientist to be profoundly distasteful.
I accept that Vince will be unhappy with my take on his publications as an unfortunate consequence. But if anyone attempts to make such extraordinary claims that are so off the mark, unjustified, and can misinform society, you open yourself to such harsh criticism. I will call you on bullshit and hope you will consider ceasing its propagation.
How to Hunt Ghosts – Joshua P. Warren, 2003
This volume was produced by an affiliate of Simon and Schuster publishing so the basic elements of a book – grammar, punctuation, spelling and formatting – is superior to small or self-published efforts. But I can’t say we get better quality in the content. The same unsupported model, built on speculative paranormal assumptions, is applied.
The first words “Ghosts are real” show us this is not about investigation but about finding proof to support a preexisting conclusion. These opening words oddly contrast with the last words of the book, “Never pretend to know all the answers. All the answers are not known”. In between, we get a mish-mash of silly claims and scientific misrepresentation. Warren’s resumé does not include science. He writes fiction and worked in film making. Like many who appear on TV shows as talking heads, he touts these appearances to bolster his credibility. It works for those who get their facts from TV, I imagine.
Warren wins the prize for the most sciencey namedropping in a ghost hunting guide – Descartes, Newton, Einstein, Sagan – none of whom had anything positive to say about spirits. Non-scientist Warren says “Let me tell you what static electricity is…”. No, thanks. I’d rather get my science information from someplace OTHER THAN in a book about entities that have not been demonstrated to exist. If we are to take these ghost hunters seriously, they should explain why physicists aren’t writing books about the paranormal but non-scientists are.
Here are some illustrations of the ideas presented:
- Spiritual manifestations are hidden from us. Our technology is not good enough. There is scientific evidence that ghostly manifestations are real, he says. Warren provides no hint of why physicists can detect subatomic particles and the tiniest releases of energy but our technology is not adequate to identify ghosts. What scientific evidence is he talking about? It’s not in any journals, as is standard with scientific protocol, cited or mentioned.
- Mainstream science is bad because they need to limit their work to activity of a certain category. “Most scientists are busy enough researching the activity they already know about.” This reveals a core ignorance of how knowledge can progress and is a self-evidently dumb claim. From the early days of the scientific endeavor, knowledge became specialized by necessity. To say science is flawed because of this is like saying medicine is bad because too many doctors specialize in distinct areas of health or surgery. Specialization is advantageous for advancing deep knowledge. Astronomers aren’t collecting and evaluating the same data as biologists or sociologists.
- If a person dies young, especially violently, “it is likley that a ghost will remain”.
- Ghosts wrap themselves in ions in order to interact physically. If this is correct, he adds, we can use this to predict and manipulate the phenomena. There is a kernel of science in there but the assumption that ghosts exists, utilize ions, and interact physcially are all grand assumptions.
- “Virtually any location can prove to be haunted.” You should experiment to decide if the Ouija board, automatic writing, pendulums, etc. work for you.
- Warps are areas were the laws of physics seem to be distorted. These may create natural portals. “Warps exemplify the most complicated issues facing science today”. They can be filled with “hundreds or thousands” of entities. The example of a warp is given as the Bermuda Triangle, a myth that was exploded decades ago as sensationalized fiction. Take note that Warren runs a “Bermuda Triangle Research” site in Puerto Rico.
- There is a “correlation between ghost manifestations and standing (acoustical) waves” – it may make the ghost appear. This is in contrast to the well-known research of Vic Tandy who demonstrated that an inadvertently created standing wave was responsible for behavior of materials (metal fencing foil) and possibly the fluid in our eyeballs that could lead to ghost-like reports. Unless I’m missing something (there are no citations to check), Warren has this concept COMPLETELY backwards.
We’re way out on the fringe here. Such incredible claims should have equally incredible documentation provided. Nope. Nothing. It’s practically lying.
Warren knows some science basics, that’s clear, but like many other ghost researchers, he applies them wildly incorrectly. There is an overuse of the term energy without a reasonable definition provided. Warren claims that there is energy of attraction, energy that comes out of our eyes when we look at someone. He says we have auras around us. Dowsing rods that you can make yourself can detect energy fields. His research group (of which he is founder and president) is called the League of Energy Materialization and Unexplained Phenomenon Research (LEMUR). I first heard of Warren through his investigation of the ghost light phenomena. He also thinks this is energy produced by the earth. On the whole, this is one of his lesser outrageous ideas, since such lights are actually documented in several places around the world, but the methods of amateur research are unlikely to produce any results of value. The answer to what causes ghost lights is certainly complex and multivariate.
Warren refers to many fictional movies for examples – he is, after all, a fiction novelist. I question at what level ghost hunters can distinguish scientific facts from PURE fictional license. And, their lack of attention to examination of very normal, reasonable explanations, providing foundationless claims instead that might as well be fiction, dooms them to failure in any effort to advance worthwhile conclusions about ghost experiences. It also leaves them wide open targets for derision by scientists working in legitimate research endeavors. Warren exhibits paranormal pretentiousness. Since he’s moved into the realm of hawking “wishing machines” and lucky charms, he’s lost all credibility. Scientific? Credible? Not in any senses of the words.
Additional Samples
To try to be as thorough as possible, I accessed a sample of several of the dozens of e-books available in the Amazon lenders library. I tried to pick those that ranked high in the search. I did not preview them beforehand so this is nearly a “random” selection off the shelf.
Unsurprisingly, these also fit into the same template and had similar characteristics:
- “Just so” facts and stories
- No references
- Lack of proofing or editing including several typographical errors and incorrect punctuation
- Poor layout and design
- Unsophisticated, overly casual writing style
- Superficial content
I included screen shots of various selections that I highlighted in these books to show I’m not making this stuff up – this is what people really wrote and marketed for sale.
Ultimate Ghost Hunting Guide – Jeff Terrozas, 2011
Subtitled “Everything you need to know for paranormal research”, the content is overly rambling and amateurish. Typos abound, the layout is annoyingly sloppy. The premise is that ghost hunting is “fun”, so have fun. It’s not to be taken seriously unless you want to make money. In that case, you should act “professional”. This book should not be taken seriously.
Ghost Seekers Field Guide, Volume 1 – Frank Potterstone, 2011
No proofreading or editing was apparently done to this manuscript. The language and grammar is poor, typos are abundant and the layout is simply ugly. There is an overuse of ellipses, and random unattributed quotes. Though the author means well, with these factors, the lack of adherence to punctuation conventions, and the unfocused content, this book is unreadable. Yes, there was a Volume 2 as well.
Ultimate Ghost Hunter Field Guide – Brandy Burgess, n.d.
Layout is very poor with line breaks in the middle of a sentence and random capitalization of words. Grammar is poor and the writing is amateurish and unfocused. The author lays out “facts” such as a description of “psychic burns” and “awakenings” without any support for such supernatural claims. She says you will know a spirit is demonic because of the sulfur or rotten flesh smell as well as the growling sounds. They also appear in half-human, half-animal form. These sound like verifiable claims; one wonders why we can’t prove such incredible new findings if they are so obvious.
* * * *
* * * *
Ghost Hunting 101: The Ultimate Resource for Beginner and Experienced Ghost Hunters – Ghostly World, 2015
Ghostly World is a website “dedicated to all things haunted”. The authors say on their site that they are not an investigation team or even “in the paranormal field”. Yet, here they are publishing and charging for an instruction book on ghost hunting. How’s that for zero credibility?
The layout of this book is good and the writing style is generally appropriate to a serious handbook. There are some typos. The content is shallow and lacks development and explanations. Terms and labels are assigned subjectively. For example, readers are told there are three kinds of ghost hunters: a hobbyist, a serious researcher and a home investigator. A random graph is included (because graphs look sciencey) without any source data to show 100% are hobbyists, 50% are serious researchers and only 10% are home investigators. Going into a client’s home is serious stuff where the ghost hunter needs to provide comfort and assistance to the residents while studying spirits. The unnamed author(s) suggest the ghost hunter may need to act in the capacity of a “therapist” – a highly unethical suggestion. Meanwhile, the reader is warned that Ouija boards and other occult dealings will bring about dangerous evil spirits. They seem to think Grant Wilson and Jason Hawes invented ghost hunting.
Some of these books are surprisingly candid, as I found with How to Legally Gain Access to Haunted Locations: A Guide for Paranormal Investigators (n.d.) by Casper Waylin. Waylin makes no apologies for playing pretend and weaseling your way into clients’ homes. He recommends following what you see on TV shows:
Professionalism starts as “pretending” but evolves into something that’s real. If you’re just getting started as a ghost hunting group, you’ll need to pretend that you’re a “professional” and put on a convincing act for the people you talk to in order to gain entry into a particular location. Put together a good costume (some nice clothes) and props (legal documents and contracts) and then tell clients and gatekeepers exactly what you plan to do from beginning to end. In terms of how you greet and speak to new clients, it can help to model other group leaders you’ve seen on TV or read about in books and for crying out loud, make sure that you have a firm handshake and you look them in the eye during your initial contact!
and
Acting professional is okay if you’re not really a professional. Find a character in a movie or watch some of the later episodes of TAPS [Ghost Hunters] or Ghost Adventures and emulate the paranormal investigators that you can relate to best.
So, copy the guys on TV when you enter other people’s houses. This is awful, awful stuff.
Finally, I would like to mention a specialty guide called The Other Side: A Teen’s guide to Ghost Hunting and the Paranormal (2009) by Gibson, Burns, and Schrader. This might be considered one of the least worst books since it was done by a reputable publisher and contains a handful of good advice. There are two overarching and egregious problems with this book. 1. Misinformation directed at teens to take on this topic and “educate the masses” about “what our place is in the universe and what possibilities there are of an afterlife”; and 2. The ignorant and condescending attitude towards science as hard and cumbersome, and skepticism as cynical bullying (p. 67). The logical fallacies and unsupported claims rampant in this book would make it excellent to use as an example for a critical thinking exercise.
Most, perhaps all, of these authors wrote these books because they believed it would be helpful to an audience or to their investigation group as a way to codify what they deemed to be important knowledge and procedures that everyone was expected to follow. With the advent of easy self-publishing, we’ve seen a proliferation of low-quality, previously unpublishable books like never before. Anyone, even someone who never wrote an article or term paper, can publish a book, sell it, and claim to be an author. There is no excuse for publishing a book without having it edited for basic grammar, spelling, and punctuation. If I had a nickel for all the times I read the phrases “First of all”, “First and foremost”, “Suffice (it) to say”, and “Let me be clear” in these books, I would take my few bucks and go buy a drink. There is no justification for the amount of self-serving, misguided misinformation out there that promises the reader that “this book” is the (ultimate) thing you need to set yourself up as a genuine, credible, and successful ghost hunter.
My recommendation: Don’t bother with any of them.
Look up books done by professional science writers or work done by actual parapsychologists to learn the literature of the field before you write a book and say you know what you are talking about.
I’ll end with some suggestions for those who plan to write future guides to the paranormal, if there has to be any…
There are two books you must research. BUY Scientific Paranormal Investigation by Benjamin Radford (2010). If you do any paranormal investigation, this should be your only guide for now.
Secondly, refer to Parapsychology, A Handbook for the 21st Century by Cardena et al., eds. (2015). You can borrow this from a university library or browse it online. While I have disagreements with content in this volume, it is an example of a credible way to construct a sophisticated and useful handbook that will be relevant for decades. It will also give the ghost hunter hobbyists an eye-opener on the insane amount of parapsychological research that has been done by far more qualified people of various disciplines. Written at a college reading level, it is not in the same class of books cited above making all amateur guides look extremely unsophisticated. But if you are going to claim to be doing groundbreaking important research that will enhance our future knowledge about spirits and hauntings, you REALLY need to up your game. Considerably. I call for no more ghost guidebooks.
References:
Hill, Sharon (2010) Being Scientifical: Popularity, Purpose and Promotion of Amateur Research and Investigation Groups in the U.S. A thesis submitted to the Faculty of the Graduate School of the University at Buffalo, State University of New York in partial fulfillment of requirements for Degree of Master of Education EdM [PDF]
Hill, Sharon (2013) Sounds Sciencey Presentation at NECSS https://www.youtube.com/watch?v=9CmgweT0eE0
#ghostHunters #ghostHuntingGuide #paranormalInvestigation #paranormalInvestigators
-
A Guide to Ghost Hunting Guidebooks: NO MORE! Please!
This might come as a shock to the millions of ghost enthusiasts out there: The scientific consensus is that ghosts are NOT spirits, remnants of the dead, recordings of energy, or supernatural entities. Our existing knowledge about nature does not point to a conclusion that ghosts are a single definable thing, paranormal or normal, that you can find, observe, measure, or study. Yet, there are about 200 guides to “ghost hunting” in print or e-book form that lay out ways to obtain evidence of or make contact with ghosts. Therefore, we have a conundrum at step one of any attempt at ghost hunting – we can’t define what a ghost is, and we do not know its properties because we’ve never determined that they exist and measured them. No ghost handbook has ever led anyone to catch and identify ghosts, they can only lead you to interpret something as a ghost.
In that sense, all ghost hunting books are worthless. So why bother with them?
First, it’s an interesting cultural phenomena. Actively investigating reports of ghosts and paranormal activity is mainstream and a popular hobby and tourism draw. In 2010, there were over 1000 paranormal investigation groups in the US, the majority of which researched hauntings. (Hill, 2010) It’s not worthless to examine why people spend their time and money on this hobby and how they go about doing it.
Second, the idea of paranormal investigation contains important aspects of society’s attitudes towards finding out about the world, decided what is meaningful and true, using science to examine questions, cooperation and trust in a community, and taking part in a larger effort beyond one’s own small role in life.
I’m deeply interested in the second point. I’ve found that examining amateur paranormal group behaviors and output highlights concepts about science education and public discourse about belief and reality. This piece mentions 11 books on ghost hunting that I have examined. They have broad similarities and distinct differences. In the main portion, I review 4 books on the basis of the following:
- Readability (language, errors, quality of writing)
- Credibility (sources, supported arguments vs speculation, factual correctness)
- Overall value as a cultural product (Buy it or not?)
I picked these particular books for several reasons. They span a significant spectrum in time over which we can watch the evolution of ghost hunting technique. I think they are generally representative of this narrow niche. There are better and worse ones, I’m sure. In searching for a selection, I realized I could not POSSIBLY read them all, nor would I want to spend money on them. Many appear to be self-published since several ghost investigation group leaders feel the need to have their own personal volume to use.
Please note that when I mention today’s “modern” ghost hunters I am referring to those who have watched Ghost Hunters, Ghost Adventures, Paranormal State and other television shows of this genre. It’s well-established (Hill, 2010) that today’s popular hobby grew from fans of these shows who copied what they saw on TV as their preferred method.
Ghost Hunting: A Practical Guide (UK) – Andrew Green, 1973
Andrew Green was called “the Spectre Inspector” and was a well-educated pursuer of ghosts for sixty years. He felt that there was such an interest in the subject of ghosts that there was a need for a small, non-technical guide for the amateur. This is the “first-ever do-it-yourself guide for the psychic researcher”. Green eschews fanaticism and suggests that those interested in the ghost phenomenon study parapsychology, thus reflecting the thinking at that time that academic parapsychology would unlock the mystery of life after death. Therefore, a good portion of the book describes parapsychological concepts, such as telepathy, which he states can be an important consideration as to the cause of a phenomena. He describes Zener cards experiments, which would later appear as what ghost researchers study in Ghostbusters (1984). This portion of the book will be rather strange to those weaned on 21st century ghost tv shows (if they manage to find and read this book AT ALL).
Green was certain that psychic powers would be soon be recognized (and respected) by science, the church, and society. He remarked that the existence of ghosts can hardly be challenged in the face of all the cases that have been reported – a common justification for investigators to do their thing. As with many paranormal investigators, Green considered serious ghost hunting important and “groundbreaking” work, the researchers as mavericks.
Contrasting Green’s book with modern ghost guides, we can see some striking differences:
- Crisis apparitions were described as “thought pictures”. These types of events were more commonly reported then (as were poltergeists). Both were seen to be manifestation of psychical powers. Today’s ghosts hunters are rarely fluent in these historical parapsychological terms.
- EVPs were called Raudive voices and are not emphasized as evidence. Green thought there were too many potential pitfalls to use them this way.
- The technology was primitive compared with what we have today. Equipment included very basic detective-type materials: level, compass, strain-gage, sand or sugar, powder for fingerprints, thread, maybe a camera. But the idea of measuring environmental variables was already being pursued by the Society of Psychical Research.
- Green mentions exorcism but it was clearly not as common as today and people were less bold about it. Today, the concept pervades pop culture and it is treated as a stunt or a ritual that you can train yourself to do. It’s taken less seriously.
- Green’s advice is that the investigator must be thorough and careful in research and provide a sophisticated investigation. He recommends studying the geology, geography, and past owners. I get the impression that Green’s investigations were not the weekend overnighters of today’s ghost hunters. They were long-term investments in time and effort. The resulting report was to be of print quality!
- The investigator should NEVER get involved in publicity for the case, Green advises. He recognized that some people are in it just for the attention and this was not a proper impetus to do this work. Well, maybe that hasn’t changed. But to restrict all publicity is not what today’s investigators would agree to.
Green judges the client in terms of credentials. Note this curious “test”:
“The production of a caseful of apparatus at the commencement of an investigation in itself constitutes a test, for the witness of a genuine phenomena will be, or should be, impressed with the serious nature of ghost hunting, while the fraudulent will be worried by the prospect of being exposed.”
That’s quaint. Times have changed.
Green states “I believe” this is the process and how it works but, as with all other ghost hunting guides reviewed here, no support is given to these suppositions. For example: Heat extracted from the environment will energize a haunting. Such ideas about ghost manifestations are very old but have yet to be supported or well-argued.
In summary, Green subscribes to ghosts as real, but this guide provides a number of pieces of sound advice and many examples of normal causes that you will not find in any recent book. He is NOT as careless and overtly credulous as modern ghost hunters. Even though he makes some howlers, he knew his history. This book is well-written and properly edited; the language is written at a higher reading level than most. Some sources are cited in the text but not enough.
How to be a Ghost Hunter – Richard Southall, 2003
This book appears to have been written in 2001 from the front information. That was at the start of the massive proliferation of ghost hunting groups in the US. Southall is located in Parkersburg, West Virginia so examples from around that area are included. He calls it a “unique handbook” and it possibly was at the time. It is not now.
The book is of the “Confessions of a Ghost Hunter” type: ghosts are defined, historical aspects are mentioned, prior cases related, procedures and equipment are suggested, collection of data and evidence are described, and advice on forming a team is offered. Southall states he has a degree in journalism and psychology; the book also has a genuine publisher (of New Age books), which brings the quality and readability of this guide above most others. However, it follows the typical outline of information and includes many unsupported claims, assumptions and statements of “fact”.
Here are some examples:
- He assumes that ghosts exists, paranormal activity is ghost activity, and these certain descriptions are characteristics of ghosts. How he “knows” this is never explained. No sources are supplied.
- Various unsourced, un-detailed anecdotes are included. The reader is asked to accept these “just so” without proper justification.
- Undefined, sciencey-sounding terms are used throughout: “highest amount of paranormal energy”, “life force”, “psychic energy”.
- If you investigate enough, you will encounter a “demonic entity”. The Ouija board can invite it in so that device is dangerous to use. “The entity will concentrate on the one with the lowest psyche”.
- You can “recharge” a haunting with an object.
- “It is common knowledge in parapsychology and metaphysics” that every thing has a life force or aura.
- Orbs are indications that an area contains a great deal of psychic energy. They concentrate around a person emanating psychic energy.
Why did Southall do a ghost hunting guide? To promote the topic. He was running a ghost tour at the time. He states his role shifted from investigation to teaching. This book fails to supply us with any sense of the author’s scientific credibility. He refers to fictional movies, such as The Sixth Sense, to suggest the real world is really like this. Southall states that the scientific method is the means to get “tangible, measurable evidence” as opposed to psychic impressions and divination, though the two methods can validate each other. He is not a scientist and it shows.
This book also shows its age. The equipment portion is written for someone who has never owned a camera. It is dull, overly simplistic and sorely out of date with regards to use of digital equipment. He states this howler: “A photograph of a ghost cannot be denied.” This wasn’t even rational advice at the TIME, let alone in the age of phone apps.
He states a good investigator should be unbiased but the language from start to finish is completely biased in the belief that an area is likely haunted. Short shrift is given to examination of mundane causes. But he advises to talk up your own credibility: “Clients love credentials and memberships”. The bibliography contains no journals or scientific sources, just references to other ghost hunters’ books and mass marketed paranormal pablum.
Southall’s writing projects the attitude of a good person who is concerned with people who are having a paranormal problem and want answers that he believes he can provide. He understands that people need reassurance that what they experience is understandable and things will be OK. Unfortunately, it’s not that simple and misinformation like this makes it worse.
Ultimate Ghost Tech – Vince Wilson, 2012
This book was also published with more or less the same content as another one of Wilson’s books, “Ultimate Ghost Hunter”. Wilson informed me that he did not care for the term “Ghost Hunter” and has recently pulled that book from publication. Different title or not, the book follows the typical ghost hunter guide book. In one of the forewords (one is spelled “foreword”, the other “forword”), Vince is described as the “foremost expert in the technological aspects of paranormal investigation.”
In the other foreword, a rather well-respected parapsychologist reveals the blatant truth about ghost hunting technology: “Let’s face it: ghost hunters love their tech – even if they don’t know how to use it or to assess the data from it in light of the reported phenomena”. Indeed. I agree with that.
The rest of this book is an example of sounding sciencey but falling short of representing anything like scientific investigation. Wilson focuses on technology, of course. An earlier book, Ghost Science – which I saw as a must-read since I am deeply interested in ghosts + science – was atrocious. It was sloppy, formatted terribly, and at the very least, desperately needed an editor who could spell and eliminate awful turns of phrase. That book begins with the premise “One of the main purposes of this book is to show that, not only do ghosts exist but also that the laws that govern reality allow them”. Neither that book, nor this one will demonstrate that stated purpose to anyone who understands how science actually works. Wilson’s array of books (3) are essentially self-published. But according to Wilson, he has progressed past that first book, yet he still stands by the work he did in this one. I cringed at many aspects of UGT and how readers will be misinformed by much of its content.
Examples:
- He states “random energy particles may hold the essence of consciousness…” There is no basis for such speculation. Shall we talk homeopathy?
- “Ghosts will be proven to exist one day and so will psychics…” What is the basis of this claim? What will that effort entail? Why after 100 years of trying by actual professionals will things change now with amateur researchers?
- He uses several phrases that are painful to read, such as “just another theory” (where “theory” is used to mean “a guess” instead of the scientific meaning of an evidence-supported overarching model of explanation), “science is absolute” (What does that even mean?), “sorry about the math” (If you have to apologize for the language of science, you should NOT be reading or writing such a book) and “blah blah blah” (I can hardly think of ANY excuse to write that).
- He refers to “stuffy scientists” and takes a disparaging tone towards skeptics. In Ghost Science, he called skepticism a quasi-religion.
Several statements rankle me as revealing a disturbingly superficial and inflated attitude of ghost hunting hobbyists. He says Ghostbusters (the movie) changed paranormal research with its lingo and gadgets, “Paranormal research just became really cool overnight.” He suggests science as way to pump up your credibility – not real science, but faking it – saying you should answer questions from people with sciencey words to sound “professional and cool” and a little “nerdy”. People are too embarrassed to ask what you mean.
Not me. I ask. And science-pretenders skirt the uncomfortable questions.
“Ghostbusters”Wilson relates all the ubiquitous (and wrong) assumptions about ghosts starting with the belief that they exist (thus scuttling any unbiased investigation of what might really be happening to people). The paradigm of today’s ghost investigation is reflected: changes in the environment can be related to ghost behavior and hauntings; technology can provide objective evidence, more and different data, than just human experience. For example, he suggests that a cold spot could be created (through an explanation of energy transfer) from an entity moving through dimensions. This type of rhetoric (apparent in nearly all ghost hunting guides) gives hope but very flimsy justification to other ghost hunters that they will discover something scientifically incredible:
“You can be an amateur parapsychologist and usher in a new era of paranormal research. Wow! That’s pretty deep for me!” (p 160)
Cringe-worthy and specious.
Wilson, like many of these guide writers, seems well-meaning, but also willing to learn new things, expand his horizons, and is fairly literate in science ideas – just enough to sound knowledgable to people who aren’t scientists, which is most of the population. He is not a scientist but a science enthusiast. It’s a widespread trend for ghost hunters to quote scientific buzzwords and namedrop famous scientists. They attempt to apply very complex physics concepts and theories, such as quantum mechanics, Einstein’s “spooky action at a distance”, to inappropriate situations. There are no scientific sources cited or referenced and explained. There are basically NO sources for the various claims or even the quotes. The recommended reading list contains references that repeat these unverified speculative claims and include pop science sources like The Handy Science Answer Book. This is just not acceptable if you claim to be doing science.
Wilson understands that TV ghost hunters are playing a role and that many paranormal investigators are “fooled by an intense need to believe”. Hoaxes are rampant. So, there is a kernel of truth in much of what he writes. However, that is trumped by his own faith that equipment CAN detect anomalous energy of some sort. The processes he suggests leave out critical considerations about confounding factors and alternative explanations. Wilson has lectured as a ghost tech expert in the past. He suggests giving workshops to teach people about this topic is a good way to fundraise for your group. I find this playing pretend professor/scientist to be profoundly distasteful.
I accept that Vince will be unhappy with my take on his publications as an unfortunate consequence. But if anyone attempts to make such extraordinary claims that are so off the mark, unjustified, and can misinform society, you open yourself to such harsh criticism. I will call you on bullshit and hope you will consider ceasing its propagation.
How to Hunt Ghosts – Joshua P. Warren, 2003
This volume was produced by an affiliate of Simon and Schuster publishing so the basic elements of a book – grammar, punctuation, spelling and formatting – is superior to small or self-published efforts. But I can’t say we get better quality in the content. The same unsupported model, built on speculative paranormal assumptions, is applied.
The first words “Ghosts are real” show us this is not about investigation but about finding proof to support a preexisting conclusion. These opening words oddly contrast with the last words of the book, “Never pretend to know all the answers. All the answers are not known”. In between, we get a mish-mash of silly claims and scientific misrepresentation. Warren’s resumé does not include science. He writes fiction and worked in film making. Like many who appear on TV shows as talking heads, he touts these appearances to bolster his credibility. It works for those who get their facts from TV, I imagine.
Warren wins the prize for the most sciencey namedropping in a ghost hunting guide – Descartes, Newton, Einstein, Sagan – none of whom had anything positive to say about spirits. Non-scientist Warren says “Let me tell you what static electricity is…”. No, thanks. I’d rather get my science information from someplace OTHER THAN in a book about entities that have not been demonstrated to exist. If we are to take these ghost hunters seriously, they should explain why physicists aren’t writing books about the paranormal but non-scientists are.
Here are some illustrations of the ideas presented:
- Spiritual manifestations are hidden from us. Our technology is not good enough. There is scientific evidence that ghostly manifestations are real, he says. Warren provides no hint of why physicists can detect subatomic particles and the tiniest releases of energy but our technology is not adequate to identify ghosts. What scientific evidence is he talking about? It’s not in any journals, as is standard with scientific protocol, cited or mentioned.
- Mainstream science is bad because they need to limit their work to activity of a certain category. “Most scientists are busy enough researching the activity they already know about.” This reveals a core ignorance of how knowledge can progress and is a self-evidently dumb claim. From the early days of the scientific endeavor, knowledge became specialized by necessity. To say science is flawed because of this is like saying medicine is bad because too many doctors specialize in distinct areas of health or surgery. Specialization is advantageous for advancing deep knowledge. Astronomers aren’t collecting and evaluating the same data as biologists or sociologists.
- If a person dies young, especially violently, “it is likley that a ghost will remain”.
- Ghosts wrap themselves in ions in order to interact physically. If this is correct, he adds, we can use this to predict and manipulate the phenomena. There is a kernel of science in there but the assumption that ghosts exists, utilize ions, and interact physcially are all grand assumptions.
- “Virtually any location can prove to be haunted.” You should experiment to decide if the Ouija board, automatic writing, pendulums, etc. work for you.
- Warps are areas were the laws of physics seem to be distorted. These may create natural portals. “Warps exemplify the most complicated issues facing science today”. They can be filled with “hundreds or thousands” of entities. The example of a warp is given as the Bermuda Triangle, a myth that was exploded decades ago as sensationalized fiction. Take note that Warren runs a “Bermuda Triangle Research” site in Puerto Rico.
- There is a “correlation between ghost manifestations and standing (acoustical) waves” – it may make the ghost appear. This is in contrast to the well-known research of Vic Tandy who demonstrated that an inadvertently created standing wave was responsible for behavior of materials (metal fencing foil) and possibly the fluid in our eyeballs that could lead to ghost-like reports. Unless I’m missing something (there are no citations to check), Warren has this concept COMPLETELY backwards.
We’re way out on the fringe here. Such incredible claims should have equally incredible documentation provided. Nope. Nothing. It’s practically lying.
Warren knows some science basics, that’s clear, but like many other ghost researchers, he applies them wildly incorrectly. There is an overuse of the term energy without a reasonable definition provided. Warren claims that there is energy of attraction, energy that comes out of our eyes when we look at someone. He says we have auras around us. Dowsing rods that you can make yourself can detect energy fields. His research group (of which he is founder and president) is called the League of Energy Materialization and Unexplained Phenomenon Research (LEMUR). I first heard of Warren through his investigation of the ghost light phenomena. He also thinks this is energy produced by the earth. On the whole, this is one of his lesser outrageous ideas, since such lights are actually documented in several places around the world, but the methods of amateur research are unlikely to produce any results of value. The answer to what causes ghost lights is certainly complex and multivariate.
Warren refers to many fictional movies for examples – he is, after all, a fiction novelist. I question at what level ghost hunters can distinguish scientific facts from PURE fictional license. And, their lack of attention to examination of very normal, reasonable explanations, providing foundationless claims instead that might as well be fiction, dooms them to failure in any effort to advance worthwhile conclusions about ghost experiences. It also leaves them wide open targets for derision by scientists working in legitimate research endeavors. Warren exhibits paranormal pretentiousness. Since he’s moved into the realm of hawking “wishing machines” and lucky charms, he’s lost all credibility. Scientific? Credible? Not in any senses of the words.
Additional Samples
To try to be as thorough as possible, I accessed a sample of several of the dozens of e-books available in the Amazon lenders library. I tried to pick those that ranked high in the search. I did not preview them beforehand so this is nearly a “random” selection off the shelf.
Unsurprisingly, these also fit into the same template and had similar characteristics:
- “Just so” facts and stories
- No references
- Lack of proofing or editing including several typographical errors and incorrect punctuation
- Poor layout and design
- Unsophisticated, overly casual writing style
- Superficial content
I included screen shots of various selections that I highlighted in these books to show I’m not making this stuff up – this is what people really wrote and marketed for sale.
Ultimate Ghost Hunting Guide – Jeff Terrozas, 2011
Subtitled “Everything you need to know for paranormal research”, the content is overly rambling and amateurish. Typos abound, the layout is annoyingly sloppy. The premise is that ghost hunting is “fun”, so have fun. It’s not to be taken seriously unless you want to make money. In that case, you should act “professional”. This book should not be taken seriously.
Ghost Seekers Field Guide, Volume 1 – Frank Potterstone, 2011
No proofreading or editing was apparently done to this manuscript. The language and grammar is poor, typos are abundant and the layout is simply ugly. There is an overuse of ellipses, and random unattributed quotes. Though the author means well, with these factors, the lack of adherence to punctuation conventions, and the unfocused content, this book is unreadable. Yes, there was a Volume 2 as well.
Ultimate Ghost Hunter Field Guide – Brandy Burgess, n.d.
Layout is very poor with line breaks in the middle of a sentence and random capitalization of words. Grammar is poor and the writing is amateurish and unfocused. The author lays out “facts” such as a description of “psychic burns” and “awakenings” without any support for such supernatural claims. She says you will know a spirit is demonic because of the sulfur or rotten flesh smell as well as the growling sounds. They also appear in half-human, half-animal form. These sound like verifiable claims; one wonders why we can’t prove such incredible new findings if they are so obvious.
* * * *
* * * *
Ghost Hunting 101: The Ultimate Resource for Beginner and Experienced Ghost Hunters – Ghostly World, 2015
Ghostly World is a website “dedicated to all things haunted”. The authors say on their site that they are not an investigation team or even “in the paranormal field”. Yet, here they are publishing and charging for an instruction book on ghost hunting. How’s that for zero credibility?
The layout of this book is good and the writing style is generally appropriate to a serious handbook. There are some typos. The content is shallow and lacks development and explanations. Terms and labels are assigned subjectively. For example, readers are told there are three kinds of ghost hunters: a hobbyist, a serious researcher and a home investigator. A random graph is included (because graphs look sciencey) without any source data to show 100% are hobbyists, 50% are serious researchers and only 10% are home investigators. Going into a client’s home is serious stuff where the ghost hunter needs to provide comfort and assistance to the residents while studying spirits. The unnamed author(s) suggest the ghost hunter may need to act in the capacity of a “therapist” – a highly unethical suggestion. Meanwhile, the reader is warned that Ouija boards and other occult dealings will bring about dangerous evil spirits. They seem to think Grant Wilson and Jason Hawes invented ghost hunting.
Some of these books are surprisingly candid, as I found with How to Legally Gain Access to Haunted Locations: A Guide for Paranormal Investigators (n.d.) by Casper Waylin. Waylin makes no apologies for playing pretend and weaseling your way into clients’ homes. He recommends following what you see on TV shows:
Professionalism starts as “pretending” but evolves into something that’s real. If you’re just getting started as a ghost hunting group, you’ll need to pretend that you’re a “professional” and put on a convincing act for the people you talk to in order to gain entry into a particular location. Put together a good costume (some nice clothes) and props (legal documents and contracts) and then tell clients and gatekeepers exactly what you plan to do from beginning to end. In terms of how you greet and speak to new clients, it can help to model other group leaders you’ve seen on TV or read about in books and for crying out loud, make sure that you have a firm handshake and you look them in the eye during your initial contact!
and
Acting professional is okay if you’re not really a professional. Find a character in a movie or watch some of the later episodes of TAPS [Ghost Hunters] or Ghost Adventures and emulate the paranormal investigators that you can relate to best.
So, copy the guys on TV when you enter other people’s houses. This is awful, awful stuff.
Finally, I would like to mention a specialty guide called The Other Side: A Teen’s guide to Ghost Hunting and the Paranormal (2009) by Gibson, Burns, and Schrader. This might be considered one of the least worst books since it was done by a reputable publisher and contains a handful of good advice. There are two overarching and egregious problems with this book. 1. Misinformation directed at teens to take on this topic and “educate the masses” about “what our place is in the universe and what possibilities there are of an afterlife”; and 2. The ignorant and condescending attitude towards science as hard and cumbersome, and skepticism as cynical bullying (p. 67). The logical fallacies and unsupported claims rampant in this book would make it excellent to use as an example for a critical thinking exercise.
Most, perhaps all, of these authors wrote these books because they believed it would be helpful to an audience or to their investigation group as a way to codify what they deemed to be important knowledge and procedures that everyone was expected to follow. With the advent of easy self-publishing, we’ve seen a proliferation of low-quality, previously unpublishable books like never before. Anyone, even someone who never wrote an article or term paper, can publish a book, sell it, and claim to be an author. There is no excuse for publishing a book without having it edited for basic grammar, spelling, and punctuation. If I had a nickel for all the times I read the phrases “First of all”, “First and foremost”, “Suffice (it) to say”, and “Let me be clear” in these books, I would take my few bucks and go buy a drink. There is no justification for the amount of self-serving, misguided misinformation out there that promises the reader that “this book” is the (ultimate) thing you need to set yourself up as a genuine, credible, and successful ghost hunter.
My recommendation: Don’t bother with any of them.
Look up books done by professional science writers or work done by actual parapsychologists to learn the literature of the field before you write a book and say you know what you are talking about.
I’ll end with some suggestions for those who plan to write future guides to the paranormal, if there has to be any…
There are two books you must research. BUY Scientific Paranormal Investigation by Benjamin Radford (2010). If you do any paranormal investigation, this should be your only guide for now.
Secondly, refer to Parapsychology, A Handbook for the 21st Century by Cardena et al., eds. (2015). You can borrow this from a university library or browse it online. While I have disagreements with content in this volume, it is an example of a credible way to construct a sophisticated and useful handbook that will be relevant for decades. It will also give the ghost hunter hobbyists an eye-opener on the insane amount of parapsychological research that has been done by far more qualified people of various disciplines. Written at a college reading level, it is not in the same class of books cited above making all amateur guides look extremely unsophisticated. But if you are going to claim to be doing groundbreaking important research that will enhance our future knowledge about spirits and hauntings, you REALLY need to up your game. Considerably. I call for no more ghost guidebooks.
References:
Hill, Sharon (2010) Being Scientifical: Popularity, Purpose and Promotion of Amateur Research and Investigation Groups in the U.S. A thesis submitted to the Faculty of the Graduate School of the University at Buffalo, State University of New York in partial fulfillment of requirements for Degree of Master of Education EdM [PDF]
Hill, Sharon (2013) Sounds Sciencey Presentation at NECSS https://www.youtube.com/watch?v=9CmgweT0eE0
#ghostHunters #ghostHuntingGuide #paranormalInvestigation #paranormalInvestigators
-
A Guide to Ghost Hunting Guidebooks: NO MORE! Please!
This might come as a shock to the millions of ghost enthusiasts out there: The scientific consensus is that ghosts are NOT spirits, remnants of the dead, recordings of energy, or supernatural entities. Our existing knowledge about nature does not point to a conclusion that ghosts are a single definable thing, paranormal or normal, that you can find, observe, measure, or study. Yet, there are about 200 guides to “ghost hunting” in print or e-book form that lay out ways to obtain evidence of or make contact with ghosts. Therefore, we have a conundrum at step one of any attempt at ghost hunting – we can’t define what a ghost is, and we do not know its properties because we’ve never determined that they exist and measured them. No ghost handbook has ever led anyone to catch and identify ghosts, they can only lead you to interpret something as a ghost.
In that sense, all ghost hunting books are worthless. So why bother with them?
First, it’s an interesting cultural phenomena. Actively investigating reports of ghosts and paranormal activity is mainstream and a popular hobby and tourism draw. In 2010, there were over 1000 paranormal investigation groups in the US, the majority of which researched hauntings. (Hill, 2010) It’s not worthless to examine why people spend their time and money on this hobby and how they go about doing it.
Second, the idea of paranormal investigation contains important aspects of society’s attitudes towards finding out about the world, decided what is meaningful and true, using science to examine questions, cooperation and trust in a community, and taking part in a larger effort beyond one’s own small role in life.
I’m deeply interested in the second point. I’ve found that examining amateur paranormal group behaviors and output highlights concepts about science education and public discourse about belief and reality. This piece mentions 11 books on ghost hunting that I have examined. They have broad similarities and distinct differences. In the main portion, I review 4 books on the basis of the following:
- Readability (language, errors, quality of writing)
- Credibility (sources, supported arguments vs speculation, factual correctness)
- Overall value as a cultural product (Buy it or not?)
I picked these particular books for several reasons. They span a significant spectrum in time over which we can watch the evolution of ghost hunting technique. I think they are generally representative of this narrow niche. There are better and worse ones, I’m sure. In searching for a selection, I realized I could not POSSIBLY read them all, nor would I want to spend money on them. Many appear to be self-published since several ghost investigation group leaders feel the need to have their own personal volume to use.
Please note that when I mention today’s “modern” ghost hunters I am referring to those who have watched Ghost Hunters, Ghost Adventures, Paranormal State and other television shows of this genre. It’s well-established (Hill, 2010) that today’s popular hobby grew from fans of these shows who copied what they saw on TV as their preferred method.
Ghost Hunting: A Practical Guide (UK) – Andrew Green, 1973
Andrew Green was called “the Spectre Inspector” and was a well-educated pursuer of ghosts for sixty years. He felt that there was such an interest in the subject of ghosts that there was a need for a small, non-technical guide for the amateur. This is the “first-ever do-it-yourself guide for the psychic researcher”. Green eschews fanaticism and suggests that those interested in the ghost phenomenon study parapsychology, thus reflecting the thinking at that time that academic parapsychology would unlock the mystery of life after death. Therefore, a good portion of the book describes parapsychological concepts, such as telepathy, which he states can be an important consideration as to the cause of a phenomena. He describes Zener cards experiments, which would later appear as what ghost researchers study in Ghostbusters (1984). This portion of the book will be rather strange to those weaned on 21st century ghost tv shows (if they manage to find and read this book AT ALL).
Green was certain that psychic powers would be soon be recognized (and respected) by science, the church, and society. He remarked that the existence of ghosts can hardly be challenged in the face of all the cases that have been reported – a common justification for investigators to do their thing. As with many paranormal investigators, Green considered serious ghost hunting important and “groundbreaking” work, the researchers as mavericks.
Contrasting Green’s book with modern ghost guides, we can see some striking differences:
- Crisis apparitions were described as “thought pictures”. These types of events were more commonly reported then (as were poltergeists). Both were seen to be manifestation of psychical powers. Today’s ghosts hunters are rarely fluent in these historical parapsychological terms.
- EVPs were called Raudive voices and are not emphasized as evidence. Green thought there were too many potential pitfalls to use them this way.
- The technology was primitive compared with what we have today. Equipment included very basic detective-type materials: level, compass, strain-gage, sand or sugar, powder for fingerprints, thread, maybe a camera. But the idea of measuring environmental variables was already being pursued by the Society of Psychical Research.
- Green mentions exorcism but it was clearly not as common as today and people were less bold about it. Today, the concept pervades pop culture and it is treated as a stunt or a ritual that you can train yourself to do. It’s taken less seriously.
- Green’s advice is that the investigator must be thorough and careful in research and provide a sophisticated investigation. He recommends studying the geology, geography, and past owners. I get the impression that Green’s investigations were not the weekend overnighters of today’s ghost hunters. They were long-term investments in time and effort. The resulting report was to be of print quality!
- The investigator should NEVER get involved in publicity for the case, Green advises. He recognized that some people are in it just for the attention and this was not a proper impetus to do this work. Well, maybe that hasn’t changed. But to restrict all publicity is not what today’s investigators would agree to.
Green judges the client in terms of credentials. Note this curious “test”:
“The production of a caseful of apparatus at the commencement of an investigation in itself constitutes a test, for the witness of a genuine phenomena will be, or should be, impressed with the serious nature of ghost hunting, while the fraudulent will be worried by the prospect of being exposed.”
That’s quaint. Times have changed.
Green states “I believe” this is the process and how it works but, as with all other ghost hunting guides reviewed here, no support is given to these suppositions. For example: Heat extracted from the environment will energize a haunting. Such ideas about ghost manifestations are very old but have yet to be supported or well-argued.
In summary, Green subscribes to ghosts as real, but this guide provides a number of pieces of sound advice and many examples of normal causes that you will not find in any recent book. He is NOT as careless and overtly credulous as modern ghost hunters. Even though he makes some howlers, he knew his history. This book is well-written and properly edited; the language is written at a higher reading level than most. Some sources are cited in the text but not enough.
How to be a Ghost Hunter – Richard Southall, 2003
This book appears to have been written in 2001 from the front information. That was at the start of the massive proliferation of ghost hunting groups in the US. Southall is located in Parkersburg, West Virginia so examples from around that area are included. He calls it a “unique handbook” and it possibly was at the time. It is not now.
The book is of the “Confessions of a Ghost Hunter” type: ghosts are defined, historical aspects are mentioned, prior cases related, procedures and equipment are suggested, collection of data and evidence are described, and advice on forming a team is offered. Southall states he has a degree in journalism and psychology; the book also has a genuine publisher (of New Age books), which brings the quality and readability of this guide above most others. However, it follows the typical outline of information and includes many unsupported claims, assumptions and statements of “fact”.
Here are some examples:
- He assumes that ghosts exists, paranormal activity is ghost activity, and these certain descriptions are characteristics of ghosts. How he “knows” this is never explained. No sources are supplied.
- Various unsourced, un-detailed anecdotes are included. The reader is asked to accept these “just so” without proper justification.
- Undefined, sciencey-sounding terms are used throughout: “highest amount of paranormal energy”, “life force”, “psychic energy”.
- If you investigate enough, you will encounter a “demonic entity”. The Ouija board can invite it in so that device is dangerous to use. “The entity will concentrate on the one with the lowest psyche”.
- You can “recharge” a haunting with an object.
- “It is common knowledge in parapsychology and metaphysics” that every thing has a life force or aura.
- Orbs are indications that an area contains a great deal of psychic energy. They concentrate around a person emanating psychic energy.
Why did Southall do a ghost hunting guide? To promote the topic. He was running a ghost tour at the time. He states his role shifted from investigation to teaching. This book fails to supply us with any sense of the author’s scientific credibility. He refers to fictional movies, such as The Sixth Sense, to suggest the real world is really like this. Southall states that the scientific method is the means to get “tangible, measurable evidence” as opposed to psychic impressions and divination, though the two methods can validate each other. He is not a scientist and it shows.
This book also shows its age. The equipment portion is written for someone who has never owned a camera. It is dull, overly simplistic and sorely out of date with regards to use of digital equipment. He states this howler: “A photograph of a ghost cannot be denied.” This wasn’t even rational advice at the TIME, let alone in the age of phone apps.
He states a good investigator should be unbiased but the language from start to finish is completely biased in the belief that an area is likely haunted. Short shrift is given to examination of mundane causes. But he advises to talk up your own credibility: “Clients love credentials and memberships”. The bibliography contains no journals or scientific sources, just references to other ghost hunters’ books and mass marketed paranormal pablum.
Southall’s writing projects the attitude of a good person who is concerned with people who are having a paranormal problem and want answers that he believes he can provide. He understands that people need reassurance that what they experience is understandable and things will be OK. Unfortunately, it’s not that simple and misinformation like this makes it worse.
Ultimate Ghost Tech – Vince Wilson, 2012
This book was also published with more or less the same content as another one of Wilson’s books, “Ultimate Ghost Hunter”. Wilson informed me that he did not care for the term “Ghost Hunter” and has recently pulled that book from publication. Different title or not, the book follows the typical ghost hunter guide book. In one of the forewords (one is spelled “foreword”, the other “forword”), Vince is described as the “foremost expert in the technological aspects of paranormal investigation.”
In the other foreword, a rather well-respected parapsychologist reveals the blatant truth about ghost hunting technology: “Let’s face it: ghost hunters love their tech – even if they don’t know how to use it or to assess the data from it in light of the reported phenomena”. Indeed. I agree with that.
The rest of this book is an example of sounding sciencey but falling short of representing anything like scientific investigation. Wilson focuses on technology, of course. An earlier book, Ghost Science – which I saw as a must-read since I am deeply interested in ghosts + science – was atrocious. It was sloppy, formatted terribly, and at the very least, desperately needed an editor who could spell and eliminate awful turns of phrase. That book begins with the premise “One of the main purposes of this book is to show that, not only do ghosts exist but also that the laws that govern reality allow them”. Neither that book, nor this one will demonstrate that stated purpose to anyone who understands how science actually works. Wilson’s array of books (3) are essentially self-published. But according to Wilson, he has progressed past that first book, yet he still stands by the work he did in this one. I cringed at many aspects of UGT and how readers will be misinformed by much of its content.
Examples:
- He states “random energy particles may hold the essence of consciousness…” There is no basis for such speculation. Shall we talk homeopathy?
- “Ghosts will be proven to exist one day and so will psychics…” What is the basis of this claim? What will that effort entail? Why after 100 years of trying by actual professionals will things change now with amateur researchers?
- He uses several phrases that are painful to read, such as “just another theory” (where “theory” is used to mean “a guess” instead of the scientific meaning of an evidence-supported overarching model of explanation), “science is absolute” (What does that even mean?), “sorry about the math” (If you have to apologize for the language of science, you should NOT be reading or writing such a book) and “blah blah blah” (I can hardly think of ANY excuse to write that).
- He refers to “stuffy scientists” and takes a disparaging tone towards skeptics. In Ghost Science, he called skepticism a quasi-religion.
Several statements rankle me as revealing a disturbingly superficial and inflated attitude of ghost hunting hobbyists. He says Ghostbusters (the movie) changed paranormal research with its lingo and gadgets, “Paranormal research just became really cool overnight.” He suggests science as way to pump up your credibility – not real science, but faking it – saying you should answer questions from people with sciencey words to sound “professional and cool” and a little “nerdy”. People are too embarrassed to ask what you mean.
Not me. I ask. And science-pretenders skirt the uncomfortable questions.
“Ghostbusters”Wilson relates all the ubiquitous (and wrong) assumptions about ghosts starting with the belief that they exist (thus scuttling any unbiased investigation of what might really be happening to people). The paradigm of today’s ghost investigation is reflected: changes in the environment can be related to ghost behavior and hauntings; technology can provide objective evidence, more and different data, than just human experience. For example, he suggests that a cold spot could be created (through an explanation of energy transfer) from an entity moving through dimensions. This type of rhetoric (apparent in nearly all ghost hunting guides) gives hope but very flimsy justification to other ghost hunters that they will discover something scientifically incredible:
“You can be an amateur parapsychologist and usher in a new era of paranormal research. Wow! That’s pretty deep for me!” (p 160)
Cringe-worthy and specious.
Wilson, like many of these guide writers, seems well-meaning, but also willing to learn new things, expand his horizons, and is fairly literate in science ideas – just enough to sound knowledgable to people who aren’t scientists, which is most of the population. He is not a scientist but a science enthusiast. It’s a widespread trend for ghost hunters to quote scientific buzzwords and namedrop famous scientists. They attempt to apply very complex physics concepts and theories, such as quantum mechanics, Einstein’s “spooky action at a distance”, to inappropriate situations. There are no scientific sources cited or referenced and explained. There are basically NO sources for the various claims or even the quotes. The recommended reading list contains references that repeat these unverified speculative claims and include pop science sources like The Handy Science Answer Book. This is just not acceptable if you claim to be doing science.
Wilson understands that TV ghost hunters are playing a role and that many paranormal investigators are “fooled by an intense need to believe”. Hoaxes are rampant. So, there is a kernel of truth in much of what he writes. However, that is trumped by his own faith that equipment CAN detect anomalous energy of some sort. The processes he suggests leave out critical considerations about confounding factors and alternative explanations. Wilson has lectured as a ghost tech expert in the past. He suggests giving workshops to teach people about this topic is a good way to fundraise for your group. I find this playing pretend professor/scientist to be profoundly distasteful.
I accept that Vince will be unhappy with my take on his publications as an unfortunate consequence. But if anyone attempts to make such extraordinary claims that are so off the mark, unjustified, and can misinform society, you open yourself to such harsh criticism. I will call you on bullshit and hope you will consider ceasing its propagation.
How to Hunt Ghosts – Joshua P. Warren, 2003
This volume was produced by an affiliate of Simon and Schuster publishing so the basic elements of a book – grammar, punctuation, spelling and formatting – is superior to small or self-published efforts. But I can’t say we get better quality in the content. The same unsupported model, built on speculative paranormal assumptions, is applied.
The first words “Ghosts are real” show us this is not about investigation but about finding proof to support a preexisting conclusion. These opening words oddly contrast with the last words of the book, “Never pretend to know all the answers. All the answers are not known”. In between, we get a mish-mash of silly claims and scientific misrepresentation. Warren’s resumé does not include science. He writes fiction and worked in film making. Like many who appear on TV shows as talking heads, he touts these appearances to bolster his credibility. It works for those who get their facts from TV, I imagine.
Warren wins the prize for the most sciencey namedropping in a ghost hunting guide – Descartes, Newton, Einstein, Sagan – none of whom had anything positive to say about spirits. Non-scientist Warren says “Let me tell you what static electricity is…”. No, thanks. I’d rather get my science information from someplace OTHER THAN in a book about entities that have not been demonstrated to exist. If we are to take these ghost hunters seriously, they should explain why physicists aren’t writing books about the paranormal but non-scientists are.
Here are some illustrations of the ideas presented:
- Spiritual manifestations are hidden from us. Our technology is not good enough. There is scientific evidence that ghostly manifestations are real, he says. Warren provides no hint of why physicists can detect subatomic particles and the tiniest releases of energy but our technology is not adequate to identify ghosts. What scientific evidence is he talking about? It’s not in any journals, as is standard with scientific protocol, cited or mentioned.
- Mainstream science is bad because they need to limit their work to activity of a certain category. “Most scientists are busy enough researching the activity they already know about.” This reveals a core ignorance of how knowledge can progress and is a self-evidently dumb claim. From the early days of the scientific endeavor, knowledge became specialized by necessity. To say science is flawed because of this is like saying medicine is bad because too many doctors specialize in distinct areas of health or surgery. Specialization is advantageous for advancing deep knowledge. Astronomers aren’t collecting and evaluating the same data as biologists or sociologists.
- If a person dies young, especially violently, “it is likley that a ghost will remain”.
- Ghosts wrap themselves in ions in order to interact physically. If this is correct, he adds, we can use this to predict and manipulate the phenomena. There is a kernel of science in there but the assumption that ghosts exists, utilize ions, and interact physcially are all grand assumptions.
- “Virtually any location can prove to be haunted.” You should experiment to decide if the Ouija board, automatic writing, pendulums, etc. work for you.
- Warps are areas were the laws of physics seem to be distorted. These may create natural portals. “Warps exemplify the most complicated issues facing science today”. They can be filled with “hundreds or thousands” of entities. The example of a warp is given as the Bermuda Triangle, a myth that was exploded decades ago as sensationalized fiction. Take note that Warren runs a “Bermuda Triangle Research” site in Puerto Rico.
- There is a “correlation between ghost manifestations and standing (acoustical) waves” – it may make the ghost appear. This is in contrast to the well-known research of Vic Tandy who demonstrated that an inadvertently created standing wave was responsible for behavior of materials (metal fencing foil) and possibly the fluid in our eyeballs that could lead to ghost-like reports. Unless I’m missing something (there are no citations to check), Warren has this concept COMPLETELY backwards.
We’re way out on the fringe here. Such incredible claims should have equally incredible documentation provided. Nope. Nothing. It’s practically lying.
Warren knows some science basics, that’s clear, but like many other ghost researchers, he applies them wildly incorrectly. There is an overuse of the term energy without a reasonable definition provided. Warren claims that there is energy of attraction, energy that comes out of our eyes when we look at someone. He says we have auras around us. Dowsing rods that you can make yourself can detect energy fields. His research group (of which he is founder and president) is called the League of Energy Materialization and Unexplained Phenomenon Research (LEMUR). I first heard of Warren through his investigation of the ghost light phenomena. He also thinks this is energy produced by the earth. On the whole, this is one of his lesser outrageous ideas, since such lights are actually documented in several places around the world, but the methods of amateur research are unlikely to produce any results of value. The answer to what causes ghost lights is certainly complex and multivariate.
Warren refers to many fictional movies for examples – he is, after all, a fiction novelist. I question at what level ghost hunters can distinguish scientific facts from PURE fictional license. And, their lack of attention to examination of very normal, reasonable explanations, providing foundationless claims instead that might as well be fiction, dooms them to failure in any effort to advance worthwhile conclusions about ghost experiences. It also leaves them wide open targets for derision by scientists working in legitimate research endeavors. Warren exhibits paranormal pretentiousness. Since he’s moved into the realm of hawking “wishing machines” and lucky charms, he’s lost all credibility. Scientific? Credible? Not in any senses of the words.
Additional Samples
To try to be as thorough as possible, I accessed a sample of several of the dozens of e-books available in the Amazon lenders library. I tried to pick those that ranked high in the search. I did not preview them beforehand so this is nearly a “random” selection off the shelf.
Unsurprisingly, these also fit into the same template and had similar characteristics:
- “Just so” facts and stories
- No references
- Lack of proofing or editing including several typographical errors and incorrect punctuation
- Poor layout and design
- Unsophisticated, overly casual writing style
- Superficial content
I included screen shots of various selections that I highlighted in these books to show I’m not making this stuff up – this is what people really wrote and marketed for sale.
Ultimate Ghost Hunting Guide – Jeff Terrozas, 2011
Subtitled “Everything you need to know for paranormal research”, the content is overly rambling and amateurish. Typos abound, the layout is annoyingly sloppy. The premise is that ghost hunting is “fun”, so have fun. It’s not to be taken seriously unless you want to make money. In that case, you should act “professional”. This book should not be taken seriously.
Ghost Seekers Field Guide, Volume 1 – Frank Potterstone, 2011
No proofreading or editing was apparently done to this manuscript. The language and grammar is poor, typos are abundant and the layout is simply ugly. There is an overuse of ellipses, and random unattributed quotes. Though the author means well, with these factors, the lack of adherence to punctuation conventions, and the unfocused content, this book is unreadable. Yes, there was a Volume 2 as well.
Ultimate Ghost Hunter Field Guide – Brandy Burgess, n.d.
Layout is very poor with line breaks in the middle of a sentence and random capitalization of words. Grammar is poor and the writing is amateurish and unfocused. The author lays out “facts” such as a description of “psychic burns” and “awakenings” without any support for such supernatural claims. She says you will know a spirit is demonic because of the sulfur or rotten flesh smell as well as the growling sounds. They also appear in half-human, half-animal form. These sound like verifiable claims; one wonders why we can’t prove such incredible new findings if they are so obvious.
* * * *
* * * *
Ghost Hunting 101: The Ultimate Resource for Beginner and Experienced Ghost Hunters – Ghostly World, 2015
Ghostly World is a website “dedicated to all things haunted”. The authors say on their site that they are not an investigation team or even “in the paranormal field”. Yet, here they are publishing and charging for an instruction book on ghost hunting. How’s that for zero credibility?
The layout of this book is good and the writing style is generally appropriate to a serious handbook. There are some typos. The content is shallow and lacks development and explanations. Terms and labels are assigned subjectively. For example, readers are told there are three kinds of ghost hunters: a hobbyist, a serious researcher and a home investigator. A random graph is included (because graphs look sciencey) without any source data to show 100% are hobbyists, 50% are serious researchers and only 10% are home investigators. Going into a client’s home is serious stuff where the ghost hunter needs to provide comfort and assistance to the residents while studying spirits. The unnamed author(s) suggest the ghost hunter may need to act in the capacity of a “therapist” – a highly unethical suggestion. Meanwhile, the reader is warned that Ouija boards and other occult dealings will bring about dangerous evil spirits. They seem to think Grant Wilson and Jason Hawes invented ghost hunting.
Some of these books are surprisingly candid, as I found with How to Legally Gain Access to Haunted Locations: A Guide for Paranormal Investigators (n.d.) by Casper Waylin. Waylin makes no apologies for playing pretend and weaseling your way into clients’ homes. He recommends following what you see on TV shows:
Professionalism starts as “pretending” but evolves into something that’s real. If you’re just getting started as a ghost hunting group, you’ll need to pretend that you’re a “professional” and put on a convincing act for the people you talk to in order to gain entry into a particular location. Put together a good costume (some nice clothes) and props (legal documents and contracts) and then tell clients and gatekeepers exactly what you plan to do from beginning to end. In terms of how you greet and speak to new clients, it can help to model other group leaders you’ve seen on TV or read about in books and for crying out loud, make sure that you have a firm handshake and you look them in the eye during your initial contact!
and
Acting professional is okay if you’re not really a professional. Find a character in a movie or watch some of the later episodes of TAPS [Ghost Hunters] or Ghost Adventures and emulate the paranormal investigators that you can relate to best.
So, copy the guys on TV when you enter other people’s houses. This is awful, awful stuff.
Finally, I would like to mention a specialty guide called The Other Side: A Teen’s guide to Ghost Hunting and the Paranormal (2009) by Gibson, Burns, and Schrader. This might be considered one of the least worst books since it was done by a reputable publisher and contains a handful of good advice. There are two overarching and egregious problems with this book. 1. Misinformation directed at teens to take on this topic and “educate the masses” about “what our place is in the universe and what possibilities there are of an afterlife”; and 2. The ignorant and condescending attitude towards science as hard and cumbersome, and skepticism as cynical bullying (p. 67). The logical fallacies and unsupported claims rampant in this book would make it excellent to use as an example for a critical thinking exercise.
Most, perhaps all, of these authors wrote these books because they believed it would be helpful to an audience or to their investigation group as a way to codify what they deemed to be important knowledge and procedures that everyone was expected to follow. With the advent of easy self-publishing, we’ve seen a proliferation of low-quality, previously unpublishable books like never before. Anyone, even someone who never wrote an article or term paper, can publish a book, sell it, and claim to be an author. There is no excuse for publishing a book without having it edited for basic grammar, spelling, and punctuation. If I had a nickel for all the times I read the phrases “First of all”, “First and foremost”, “Suffice (it) to say”, and “Let me be clear” in these books, I would take my few bucks and go buy a drink. There is no justification for the amount of self-serving, misguided misinformation out there that promises the reader that “this book” is the (ultimate) thing you need to set yourself up as a genuine, credible, and successful ghost hunter.
My recommendation: Don’t bother with any of them.
Look up books done by professional science writers or work done by actual parapsychologists to learn the literature of the field before you write a book and say you know what you are talking about.
I’ll end with some suggestions for those who plan to write future guides to the paranormal, if there has to be any…
There are two books you must research. BUY Scientific Paranormal Investigation by Benjamin Radford (2010). If you do any paranormal investigation, this should be your only guide for now.
Secondly, refer to Parapsychology, A Handbook for the 21st Century by Cardena et al., eds. (2015). You can borrow this from a university library or browse it online. While I have disagreements with content in this volume, it is an example of a credible way to construct a sophisticated and useful handbook that will be relevant for decades. It will also give the ghost hunter hobbyists an eye-opener on the insane amount of parapsychological research that has been done by far more qualified people of various disciplines. Written at a college reading level, it is not in the same class of books cited above making all amateur guides look extremely unsophisticated. But if you are going to claim to be doing groundbreaking important research that will enhance our future knowledge about spirits and hauntings, you REALLY need to up your game. Considerably. I call for no more ghost guidebooks.
References:
Hill, Sharon (2010) Being Scientifical: Popularity, Purpose and Promotion of Amateur Research and Investigation Groups in the U.S. A thesis submitted to the Faculty of the Graduate School of the University at Buffalo, State University of New York in partial fulfillment of requirements for Degree of Master of Education EdM [PDF]
Hill, Sharon (2013) Sounds Sciencey Presentation at NECSS https://www.youtube.com/watch?v=9CmgweT0eE0
#ghostHunters #ghostHuntingGuide #paranormalInvestigation #paranormalInvestigators
-
The Irish and the Cymric names for the month of June are obviously related: Meitheamh and Mehefin.
But what's their original meaning? It's Midsummer, fittingly, developed from a common reconstructed Proto-Celtic word *medyo-samīno-#100daysofGaeilge
#etymology
#irish
#Gaeilge
#Welsh
#Cymraeg
#Cymric
#protoceltic -
Orchid #plants nurture their seedlings via shared underground fungal network https://phys.org/news/2024-05-orchid-nurture-seedlings-underground-fungal.html
Photosynthate transfer from an autotrophic #orchid to conspecific heterotrophic protocorms through a common #mycorrhizal #fungi network https://nph.onlinelibrary.wiley.com/doi/10.1111/nph.19810
"This finding is exciting because why these #orchids are often found in clumps, despite their #seeds being wind dispersed, has been a puzzle for hundreds of years."
-
Orchid #plants nurture their seedlings via shared underground fungal network https://phys.org/news/2024-05-orchid-nurture-seedlings-underground-fungal.html
Photosynthate transfer from an autotrophic #orchid to conspecific heterotrophic protocorms through a common #mycorrhizal #fungi network https://nph.onlinelibrary.wiley.com/doi/10.1111/nph.19810
"This finding is exciting because why these #orchids are often found in clumps, despite their #seeds being wind dispersed, has been a puzzle for hundreds of years."
-
Orchid #plants nurture their seedlings via shared underground fungal network https://phys.org/news/2024-05-orchid-nurture-seedlings-underground-fungal.html
Photosynthate transfer from an autotrophic #orchid to conspecific heterotrophic protocorms through a common #mycorrhizal #fungi network https://nph.onlinelibrary.wiley.com/doi/10.1111/nph.19810
"This finding is exciting because why these #orchids are often found in clumps, despite their #seeds being wind dispersed, has been a puzzle for hundreds of years."
-
Orchid #plants nurture their seedlings via shared underground fungal network https://phys.org/news/2024-05-orchid-nurture-seedlings-underground-fungal.html
Photosynthate transfer from an autotrophic #orchid to conspecific heterotrophic protocorms through a common #mycorrhizal #fungi network https://nph.onlinelibrary.wiley.com/doi/10.1111/nph.19810
"This finding is exciting because why these #orchids are often found in clumps, despite their #seeds being wind dispersed, has been a puzzle for hundreds of years."
-
Orchid #plants nurture their seedlings via shared underground fungal network https://phys.org/news/2024-05-orchid-nurture-seedlings-underground-fungal.html
Photosynthate transfer from an autotrophic #orchid to conspecific heterotrophic protocorms through a common #mycorrhizal #fungi network https://nph.onlinelibrary.wiley.com/doi/10.1111/nph.19810
"This finding is exciting because why these #orchids are often found in clumps, despite their #seeds being wind dispersed, has been a puzzle for hundreds of years."
-
mosaique Lüneburg im Juni 2026: Theater, Spiele, Allyship Workshop und mehr: Lüneburgs größtes Wohnzimmer öffnet seine Türen – für Begegnung, Beratung, Sprachkurse, Vespertisch, Musik, Spiele und vieles mehr. Such dir einen gemütlichen Platz im Raum und fühl dich willkommen! Das Angebot ist kostenlos. Zum Erhalt freut [...]
Der Beitrag mosaique Lüneburg im Juni 2026: Theater, Spiele, Allyship Workshop und mehr erschien… https://luene-blog.de/mosaique-veranstaltungen-juni-2026/?utm_source=dlvr.it&utm_medium=mastodon #Engagement #Kultur #Lüneburg #Bildung #Commons