#resilience-engineering — Public Fediverse posts
Live and recent posts from across the Fediverse tagged #resilience-engineering, aggregated by home.social.
-
Tuning into whispered frequencies: Harnessing Large Language Models to detect Weak Signals in complex socio-technical systems
This study evaluated whether LLMs can support a scaled and systematic analysis of surveyed data about worker adaptive practices, to foster weak signal ID.
E.g. can LLMs help identify weak signals from large-scale data. In this case, textual data describing frontline personnel adaptive behaviours during everyday operations. This was obtained via survey.
PS. Check out my YouTube channel: https://www.youtube.com/@safe_as_pod
Extracts:
· “Systems performance varies in everyday operations due to various internal and external factors, with individuals forced to adapt their performance to cope with any given situation”
· “The factors behind these adaptations are not usually evident, as they may emerge from disconnected pieces of information. Making sense of them refers to identifying ‘weak signals’”
· “Data gathering on adaptive performance is rarely performed, if disconnected from adverse events,” even though it “may have several benefits to fully grasp the actual status of the system and understand the mechanisms that sustain its operation.”
· Manual analysis is useful, but limited as “the dominance of human contribution in textual data analysis significantly limits its applicability and scalability”
· The “weak signals identified through the proposed approach are intrinsically socio-technical, as they emerge from the ways in which people adapt, coordinate, prioritize, and make trade-offs in everyday operations”
· This approach isn’t just related to weak signals of emerging risks, but “can also unearth weak signals that contribute positively to system performance”, e.g. “positive weak signals” represent the very mechanisms that ensure system resilience in everyday operations. They reveal how systems continue to function effectively despite uncertainty, constraints, and competing goals, by relying on adaptive capacity rather than strict procedural compliance”
· “This study demonstrates how the application of LLM-driven analysis can reveal subtle but potentially crucial weak signals within ultra-safe, complex socio-technical environments”
· One weak signal was “the combination of the absence of specific procedures and colleagues’ pressure during events characterized by communication issues”
· The study “demonstrates how the application of LLM-driven analysis can reveal subtle but potentially crucial weak signals within ultra-safe, complex socio-technical environments”
· The authors claim that such patterns are “hard to grasp by traditional methods”
· Further, “proactive safety improvements” and “strengthening the foundations of knowledge management in high-stakes domains”
Lombardi, M., & Patriarca, R. (2026). Tuning into whispered frequencies: Harnessing Large Language Models to detect Weak Signals in complex socio-technical systems. Engineering Applications of Artificial Intelligence, 176, 114738.
#ai #llm #safety #risk #safetyengineering
Study link: https://doi.org/10.1016/j.engappai.2026.114738
My YouTube: https://www.youtube.com/@safe_as_pod
#adaptiveBehaviour #ai #artificialIntelligence #llm #resilienceEngineering #safetyIi #weakSignals
My site with more reviews: SafetyInsights.org
Shout me a coffee: https://buymeacoffee.com/benhutchinson
Safe As LinkedIn group: https://www.linkedin.com/groups/14717868 -
Resilience in Software Foundation is hosting a FRAM workshop!
Dr. Niklas Grabbe is giving an introduction to the Functional Resonance Analysis Method (FRAM).
April 15, 2026 12:00 PM - 2:00 PM EDT. $10 to register (free to Foundation members).
The workshop is designed as a practical introduction for people interested in resilience engineering, safety science, and system modeling.
#ResilienceEngineering #Resilience #ResilienceInSoftware #RISF #FRAM #SRE #Complexity
-
CW: Self promotion
Trust is the glue of our society, but we’re living through a 30+ year crisis of trust in institutions. As software makers, we aren't just witnesses to this crisis; we are participants in it.
When a number is wrong, trust is lost. And I am sort of fed up of "works on my machine" being applied to number and data outputs by systems. "Number is correct, bug closed". We have to do better.
So I've tried to document that, and also share some more difficult areas like communicating uncertainty in a sensible way.
The lovely people at #Monkigras are letting me share this with a wonderful audience of curious engineers next week (19th-20th March - link https://monkigras.com/ )
Do not worry, there will also be trains. Hopefully yellow ones.
#PreppingCraft #SystemsThinking #ResilienceEngineering #TrustBeforeTruth #AI #RailwayEngineering #ux
-
What I’ve been reading (, watching, and listening to) this week ending 7 December 2025 https://jchyip.medium.com/what-ive-been-reading-watching-and-listening-to-this-week-ending-7-december-2025-a7e81ab13dea #AI #physics #RankedChoiceVoting #ResilienceEngineering #civics
-
"... it is impossible to ensure safe and efficient performance by insisting on compliance with design assumptions or work-as-imagined, since the actual conditions never completely match the intended conditions. This is demonstrated by the simple fact that working-to-rule is a recognized way of creating disruptions." https://www.researchgate.net/publication/263145269_Resilience_engineering_and_the_built_environment #ResilienceEngineering
-
"In #ResilienceEngineering it makes as much sense to try to understand why things go right as to understand why they go wrong. In fact, it makes more sense because there are very many more things that go right than things that go wrong." https://www.researchgate.net/publication/263145269_Resilience_engineering_and_the_built_environment
-
"...a built system must include or embody some form of sentience – intelligence or cognition – in order to be resilient." https://www.researchgate.net/publication/263145269_Resilience_engineering_and_the_built_environment #ResilienceEngineering
-
More entries to the Practice of Practice collection! Grow with your team! Find that empathy everyone is talking about! Build relationships! Learn the system!
systems seeing journalling by @RuthMalan
instrumenting complexity with LEGO from @mike_bowler
https://github.com/maroda/practiceofpractice
#SRE #PracticeOfPractice #ResilienceEngineering #GamePlay #Improvisation
-
Was sharing this with a friend today, my description of a game called "Wheel of Expertise" that helps teams dig into deep specifics of their systems:
https://www.popg.xyz/2024/05/23/wheelofexpertise/
#SRE #ContinuousLearning #ConnectiveLabor #IncidentResponse #ComplexSystems #WheelOfExpertise #ResilienceEngineering
-
The people all said
Sit down
Sit down you're rockin the boat#MyLife #SRE #ResilienceEngineering #GeurillaResilience #LaidOff #LetGo #DeadParrot
-
It drives me fucking bananas that SREs at my company are not considered Engineers and are not included in DevEx. On purpose. For the love of all that's holy!!!!!!!1
Brought to you by: a longish motivational thing the VP of Engineering dropped in their channel today that really showed some care for helping people in their careers and learn.
But it's that this stuff was made explicit, by a leader, that really sets it apart and illustrates how dysfunctional the TechOps side of the org is growing without any leadership. They've flattened the entire thing out so that managers are reporting directly to the CTO and are taking on 10-15 reports, minimum. It's a ticket factory, not a collaboration.
-
New blog post... something I have had at the tip of my pen for years...
Am I Developer?
-
Here's a new blog post from me! It's a small "book" review, which is actually a workbook with some essays at the beginning.
The short book is Maj. John Schmitt's exercise book on Tactical Decision Games (TDGs) for the Marines (and likely other branches), and I noticed how much the philosophy behind them is shared with the sorts of Practice of Practice games we play to understand the system and prepare ourselves for incidents.
https://www.sounding.com/2025/05/02/schmitt-tdg-sre/
#SRE #TDG #PracticeOfPractice #TabletopExercises #OperationalReadiness #IncidentResponse #TacticalDecisionGames #Resilience #ResilienceEngineering #ReliabilityEngineering
-
Saturday morning read: I wrote a review of a fairly well known Tactical Decision Games exercise book written 30 years ago that is still applicable today. I wanted to show how the ideas behind such practice are exactly like how we practice for the unexpected in fields like SRE and Arctic rescue.
https://www.sounding.com/2025/05/02/schmitt-tdg-sre/
#SRE #TDG #MajJohnSchmitt #NDM #GaryKlein #NaturalisticDecisionMaking #ResilienceEngineering #PracticeOfPractice #TacticalDecisionGames #ChaosEngineering
-
Compendium of Nancy Leveson: STAMP, STPA, CAST and Systems Thinking
Although I don’t often mention or post about Leveson’s work, she’s probably been the most influential thinker on my approach after Barry Turner.
So here is a mini-compendium covering some of Leveson’s work.
Feel free to shout a coffee if you’d like to support the growth of my site:
https://buymeacoffee.com/benhutchinsonhttps://direct.mit.edu/books/oa-monograph/2908/Engineering-a-Safer-WorldSystems-Thinking-Applied
https://dspace.mit.edu/bitstream/handle/1721.1/102747/esd-wp-2003-01.19.pdf?sequence=1&isAllowed=y
https://escholarship.org/content/qt5dr206s3/qt5dr206s3_noSplash_4453efa62859a16d187fa5e66d414ac2.pdf
https://escholarship.org/content/qt8dg859ns/qt8dg859ns_noSplash_e67040b78c1ff72e51b682bb23d8628a.pdf
https://doi.org/10.1177/0170840608101478
https://doi.org/10.1145/7474.7528
http://therm.ward.bay.wiki.org/assets/pages/documents-archived/safety-3.pdf
https://dspace.mit.edu/bitstream/handle/1721.1/108601/Leveson_A%20systems%20approach.pdf
http://sunnyday.mit.edu/papers/Rasmussen-Legacy.pdf
https://www.tandfonline.com/doi/pdf/10.1080/00140139.2015.1015623
http://sunnyday.mit.edu/papers/issc03-stpa.doc
https://doi.org/10.1016/j.ssci.2018.07.028
http://sunnyday.mit.edu/shell-moerdijk-cast.pdf
http://sunnyday.mit.edu/CAST-Handbook.pdf
https://psas.scripts.mit.edu/home/get_file.php?name=STPA_Handbook.pdf
https://psas.scripts.mit.edu/home/wp-content/uploads/2020/07/JThomas-STPA-Introduction.pdf
https://cris.vtt.fi/ws/portalfiles/portal/98296189/Complete_with_DocuSign_2024-1-2_STPA_guide_F.pdf
http://sunnyday.mit.edu/UPS-CAST-Final.pdf
https://doi.org/10.1016/j.trip.2023.100912
https://dspace.mit.edu/bitstream/handle/1721.1/107502/974705860-MIT.pdf?sequence=1
https://proceedings.systemdynamics.org/2007/proceed/papers/DULAC552.pdf
http://sunnyday.mit.edu/nasa-class/jsr-final.pdf
https://dl.acm.org/doi/pdf/10.1145/2556938
https://www.tandfonline.com/doi/pdf/10.1080/00140139.2015.1015623
https://dspace.mit.edu/bitstream/handle/1721.1/102833/esd-wp-2011-13.pdf?sequence=1&isAllowed=y
https://academic.oup.com/jamia/article-abstract/15/3/272/727503?redirectedFrom=PDF
https://www.academia.edu/29657886/The_systems_approach_to_medicine_controversy_and_misconceptions
https://dl.acm.org/doi/pdf/10.1145/3376127
https://www.sciencedirect.com/science/article/pii/S0022522316000702
http://sunnyday.mit.edu/caib/issc-bl-2.pdf
http://sunnyday.mit.edu/papers/ARP4761-Comparison-Report-final-1.pdf
https://ieeexplore.ieee.org/stamp/stamp.jsp?arnumber=8102762
https://www.tandfonline.com/doi/pdf/10.1080/00140139.2015.1011241
https://onlinelibrary.wiley.com/doi/pdf/10.1260/2040-2295.3.3.391
http://sunnyday.mit.edu/papers/incose-04.pdf
https://core.ac.uk/download/pdf/78070242.pdf
https://dspace.mit.edu/bitstream/handle/1721.1/102767/esd-wp-2004-08.pdf?sequence=1&isAllowed=y
https://www.tandfonline.com/doi/pdf/10.1080/00140139.2014.1001445
https://ntrs.nasa.gov/api/citations/20230017753/downloads/Kopeikin_AIAA_UnsafeCollabControl_v5.pdf
http://sunnyday.mit.edu/accidents/space2001-version2.pdf
https://dspace.mit.edu/bitstream/handle/1721.1/90801/891583966-MIT.pdf?sequence=2&isAllowed=y
http://sunnyday.mit.edu/Bow-tie-final.pdf
https://cs.emis.de/LNI/Proceedings/Proceedings232/597.pdf
https://a3e.com/wp-content/uploads/2021/03/Risk-Matrix.pdf
https://journals.sagepub.com/doi/pdf/10.1177/21695067231192457
https://jsystemsafety.com/index.php/jss/article/download/44/41
http://sunnyday.mit.edu/compliance-with-882.pdf
https://meridian.allenpress.com/bit/article-pdf/47/2/115/1488089/0899-8205-47_2_115.pdf
LinkedIn post:
#CAST #disaster #nancyLeveson #resilienceEngineering #risk #safetyScience #safetyIi #safety2 #safetyii #stamp #stpa #systemSafety #systemsEngineering #systemsSafety #systemsThinking
-
Wading the waters of changing how people work is treacherous.
It's like walking around a rocky riverbed, where you think you see the size and shape of the rocks as the water moves over them, but light and motion illusions fog any real understanding up until you've twisted your ankle on the result.
I twist my ankle a lot doing this work. But I keep doing it. There's so much more for us to learn when we think we've learned it all.
-
Having a little discussion about our passionate view of things in the DevEx team I'm on and I offered this:
"studying Resilience Engineering saved my life, there's no doubt about it. there is so much good to be had from RE that i want to share with the world. i hope that we can get closer to RE practices here. not because i want to show i've done it, i don't care about resume filling. but to show people there's a better way. it takes care of humans and is more curious about the system. fix errors, but also discover insight."
-
Venting and rubber-ducking* in your team channels is 100% a fantastic way to invigorate your remote work practice.
I cannot count the number of times I have witnessed and been a part of such conversations that had an effect of pulling the team closer together, whether people agree or not.
(* Rubber ducking is the process of using an external "listener" to receive your stream of consciousness ranting about whatever problem it is you're trying to solve. The phrase comes from the practice of having a real "rubber duck" on your desk to use as the receiver.)
-
There's this thing about resilience engineering being more about being ready for dragons around the next corner than trying to guess where all the holes are in the swiss cheese.
I enjoy high nerd humor.
#ResilienceEngineering #ThereBeDragons #WhenSwissCheeseModelsFail #DefenseInDepth #Complexity https://mastodon.zergy.net/@Enalys/113656847324163454
-
~~~ Be Not Measurable ~~~
If developer productivity should be measured at all, maybe it should be with sociograms that show how much actual collaboration happens and how it becomes reflected in the organization of the technology (Conway's Law enters the chat).
Which you bet is invasive shit that can feel like Big Brother is watching, regardless of the data being collected. But tech management seems to want to do it anyway, without thinking about it from our perspective at the sharp-end.
These complicated ways they want to measure Work As Imagined and Work As Prescribed become huuuuge compliance pressure on us. They care very little about Work As Done because it can't be predicted. It's very unmeasurable work, especially in SRE.
This is the same neoliberal thinking that we can quantify everything, make all Work measurable, so that only Good Training - not Expertise - is needed to do the Work. Hire lower paid or even robotic agents. Profit.
For instance, attributing Root Cause is fundamentally misaligned with how we need to understand the way things Succeed.
If we look at history and point to the downhill slope of failure, we're looking past what Resilience has already made this "failure" not be more of a failure than it is. So is it a failure at all?
When we measure success, we tend to measure multiple things and consider all kinds of abductive reasoning to point at all the various paths for success to happen.
In tech, we do the opposite for incidents, especially when only failures count. So if you are in an org viewing incidents as causal pathways to automate away, then maybe AI can replace those humans and the org will continue to suffer exactly the same kinds of incidents.
Well, the silver lining I guess is that AI can't replace what can't be measured.
~~~ Be Not Measurable ~~~
#DeveloperProductivity #DevEx #SRE #BeNotMeasurable #ResilienceEngineering
-
Postscript:
Guess what happened? Another team member started a thread, diving into the HTML with me. He really didn't remember too much but I was new to SVG as well so we're working together, postulating, trying things out, making suggestions, and finally I come upon the right answer through our trial and error applications of what we *thought* the system wanted.
Two things happened:
1. I learned what the system wanted by building a pattern recognition engine of the needs of the system.
2. I gained reciprocity with a teammate.I think we just practiced true Followership.
-
Hey y'all! Any SREs out there?
Any *aspirational* SREs out there?
Maybe there's a team at work who they call "SRE" and you're really not sure what they do if it's not infra/platform/deploy?
A coworker is interested in switching careers to SRE and asked me for reading recommendations. So I put together the most solid top-five for me, from my experience and perspective of doing Ops for 30 years and SRE for 12:
https://www.sounding.com/2024/10/03/five-reads-for-learning-sre/
#SRE #ContinuousLearning #ResilienceEngineering #ReliabilityEngineering #DevOps #TechOps #HumanError #ComplexSystems #SystemsThinking
-
New entry for Practice of Practice Gamelan, where I introduce the game Wheel of Expertise that I developed for sharing mental models and performing knowledge elicitation.
#SRE #SocioTechnical #ResilienceEngineering #Operations #Teamwork #CTA #NDM #HumanFactors
-
Completed an 8 page incident report at work today without using or suggesting the phrase "root cause" once.
#ResilienceEngineering #LearningFromIncidents -
@mononcqc One of my favorites that I go back to a lot! #SRE #ResilienceEngineering
-
This week's notes are on a #paper by David Woods on the four definitions of "Resilience" used in #ResilienceEngineering
The four terms are related to resilience as:
- rebound
- robustness
- graceful extensibility
- sustained adaptabilitySpecifically, the two later categories are those that yield more interesting insights.
Notes at https://ferd.ca/notes/paper-four-concepts-for-resilience-engineering.html & https://cohost.org/mononcqc/post/3019560-paper-four-concepts
-
This week's #paper is the Strategic Agility Gap by David Woods: https://link.springer.com/chapter/10.1007/978-3-030-25639-5_11
It puts together a lot of the concepts he has written about in the past, particularly around the need of organizations to balance growth in capabilities with the ability to adjust to the changes they enable, and how sustained adaptability becomes a requirement in these environments.
Notes at https://ferd.ca/notes/paper-the-strategic-agility-gap.html & https://cohost.org/mononcqc/post/2950533-paper-the-strategic
-
Quick posting of older #paper notes I had, as I take a short paper-reading break for a few weeks: Can We Trust Best Practices? Six Cognitive Challenges of Evidence-Based Approaches by David D. Woods and Gary Klein.
In which "best practices" are matched with 6 cognitive challenges and a suggestion to move from "best practices" to "better practices" to properly frame them as provisional.
Notes at https://ferd.ca/notes/paper-can-we-trust-best-practices.html & https://cohost.org/mononcqc/post/1942762-paper-can-we-trust