home.social

#resilienceengineering β€” Public Fediverse posts

Live and recent posts from across the Fediverse tagged #resilienceengineering, aggregated by home.social.

  1. AWS US-EAST-1 outage (Oct 20, 2025): Root cause & lessons

    A DNS race condition in DynamoDB led to empty endpoint records, triggering cascading failures across AWS services like EC2 and Lambda.

    Explore what went wrong and how to build resilient cloud systems:
    shorturl.at/sJO5K

    #AWS #CloudComputing #DevOps #DynamoDB #ResilienceEngineering

  2. AWS US-EAST-1 outage (Oct 20, 2025): Root cause & lessons

    A DNS race condition in DynamoDB led to empty endpoint records, triggering cascading failures across AWS services like EC2 and Lambda.

    Explore what went wrong and how to build resilient cloud systems:
    shorturl.at/sJO5K

    #AWS #CloudComputing #DevOps #DynamoDB #ResilienceEngineering

  3. A communications failure disrupted air traffic operations across Greece, grounding and diverting flights for several hours. Officials report no evidence of a cyberattack, with technical and judicial investigations underway.

    The incident reinforces the importance of resilience engineering, redundancy testing, and modernization in safety-critical systems - alongside cyber defense.

    From an infosec and resilience standpoint, where should investment be prioritized?

    Source: securityweek.com/cyberattack-u

    Share insights and follow @technadu for objective coverage.

    #CriticalInfrastructure #ResilienceEngineering #AviationSecurity #Infosec #OperationalRisk #SystemsReliability

  4. A communications failure disrupted air traffic operations across Greece, grounding and diverting flights for several hours. Officials report no evidence of a cyberattack, with technical and judicial investigations underway.

    The incident reinforces the importance of resilience engineering, redundancy testing, and modernization in safety-critical systems - alongside cyber defense.

    From an infosec and resilience standpoint, where should investment be prioritized?

    Source: securityweek.com/cyberattack-u

    Share insights and follow @technadu for objective coverage.

    #CriticalInfrastructure #ResilienceEngineering #AviationSecurity #Infosec #OperationalRisk #SystemsReliability

  5. A communications failure disrupted air traffic operations across Greece, grounding and diverting flights for several hours. Officials report no evidence of a cyberattack, with technical and judicial investigations underway.

    The incident reinforces the importance of resilience engineering, redundancy testing, and modernization in safety-critical systems - alongside cyber defense.

    From an infosec and resilience standpoint, where should investment be prioritized?

    Source: securityweek.com/cyberattack-u

    Share insights and follow @technadu for objective coverage.

    #CriticalInfrastructure #ResilienceEngineering #AviationSecurity #Infosec #OperationalRisk #SystemsReliability

  6. A communications failure disrupted air traffic operations across Greece, grounding and diverting flights for several hours. Officials report no evidence of a cyberattack, with technical and judicial investigations underway.

    The incident reinforces the importance of resilience engineering, redundancy testing, and modernization in safety-critical systems - alongside cyber defense.

    From an infosec and resilience standpoint, where should investment be prioritized?

    Source: securityweek.com/cyberattack-u

    Share insights and follow @technadu for objective coverage.

    #CriticalInfrastructure #ResilienceEngineering #AviationSecurity #Infosec #OperationalRisk #SystemsReliability

  7. More entries to the Practice of Practice collection! Grow with your team! Find that empathy everyone is talking about! Build relationships! Learn the system!

    systems seeing journalling by @RuthMalan

    instrumenting complexity with LEGO from @mike_bowler

    github.com/maroda/practiceofpr

    #SRE #PracticeOfPractice #ResilienceEngineering #GamePlay #Improvisation

  8. Here's a new blog post from me! It's a small "book" review, which is actually a workbook with some essays at the beginning.

    The short book is Maj. John Schmitt's exercise book on Tactical Decision Games (TDGs) for the Marines (and likely other branches), and I noticed how much the philosophy behind them is shared with the sorts of Practice of Practice games we play to understand the system and prepare ourselves for incidents.

    sounding.com/2025/05/02/schmit

    #SRE #TDG #PracticeOfPractice #TabletopExercises #OperationalReadiness #IncidentResponse #TacticalDecisionGames #Resilience #ResilienceEngineering #ReliabilityEngineering

  9. Here's a new blog post from me! It's a small "book" review, which is actually a workbook with some essays at the beginning.

    The short book is Maj. John Schmitt's exercise book on Tactical Decision Games (TDGs) for the Marines (and likely other branches), and I noticed how much the philosophy behind them is shared with the sorts of Practice of Practice games we play to understand the system and prepare ourselves for incidents.

    sounding.com/2025/05/02/schmit

    #SRE #TDG #PracticeOfPractice #TabletopExercises #OperationalReadiness #IncidentResponse #TacticalDecisionGames #Resilience #ResilienceEngineering #ReliabilityEngineering

  10. Here's a new blog post from me! It's a small "book" review, which is actually a workbook with some essays at the beginning.

    The short book is Maj. John Schmitt's exercise book on Tactical Decision Games (TDGs) for the Marines (and likely other branches), and I noticed how much the philosophy behind them is shared with the sorts of Practice of Practice games we play to understand the system and prepare ourselves for incidents.

    sounding.com/2025/05/02/schmit

    #SRE #TDG #PracticeOfPractice #TabletopExercises #OperationalReadiness #IncidentResponse #TacticalDecisionGames #Resilience #ResilienceEngineering #ReliabilityEngineering

  11. Here's a new blog post from me! It's a small "book" review, which is actually a workbook with some essays at the beginning.

    The short book is Maj. John Schmitt's exercise book on Tactical Decision Games (TDGs) for the Marines (and likely other branches), and I noticed how much the philosophy behind them is shared with the sorts of Practice of Practice games we play to understand the system and prepare ourselves for incidents.

    sounding.com/2025/05/02/schmit

    #SRE #TDG #PracticeOfPractice #TabletopExercises #OperationalReadiness #IncidentResponse #TacticalDecisionGames #Resilience #ResilienceEngineering #ReliabilityEngineering

  12. Here's a new blog post from me! It's a small "book" review, which is actually a workbook with some essays at the beginning.

    The short book is Maj. John Schmitt's exercise book on Tactical Decision Games (TDGs) for the Marines (and likely other branches), and I noticed how much the philosophy behind them is shared with the sorts of Practice of Practice games we play to understand the system and prepare ourselves for incidents.

    sounding.com/2025/05/02/schmit

    #SRE #TDG #PracticeOfPractice #TabletopExercises #OperationalReadiness #IncidentResponse #TacticalDecisionGames #Resilience #ResilienceEngineering #ReliabilityEngineering

  13. Saturday morning read: I wrote a review of a fairly well known Tactical Decision Games exercise book written 30 years ago that is still applicable today. I wanted to show how the ideas behind such practice are exactly like how we practice for the unexpected in fields like SRE and Arctic rescue.

    sounding.com/2025/05/02/schmit

    #SRE #TDG #MajJohnSchmitt #NDM #GaryKlein #NaturalisticDecisionMaking #ResilienceEngineering #PracticeOfPractice #TacticalDecisionGames #ChaosEngineering

  14. Hey y'all! Any SREs out there?

    Any *aspirational* SREs out there?

    Maybe there's a team at work who they call "SRE" and you're really not sure what they do if it's not infra/platform/deploy?

    A coworker is interested in switching careers to SRE and asked me for reading recommendations. So I put together the most solid top-five for me, from my experience and perspective of doing Ops for 30 years and SRE for 12:

    sounding.com/2024/10/03/five-r

    #SRE #ContinuousLearning #ResilienceEngineering #ReliabilityEngineering #DevOps #TechOps #HumanError #ComplexSystems #SystemsThinking

  15. Hey y'all! Any SREs out there?

    Any *aspirational* SREs out there?

    Maybe there's a team at work who they call "SRE" and you're really not sure what they do if it's not infra/platform/deploy?

    A coworker is interested in switching careers to SRE and asked me for reading recommendations. So I put together the most solid top-five for me, from my experience and perspective of doing Ops for 30 years and SRE for 12:

    sounding.com/2024/10/03/five-r

    #SRE #ContinuousLearning #ResilienceEngineering #ReliabilityEngineering #DevOps #TechOps #HumanError #ComplexSystems #SystemsThinking

  16. Hey y'all! Any SREs out there?

    Any *aspirational* SREs out there?

    Maybe there's a team at work who they call "SRE" and you're really not sure what they do if it's not infra/platform/deploy?

    A coworker is interested in switching careers to SRE and asked me for reading recommendations. So I put together the most solid top-five for me, from my experience and perspective of doing Ops for 30 years and SRE for 12:

    sounding.com/2024/10/03/five-r

    #SRE #ContinuousLearning #ResilienceEngineering #ReliabilityEngineering #DevOps #TechOps #HumanError #ComplexSystems #SystemsThinking

  17. Hey y'all! Any SREs out there?

    Any *aspirational* SREs out there?

    Maybe there's a team at work who they call "SRE" and you're really not sure what they do if it's not infra/platform/deploy?

    A coworker is interested in switching careers to SRE and asked me for reading recommendations. So I put together the most solid top-five for me, from my experience and perspective of doing Ops for 30 years and SRE for 12:

    sounding.com/2024/10/03/five-r

    #SRE #ContinuousLearning #ResilienceEngineering #ReliabilityEngineering #DevOps #TechOps #HumanError #ComplexSystems #SystemsThinking

  18. Hey y'all! Any SREs out there?

    Any *aspirational* SREs out there?

    Maybe there's a team at work who they call "SRE" and you're really not sure what they do if it's not infra/platform/deploy?

    A coworker is interested in switching careers to SRE and asked me for reading recommendations. So I put together the most solid top-five for me, from my experience and perspective of doing Ops for 30 years and SRE for 12:

    sounding.com/2024/10/03/five-r

    #SRE #ContinuousLearning #ResilienceEngineering #ReliabilityEngineering #DevOps #TechOps #HumanError #ComplexSystems #SystemsThinking

  19. New entry for Practice of Practice Gamelan, where I introduce the game Wheel of Expertise that I developed for sharing mental models and performing knowledge elicitation.

    #SRE #SocioTechnical #ResilienceEngineering #Operations #Teamwork #CTA #NDM #HumanFactors

    popg.xyz/2024/05/23/wheelofexp