#slightreliability — Public Fediverse posts
Live and recent posts from across the Fediverse tagged #slightreliability, aggregated by home.social.
-
This week on #SlightReliability I had the honour of chatting with @honeycombio Field CTO @lizthegrey about the role of developer advocacy in #SRE.
🗣️ What is developer relations (DevRel)?
🎵 Is DevRel and developer advocacy the same thing?
💰 What value does developer advocacy add to organisations and the community?
📖 Storytelling and the power of visuals
🥇 ...and some tips on getting SRE traction in your organisation! -
@paigerduty I was listening to your appearance on #SlightReliability last week and you mentioned that in #OpenTelemetry all of your services should send data to the same collector. Is this always the case? I was thinking that some ingestion platforms would be able to correlate all the traces, but I could easily be wrong there.
I was thinking I could have one collector per service in a separate container, or even one per eng group. Maybe that isn't feasible, best practices though? -
This week on Slight Reliability @paigerduty is back! This time we dive into sampling of distributed traces. We cover...
🕸️ What is distributed tracing? What are spans?
🧪 What is sampling? And why do we need it?
🤯 What constitutes an interesting trace?
🦘 No sampling VS head based VS tail based
👩🏾🔬 Non-traditional use cases of tracing such as CI/CD
🧻 The power of napkin math to make informed decisions
...and much more. -
This week on #SlightReliability I chat with Dr. Vlad Ukis (author of the book "Establishing SRE Foundations" and head of R&D at Siemens Healthineers) about implementing #SRE.
One of my big takeaways from the conversation was the power of selling SRE practices internally, showcasing success, and the "SRE marketing funnel". The social side of SRE is overlooked but very important.
Also in this episode: SLOs and how to get started with them.
-
This week on #SlightReliability Amin Astaneh from Certo Modo is back! This time we discuss his #sre (production engineering) experiences at #meta. We cover:
🏢 What it's like interviewing for big tech
🦶 Voting with your feet (as an incentive to prioritise reliability)
💍 SRE engagement models
🏅 Socialising SRE wins to grow the practice (the sales part of SRE)
🇹 Wide VS deep skillsets in different sized orgs
🚒 The time Amin burned down a data centre...(and much more!)
-
This week on Slight Reliability I revisit the concept of the single pane of glass (#SPOG) with Jamie Allen from EPAM Systems and Adam Kinniburgh from SquaredUp.
👁️ What is a SPOG supposed to be?
🌏 Can it work at massive scale?
💼 Is it a tool for engineers or executives?
🤖 What is the future of dashboards in the #AI era?(and much more) #SRE #observability #dashboard #monitoring #SlightReliability
-
This week on #SlightReliability I drill into the myths and truths about #AI with Kyle Forster from RunWhen.
Can we bring single player mode to pair programming using AI? Are IT jobs at risk of being displaced? How (as consumers) do we make informed decisions about purchasing products with AI? (and of course, much more).
I hope you enjoy my drawing of Vision (from the MCU)... it took me quite some time :)
-
This week on Slight Reliability I had the honour of interviewing Courtney Nash about why mean time to recover (#MTTR) is an unhelpful metric, what she learned by analysing 10+ incident reports, and much more.
🕵🏽♀️ Instead of MTTR, let's focus on learning from incidents, observing patterns and themes, involving leadership, and adding an "accident investigator" lens after the fact to enhance the learning.
-
This week on Slight Reliability I had the honour of interviewing Courtney Nash about why mean time to recover (#MTTR) is an unhelpful metric, what she learned by analysing 10+ incident reports, and much more.
🕵🏽♀️ Instead of MTTR, let's focus on learning from incidents, observing patterns and themes, involving leadership, and adding an "accident investigator" lens after the fact to enhance the learning.
-
This week on Slight Reliability I had the honour of interviewing Courtney Nash about why mean time to recover (#MTTR) is an unhelpful metric, what she learned by analysing 10+ incident reports, and much more.
🕵🏽♀️ Instead of MTTR, let's focus on learning from incidents, observing patterns and themes, involving leadership, and adding an "accident investigator" lens after the fact to enhance the learning.
-
This week on #SlightReliability I chat with Martin Thwaites from Honeycomb.io about #observability during #development (#ODD). Some of my takeaways:
💻 How observability in development frees up developers to spend less time debugging and more time writing code.
🤖 That manual instrumentation is where the power is.
💰 Keeping the cost of observability data down through a combination of head and tail based sampling. "Keeping every span of trace data is irresponsible".
-
This week on #SlightReliability... how do we prevent #observability from only generating value for a small set of engineers? How do executives, product managers, and other stakeholders leverage its power?
https://www.youtube.com/watch?v=rH0U1sKr-TA
(You can also listen to Slight Reliability via most podcast platforms, or check out https://slightreliability.com/)
-
Unfortunately there is no #SlightReliability episode this week... So as is tradition, I have a haiku for you. #sre
-
Who else is going to be at AWS Summit in London on June 7th? Would be great to meet some of the community in person. #awssummit #aws #slightreliability https://aws.amazon.com/events/summits/london/
-
This week on Slight Reliability I chat to Ivan Merrill about his experiences implementing #observability in the real world. We discuss making observability part of onboarding, discussing risk to get leadership buy-in, inviting over inflicting practices, and much more.
-
Yesterday #SlightReliability reached 1k subscribers on YouTube! Just wanted to say thank you to everyone who has listened and joined in the discussion about #sre!
-
This week on #SlightReliability... what is "insight" in #observability? Are tool vendors lying to us about being able to provide it? Is it science? Art? Or magic? #sre https://www.youtube.com/watch?v=i2GFEobj2gM
-
This week on #SlightReliability I reminisce from my #performancetesting days when I used to analyse complete sets of raw data using scatterplots, and ponder how we could apply this in #observability #sre https://www.youtube.com/watch?v=f1GSGWGUEGo
-
Last week on #SlightReliability I chated to Paige Cruz from Chronosphere about cognitive overload in #SRE. We chated about how SREs are often used as the Swiss army knives of the IT department, how as humans our RAM is maxed out, why you shouldn’t give your team a name like “The Lobsters”, and a whole lot more.
This was one of my very favourite interviews I've ever done. https://www.youtube.com/watch?v=CDhGgnIGGQY
-
This week on #SlightReliability I talk about how I think #observability promises more than what we're getting. I argue that it needs to look at more than technology in order to help us negotiate the ocean of chaos in the Digital Era. #sre https://www.youtube.com/watch?v=da3o2QSxVeI
-
This week on #SlightReliability... what do we do with all our #telemetry data? Should we put it all in a data lake? Or is there another way we can pull insight together? #sre #observability https://www.youtube.com/watch?v=Mv55p1kXz6g
-
My second official #SlightReliability blog, focusing on my #SRE takeaways from #reinvent. I explore serverless, observability data lakes, topologies (technology maps), FinOps, and more. Oh, and lots of #mspaint art! https://squaredup.com/blog/slight-reliability/sre-trends-from-reinvent-2022/
-
What is the future of #SRE? This week on #SlightReliability I'm joined by the hosts of the @oncallmemaybe podcast @adrianamvillela and @anamedina to discuss just this.
We discuss the role of #observability in SRE, recruitment tactics, company culture and leadership buy-in, cognitive load, leveraging the scale of community, and more.
-
How do you improve yourself as an #SRE or any other role in technology? This week on #SlightReliability I share the books I read in 2022 and what I gained from each. Perhaps one of them could be useful to you? https://www.youtube.com/watch?v=g54j6lTBbfk
-
About to log off for the year. Thank you to SquaredUp for being an awesome employer, and to everyone who tuned into #SlightReliability (or read my articles) in 2022. Looking forward to hitting the ground running in 2023.
I hope you all have a well earned break, and if you're on call over the holiday period... may your incidents be few, and your MTTR extremely small. Oh wait, MTTR has been disproved or something hasn't it? How about, hope it goes smoothly? #sre #observability
-
This week on #SlightReliability
Henrik Rexed (from Dynatrace) and I share our #observability new year's resolutions. We chat about #otel, continuous #profiling, using distributed #tracing in #testing, and much more. #sre https://www.youtube.com/watch?v=e5PzmBYsYNY -
This week on #SlightReliability I chat to Gwen Berry and Steve Gill about starting an #SRE team from scratch. We discuss failing at #SLO adoption, being on-call as a junior engineer, single pane of glass #observability, and much more. https://www.youtube.com/watch?v=o5zm_GgdbEs
-
This week on #SlightReliability I summarise what I observed at #reinvent last week that relates to #SRE work. This includes observability data warehouses, the use of topologies to understand complexity, FinOps, serverless, and much more. https://www.youtube.com/watch?v=Jn4feHsGmdY
-
This week on #SlightReliability we do something a little different. I walk around AWS #reinvent in Las Vegas asking a simple question: "What is observability?" (and then reflect on what I heard).
This topic will definitely need a more detailed follow-up episode in the future. Massive thank you to everyone who took the time to speak to me. #sre #observability
-
There is more than one type of #SRE. This week on #SlightReliability I explore the different types of SRE out there in the industry and how they meet the needs of the industry, and discuss some ethically dubious practices around hiring. https://www.youtube.com/watch?v=HMD5NHpEo4U
-
#introduction time... I'm Stephen, a DevRel at SquaredUp. I did #PerformanceEngineering for a long time, then moved to #sre more recently. I host the #SlightReliability podcast, blog, and speak at events. I'm also doing some engineering work internally.
I live in New Zealand. Married with two kids, two dogs, and a cat. I trained to be a professional actor but that didn't work out as a career. I enjoy video games (especially #DeepRockGalactic) and #dnd