#monitorama24 — Public Fediverse posts
Live and recent posts from across the Fediverse tagged #monitorama24, aggregated by home.social.
-
Gathering data from lots of different sources, providing it to the model then processing the output through post processing steps? That sounds like an asynchronous workflow, USE TRACING ~ @cartermp lauding the benefits of tracing for deploying and iterating AI applications #monitorama #monitorama24 #observability
-
“Disintegration is a feature not a bug, of asynchronous systems, until you introduce telemetry and monitoring” ~ Johannes Tax from @grafana #monitorama #monitorama24 #observability
-
Johannes Tax at @grafana describing the pain of distributed tracing in asynchronous systems “Disintegrated telemetry: The pains of monitoring asynchronous workflows” #monitorama #monitorama24 #observability
-
“Name your metrics, alerts and dashboards with the language you would use if you were having a conversation about the system at lunch, not the cryptic defaults or uuid hostnames” ~ @danslimmon #monitorama #monitorama24 #observability
-
We’ve all heard the definition for observability, but, who is one? What is involved in determining? ~ @danslimmon presenting “No observability without theory” You need a theory of the system to build valid inferences
#monitorama #monitorama24 #observability -
Julia Thoreson at Bloomberg sharing “Incident Management: Lessons from Emergency Services” breaking down how the lessons learned in emergency services can apply to incident management in technical systems #monitorama #monitorama24 #incidentmanagement
-
Pete Fritchman’s Takeaways on managing internal services effectively:
Internal Services impact customers
Leverage your observability tools
Talk to your internal customers
*APPLY SRE PRINCIPLES* -
“New hires are super value able in your internal customer interviews, they actually expect things to work and aren’t bitter yet” ~ Pete Fritchman #Monitorama #monitorama24 #observability #sre
-
“Treat your internal tooling outages like the most critical production outages, because they’ll always hit when you’re trying to recover from a critical production outage” ~ Pete Fritchman #monitorama #monitorama24 #observability
-
“The shoemaker’s children have no shoes - why SRE teams must help themselves” Pete Fritchman making the case for investing in watching the watchmen, and techniques for accomplishing it. #monitorama #monitorama24 #observability #sre
-
Hashmaps to counts work great for small sets, but what happens when you need to count sets larger than memory? You need HyperLogLog or Disjunctive Normal Form (CVM) ~ @phredmoyer #monitorama #monitorama24 #observability
-
“Use counters to count things” @phredmoyer providing some examples where counting things by processing petabytes of log or trace data is prohibitively expensive and justify spending a bit more on dealing with higher cardinality metrics.
#monitorama #monitorama24 #observability -
Baggage is bad for your relationships, good for your service graphs. @kalyanaj makes the case for an arbitrary key value metadata store (baggage) to propagate through your services to enable controllability and observability use cases. #monitorama #monitorama24 #observability
-
“Distributed Context Propagation: How you can use it to Improve Observability, Test in Production, and more...” @kalyanaj explaining the importance of context in interpreting observability data #monitorama #monitorama24
-
“Every team has a different answer for discovering what the dependencies of their services are, some say firewall rules, some look at network flows, tracing gives us a uniform answer to this” ~ Sudeep Kumar #Monitorama #monitorama24 #observability #tracing
-
“We have so many microservices, people are always looking for an excuse to create more, and no one knows which ones they’re already dependent on” ~ Sudeep Kumar from Salesforce with “Tracing Service Dependencies at Salesforce”
-
“Low cardinality in Prometheus and low cardinality in Clickhouse are vastly different things” - @colind in his talk “Experiments in Backing Prometheus with Clickhouse” #Monitorama #Monitorama24 #observability
-
”All of the observability infrastructure in the world is just noise unless someone spends time developing an understanding of your system” ~ David Caudill There’s likely a 10:1 investment return when spending that time up front versus in the middle of the night when you’re troubleshooting an outage.