#fluent-bit — Public Fediverse posts
Live and recent posts from across the Fediverse tagged #fluent-bit, aggregated by home.social.
-
Reduce developer friction – Configuring tools like Fluent Bit (and Fluentd)
Something that vendors like Microsoft have been really good at is reducing the friction on getting started – from simplifying installations with MSI files and defaulted options through to very informative error messages in Excel when you’ve got a function slightly wrong. Apple is another good example of this; while no two Android phones are the same, my experience is that setting up an iPhone is just so much easier than setting up an Android phone. It is also the setup/configuration where most friction comes from.
Open-Source Software (OSS), as a generalisation, tend to be a bit weaker at minimising friction – this comes from several factors:
- When OSS is part of a business model, vendors can reduce that friction, making their enhanced version more attractive.
- OSS contributors are typically focused on the core problem space and are usually close enough to the fine details to not need those fancy features to keep the rest of us out of trouble.
- The expectation is that tools to make configuration easy are embedded in the application, making it heavier, when the aim is to keep things as light as possible.
- Occasionally, a little bit of intellectual snobbery can creep in
The common challenge
The issue that I have observed is that we often go through cycles of working with a technology. For example, you’re building a microservice. Chances are, you’ll start writing and running it locally, without worrying about containerization. Once you’re pretty happy with things, you’ll Dockerize the service, start testing it locally, and then you’ll be ready to deploy it to a cluster. Now you’ll need your YAML. It may well be weeks since you last looked at Helm charts. You end up cutting and pasting your last configuration. But now you need to use another feature of Helm, can you remember the exact settings for the feature. So now you’re trawling the net for documentation, and then it takes several tries to get it right.
AI may well step in to help developers in this area, where solutions and products are well-documented. But with the wrong model or insufficient detail in the prompt, it’s easy to make a mistake. Personally, I’d turn to AI when it becomes necessary to trawl code to better understand the configuration and its behaviour, and to set options.
Experimental Solution
Solution – well, that depends upon the configuration syntax. We have been experimenting with RJSF (React JSON Schema Form), which provides a React-based UI that can be dynamically driven by a JSON schema and validate data with AJV (an alternative stack considered would have been around JSON Forms).
{ "type": "object", "title": "Dummy", "properties": { "name": { "type": "string", "const": "dummy", "title": "Plugin" }, "copies": { "type": "integer", "description": "Number of messages to generate each time messages are generated.", "x-doc-reference": "https://docs.fluentbit.io/manual/data-pipeline/inputs/dummy#configuration-parameters", "x-doc-required": false, "x-config-data-type": "integer", "default": 1 }, "dummy": { "type": "string", "description": "Dummy JSON record.", "x-doc-reference": "https://docs.fluentbit.io/manual/data-pipeline/inputs/dummy#configuration-parameters", "x-doc-required": false, "x-config-data-type": "string", "default": "{\"message\":\"dummy\"}" }, "fixed_timestamp": { "type": "boolean", "description": "If enabled, use a fixed timestamp.", "x-doc-reference": "https://docs.fluentbit.io/manual/data-pipeline/inputs/dummy#configuration-parameters", "x-doc-required": false, "x-config-data-type": "boolean", "default": false } } }The above fragment shows part of the Schema definition for the Dummy plugin for Fluent Bit.
By then creating a schema that defines the different plugins, attributes, etc., we can drive validation and menu items easily in the UI. Admittedly, the config file is significant given all the plugins and configuration options, but it is a fair price to pay for a UI that validates the data. Establishing the schema to start with, we’ve covered it through scripting the retrieval and scraping of the Fluent Bit pages, which are pretty consistent in structure.
We have added some custom elements into the definition, for example, x-doc-reference, which allows us to extend the React components to provide features such as a link back to the original documentation as you select attributes or plugins.
As a result, we very quickly have a UI that can look like this:
A lot easier to view and tweak, with no need to hunt for valid options. Even if we want more information, we’re just a button click away from the open-source data. Perhaps we should provide a version that hyperlinks to the Manning Live Books on Fluent Bit, etc.
There are a few other factors to consider; for example, Fluent Bit configuration is YAML, not JSON, which can be easily resolved given the relationship between the two standards. Then there are processors that can embed Lua code or a SQL-like syntax. As we’ve chosen to provide a Python backend, we’ve addressed this by providing REST endpoints which can query out of the JSON the code or SQL and perform validation using the Python Lua Parser, and the SQL syntax can be addressed using the Lark library for processing the SQL, as the syntax is simple enough to define and maintain the syntax.
Outstanding Gaps for Fluent Bit
We still need to address several features that Fluent Bit has, specifically:
- Environment variables
- Includes
These issues should be straightforward to overcome, although dynamically including the included elements into the UI view elements can be done. The challenge is: if any changes need to go into something that has been included, how do we push them back to the included file? Particularly if there are multiple layers of inclusion.
What about Fluentd?
Fluentd configuration isn’t JSON-based notation, but it is structured. So, to apply the same mechanism, we’ll need to define a schema and a mapping mechanism. The tricky part of the schema is that Fluentd supports nesting plugins, since the way pipelines are defined for routing differs. While JSON schema will enable this with constructs such as anyOf, oneOf, object nesting, and bounded object arrays, the structure will be more complex.
The second challenge will be the transformer/renderer, so we don’t introduce issues from having to escape and unescape characters, since JSON Schema is stricter about character use.
Then What?
Well, if we get this going, we’ll probably incorporate the capability into our OpAMP project and maybe create a build that lets the configuration tool run independently. Lastly, perhaps we should look to see if we can make the different layers a little more abstract, so we can plug in editors for other configurations, such as OTel Collectors or the ELK Stack.
As a bonus, perhaps transform the Schema into a quick reference web document?
#AI #artificialIntelligence #configuration #development #ELK #FluentBit #Fluentd #LLM #observability #OpAMP #Technology -
OpAMP with Fluent Bit – Observability and ChatOps
With KubeCon Europe happening this week, it felt like a good moment to break cover on this pet project.
If you are working with Fluent Bit at any scale, one question keeps coming up: how do we consistently control and observe all those edge agents, especially outside a Kubernetes-only world?
This is exactly the problem the OpAMP specification is trying to solve. At its core, OpAMP defines a standard contract between a central server and distributed agents/supervisors, so status, health, commands, and config-related interactions follow one protocol instead of ad-hoc integration per tool.
That is where this project sits. We’re implementing the OpAMP specification to support Fluent Bit (and later Fluentd).
In this implementation, we have:
- a provider (the OpAMP server), and
- a consumer acting as a supervisor to manage Fluent Bit deployments.
Right now, we are focused on Fluent Bit first. That is deliberate: it keeps scope practical while we validate the framework. The same framework is being shaped so it can evolve to support Fluentd as well.
The repository for the implementation can be found at https://github.com/mp3monster/fluent-opamp
Quick summary
The provider/server is the control plane endpoint. It tracks clients, accepts status, queues commands, and returns instructions using OpAMP payloads over HTTP or WebSocket.
The consumer/supervisor handles the local execution and reporting. It launches Fluent Bit, polls local health/status endpoints, sends heartbeat and metadata to the provider, and handles inbound commands (including custom ones). The server and supervisor can be deployed independently, which is important for real-world rollout patterns.
Because they follow the OpAMP protocol model, clients and servers can be interchanged with other OpAMP-compliant implementations (although we’ve not yet tested this aspect of the development).
Together, they give us a manageable, spec-aligned path to coordinating distributed Fluent Bit nodes without hard-coding one-off control logic into every environment.
Deployment options and scripts
There are a few practical ways to get started quickly:
- Deploy just the server/provider using
scripts/run_opamp_server.sh(orscripts/run_opamp_server.cmdon Windows). - Deploy just the client/supervisor using
scripts/run_supervisor.sh(orscripts/run_supervisor.cmdon Windows). - Run both components either together in a single environment or independently across different hosts.
The scripts will set up a virtual environment and retrieve the necessary dependencies.
If you want an initial MCP client setup as part of your workflow, there are helper scripts for that too:
mcp/configure-codex-fastmcp.shandmcp/configure-codex-fastmcp.ps1mcp/configure-claude-desktop-fastmcp.shandmcp/configure-claude-desktop-fastmcp.ps1
Server screenshots
Here is a first server view we can include in the post:
The Server Console with a single AgentThe UI is still evolving, but this gives a concrete picture of the provider side control plane we are discussing.
What the OpAMP server (provider) does
The provider is responsible for the shared view of fleet state and intent.
Today it provides:
- OpAMP transport endpoints (
/v1/opamp) over HTTP and WebSocket. - API and UI endpoints to inspect clients and queue actions.
- In-memory command queueing per client.
- Emission of standard command payloads (for example, restart).
- Emission of custom message payloads for custom capabilities.
- Discovery and publication of custom capabilities supported by the server side command framework.
Operationally, this means we can queue intent once at the server and let the next client poll/connection cycle deliver that action in protocol-native form.
What the supervisor (consumer) does for Fluent Bit
The supervisor is the practical glue between OpAMP and Fluent Bit:
- Starts Fluent Bit as a local child process.
- Parses Fluent Bit config details needed for status polling.
- Polls Fluent Bit local endpoints on a heartbeat loop.
- Builds and sends
AgentToServermessages (identity, capabilities, health/status context). - Receives
ServerToAgentresponses and dispatches commands. - Handles custom capabilities and custom messages through a handler registry.
So for Fluent Bit specifically, the supervisor gives us a way to participate in OpAMP now, even before native in-agent OpAMP support is universal.
And to be explicit: this is the current target. Fluentd support is a planned evolution of this same model, not a separate rewrite.
Where ChatOps fits
ChatOps is where this gets interesting for day-2 operations.
In this implementation, ChatOps commands are carried as OpAMP custom messages (custom capability
org.mp3monster.opamp_provider.chatopcommand). The provider queues the custom command, and the supervisor’s ChatOps handler executes it by calling a local HTTP endpoint on the configuredchat_ops_port.That gives us a cleaner control path:
- Chat/user intent can go to the central server/API.
- The server routes to the right node through OpAMP.
- The supervisor performs the local action and can return failure context when local execution fails.
This is a stronger pattern than directly letting chat tooling call every node individually, and it opens the door to better auditability and policy controls around who can trigger what.
Reality check: we are still testing
This is important: we are still actively testing functionality.
Current status is intentionally mixed:
- Core identity, sequencing, capabilities, disconnect handling, and heartbeat/status pathways are in place.
- Some protocol fields are partial, todo, or long-term backlog.
- Custom capabilities/message pathways are implemented as a practical extension point and are still being hardened with test coverage and real-world runs.
So treat this as a working framework with proven pieces, not a finished all-capabilities implementation.
What is coming next (based on
docs/features.md)Near-term priorities include:
- stricter header/channel validation,
- heartbeat validation hardening,
- payload validation against declared capabilities,
- server-side duplicate websocket connection control behaviour.
Broader roadmap themes include:
- authentication/security model for APIs and UI,
- persistence in the provider,
- richer UI controls for node/global polling and multi-node config push,
- certificate and signing workflows,
- packaging improvements.
And yes, a key strategic direction is evolving the framework abstraction so it can support Fluentd in due course, not only Fluent Bit. Some feature areas (like package/status richness) make even more sense in that broader collector ecosystem.
Why this matters
OpAMP gives us a standard envelope for control-plane interactions; the server/supervisor split gives us pragmatic deployment flexibility; and ChatOps provides a human-friendly control surface.
Put together, this becomes a useful pattern for managing telemetry agents in real environments where fleets are mixed, rollout velocity matters, and “just redeploy everything” is not always an option.
If you are evaluating this right now, the right mindset is: useful today, promising for tomorrow, and still under active verification as we close feature gaps.
#AI #artificialIntelligence #Cloud #Fluentbit #Fluentd #LLM #observability #OpAMP #Technology -
Fluent Bit Vulnerabilities Expose Cloud Services to Takeover https://www.securityweek.com/fluent-bit-vulnerabilities-expose-cloud-services-to-takeover/ #Vulnerabilities #CloudSecurity #cloudsecurity #vulnerability #FluentBit #cloud
-
Auch in der Firma geht es voran: Für #pcidss (4.x!!!!)sind die unangenehmen Fragen für uns durch. Nur noch Pakete schnüren und alles hochladen zzgl. gefixter Incidents (Icinga Checks für ClamAV Prozess) und dann abwarten. Die Entwickler haben etwas mehr zu tun.
Und hab neues #logsystem fast schon komplett. #fluentbit wird #nxlog ersetzen. #victorialogs wird eine Weile parallel zum #Graylog laufen und die mangelnde Auth Fähigkeit von vmlog wird mittels #Nginx und #oauthproxy kompensiert. Es gibt auch ein schönes Ticket: Feature Request für fluentbit: Parameter für #yaml oder classic. Dann kann man nämlich fluent Config über Graylog ausrollen 😍
-
#Monitoring #Caddy with #FluentBit and #Prometheus
The most important tool for planning, understanding the impact of changes, and for dealing with the consequence of complexity (← bugs) is the ability to understand and measure what is actually happening. Therefore, what we need is to install some monitoring, so that we can see issues and make plans based on actual data.
https://marctrius.net/monitoring-caddy-with-fluent-bit-and-prometheus/
Please let me know if you have any feedback.
-
During the #SharkBytes session at #SharkFest conference I had an opportunity to present a lightning talk about my pet project called IDS Lab.
It is a lab infrastructure deployable as docker containers, which simulates the small company network.The IDS Lab consists of web webserver with #Wordpress, #MySQL database, #Linux desktop with RDP, the #WireGuard VPN for "remote" workers and for connecting another virtual or physical machines into the lab network.
This part of infrastructure can be used for attack simulations.There are additional components for playing with logs and detections, too: #Fluentbit, #Suricata and #OpenObserve as lightweight SIEM.
In the #SIEM we already have preconfgured dashboards for alerts, netflows, web logs and logs from windows machines, if present.
Using the provided setup script, the whole lab can be up and running in up to 5 minutes. For more info, please check my GitHub repository with the IDS Lab: