This plugin uses td-agent's monitor_agent Monitor the buffer of td-agent.
Let's prepare monitor_agent
before installing blackbird-td-agent.
Fluentd monitor_agent
monitor_agent
is a fluentd plugin that allows you to get the buffer status via HTTP. Since it is installed by default, there is no need to install a new gem.
<source>
type monitor_agent
bind 0.0.0.0
port 24220
</source>
If you write config as above and restart fluentd, monitor_agent
will be enabled. The endpoint for monitor_agent
is http: // localhost: 24220 / api / plugins.json
. If you GET to this URL, you can get the byte size of buffer and the length of buffer queue, and the blackbird-td-agent plugin parses this value and gets the value.
Due to the characteristics of monitor_agent
, if there are multiple output plugins, there will be multiple json outputs. So I decided to use zabbix's Law Level Discovery Item to dynamically increase or decrease items.
If it is an output plugin and buffer_queue_length
exists in json, the target item will be expanded to the HOST linked to the template of zabbix. The item name is assigned plugin_id
. (plugin_id is assigned a unique plugin for each plugin, and json in the format of {"plugin_id ":" object: XXXXXXXXXX "}
is output.)
Items(Normal)
You can get the following items as normal items
Law Level Discovery Items(Dynamic)
Items that change dynamically with Law Level Discovery are as follows
current queue length รท buffer_queue_limit
specified in config.buffer_total_queued_size
from http: //MONITOR_AGENT_HOST: 24420/api/plugins.json
buffer_queue_length
from http: //MONITOR_AGENT_HOST: 24420/api/plugins.json
retry_count
from http: //MONITOR_AGENT_HOST: 24420/api/plugins.json
Triggers
The following are implemented as Trigger
retry_count
is more than a certain number of timesGraphs
Each of the values of is a single graph.
When fluentd is queued up and it is dangerous, or when the destination is wrong and it can not be sent, it becomes a combined monitor with monitoring of other host OR middleware, or log monitoring is the only way to detect retry_count in the first place. I feel that it is subtly difficult to monitor, or because it is difficult, I tend to feel like I don't like it later. You can monitor fluentd itself with monitor_agent.