Analyzing and visualizing csv logs with Excel Elastic Stack (docker-compose) --Dividing PipelineFilter into 3 files [input / filter / output] to improve maintainability and reusability

Introduction

Thanks! An engineer in charge of the product inspection process in the production engineering department. Analyzing and visualizing csv logs with Excel Elastic Stack (docker-compose) --Elastic Stack is a continuation.

Target audience

This article is intended for those who are new to Elastic Stack and who are thinking about trying it out.

Content of this article

[How to create a Logstash pipeline with excellent maintainability and reusability](https://www.elastic.co/jp/blog/how-to-create-maintainable-and-reusable] described in the official blog -logstash-pipelines) I tried it. Even if the number of pipelines increases, it is divided into three folders [input / filter / output] so that it can be maintained.

image.png

I have put a set of configuration files in GitLab, so please refer to it. Click here for repository-> elastic-stack

Dividing method

Divide logstash.conf into three parts: input, filter, and output.

logstash/pipeline/logstash.conf


input {
  beats {
    port => "5044"
  }
}

filter {
  csv {
    columns => ["Step","TestName","Value1","Judge"]
    separator => ","
  }
}

output {
  elasticsearch {
    hosts    => [ 'elasticsearch' ]
    index    => "%{[@metadata][beat]}-csv-%{+YYYY}-%{+MM}"
  }
}

Divide into the following _in, _filter, _out and give the extension .cfg.

logstash/pipeline/input/logstash_in.cfg


input {
  beats {
    port => "5044"
  }
}

logstash/pipeline/lfilter/ogstash_filter.cfg


filter {
  csv {
    columns => ["Step","TestName","Value1","Judge"]
    separator => ","
  }
}

logstash/pipeline/output/logstash_output.cfg


output {
  elasticsearch {
    hosts    => [ 'elasticsearch' ]
    index    => "%{[@metadata][beat]}-csv-%{+YYYY}-%{+MM}"
  }
}

docker-compose settings

Mount the entire pipeline folder.

docker-compose.yml


logstash01:
    build: ./logstash
    container_name: logstash01
    links:
      - es01:elasticsearch
    volumes:
      - ./logstash/config/logstash.yml:/usr/share/logstash/config/logstash.yml
      - ./logstash/config/jvm.options:/usr/share/logstash/config/jvm.options
      - ./logstash/config/log4j2.properties:/usr/share/logstash/config/log4j2.properties
      - ./logstash/config/pipelines.yml:/usr/share/logstash/config/pipelines.yml
      - ./logstash/pipeline/:/usr/share/logstash/pipeline/
      - ./logstash/extra_patterns/date_jp:/opt/logstash/extra_patterns
    networks:
      - esnet

pipeline settings

Set three files in path.config using the grob representation {}. Since it is divided into [input / filter / output] directories, write as follows.

- pipeline.id: filebeat-processing
  path.config: "/usr/share/logstash/pipeline/{input/logstash_in,filter/logstash_filter,output/logstash_out}.cfg"

Finally

In situations where you want to change only the filter, I think there is an advantage in terms of reusability, but so far I have not benefited.

Recommended Posts

Analyzing and visualizing csv logs with Excel Elastic Stack (docker-compose) --Dividing PipelineFilter into 3 files [input / filter / output] to improve maintainability and reusability
Analyzing and visualizing csv logs with Excel Elastic Stack (docker-compose) --What is Elastic Stack?
Analyzing and visualizing csv logs with Excel Elastic Stack (docker-compose) --Two ways to deal with Logstash OutOfMemoryError
Analyzing and visualizing csv logs with Excel Elastic Stack (docker-compose) --How to deal with data duplication errors in Elasticsearch
Analyze and visualize csv logs with Excel Elastic Stack (docker-compose) --Set up with docker-compose
Analyze and visualize csv logs with Excel Elastic Stack (docker-compose) --Receive input from multiple beats with Pipeline-to-Pipeline of Logstash
Analyze and visualize csv logs with Excel Elastic Stack (docker-compose) --Parse "year / month / day, hour: minute: second" in multiline with grok filter and treat it as Japan time
Analyze and visualize csv logs with Excel Elastic Stack (docker-compose)-(1st line: date, 2nd and subsequent lines: csv data) date is added to each line after the 2nd line as a timestamp field.