Analyzing and visualizing csv logs with Excel Elastic Stack (docker-compose) --What is Elastic Stack?

Introduction

Thanks! An engineer in charge of the product inspection process in the production engineering department. The main task of the inspection process of production technology, which is unfamiliar to those in the software industry, is to analyze the inspection log for stable production of products, in addition to the hardware design and inspection software design of equipment that inspects products. We will take measures against problems. Stopping the inspection process directly leads to deterioration of productivity, so an early solution is required.

To solve the problems so far, open the inspection log (csv) in Excel, filter and analyze the items for which inspection is NG, or use Excel VBA to aggregate weekly data and manage the signs. It was that. Needless to say, the work time is long and it is far from an early solution.

Therefore, we aim to improve work efficiency and productivity by using Elastic Stack.

Target audience

This article is intended for those who are new to Elastic Stack and who are thinking about trying it out.

Content of this article

This article is only for the touch. Let's talk about what an Elastic Stack is.

I would like to introduce in a future article how to write a configuration file required when launching with docker-compose, how to analyze dates such as Japan time, how to handle csv files, etc.

First of all, what I did concretely

By extracting the date and inspection result from the inspection log (sample.csv) and associating the date with the inspection result for each row, it is possible to graph it as time series data.

`sample.csv`


Dummy,1
Date,2020/10/18,20:19:18　　<---date
ID,123456
Step,TestName,Value1,Judge　<---Below, the test results
10,Test1,130,OK
20,Test2,1321,OK
30,Test3,50,NG
40,Test4,13432,OK
55,Test5,15,NG
70,Test6,1,OK
100,Test7,1734,OK
120,Test8,54,OK
End　　　　　　　　　　　　　　　<---This is the test result for one
Dummy,2
Date,2020/10/19,12:30:50
ID,123457
Step,TestName,Value1,Judge
10,Test1,140,OK
20,Test2,1300,OK
30,Test3,50,NG
40,Test4,13431,OK
55,Test5,20,NG
70,Test6,1,OK
100,Test7,1733,OK
120,Test8,56,OK
End

I just added the graph appropriately, but I confirmed that the necessary data was extracted correctly.

What is Elastic Stack?

First there is what is called the ELK Stack. Until then, if you can tell by looking at the Official Site, ** E ** lasticsearch, ** L ** ogstash, ** K ** ibana is an acronym for "ELK". These are all open source products provided by Elastic, and can be combined according to the application to enable log analysis and visualization. And to allow for more flexibility, Beats is combined with ELK Stack and is called Elastic Stack. Elasticsearch A distributed search and analysis engine that saves documents and adds searchable references to documents to the index. It is used for full-text search, etc. to search by words in blog articles. Officially, the search is fast, as the phrase "Elasticsearch is fast. It's fast anyway."

Logstash It provides a mechanism for processing the input data. Many processing filters are prepared, and it is possible to analyze any log. Specifically, there are various filters such as Grok filter for structuring unstructured data and csv filter for decomposing csv data into each field. The list of filters is here.

It also supports various types of inputs and output destinations. You can receive web logs, files such as csv, system logs, etc. The supported input list is here. Of course, you can specify Elasticsearch as the output destination, but you can also select email, slack, files, etc. The list of supported output destinations is here.

Logstash seems to be often compared to Fluentd, a log collection tool. Kibana It is a tool to graph the data aggregated in Elasticsearch. No complicated operations are required, and graphs can be created intuitively. You can also create a dashboard, and if you add the necessary graphs on the board, you can check the graphs at any time without hassle.

Another important feature is alerts. You can set a threshold and notify email or slack when the threshold is exceeded. It also supports cooperation with PagerDuty. This enables incident management.

Kibana seems to be often compared to Grafana.

Beats The Filebeat used this time belongs to a category called Beats, and Beats is called a lightweight data shipper. This is for transferring data to Elasticsearch and Logstash, and besides Filebeat, there are Metricbeat, Packetbeat, Winlogbeat, Audiobeat, Heartbeat, Functionbeat. Filebeat, as the name implies, is for transferring log files.

Specific flow using Elastic Stack

The csv file added to the folder is transferred to Logstash with Filebeat, the date and inspection result are extracted with Logstash, the processed data is stored in Elasticsearch, and it is graphed with Kibana.

Finally

This time, I tried to visualize using Elastic Stack for the first time. I introduced only the functions that I actually used, but it seems that there are many other functions that I do not know. I would like to try various things from now on. In the future, I would like to introduce the configuration file required when launching with docker-compose, the date analysis method, how to handle the csv file, etc.