Thanks! An engineer in charge of the product inspection process in the production engineering department. It is a continuation of Analyzing and visualizing csv logs with Excel Elastic Stack (docker-compose) --What is Elastic Stack.
This article is intended for those who are new to Elastic Stack and who are thinking about trying it out.
I will explain how to treat the first date of such csv data as the timestamp of each line.
Date,2020/10/28,20:19:18
10,Test1,130.1,OK
20,Test2,1321,OK
30,Test3,50.2,OK
End
Date,2020/10/29,10:30:50
10,Test1,140.4,OK
20,Test2,1300,OK
30,Test3,50.0,OK
End
Date,2020/10/29,11:40:50
10,Test1,141.1,OK
20,Test2,1310,OK
30,Test3,55.8,NG
End
I have put a set of configuration files in GitLab, so please refer to it. Click here for repository-> elastic-stack
I examined it in two ways.
In conclusion, this method didn't work. aggregate Because filters allow you to share information between multiple events. , In some cases it is possible.
In the official example, the numbers from start to end are added and the result of adding to the event field of TASK_END is set.
INFO - 12345 - TASK_START - start
INFO - 12345 - SQL - sqlQuery1 - 12
INFO - 12345 - SQL - sqlQuery2 - 34
INFO - 12345 - TASK_END - end
However, you can share data between events by setting the task_id of the aggregate filter to "12345". In this situation, the task_id equivalent did not exist in the data and could not be used.
It's a successful method. By using filebeat's multiline, multiple lines of text can be combined into one \ n delimiter. It can be an event. Below, it can be summarized by the rules of three lines.
multiline.pattern: (End)
multiline.negate: true
multiline.match: before
@ datake913's article Multiline setting summary that handles multiple lines with Filebeat is a table and it was easy to understand, so I will refer to it as "pattern: Consecutive lines that do not match (END) are added before the next matching line. " It can be combined into one line as shown below.
Date,2020/10/28,20:19:18\n10,Test1,130.1,OK\n20,Test2,1321,OK\n30,Test3,50.2,OK\nEnd
To parse the timestamp, use De-Excel Elastic Stack (docker-compose) to analyze and visualize csv logs --- parse "year / month / day, hour: minute: second" in multiline with grok filter, and Japanese time Treat as.
If you apply the csv filter to multiline as it is, you can only parse Date, 2020/10 / 28,20: 19: 18
up to the first \ n. split By using a filter, multiline can be decomposed again and divided into multiple events. ..
As a final process, lines that start with a non-number are deleted with the drop filter and mutate Type conversion is performed from String type with a filter.
logstash.conf
filter {
grok {
patterns_dir => ["/opt/logstash/extra_patterns"]
match => { "message" => "%{TIMESTAMP_JP:read_timestamp}" }
}
date {
match => ["read_timestamp", "yyyy/MM/dd,HH:mm:ss"]
timezone => "Asia/Tokyo"
target => "@timestamp"
}
split{}
csv {
columns => ["Step","TestName","Value1","Judge"]
separator => ","
}
if [Step] !~ /\d+/ {
drop{}
}
mutate {
convert => {
"Step" => "integer"
"Value1" => "float"
}
}
}
You can now assign the same timestamp to multiple lines using multiline and split. In future articles, I would like to introduce countermeasures for the following heap errors.
java.lang.OutOfMemoryError: Java heap space
Dumping heap to java_pid1.hprof ...
Heap dump file created [3178685347 bytes in 34.188 secs]
warning: thread "[main]>worker11" terminated with exception (report_on_exception is true):
warning: thread "[main]>worker4" terminated with exception (report_on_exception is true):
java.lang.OutOfMemoryError: Java heap space
Recommended Posts