What is the best practice for this? I've done it for the time being, so I'll introduce it.

Prerequisites

Last CSV Create Apps seika to use. The directory structure looks like this

`tree`


├── bin
│   ├── README
│   ├── dl.py
│   └── seika_dl.sh
├── data
│   ├── seika_20210113.csv
│   └── seika_20210114.csv
├── default
│   ├── app.conf
│   └── data
│       └── ui
│           ├── nav
│           │   └── default.xml
│           └── views
│               └── README
├── local
│   ├── app.conf
│   ├── inputs.conf
│   └── props.conf
└── metadata
    ├── default.meta
    └── local.meta

inputs.conf

`inputs.conf`


[script://$SPLUNK_HOME/etc/apps/seika/bin/seika_dl.sh]
disabled = false
index = main
interval = 0 20 * * *
sourcetype = csv

[monitor:///Applications/Splunk/etc/apps/seika/data]
disabled = false
sourcetype = seika_csv
crcSalt= /Applications/Splunk/etc/apps/seika/data

The interval of seika_dl.sh is _ conducted at 8 pm_ At the time of verification, it was changed to interval = 0 * * * * execution at 0 seconds per minute. The source type here is not used, so it is appropriate

For soucetype of monitor, use the one created for this time. By adding crcSalt, the file is forcibly read when it is created.

seika_dl.sh The two scripts under bin / have been chmod 755

`seika_dl.sh`


cd $SPLUNK_HOME/etc/apps/seika/data
/opt/anaconda3/bin/python ../bin/dl.py

After moving to the data storage folder, the one that just starts the download script If you think that you can go with / usr/bin/env/python, python in Splunk will start, so describe it with the full path.

dl.py just added a shebang to the previous one. I want you to put in pandas ~: cry:

props.conf

`props.conf`


[seika_csv]
INDEXED_EXTRACTIONS = csv
KV_MODE = none
LINE_BREAKER = ([\r\n]+)
NO_BINARY_CHECK = true
SHOULD_LINEMERGE = false
TIMESTAMP_FIELDS = date
TIME_FORMAT = %Y-%m-%d
category = Structured
disabled = false
pulldown_type = true

Simple because it is processed on the python side so that it is easy to read If you put the date column on the far left on the python side, you probably don't need to set the time.

Search

SPL

`miyagi.spl`


index=main sourcetype=seika_csv area="Miyagi" category="Vegetables"
| stats sum(*_price) as *_price by date, product_name

result

Commentary

--_ trellis_ can be displayed neatly like this by setting the argument of stats by to the 1st _X axis_2nd: category. -Since ['high_price','middle_price','low_price'] are graphed, they are abbreviated as (* _ price) as * _ price.

Summary

It's boring because it's only for two days, but when the data was collected, I was able to bring it to the point where it could be used in various ways.

For full-scale operation, I thought it would be better to consider a method that does not store the data itself locally.

Download and import files with Splunk external python