What is the best practice for this? I've done it for the time being, so I'll introduce it.
Last CSV
Create Apps seika
to use.
The directory structure looks like this
tree
├── bin
│ ├── README
│ ├── dl.py
│ └── seika_dl.sh
├── data
│ ├── seika_20210113.csv
│ └── seika_20210114.csv
├── default
│ ├── app.conf
│ └── data
│ └── ui
│ ├── nav
│ │ └── default.xml
│ └── views
│ └── README
├── local
│ ├── app.conf
│ ├── inputs.conf
│ └── props.conf
└── metadata
├── default.meta
└── local.meta
inputs.conf
inputs.conf
[script://$SPLUNK_HOME/etc/apps/seika/bin/seika_dl.sh]
disabled = false
index = main
interval = 0 20 * * *
sourcetype = csv
[monitor:///Applications/Splunk/etc/apps/seika/data]
disabled = false
sourcetype = seika_csv
crcSalt= /Applications/Splunk/etc/apps/seika/data
The interval
of seika_dl.sh
is _ conducted at 8 pm_
At the time of verification, it was changed to interval = 0 * * * *
execution at 0 seconds per minute.
The source type
here is not used, so it is appropriate
For soucetype
of monitor
, use the one created for this time.
By adding crcSalt
, the file is forcibly read when it is created.
seika_dl.sh
The two scripts under bin /
have been chmod 755
seika_dl.sh
cd $SPLUNK_HOME/etc/apps/seika/data
/opt/anaconda3/bin/python ../bin/dl.py
After moving to the data storage folder, the one that just starts the download script
If you think that you can go with / usr/bin/env/python
, python in Splunk will start, so describe it with the full path.
dl.py
just added a shebang to the previous one.
I want you to put in pandas
~: cry:
props.conf
props.conf
[seika_csv]
INDEXED_EXTRACTIONS = csv
KV_MODE = none
LINE_BREAKER = ([\r\n]+)
NO_BINARY_CHECK = true
SHOULD_LINEMERGE = false
TIMESTAMP_FIELDS = date
TIME_FORMAT = %Y-%m-%d
category = Structured
disabled = false
pulldown_type = true
Simple because it is processed on the python side so that it is easy to read
If you put the date
column on the far left on the python side, you probably don't need to set the time.
SPL
miyagi.spl
index=main sourcetype=seika_csv area="Miyagi" category="Vegetables"
| stats sum(*_price) as *_price by date, product_name
--_ trellis_ can be displayed neatly like this by setting the argument of stats by
to the 1st _X axis_2nd: category.
-Since ['high_price','middle_price','low_price'] are graphed, they are abbreviated as (* _ price) as * _ price
.
It's boring because it's only for two days, but when the data was collected, I was able to bring it to the point where it could be used in various ways.
For full-scale operation, I thought it would be better to consider a method that does not store the data itself locally.
Recommended Posts