2020/10/11
Aim for: --Build with Docker --Elasticsearch: Start with 3 nodes --Logstash: Twitter data acquisition --Kibana: Twitter data visualization
The final source is stored below.
https://github.com/sugasaki/elasticsearch-multinode-docker
Follow the steps below
Install Elasticsearch with Docker | Elasticsearch Reference [7.9] | Elastic
Make sure Docker Engine is allotted at least 4GiB of memory. In Docker Desktop, you configure resource usage on the Advanced tab in Preference (macOS) or Settings (Windows).
Make sure the Docker Engine has at least 4GiB of memory allocated. In Docker Desktop, configure resource usage on the Advanced tab of Settings (macOS) or Settings (Windows).
If the allocated memory is low, the node may fail to start.
Elasticsearch
There are various other introductory articles, so I will omit them.
Launch ElasticSearch with docker and trial operation --Qiita
Docker's Kibana Japanese localization --Qiita
Run a 3-node cluster using Docker Compose.
docker-compose.yml
Prepare a working folder and create docker-compose.yml
directly under it.
Start a 3-node Elasticsearch cluster. Node es01 listens on localhost: 9200 and es02 and es03 communicate with es01 over the Docker network.
docker-compose.yml
version: "2.2"
services:
# 1st node port=9200
elasticsearch01:
image: docker.elastic.co/elasticsearch/elasticsearch:7.9.2
container_name: es01
environment:
- node.name=es01
- cluster.name=es-docker-cluster
- discovery.seed_hosts=es02,es03
- cluster.initial_master_nodes=es01,es02,es03
- bootstrap.memory_lock=true
- "ES_JAVA_OPTS=-Xms512m -Xmx512m"
ulimits:
memlock:
soft: -1
hard: -1
volumes:
- elasticsearch_multinode_data01:/usr/share/elasticsearch/data
ports:
- 9200:9200
networks:
- elastic
healthcheck:
interval: 20s
retries: 10
test: curl -s http://localhost:9200/_cluster/health | grep -vq '"status":"red"'
# 2nd node
elasticsearch02:
image: docker.elastic.co/elasticsearch/elasticsearch:7.9.2
container_name: es02
environment:
- node.name=es02
- cluster.name=es-docker-cluster
- discovery.seed_hosts=es01,es03
- cluster.initial_master_nodes=es01,es02,es03
- bootstrap.memory_lock=true
- "ES_JAVA_OPTS=-Xms512m -Xmx512m"
ulimits:
memlock:
soft: -1
hard: -1
volumes:
- elasticsearch_multinode_data02:/usr/share/elasticsearch/data
networks:
- elastic
# 3rd node
elasticsearch03:
image: docker.elastic.co/elasticsearch/elasticsearch:7.9.2
container_name: es03
environment:
- node.name=es03
- cluster.name=es-docker-cluster
- discovery.seed_hosts=es01,es02
- cluster.initial_master_nodes=es01,es02,es03
- bootstrap.memory_lock=true
- "ES_JAVA_OPTS=-Xms512m -Xmx512m"
ulimits:
memlock:
soft: -1
hard: -1
volumes:
- elasticsearch_multinode_data03:/usr/share/elasticsearch/data
networks:
- elastic
volumes:
elasticsearch_multinode_data01:
driver: local
elasticsearch_multinode_data02:
driver: local
elasticsearch_multinode_data03:
driver: local
networks:
elastic:
driver: bridge
Note that in this configuration, port 9200 is open and publicly accessible.
If you don't want to expose port 9200 and want to use a reverse proxy instead, replace 9200: 9200 with 127.0.0.1: 9200: 9200 in the docker-compose.yml
file. Elasticsearch will only be accessible from the host machine itself.
Volume elasticsearch_multinode_data01 to 03 will be retained even after rebooting. If they don't already exist, they will be created by docker-compose when you start the cluster.
docker-compose up
$ curl -X GET "localhost:9200/_cat/health?v&pretty"
epoch timestamp cluster status node.total node.data shards pri relo init unassign pending_tasks max_task_wait_time active_shards_percent
1602390892 04:34:52 es-docker-cluster green 3 3 0 0 0 0 0 0 - 100.0%
$ curl -X GET "localhost:9200/_cat/nodes?v&pretty"
ip heap.percent ram.percent cpu load_1m load_5m load_15m node.role master name
172.19.0.4 37 63 47 2.44 1.47 1.18 dilmrt - es03
172.19.0.3 18 63 46 2.44 1.47 1.18 dilmrt * es01
172.19.0.2 24 63 48 2.44 1.47 1.18 dilmrt - es02
$ docker volume ls
DRIVER VOLUME NAME
local docker_elasticsearch_multinode_data01
local docker_elasticsearch_multinode_data02
local docker_elasticsearch_multinode_data03
Detailed confirmation of data01
$ docker volume inspect docker_elasticsearch_multinode_data01
[
{
"CreatedAt": "2020-10-11T04:20:26Z",
"Driver": "local",
"Labels": {
"com.docker.compose.project": "docker",
"com.docker.compose.version": "1.27.4",
"com.docker.compose.volume": "elasticsearch_multinode_data01"
},
"Mountpoint": "/var/lib/docker/volumes/docker_elasticsearch_multinode_data01/_data",
"Name": "docker_elasticsearch_multinode_data01",
"Options": null,
"Scope": "local"
}
]
Launch the terminal and enter docker with the following command to check
$ docker run -it --privileged --pid=host debian nsenter -t 1 -m -u -n -i sh
Check with the ls command
/ # ls -ll /var/lib/docker/volumes/
total 44
drwxr-xr-x 3 root root 4096 Oct 11 04:20 docker_elasticsearch_multinode_data01
drwxr-xr-x 3 root root 4096 Oct 11 04:20 docker_elasticsearch_multinode_data02
drwxr-xr-x 3 root root 4096 Oct 11 04:20 docker_elasticsearch_multinode_data03
-rw------- 1 root root 65536 Oct 11 04:20 metadata.db
When satisfied, exit with the ʻexit` command
-[Docker for Mac] I searched for the actual state of Docker Volume | w0o0ps | note
-I can't find the actual data volume on docker desktop | t11o
Create an Index called customer, input data, and check. After exiting Elasticsearch in the middle, re-execute to check if the data is persistent.
curl -X PUT "localhost:9200/customer?pretty&pretty"
$ curl -X GET "localhost:9200/_cat/indices?v&pretty"
health status index uuid pri rep docs.count docs.deleted store.size pri.store.size
green open customer FIT3eS3YSR2UEE0np3BnwA 1 1 0 0 416b 208b
Input data by specifying 1 for _id
Do the following in bulk
curl --include -XPOST "http://localhost:9200/customer/_doc/1?pretty" \
-H 'Content-Type: application/json' \
-d '{
"name": "John Doe",
"message": "The night was young, and so was he. But the night was sweet, and he was sour."
}'
$ curl -X GET 'localhost:9200/customer/_doc/1?pretty'
{
"_index" : "customer",
"_type" : "_doc",
"_id" : "1",
"_version" : 1,
"_seq_no" : 2,
"_primary_term" : 1,
"found" : true,
"_source" : {
"name" : "John Doe",
"message" : "The night was young, and so was he. But the night was sweet, and he was sour."
}
}
Exit Elasticsearch and run it again.
$ docker-compose down
$ docker-compose up
Data persistence confirmation
$ curl -X GET "localhost:9200/_cat/indices?v&pretty"
health status index uuid pri rep docs.count docs.deleted store.size pri.store.size
green open customer oD3E_VXqSWy7I0F1NSIlyQ 1 1 1 0 9.6kb 4.8kb
$ curl -X GET 'localhost:9200/customer/_doc/1?pretty'
{
"_index" : "customer",
"_type" : "_doc",
"_id" : "1",
"_version" : 1,
"_seq_no" : 2,
"_primary_term" : 1,
"found" : true,
"_source" : {
"name" : "John Doe",
"message" : "The night was young, and so was he. But the night was sweet, and he was sour."
}
}
reference: Install Kibana with Docker | Kibana Guide [7.9] | Elastic
Add the following to the top of the Volumes section of docker-compose.yml
.
docker-compose.yml
# kibana
kibana:
image: docker.elastic.co/kibana/kibana:7.9.2
container_name: kibana
ports:
- "5601:5601"
environment:
- ELASTICSEARCH_HOSTS=http://es01:9200 # container_Refer to name
- "I18N_LOCALE=ja-JP" #Display in Japanese
depends_on:
- elasticsearch01
- elasticsearch02
- elasticsearch03
networks:
- elastic
healthcheck:
interval: 10s
retries: 20
test: curl --write-out 'HTTP %{http_code}' --fail --silent --output /dev/null http://localhost:5601/api/status
restart: always
I have also added a Japanese setting to the environment.
Reference: Docker's Japanese localization of Kibana --Qiita
$ docker-compose up
Access to http://localhost:5601/
The Kibana screen is displayed in the browser
kibana menu> Management> Stack management
Select Index Management from the Data menu.
You should see the customer
index you created earlier.
When you are completely satisfied, quit Elasticsearch and move on to Logstash.
Logstash
reference: Running Logstash on Docker | Logstash Reference [7.9] | Elastic
Add the following to the top of the Volumes section after the Kibana settings in docker-compose.yml
.
docker-compose.yml
# logstash
logstash:
image: docker.elastic.co/logstash/logstash:7.9.2
container_name: logstash
networks:
- elastic
depends_on:
- elasticsearch01
- elasticsearch02
- elasticsearch03
restart: always
$ docker-compose up
Check if it is running
$ docker ps
CONTAINER ID IMAGE COMMAND CREATED STATUS PORTS NAMES
072d509e37f0 docker.elastic.co/kibana/kibana:7.9.2 "/usr/local/bin/dumb…" 16 minutes ago Up 16 minutes (healthy) 0.0.0.0:5601->5601/tcp kibana
7bd68ec00995 docker.elastic.co/logstash/logstash:7.9.2 "/usr/local/bin/dock…" 16 minutes ago Up 16 minutes 5044/tcp, 9600/tcp logstash
7bff0bddb7e1 docker.elastic.co/elasticsearch/elasticsearch:7.9.2 "/tini -- /usr/local…" 16 minutes ago Up 16 minutes 9200/tcp, 9300/tcp es02
e32fcf67c1c3 docker.elastic.co/elasticsearch/elasticsearch:7.9.2 "/tini -- /usr/local…" 16 minutes ago Up 16 minutes 9200/tcp, 9300/tcp es03
60f489bf0dc8 docker.elastic.co/elasticsearch/elasticsearch:7.9.2 "/tini -- /usr/local…" 16 minutes ago Up 16 minutes (healthy) 0.0.0.0:9200->9200/tcp, 9300/tcp es01
If you haven't registered as a developer of Twitter API, register below.
https://developer.twitter.com/en/apps/
I registered by referring to the following.
[Detailed explanation from the example sentence of the 2020 version Twitter API usage application to the acquisition of the API key | Shinjuku homepage production company ITTI](https://www.itti.jp/web-direction/how-to-apply- for-twitter-api /)
Add the application from the following and refrain from consumer_key etc.
We will proceed so that the folder structure is as follows.
$ tree --charset=C
.
|-- docker-compose.yml
|-- logstash
| |-- config
| | |-- logstash.yml
| | |-- pipelines.yml
| `-- pipeline
| `-- twitter.conf
Add volumes to the logstash settings as shown below.
docker-compose.yml
logstash:
image: docker.elastic.co/logstash/logstash:7.9.2
container_name: logstash
volumes:
- ./logstash/pipeline/:/usr/share/logstash/pipeline/
- ./logstash/config/:/usr/share/logstash/config/
networks:
- elastic
depends_on:
- elasticsearch01
- elasticsearch02
- elasticsearch03
restart: always
Create a ./logstash/config/
folder and place the following under it
:./logstash/config/logstash.yml
pipeline.ordered: auto
http.host: "0.0.0.0"
xpack.monitoring.elasticsearch.hosts: ["http://es01:9200"]
:./logstash/config/pipelines.yml
- pipeline.id: twitter_pipeline
path.config: "/usr/share/logstash/pipeline/twitter.conf"
queue.type: persisted
:./logstash/pipeline/twitter.conf
input {
twitter {
consumer_key => "<your_consumer_key>"← Set the key obtained on the Twitter API management screen
consumer_secret => "<your_consumer_secret>"
oauth_token => "your_oauth_token"
oauth_token_secret => "your_oauth_token_secret"
keywords => ["Maezawa", "President Maezawa"]
ignore_retweets => true
full_tweet => true
}
}
output {
elasticsearch {
hosts => ["http://es01:9200/"]
index => "twitter_maezawa"
}
}
Reference: [Sentiment analysis on twitter data with ELK | Clément's blog](https://clementbm.github.io/elasticsearch/kibana/logstash/elk/sentiment%20analysis/2020/03/02/elk-sentiment-analysis-twitter -coronavirus.html)
As of October 11, 2020, the keyword "Maezawa-san" is [Twitter trend](https://twitter.com/search?q=%E5%89%8D%E6%BE%A4%E3%81% Since it was 95% E3% 82% 93 & src = trend_click & vertical = trends), ["Maezawa-san", "Maezawa-san"] is set as the keyword to be acquired from Twitter. I also tried the index name as "twitter_maezawa".
docker-compose up
Go to Kibana.
It is successful if Index is added to [Index Management](http: // localhost: 5601 / app / management / data / index_management / indices) as shown below.
http://localhost:5601/app/management/data/index_management/indices
Create an index pattern to view the collected data. http://localhost:5601/app/management/kibana/indexPatterns/create
Enter twitter_maezawa
in the index pattern name and press the next step.
Select @timestamp for the time field and press Create Index Pattern.
[Discover](http: // localhost: 5601 / app / discover # /? _ G = (filters:! (), RefreshInterval: (pause:! T, value: 0), time: (from: now-1h, to) : now)) & a = (columns :! ( source), filters :! (), index: c9d9ea70-0b90-11eb-be10-f594a95a62f0, interval: auto, query: (language: kuery, query:''), sort You can check the details of the data by opening :! ())).
Created from the Visualize menu
Added to the dashboard.
Since it's a big deal, I'll also add the Japanese morphological analysis engines "kuromoji" and "icu".
Create an elasticsearch folder and create a Dockerfile under it. I will also try to read elasticsearch.yml.
elasticsearch/Dockerfile
FROM docker.elastic.co/elasticsearch/elasticsearch:7.9.2
COPY ./config/elasticsearch.yml /usr/share/elasticsearch/config/elasticsearch.yml
#remove plugin ↓ Enable if an error that is already included is displayed
# RUN elasticsearch-plugin remove analysis-icu
# RUN elasticsearch-plugin remove analysis-kuromoji
# install plugin
RUN elasticsearch-plugin install analysis-icu
RUN elasticsearch-plugin install analysis-kuromoji
docker-compose.yml
elasticsearch01:
image: docker.elastic.co/elasticsearch/elasticsearch:7.9.2
Change to ↓
elasticsearch01:
build: ./elasticsearch/
Similarly, change elasticsearch02 and elasticsearch03.
elasticsearch.yml
Create ʻelasticsearch.yml under the ʻelasticsearch / config
folder. The contents are empty and OK
.
|-- docker-compose.yml
|-- elasticsearch
| |-- Dockerfile
| `-- config
| `-- elasticsearch.yml
build
docker-compose build
docker-compose up
I will confirm it just in case.
$ curl -X GET "localhost:9200/_cat/nodes?v&pretty"
ip heap.percent ram.percent cpu load_1m load_5m load_15m node.role master name
172.24.0.2 73 81 51 2.27 2.20 1.23 dilmrt - es03
172.24.0.3 58 81 52 2.27 2.20 1.23 dilmrt * es02
172.24.0.4 49 81 52 2.27 2.20 1.23 dilmrt - es01
Check if the plug-in is installed with the following command.
$ curl -X GET "http://localhost:9200/_nodes/es01/plugins?pretty"
...abridgement
},
"plugins" : [
{
"name" : "analysis-icu",
"version" : "7.9.2",
"elasticsearch_version" : "7.9.2",
"java_version" : "1.8",
"description" : "The ICU Analysis plugin integrates the Lucene ICU module into Elasticsearch, adding ICU-related analysis components.",
"classname" : "org.elasticsearch.plugin.analysis.icu.AnalysisICUPlugin",
"extended_plugins" : [ ],
"has_native_controller" : false
},
{
"name" : "analysis-kuromoji",
"version" : "7.9.2",
"elasticsearch_version" : "7.9.2",
"java_version" : "1.8",
"description" : "The Japanese (kuromoji) Analysis plugin integrates Lucene kuromoji analysis module into elasticsearch.",
"classname" : "org.elasticsearch.plugin.analysis.kuromoji.AnalysisKuromojiPlugin",
"extended_plugins" : [ ],
"has_native_controller" : false
}
],
Let's visualize the Elastic forum --Qiita
[Dashboard](http: // localhost: 5601 / app / dashboards # / list? _ G = (filters :! (), refreshInterval: (pause:! T, value: 0), time: (from: now-1h) , to: now)))) to check.
Visualization was possible in the tag cloud.
The final result is as follows.
version: "2.2"
services:
# 1st node port=9200
elasticsearch01:
# image: docker.elastic.co/elasticsearch/elasticsearch:7.9.2
build: ./elasticsearch/
container_name: es01
environment:
- node.name=es01
- cluster.name=es-docker-cluster
- discovery.seed_hosts=es02,es03
- cluster.initial_master_nodes=es01,es02,es03
- bootstrap.memory_lock=true
- "ES_JAVA_OPTS=-Xms512m -Xmx512m"
ulimits:
memlock:
soft: -1
hard: -1
volumes:
- elasticsearch_multinode_data01:/usr/share/elasticsearch/data
ports:
- 9200:9200
networks:
- elastic
healthcheck:
interval: 20s
retries: 10
test: curl -s http://localhost:9200/_cluster/health | grep -vq '"status":"red"'
# 2nd node
elasticsearch02:
# image: docker.elastic.co/elasticsearch/elasticsearch:7.9.2
build: ./elasticsearch/
container_name: es02
environment:
- node.name=es02
- cluster.name=es-docker-cluster
- discovery.seed_hosts=es01,es03
- cluster.initial_master_nodes=es01,es02,es03
- bootstrap.memory_lock=true
- "ES_JAVA_OPTS=-Xms512m -Xmx512m"
ulimits:
memlock:
soft: -1
hard: -1
volumes:
- elasticsearch_multinode_data02:/usr/share/elasticsearch/data
networks:
- elastic
# 3rd node
elasticsearch03:
# image: docker.elastic.co/elasticsearch/elasticsearch:7.9.2
build: ./elasticsearch/
container_name: es03
environment:
- node.name=es03
- cluster.name=es-docker-cluster
- discovery.seed_hosts=es01,es02
- cluster.initial_master_nodes=es01,es02,es03
- bootstrap.memory_lock=true
- "ES_JAVA_OPTS=-Xms512m -Xmx512m"
ulimits:
memlock:
soft: -1
hard: -1
volumes:
- elasticsearch_multinode_data03:/usr/share/elasticsearch/data
networks:
- elastic
# kibana
kibana:
image: docker.elastic.co/kibana/kibana:7.9.2
container_name: kibana
ports:
- "5601:5601"
environment:
- ELASTICSEARCH_HOSTS=http://es01:9200 # container_Refer to name
- "I18N_LOCALE=ja-JP"
depends_on:
- elasticsearch01
- elasticsearch02
- elasticsearch03
networks:
- elastic
healthcheck:
interval: 10s
retries: 20
test: curl --write-out 'HTTP %{http_code}' --fail --silent --output /dev/null http://localhost:5601/api/status
restart: always
# logstash
logstash:
image: docker.elastic.co/logstash/logstash:7.9.2
container_name: logstash
volumes:
- ./logstash/pipeline/:/usr/share/logstash/pipeline/
- ./logstash/config/:/usr/share/logstash/config/
networks:
- elastic
depends_on:
- elasticsearch01
- elasticsearch02
- elasticsearch03
restart: always
volumes:
elasticsearch_multinode_data01:
driver: local
elasticsearch_multinode_data02:
driver: local
elasticsearch_multinode_data03:
driver: local
networks:
elastic:
driver: bridge
$ tree --charset=C
.
|-- docker-compose.yml
|-- elasticsearch
| |-- Dockerfile
| `-- config
| `-- elasticsearch.yml
`-- logstash
|-- config
| |-- logstash.yml
| |-- pipelines.yml
`-- pipeline
`-- twitter.conf
The following command
docker-compose down
When erasing the volume,
docker-compose down --volumes
When erasing all, the following (container, image, volume and network, all)
docker-compose down --rmi all --volumes
docker volume prune
[First Elasticsearch with Docker --Qiita](https://qiita.com/kiyokiyo_kzsby/items/344fb2e9aead158a5545#elasticsearch%E3%82%AF%E3%83%A9%E3%82%B9%E3%82%BF% E3% 81% AE% E5% 81% 9C% E6% AD% A2)
[A story about building an ElasticStack environment on Docker and tag-clouding tweets about "Corona" --Qiita](https://qiita.com/kenchan1193/items/9320390d48f3d2ae883c#elasticstack%E7%92%B0%E5% A2% 83% E6% A7% 8B% E7% AF% 89)
Recommended Posts