ElasticSearch

Requires java8 or above (Download)

#ES installation
$ brew install elasticsearch

#Start ES
$ /usr/local/Cellar/elasticsearch/7.8.1/bin/elasticsearch
or
$ elasticsearch

#Operation check
$ curl http://localhost:9200

Add plugin

--kuromoji (separate writing) --ICU (Character Normalization Filter)

$ elasticsearch-plugin install https://artifacts.elastic.co/downloads/elasticsearch-plugins/analysis-kuromoji/analysis-kuromoji-7.8.1.zip
$ elasticsearch-plugin install https://artifacts.elastic.co/downloads/elasticsearch-plugins/analysis-icu/analysis-icu-7.8.1.zip

data structure

Structure of ES when compared with RDB (Reference)

Cluster > Node > Index( = DB) > Type(=Table) >Document( =Row)

Basic usage

#index list
curl -X GET http://localhost:9200/_cat/indices

#index creation
curl -XPUT http://localhost:9200/Index name

#Delete index
curl -XDELETE localhost:9200/Index name

#Add doc
curl -XPOST http://localhost:9200/Index name/Type name/ -d '{
    "field_01": "hoge1",
}'


#Get doc
curl -XGET http://localhost:9200/new_index/_search

curl -H "Content-Type: application/json" -XGET localhost:9200/Index name/_search\?pretty  -d '{"size":100}'

    ※?Format the result with pretty and output
    ※ -d '{"size":100}'If you do not specify the number of cases in, there are only a maximum of 10 cases.
* When sending json, "-H "Content-Type: application/json""Is necessary


#doc search
curl -H "Content-Type: application/json" -XGET localhost:9200/Index name/_search\?pretty  -d'
{
  "query": {
    "match": {
item name:Search value
    }
  }
}'


#Add doc
curl -H "Content-Type: application/json" -XPOST http://localhost:9200/Index name/new/ -d'{
item name:value
}'

Kibana Since it has a request builder-like function, add it if necessary for development

# install
$ brew install kibana
$ brew info kibana

#Start-up
$ kibana

#Access the dashboard
http://localhost:5601/app/kibana#/dev_tools/console

Reference link

searchAPI
queryDSL -Query Example Compare with SQL -[Summary of precautions and anti-patterns during actual operation of Elasticsearch](http://kakakazuma.hatenablog.com/entry/2016/07/19/100000_2#_all%E3%81%AE%E4%BD%BF%E7 % 94% A8% E3% 81% AF% E5% BF% 85% E8% A6% 81% E3% 81% AA% E6% 99% 82% E3% 81% AE% E3% 81% BF% E8% A1 % 8C% E3% 81% 86) -Points when using Elasticsearch as a search engine --The default memory is 2G, which is 50% or less of the server memory. So I want 4G as a server. Reference

Use ElasticSearch from Rails

-elasticsearch-rails Use gem. --ʻInclude Elasticsearch :: Model and operate via Model. --The index name is set to ʻindex_name" articles-# {Rails.env} " so that it changes depending on the environment.

#index creation
<Model name>.create_index

#Re-import all DB data
<Model name>__elasticsearch__.import

#Data search example
<Model name>.search query:{ match_all: {}}
<Model name>.search query:{ match: {title:”hello”}}

Execute sidekiq async during model change processing (ʻArticleIndexer.perform_async) → Queued to Redis and asynchronously create ElasticSearch index (models / concerts / article_searchable`)

sidekiq

Required for asynchronous index updates. Queued to Redis and asynchronously indexed to ElasticSearch.

$ brew install redis 

$ gem 'sidekiq'

$ bundle exec sidekiq -C config/sidekiq.yml

Reference

ElasticSearch Japanese plugin story

-Check the index setting (If you specify analyzer, you can check it.)

curl -H "Content-Type: application/json"  http://localhost:9200/articles-test/_mappings\?pretty
curl -H "Content-Type: application/json"  http://localhost:9200/articles-test/\?pretty

-Check how the query is split by es (separate == tokenize)

curl -H "Content-Type: application/json" -XGET localhost:9200/articles-test/_search\?pretty  -d'
{
  "query": {
    "match": {
      "title": "Tokyo"
    }
  }
}'

About full-text search in the first place

Basically it is not an exact match. When full-text search is in English, words are separated by spaces, and each word is divided to create an index. Since the words are matched by the divided words, it is not always possible to search by an exact match.

kuromoji and ngram are the settings for this word-separation. The match rate is higher for ngram.

What icu does is good at half-width kana and indexing it as ordinary characters. A filter-like that converts Japanese UTF-8 characters well ..

About full-text search Reference link

https://dev.classmethod.jp/articles/es-02/ https://tech-blog.rakus.co.jp/entry/20191002/elasticsearch#Analyzer%E3%81%A8%E3%81%AF https://medium.com/hello-elasticsearch/elasticsearch-833a0704e44b https://qiita.com/shin_hayata/items/41c07923dbf58f13eec4

Term-based query (exact match)

https://blog.chocolapod.net/momokan/entry/114

Difference between filter and query

query (full text search) does not consider the score when retrieving search results. Feeling like using query when the score that shows whether the search result is highly relevant is important

What is a score?

How to determine the ranking of search results If the search term matches well in one document, and it is a unique search term for the relevant document that is not often found in other documents, and if it is a short document, the match is high. https://qiita.com/r4-keisuke/items/d653d26b6fc8b7955c05

What the ICU filter does

https://medium.com/hello-elasticsearch/elasticsearch-c98fd9ce6a18 Set to filter to normalize characters

Installation of Japanese plug-in

Kuromoji installation

/usr/local/Cellar/elasticsearch/7.8.1/bin/elasticsearch-plugin install https://artifacts.elastic.co/downloads/elasticsearch-plugins/analysis-kuromoji/analysis-kuromoji-7.8.1.zip

ICU installation

/usr/local/Cellar/elasticsearch/7.8.1/bin/elasticsearch-plugin install https://artifacts.elastic.co/downloads/elasticsearch-plugins/analysis-icu/analysis-icu-7.8.1.zip

Setting reference in rails

https://qiita.com/chase0213/items/381b1eeacb849d93ecfd https://qiita.com/yamashun/items/e1f2157e1b3cf3a716e3

Rails + ElasticSearch Survey Memo