Understand while reading the article! Summary of contents that Elasticsearch beginners want to suppress

Elasticsearch, a decentralized and open source search and analysis engine. I think there are some people who are interested in it these days and plan to use it for business. In such a case

If you look at it, it's okay! !!

Speaking of (official), it's definitely true,

――The hurdle is high if the starting point is English ――For the time being, I want to grasp only the main points and use it hard

I think that there are many people who will be, so I tried to summarize it assuming that I will pick up a good article for beginners and actually move it.

Prerequisites

The emphasis is on creating an environment and actually moving it. My main environment and language are as described above, so please be patient. For people in other environments, please see around 3 and 4 of "Main subject" for reference.

Main subject

1. Download and configure curl

Curl, a tool for exchanging data with servers. Even though it's Elasticsearch, why is it curl? It seems like that, but it is essential for command-based operations. And unfortunately Windows doesn't include curl by default. You need to download it.

Download → Unzip and set the path to the bin folder in the environment variable (user environment variable). Then, execute the following command at the command prompt to prevent garbled characters even in Japanese.

UTF character code-Change to 8


> chcp 65001

If you don't understand curl in the first place and want to know the basic usage,

-curl command usage memo | Qiita -Frequently used curl command options | Qiita

Is very helpful, so it's a good idea to read it.

2. Preparing the Elasticsearch (and Kibana) environment

Now let's build an Elasticsearch environment! You don't have to have Kibana,

――Easy operation of Elasticsearch, --You can visualize the data

It's a useful tool, so be sure to include it as well.

-[Introduction to Elasticsearch] Environment Construction for Windows | Qiita

If you refer to, the environment on Windows will be ready ♪ ・ ・ I'm sorry, but it's really downloadable! !! !! It takes a long time (I took about 3.5 hours: Kibana is not so much), so people who can use Docker reasonably

-First Elasticsearch with Docker | Qiita -Run Elasticsearch + Kibana with Docker Compose | Qiita -Persist Elasticsearch data on Docker container | Qiita

You may build the environment quickly by referring to the surroundings. (It took less than an hour to review the settings.)

On the official website,

There is a description in, so please refer to this as well.

3. Understand the basic concept

Well then! Let's move it right away! Before that, it is important to understand the basic concept. In the first place, let's put that in front of the environment, and the contents that we want to keep at a minimum are as follows.

I think it's easier to get an image of terms that often appear when compared to RDB. If you look it up, there are various theories, but the most suitable combination is as follows.

RDB Elasticsearch
Database Cluster
table Index
table definition Mapping
record(line) Document
Column Fields
Primary key (primary key) Document ID

Of course there are different parts, so for details

-Elasticsearch to learn by making clients | iPRIDE

Please refer to the.

Then, analyzer. An analyzer is a text analysis that performs the process of converting to the best format for searching. The analyzer is roughly divided into the following three parts.

name Contents
Character filters Perform necessary processing (addition, deletion, change) before dividing the character string with Tokenizer.
Pre-processing position, optional use.
Tokenizer It has the role of dividing the character string at the word level.
Required item.
Token filters Perform necessary processing (addition, deletion, change) for the contents divided by Tokenizer.
Post-processing position, optional use.

On the official website

There is a description in, so let's check it once.

Finally, aggregations. Aggregate the data based on the search query.

On the official website

Refer to the area.

4. Let's run it on a command basis

Let's finally operate Elasticsearch ~!

For general content,

-First Elasticsearch | Qiita -[Elasticsearch] List of frequently used commands | Qiita

Please refer to the area.

From the perspective of comparison with SQL

-Comparison of queries between SQL and Elasticsearch | Qiita -Understanding while comparing Elasticsearch and SQL | Qiita

The area is very easy to understand ♪

First of all, referring to the above articles, on a command basis,

--Creating an index --Data input & update --Query execution

It's a good idea to go through the area.

If you have the spare capacity, it is a good idea to go through analyzer and aggregations.

For analyzer (using Japanese),

-In Elasticsearch, _search the query that has been divided (kuromoji: morphological analysis). | Qiita -Summary of settings for using Elasticsearch in Japanese | Qiita -[Elasticsearch] Try to improve search accuracy with Kuromoji and ngram | Qiita

Please refer to the area.

For aggregations,

-Realization of facets using Elasticsearch Aggregations | Qiita -Replace SQL "Group by" with Elasticsearch "Aggregations" | Qiita

The area will be helpful.

5. Let's run it in a programming language (Java)

You can also operate using a programming language!

According to Elasticsearch, it supports Java, JavaScript, Go, .NET, PHP, Perl, Python, Ruby and 8 other programming languages, but as I mentioned at the beginning, my main environment is Since it is Java, this time it is Java.

For Japanese articles

Is very well organized.

The top page of the original book is below.

The following page is the one that I personally wanted to keep in mind.

It's a good idea to actually write the code here as well and see how it works.

Impressions

It's hard to get at first, Elasticsearch Konoyaro! !! However, as I used it, I became able to understand it a lot, and I became more attached to it. The range of customization is large, and it is also good that it seems to be worth raising.

It's been about two weeks since I started touching it as of 09/03/2020 at the time of writing this article. It's still a long way to go, but I'd like to get used to it rather than learn it!

Change log

date Contents
2020/09/03 First edition post
2020/09/04 analyzer,Added content about aggregations

reference

These articles are not included in the text, but have been very helpful.

  1. curl -Summary of curl command usage in Windows environment (Heisei final version) | Qiita

3. Understand the basic concept

-Elasticsearch Mapping | Hello! Elasticsearch. -Summary of Amazon Elasticsearch Service | Qiita

Recommended Posts

Understand while reading the article! Summary of contents that Elasticsearch beginners want to suppress
I want to var_dump the contents of the intent
I want to be aware of the contents of variables!
I want to understand the flow of Spring processing request parameters
The story of Collectors.groupingBy that I want to keep for posterity
[Rails] Articles for beginners to organize and understand the flow of form_with
I want to see the contents of Request without saying four or five
I want to output the day of the week
Customize how to divide the contents of Recyclerview
[Ruby] I want to make a program that displays today's day of the week!
Rails The concept of view componentization of Rails that I want to convey to those who want to quit
[Object-oriented] A memorandum that makes zero the best [Site summary that helped to understand the concept]
[For beginners] Quickly understand the basics of Java 8 Lambda
Understand the characteristics of Scala in 5 minutes (Introduction to Scala)
I want to know the answer of the rock-paper-scissors app
I want to display the name of the poster of the comment
"Inheritance" that even beginners of object-oriented programming can understand
[Rails] How to get the contents of strong parameters
Java: Use Stream to sort the contents of the collection
Polymorphism that even beginners of object-oriented programming can understand
I want to return the scroll position of UITableView!
I want to get a list of the contents of a zip file and its uncompressed size
I want to recreate the contents of assets from scratch in the environment built with capistrano
[For beginners] An article that makes you understand 5000 trillion% of Java's complicated "passing by reference"