This article is the 18th day article of Elastic stack (Elasticsearch) Advent Calendar 2018.
When I became a web application engineer and was entrusted with the implementation of full-text search using Elasticsearch in the field for the first time, I had a hard time with articles using JavaClient, so I hope it will be useful for such people.
Assuming that the Spring application incorporates the full-text search function using JavaClient. Deprecated from Elasticsearch 7 series, Transport Client rumored to be completely gone from 8 series I have summarized the changes in client connection and query generation when migrating to RestHighLevelClient.
This time, connections including authentication, Bulk API, and scroll API are not covered.
macOS Elasticsearch6.5.2 Java8 Spring Boot 2.1.1
The application I created is listed on GitHub. https://github.com/ohanamisan/Elasticsearch_on_Java
If the version of Elasticsearch is different, please change the jar import part of the gradle file as appropriate. By the way, in the sample, the process of inserting with Bulk is also implemented roughly. No details are given in this article. Please refer to the README for other details.
TransportClient -> RestHighLevelClient Immediately, this is the migration of the leading client.
TransportClient client = new PreBuiltTransportClient(Settings.EMPTY)
.addTransportAddress(new InetSocketTransportAddress(InetAddress.getByName("localhost"), 9300));
The Transport Client, which has been used as a standard until now, looks like this. The port number is 9300, which is provided as Transport instead of the Elasticsearch default 9200.
By the way, here is the 7 series α version.
It has been deprecated. It cannot be used from 8 series.
Now, let's rewrite it to RestHighLevelClient, which will be the standard in the future.
RestHighLevelClient client = new RestHighLevelClient(
RestClient.builder(new HttpHost("localhost", 9200, "http")));
Did you feel like hitting the 9200 directly with http access without using the 9300 for Transport? I think it is. The name and structure are so simple that you can see that you are hitting Rest at a glance.
When using multiple Elasticsearch on the same server, the port will automatically use the next port, so add HttpHosts separated by commas according to the number.
RestHighLevelClient client = new RestHighLevelClient(
RestClient.builder(
new HttpHost("localhost", 9200, "http"),
new HttpHost("localhost", 9201, "http")
)
);
TransportClient was throwing a log that goes to access internally with netty, but RestHighLevelClient does not output by default probably because it is hit with Rest as the name suggests.
prepareSearch -> SearchSourceBuilder + SearchRequest Next, we will modify the query generation implementation. The query generated during full-text search this time assumes the following.
{
"query": {
"bool": {
"should": [
{
"match_phrase": {
"title.full_text_search":Search word
}
},
{
"match_phrase": {
"body.full_text_search":Search word
}
}
], "minimum_should_match": 1
}
}
}
A simple query that throws a match_phrase query on the text and title. I wanted to do a full-text search that included the title, so if I used a should query and hit either the title or the body, it would be positive.
We will generate this query in Java.
The query generation and request when using TransportClient is as follows.
BoolQueryBuilder query = QueryBuilders.boolQuery();
query.should(QueryBuilders.matchPhraseQuery("title.full_text_search", word))
.should(QueryBuilders.matchPhraseQuery("body.full_text_search", word));
.minimumShouldMatch(1));
SearchResponse res = client.prepareSearch("qiita")
.setQuery(query)
.setSize(1000)
.get();
res.getHits();
Generate a query with QueryBuilders and use the client's prepareSearch method I will pass the request query and the settings at the time of request.
As an aside, even in Java, if you nest Query Builders or chain methods, you can generate quite complicated queries as well as JSON queries.
RestHighLevelClient no longer has a prepareSearch method The search method with SearchRequest as an argument is used. Below is the modified code.
SearchSourceBuilder sourceBuilder = new SearchSourceBuilder();
sourceBuilder.from(0);
sourceBuilder.size(1000);
sourceBuilder.query(QueryBuilders.boolQuery()
.should(QueryBuilders.matchPhraseQuery("title.full_text_search", word))
.should(QueryBuilders.matchPhraseQuery("body.full_text_search", word))
.minimumShouldMatch(1));
SearchRequest req = new SearchRequest().indices(INDEX).source(sourceBuilder);
SearchResponse res = client.search(req, RequestOptions.DEFAULT);
res.getHits();
I think the number of lines of code has become long, I have the impression that the responsibilities for setting queries, requests, and responses are separated, making it easier to handle personally.
I'm sorry I've exhausted my efforts in making gifs, but I hope you can see that the search word "Java" is not included in the title. It feels like a hit from the text.
This time, I tried to implement the minimum full-text search in the local environment as an example.
RestHighLevelClient with basic authentication when xpack etc. is inserted Since there are still extension elements such as Scroll API for paging that is indispensable for searching Taking this (actually) first post as an opportunity, I would like to continue to write articles while expanding the sample.
Also, since I implemented Java after a long time, please point out any strange points.
Tomorrow there will be the 27th Elasticsearch study session "LT & year-end party"! https://www.meetup.com/ja-JP/Tokyo-Elastic-Fantastics/events/256619262/
Recommended Posts