Hi, I am empty night, long time no see!

This article introduces the implementation of ElasticSearch and CRUD operations in Spring Boot, including paging, scrolling, etc. Before I used ES in the company, I had been using packages packaged by predecessors. Recently, I hope to learn ES related technologies from the perspective of native Spring Boot/ES syntax. Hope to help you.

This article is an excerpt from the Spring-boot-examples series, which has been uploaded to github.com/laolunsi/sp…


Install ES and visualization tools

To ES official www.elastic.co/cn/download… For Windows, download the installation package, start the elasticSearch. bat file, and visit http://localhost:9200

In this case, ES is installed. For a better view of your ES data, install the ElasticSearch-Head visualization plugin. Go to github.com/mobz/elasti… Main steps:

  • git clone git://github.com/mobz/elasticsearch-head.git
  • cd elasticsearch-head
  • npm install
  • npm run start
  • open http://localhost:9100/

The following situations may occur:

Discovery is a cross-domain problem. For elasticSearch, add the following two lines to the elasticSearch. yml folder:

http.cors.enabled: true
http.cors.allow-origin: "*"
Copy the code

Refresh page:The Article index here is the index THAT I created automatically through the Spring Boot project. Now let’s get down to business.


Spring Boot introduces ES

Create a spring-boot project that introduces es dependencies:

        <dependency>
            <groupId>org.springframework.boot</groupId>
            <artifactId>spring-boot-starter-data-elasticsearch</artifactId>
        </dependency>
Copy the code

The configuration application. Yml:

server:
  port: 8060

spring:
  elasticsearch:
    rest:
      uris: http://localhost:9200

Copy the code

Create a test object, article:

import org.springframework.data.annotation.Id;
import org.springframework.data.elasticsearch.annotations.Document;

import java.util.Date;

@Document(indexName = "article")
public class Article {

    @Id
    private String id;
    private String title;
    private String content;
    private Integer userId;
    private Date createTime;

    // ... igonre getters and setters
}

Copy the code

Here are three ways to manipulate ES data in Spring Boot:

  • Implement ElasticsearchRepository interface
  • The introduction of ElasticsearchRestTemplate
  • The introduction of ElasticsearchOperations

Implement the corresponding Repository:

import org.springframework.data.elasticsearch.repository.ElasticsearchRepository;

public interface ArticleRepository extends ElasticsearchRepository<Article.String> {}Copy the code

You can now use this ArticleRepository to manipulate Article data in ES. The Article index is not created manually. It is created by default for ElasticSearch.

The following interface realizes spring Boot for ES data insertion, update, paging query, scrolling query, delete and other operations. Can be used as a reference. Among them, use the Repository to capture, save, delete ES data, by using ElasticsearchRestTemplate or ElasticsearchOperations paging/scroll query.

Get/delete data by ID

	@Autowired
    private ArticleRepository articleRepository;

    @GetMapping("{id}")
    public JsonResult findById(@PathVariable String id) {
        Optional<Article> article = articleRepository.findById(id);
        JsonResult jsonResult = new JsonResult(true);
        jsonResult.put("article", article.orElse(null));
        return jsonResult;
    }

	@DeleteMapping("{id}")
    public JsonResult delete(@PathVariable String id) {
        // Delete by id
        articleRepository.deleteById(id);
        return new JsonResult(true."Deleted successfully");
    }
Copy the code

Save the data

	@PostMapping("")
    public JsonResult save(Article article) {
        // Add or update
        String verifyRes = verifySaveForm(article);
        if(! StringUtils.isEmpty(verifyRes)) {return new JsonResult(false, verifyRes);
        }

        if (StringUtils.isEmpty(article.getId())) {
            article.setCreateTime(new Date());
        }

        Article a = articleRepository.save(article);
        booleanres = a.getId() ! =null;
        return new JsonResult(res, res ? "Saved successfully" : "");
    }

    private String verifySaveForm(Article article) {
        if (article == null || StringUtils.isEmpty(article.getTitle())) {
            return "The title cannot be empty.";
        } else if (StringUtils.isEmpty(article.getContent())) {
            return "Content cannot be empty.";
        }

        return null;
    }
Copy the code

Paging query data

    @Autowired
    private ElasticsearchRestTemplate elasticsearchRestTemplate;

    @Autowired
    ElasticsearchOperations elasticsearchOperations;

    @GetMapping("list")
    public JsonResult list(Integer currentPage, Integer limit) {
        if (currentPage == null || currentPage < 0 || limit == null || limit <= 0) {
            return new JsonResult(false."Please enter valid paging parameters.");
        }
        // paging list query
        // The search method in older versions of Repository is deprecated.
        / / use ElasticSearchRestTemplate or ElasticsearchOperations to paging query here

        JsonResult jsonResult = new JsonResult(true);
        NativeSearchQuery query = new NativeSearchQuery(new BoolQueryBuilder());
        query.setPageable(PageRequest.of(currentPage, limit));

        // Method 1:
        SearchHits<Article> searchHits = elasticsearchRestTemplate.search(query, Article.class);

        // Method 2:
        // SearchHits<Article> searchHits = elasticsearchOperations.search(query, Article.class);

        List<Article> articles = searchHits.getSearchHits().stream().map(SearchHit::getContent).collect(Collectors.toList());
        jsonResult.put("count", searchHits.getTotalHits());
        jsonResult.put("articles", articles);
        return jsonResult;
    }
Copy the code

Scroll through data

	@GetMapping("scroll")
    public JsonResult scroll(String scrollId, Integer size) {
        // Scroll the Scroll API
        if (size == null || size <= 0) {
            return new JsonResult(false."Please enter number of queries per page");
        }
        NativeSearchQuery query = new NativeSearchQuery(new BoolQueryBuilder());
        query.setPageable(PageRequest.of(0, size));
        SearchHits<Article> searchHits = null;
        if (StringUtils.isEmpty(scrollId)) {
            // Start a scroll query. Set the scroll context to 60 seconds
            // In the same scroll context, you only need to set the query once.
            searchHits = elasticsearchRestTemplate.searchScrollStart(60000, query, Article.class, IndexCoordinates.of("article"));
            if (searchHits instanceofSearchHitsImpl) { scrollId = ((SearchHitsImpl) searchHits).getScrollId(); }}else {
            // Continue scrolling
            searchHits = elasticsearchRestTemplate.searchScrollContinue(scrollId, 60000, Article.class, IndexCoordinates.of("article"));
        }

        List<Article> articles = searchHits.getSearchHits().stream().map(SearchHit::getContent).collect(Collectors.toList());
        if (articles.size() == 0) {
            // End scrolling
            elasticsearchRestTemplate.searchScrollClear(Collections.singletonList(scrollId));
            scrollId = null;
        }

        if (scrollId == null) {
            return new JsonResult(false."At the end of the line");
        } else {
            JsonResult jsonResult = new JsonResult(true);
            jsonResult.put("count", searchHits.getTotalHits());
            jsonResult.put("size", articles.size());
            jsonResult.put("articles", articles);
            jsonResult.put("scrollId", scrollId);
            returnjsonResult; }}Copy the code

ES Deep paging vs scrolling queries

Last time I met a problem, my colleague told me that the interface of log retrieval was too slow, and asked me if I could optimize it. We started with deep paging, 1,2,3.. 10. Such paging query has many query conditions (more than ten parameters) and a large amount of query data (a single log index is about 200 million pieces of data).

Paging queries are slow because: ES paging query, such as the query of the data on page 100, 10 per page, is the first 100 * 10 hit data from each partition (shard, an index default is 5 shards), and then the coordination node to merge and other operations, finally give the data on page 100. That is, much more data is actually loaded into memory than is ideal.

In this way, the larger the index shard, the more pages the query, the slower the query. ES default max_result_window is 10000 entries, which means that normally, when 10000 entries are queried by paging, the next page will not be returned.

If page-hopping is not necessary, such as directly querying the data on page 100, or the data volume is very large, scroll query can be considered. In scroll query, you need to enable a Scroll context according to query parameters and set the context cache time for the first time. Subsequent scrolling only needs to be done according to the scrollId returned for the first time.

Scroll only supports scroll down, if you want to scroll back, you can also cache the query results according to scrollId, so that you can achieve scroll up and down query — just like the common use of Taobao product search up and down the same.


I have been studying Redis, RabbitMQ, ES and other technologies in a systematic way recently, focusing on principles, underlying, concurrency and other issues. Welcome to our public account: JavaApes