preface

First of all, elasticSearch and Solr are not good or bad, only fit or not. This is the general principle. We need to choose the most suitable search engine according to our actual situation.

So let’s compare elasticSearch and Solr for their pros and cons, scenarios, and future requirements to find the best solution.

Noun explanation

For elasticSearch, ES stands for Elastic Stack. For elasticSearch, ES stands for elastic Stack.

Solr pros and cons

Solr advantages

  1. Solr has a much larger and more mature community of users, developers, and contributors.
  2. You can add indexes in various formats, such as HTML, PDF, Microsoft Office, JSON, XML, and CSV.
  3. Solr is relatively mature and stable.
  4. It is faster to search without considering building an index.

Solr shortcomings

  1. During index building, the search efficiency decreases and the real-time index search messages are not high.

Elasticsearch pros and cons

Elasticsearch advantages

  1. Elasticsearch is distributed. No other components are required, distribution is real-time and is called “Push Replication.”
  2. Elasticsearch fully supports Apache Lucene’s near real-time search.
  3. Handling multitenancy requires no special configuration, whereas Solr requires more advanced Settings.
  4. Elasticsearch uses the Gateway concept to make completeness much simpler.
  5. Each node forms a peer-to-peer network structure. When some nodes fail, other nodes are automatically assigned to take their place.

Elasticsearch shortcomings

  1. Not automatic enough (not suitable for the current new Index Warmup API)

Solr vs. ElasticSearch

However, from the advantages and disadvantages of the above comparison, it is difficult to find which search engine is more suitable for us. Even our precise conception of their strengths and weaknesses is a little hard to grasp.

We can compare some features of Solr and ES and analyze the differences between them to find a more suitable search engine for us.

Popular trend

There is no doubt that ES has become the de facto hegemon of full-text index, surpassing Solr in popularity in China. A comparison of Google search trends shows that ES is more attractive than Solr, while Solr has a downward trend. However, this does not mean that Solr has declined. On the contrary, Solr is still one of the most popular search engines. This can be seen in search engine rankings, where Solr is still included in the top three most popular search engines: Elasticsearch, Splunk, and Solr.

The retrieval speed

Solr is faster when simply searching through existing data. When indexes are created in real time, Solr causes I/O congestion, resulting in poor query performance. Elasticsearch has an obvious advantage. Solr becomes less efficient as the amount of data increases, while Elasticsearch doesn’t change significantly. Large Internet companies, tested in production environments, saw an average query speed increase of 50 times after switching from Solr to Elasticsearch.

Elastic and Solr search speed

According to the current trend, we have entered the era of big data and will inevitably face massive data. Compared with Solr, ES has a huge advantage in mass data retrieval speed.

Solr is a great solution for traditional applications, but Elasticsearch is better suited for emerging real-time search applications.

Near real time search

Near-real-time search has always been one of the strengths of ES. That doesn’t make much sense, though, because Solr also implements near-real-time search, but because ES came up with it first, people tend to think of ES first when they talk about near-real-time search.

And real time is Lucene’s ability, not ES and Solr’s.

community

  • Solr has a much larger and more mature community of users, developers, and contributors. Extensive documentation, as well as problem solving cases, can be found online.
  • Although ES is relatively new, it has grown rapidly and is used by many well-known companies. The community is small but active. So there’s nothing to worry about. Elasticsearch Chinese community

Capacity Expansion (distributed)

ES was born for distribution, and its design hides the complexity of distribution itself. ES is clustered by default (even if it runs on a single server, also known as a cluster), and more servers can always be added to increase capacity or fault tolerance. Similarly, if the load is low, servers can be easily removed from the cluster, reducing costs.

SolrCloud is a distributed search solution provided by Solr. Use SolrCloud when you need large-scale, fault-tolerant, distributed indexing and retrieval capabilities. When the index volume is large and the search request concurrency is high, SolrCloud is also needed to meet these requirements. However, SolrCloud is not necessary when a system has a small amount of index data. SolrCloud is a distributed search solution based on Solr and Zookeeper. The main idea is to use Zookeeper as the configuration information center of the SolrCloud cluster to centrally manage SolrCloud configurations, such as solrconfig.xml and schema.xml.

For most databases, scaling out (adding new machines) means that your program will have to change a lot to take advantage of these new devices. By contrast, ES is distributed by nature and knows how to manage nodes to provide high scalability and availability, which means your application doesn’t have to care.

The amount of data

Distributed ElasticSearch does support more data, no doubt about it. But this is relative to the standalone Solr, and the distributed Solr-Cloud data volume is also supported.

In this respect, Solr and ES are no different.

The data are aggregated

This hit so many ☆, purely because this is really too outstanding wife.

No matter which search engine we choose, we all have the same goal: to organize data to serve our goals. ES provides us with an awesome feature called aggregation. The aggregation function injects the pedigree of statistical analysis into ES, making users more proficient in extracting statistical indicators from big data.

Solr also provides a similar feature, the Analytics Component

This reflects the fact that ES and Solr are not that different.

model

Data in Solr is structured, and Solr defines this structure through the schema file schema.xml.

Documents in ES are schemaless, which means that not all documents need to have the same fields, and they are not limited to the same schema.

ES will automatically detect if a newly indexed document has a field that does not already exist in the map. If it does not, ES will automatically add the new field to the map. In order to add this field, ES has to determine what type it is, so ES makes a guess. For example, if the value is 7, ES assumes that the field is a long integer.

The downside of this approach is that ES may be wrong. For example, after indexing the value 7, you might reindex Hello World, which would fail because it was a string instead of a long.

For an online environment, it is safest to define the required mappings before indexing the data.

Configuration Management ☆

One strength of ES is that its default configuration is very programmer-friendly, making it easy to get started. ElasticSearch does a lot of things right out of the box and doesn’t require user configuration. The design philosophy of ElasticSearch is to minimize the possibility of user error, and ES also puts a lot of restrictions on the runtime environment to avoid errors that don’t make sense because many users are not experts in these areas and can’t figure out why. ES helps us free ourselves from complex configurations and focus more time and energy on other things.

Solr’s configuration is too flexible, giving users plenty of room for error. But Solr is more customizable and can configure almost anything. For developers, a new feature can be implemented by adding processers and components to Solr without touching the Solr core code, and then configuring the server’s behavior through XML.

Ecosystem being fostered fostered

Elasticsearch, Logstash, Kibana,

Applicable scenario

On the Internet, there are many different scenarios for the two search engines. But individual talent and learning shallow, unable to distinguish true and false. But there is one point that Solr and ES, as the two most popular search engines at present, are learning from each other and making progress together. There is no obvious difference, no obvious applicability or inapplicability. We can only choose the best and follow them.

If so, I’ll summarize it here.

conclusion

ES and Solr are not fundamentally different, after all, both are based on Lucene. And these two search engines are still in constant iteration, the function of some have, but the implementation is inconsistent. Solr implements a lot of functionality, whereas ES implements a lot of functionality through plug-ins.

But, who is ES more popular now? I vote for ES !!!!!!

reference

  • solr-vs-elasticsearch
  • ElasticSearch or Solr?
  • Elastic development History
  • ElasticSearch and Solr
  • Choose what’s Good — Why did I start ElasticSearch
  • ElasticSearch vs. Solr
  • Practice of Uplike Search Engine (Engineering)
  • Solr vs ElasticSearch