Solr configuration only has three places:

  • Solr.xml defines management, logging, sharding, and solrCloud-related attributes and is a Solr cluster-level configuration file
  • Solrconfig. XML defines the main configuration of the Solr kernel and is a Solr coin-level configuration file
  • Schema.xml defines the index structure, including the fields and their data types

core.properties

The directory for each Solr kernel contains a core.properties file, which defines the basic properties for Solr to automatically discover the kernel

Through the core.properties file, you can make changes to the kernel information

Solr automatically discovers the kernel through the core.properties file at startup. Once the kernel is discovered, it automatically locates solrconfig. XML in the data/conf directory of the current Solr project directory, and initializes the kernel through solrconfig. XML.

4.1. Solrconfig. XML

When the engine is launch solr can visit http://localhost:8983/solr to enter solr console, click collection1 kernel Files directory can see all of the current kernel configuration file

If the configuration file needs to be updated, you can use the Reload tag to update the Solr configuration file

4.1.2. General configuration of SolrConfig

2.1.1 Lucene version of Solr

 <luceneMatchVersion>4.10.3</luceneMatchVersion>
Copy the code

This parameter can be used to limit the maximum version of Lucene and the use of certain new functions

2.1.2 Adding dependent JAR packages

<lib dir="${solr.install.dir:.. /.. /.. }/contrib/extraction/lib" regex=".*\.jar" />
  <lib dir="${solr.install.dir:.. /.. /.. }/dist/" regex="solr-cell-\d.*\.jar" />

  <lib dir="${solr.install.dir:.. /.. /.. }/contrib/clustering/lib/" regex=".*\.jar" />
  <lib dir="${solr.install.dir:.. /.. /.. }/dist/" regex="solr-clustering-\d.*\.jar" />
Copy the code

4.2 Request Processing Process

4.2.1 Introduction to Request Processing

Solr requests are made through HTTP. The query request is HTTP GET, and the index request is HTTP POST.

Solr request URL example

Specific process of solR request

  • 1. The client sends the HTTP request to the specified server based on the ADDRESS of the request. The query parameters are transmitted through the query string in the GET request
  • 2. The Serlvet server receives the request and, based on the content after SOLr /, passes it to the unified request dispatcher (request dispatcher) in SOLR, which is a Java servlet container that filters out the URL
  • 3. Solr’s request scheduler determines the name of the queried kernel based on collection1 in the request path. Next, the request scheduler locates the/SELECT request handler defined in the solrconfig.xml file
  • 4. The request processor invokes a series of request components to process the client’s request.
  • 5. After the request is processed, the response read and write component sends the query result to the client after formatting it.

4.2.2. Search processor (Request scheduler)

The definition of the/SELECT request handler in the solrconfig.xml file

<requestHandler name="/select" class="solr.SearchHandler">Type of request handler used to process the query<! -- default values for query parameters can be specified, these will be overridden by parameters in the request -->
     <lst name="defaults">The default argument list<str name="echoParams">explicit</str>
       <int name="rows">10</int>
       <str name="df">fuzzy_searchkey</str>
       <str name="wt">json</str>
       <str name="defType">methodParser</str>
       <str name="facet.limit">1000</str>
      
       <bool name="preferLocalShards">false</bool>
     </lst>
    
      <arr name="components">Define the default component<str>rewrite</str>
        <str>topIds</str>
        <str>attribute</str>
        <str>query</str>
        <str>sightShuffle</str>
        <str>shuffle</str>
        <str>facet</str>
        <str>expand</str>
        <str>mlt</str>
        <str>highlight</str>
        <str>stats</str>
        <str>aggregate</str>
        <str>response</str>
        <str>join</str>
        <str>qscore</str>
        <str>debug</str>
      </arr>
    </requestHandler>
Copy the code

The search processor handles the request process for a query

The tag allows you to define the process of a SOLR request, and within this tag you can adjust the process of the entire request based on the actual situation.

  • 1. Request parameter decorator

< LST name=”defaults”> modifier, add default values to parameters when the customer does not specify values

< LST name=”invariants”> Overrides constant values, overwriting the client’s parameters to fixed values

< LST name=”appends”> postfix modifier that adds additional parameters to the end of the client request

  • 2. First-components


A set of optional search components that perform preprocessing tasks

  • 3. Search for components


A chained set of search components, including at least a query component

  • 4. Last-components


z Perform post-processing tasks

4.2.4 Extended search component

The six built-in search components used by the < Components > component

<arr name="components">
        <str>rewrite</str>
        <str>topIds</str>
        <str>attribute</str>
        <str>query</str>
        <str>sightShuffle</str>
        <str>shuffle</str>
        <str>facet</str>
        <str>expand</str>
        <str>mlt</str>
        <str>highlight</str>
        <str>stats</str>
        <str>aggregate</str>
        <str>response</str>
        <str>join</str>
        <str>qscore</str>
        <str>debug</str>
      </arr>
Copy the code
  • Query component Query

The query component is responsible for finding all documents that meet the criteria. The query component is enabled by default, and other components need to be enabled manually. The query component parses and executes the query statement using the active searcher.

<str name="defType">edismax</str>
Copy the code
  • Faceted components

Results are counted according to faceted fields

<str name="facet">true</str>
Copy the code
  • Something like that

If more similar result components are used, you can search for other documents that return similar results to the result set

  • The highlighted components

Highlight the highly relevant content in the query results

  • Statistical component

The statistical component can perform mathematical statistics on the numeric field in the document result set, and calculate the minimum value, maximum value, average value and sum of the field.

  • Debugging components

The debug component returns the parsed results of executed queries, as well as the detailed results of a correlation score calculation for each document in the result set

  • Add spell checking as a post-processor
<! --<arr name="last-components">-->
       <! --<str>spellcheck</str>-->
     <! --</arr>-->
Copy the code

4.3 Managing the searcher

Solrconfig.xml has some performance optimized tags, including caching, field delay loading, new searcher warm-up, etc.

4.3.1 Creating a Searcher

All solr queries consist of a component called the searcher, to which only one searcher can be active at any time in Solr and to which all query components in the search request processor make requests.

After every time to submit the documents in the solr engine, due to increased the new data, or delete some of the old data, may lead to the current search from failure, need to rebuild a new searcher, and rebuild the searcher is time-consuming, solr supports the concept of cache warming, namely to make an appointment in the background of new searcher, while maintaining the current active searcher, until a new search The cable is preheated.

4.3.2 Warm-up of new searcher

Solr allows queries of old data without significant performance degradation (satisfying high availability preference), and does not close the old searcher until the new searcher is booked and ready to query the data.

Before replacing the old searcher with a new searcher, the new searcher is warmed up in two main ways:

  • One is to use the old cache to warm up the new cache
  • One is to perform cache warm-up queries, which use a set of pre-configured queries in the solrconfig.xml file to load the results of the execution into the cache.

Cache preheating configuration:

    <listener event="newSearcher" class="solr.QuerySenderListener">
      <arr name="queries">
        <! - < LST > < STR name = "q" > solr < / STR > < STR name = "sort" > price asc < / STR > < / LST > # define query field < LST > < STR name = "q" > rocks < / STR > < STR name="sort">weight asc</str></lst> -->
      </arr>
    </listener>
Copy the code


these queries will be executed when Solr has a newSearcher event such as commit.

<useColdSearcher>

<useColdSearcher>false</useColdSearcher>
Copy the code

Submit a solr request. If useColdSearcher is false, the request will be blocked if the current Solr searcher is not warmed up until the warmed up searcher cache is warmed up. If useColdSearcher is true, Solr will immediately make a warmed up solr cache The searcher is active.

<maxWarmingSearchers>

<maxWarmingSearchers>2</maxWarmingSearchers>
Copy the code

May find a new searcher in cache warming has yet to be completed, a new submission made another searcher began preheated, such circumstance if happen frequently, so the efficiency of search will be affected by a lot of. MaxWarmingSearchers specifies the allowed maximum number of preheating query, if more than the number of new will submit the request Failure.

4.4 Cache Management

Solr provides a series of caches to optimize query performance.

4.4.1 Principles of SOLR Cache

Solr cache principle mainly includes four aspects:

  • Cache size and cache replacement algorithm
  • Cache hit ratio and cache reclamation
  • Cache object invalidation
  • Automatic reservation of new cache

The cache size

The solr cache setting should not be too large, as this will affect the performance of the Solr JVM program. If cache usage reaches the upper limit, use LRU or LFU to replace memory.

Solr uses the LRU algorithm to replace memory by default

Solr’s filter cache uses the LFU substitution algorithm

Garbage in the Solr cache is collected through the JVM’s GC, so avoid setting the cache size too large to increase the GC duration and recycle the cache periodically

Cache object invalidation

All solr cache objects are associated with the corresponding searcher instance, and if the searcher instance fails, the corresponding cache object is invalidated

Autopreheated cache

Every Solr cache supports the autowarmCount attribute, which indicates the old maximum percentage of the autowarm-up cache

  <filterCache class="solr.FastLRUCache"
                 size="512"
                 initialSize="512"
                 autowarmCount="0"/>The cache object ratio is 0Copy the code

4.4.2 Filter Cache (FQ Cache)

<filterCache class="solr.FastLRUCache"
                 size="512"
                 initialSize="512"
                 autowarmCount="0"/>

Copy the code

The main function is to reuse the index result of the filter between different queries. For example, if the query uses the filter condition name: Tom, the result will be reused if other queries have the same filter condition.

4.4.3 Caching Query Results

The query result cache stores the query results in the cache. If the same query is executed multiple times, the cache can fetch the results directly

<queryResultCache class="solr.LRUCache"
                      size="0"
                      initialSize="0"
                      autowarmCount="0"/>
Copy the code

The principle is to use the query statement as the key and the document ID of the query as the value

Value of the default query window

Returns the size of the default window if no paging is specified

 <queryResultWindowSize>20</queryResultWindowSize>
Copy the code

Maximum number of documents cached

 <queryResultMaxDocsCached>200</queryResultMaxDocsCached>
Copy the code

Enable field delay loading

 <enableLazyFieldLoading>false</enableLazyFieldLoading>
Copy the code