1. Directly use SQL to search for existing problems

Most applications must have a search engine search some functionality is often huge resource consumption They drag you down due to heavy load database application performance All of our general when doing the search Will transfer it to an external search servers by Apache Solr is a popular open source search server

2, Apache Solr

Solr is an open source search platform for building search applications. Is a standalone enterprise search application server that provides a Web service-like API based on Lucene(full-text search engine). Solr is enterprise-class, fast and highly extensible. Users can submit XML files in a certain format to search engine servers through HTTP requests to generate indexes. You can also make a lookup request through an Http Get operation and Get the result back in XML format.

3. Why Solr?

Reason number one: There is a lack of performance from SQL databases. Basically, you need to use the JOIN operation in your query. The second reason is that the natural data nature of the document is a loose text file, and this kind of query is required to use LIKE. However joins and likes are performance killers and are inconvenient in current database engines. Solr uses an inverted index underneath. This data structure is akin to a glorified dictionary

4. Key features of Solr

1. Standards-based open interface: Solr search server supports querying and retrieving results through XML, JSON and HTTP. 2. Easy management: Solr can be managed through HTML pages, and Solr configuration is completed through XML. 3. Scalability: Ability to efficiently replicate to another Solr search server. 4. Flexible plug-in system: New functions can be easily added to Solr server in the form of plug-ins. 5. Powerful data import capabilities: Databases and other structured data sources can now be imported, mapped, and transformed.

5. Solr installation

5.1. Upload the installation package

5.2. Decompress the above content

Unpack the tomcat

Tar ZXVF – apache tomcat – 8.5.32. Tar. Gz

Mv apache tomcat – 8.5.32 tomcat

Unpack the solr

The tar – ZXVF solr – 4.10.3. Tar

Unpack the IK

unzip IKAnalyzer.zip

5.2. Copy solr.war to Tomcat/webApp

CD/usr/local/solr/solr – 4.10.3 / example/webapps /

cp solr.war /usr/local/solr/tomcat/webapps/

5.3. Start Tomcat to automatically decompress Solr. war

/usr/local/solr/tomcat/bin/startup.sh

5.4. Close Tomcat

/usr/local/solr/tomcat/bin/shutdown.sh

5.5. Go to webapps and delete the solr.war package

cd /usr/local/solr/tomcat/webapps/

rm -rf solr.war

5.6, the solr – 4.10.3 / example/lib/ext/directory of all the jars copied to/usr/local/solr/tomcat/webapps/solr/WEB – INF/lib directory

CD/usr/local/solr/solr – 4.10.3 / example/lib/ext

cp * /usr/local/solr/tomcat/webapps/solr/WEB-INF/lib

5.7. Copy the solr folder from solr-4.10.3/example/ to /usr/local/solr/ and rename it solrhome

CD/usr/local/solr/solr – 4.10.3 / example /

cp -r solr /usr/local/solr/

cd /usr/local/solr mv solr solrhome

5.8, configure the tomcat/webapps/solr/WEB-INF/web. XML home location

cd /usr/local/solr/tomcat/webapps/solr/WEB-INF/

vim web.xml

Add solrhome

6. Classification of domains

Domain configuration file CD/usr/local/solr/solrhome collection1 / conf/vim schema. The XML

6.1. What is a Domain

A domain is equivalent to a table Field in a database. Users store data and define related fields according to their business needs.

6.2 Classification of Domains:

  1. The field common field can be used in most cases, mainly defining the domain name and the type of domain.
  2. CopyField In a replication domain, a source is called the source domain. Dest stands for the target domain. During data maintenance, a copy of the content in the source domain is copied to the target domain
  3. DynamicField In solr, a domain name must be defined before it is used. If a domain name is not defined, an error is reported. If you want to use an undefined domain name, you can blur the dynamic domain name so that the undefined domain name can be used.
  4. UniqueKey Primary key field When adding data, you must have a primary key field. No error will be reported. You do not need to add or modify this field.

6.3 Common domain properties

Indexed indexed stored required multiValuedCopy the code

To store data in SOLr, write related fields into the field

6.4. Common Domain

<field name="item_goodsid" type="long" indexed="true" stored="true"/> <field name="item_title" type="text_ik" indexed="true" stored="true"/> <field name="item_price" type="double" indexed="true" stored="true"/> <field name="item_image" type="string" indexed="false" stored="true" /> <field name="item_category" type="string" indexed="true" stored="true" /> <field name="item_seller" type="text_ik" indexed="true" stored="true" /> <field name="item_brand" type="string" indexed="true" stored="true" /> <field name="item_updatetime" type="date" indexed="true"  stored="true" />Copy the code

6.5 Replication Domain

<field name="item_keywords" type="text_ik" indexed="true" stored="false" multiValued="true"/>
<copyField source="item_title" dest="item_keywords"/>
<copyField source="item_category" dest="item_keywords"/>
<copyField source="item_seller" dest="item_keywords"/>
<copyField source="item_brand" dest="item_keywords"/>
Copy the code

6.6. Dynamic Domains

<dynamicField name="item_spec_*" type="string" indexed="true" stored="true" />
Copy the code

7, solrj

An overview of the

SolrJ is the client toolkit officially launched by SolR. After putting the JAR package of solrJ into our project, we call the API in solrJ to send commands to solr server remotely, solr server can complete operations on the index library (add, modify, delete and query).

steps

  1. Create a normal Java project
  2. Add soloJ related Jar packages

  1. operation
  2. Add or modify Modify, the previous content will be deleted, and then add

Public class test {@test public void testIndexCreateAndUpdate() throws Exception {/** * Creates a connection with the Solr server * http://192.168.0.108:8080/solr is connected to the default instance is collection1 of instance * * / SolrServer http://192.168.0.108:8080/solr/collection2 instance SolrServer = new HttpSolrServer (" http://192.168.0.108:8080/solr "); SolrInputDocument doc = new SolrInputDocument(); doc.addField("id", "002"); Doc. addField("title", "descendant "); doc.addField("price", "250"); // Add or modify solrserver.add (doc); / / submit solrServer.com MIT (); }}Copy the code

Query all

@Test public void testIndexSearch() throws Exception { SolrServer solrServer = new HttpSolrServer (" http://192.168.0.108:8080/solr "); SolrQuery = new SolrQuery(); SolrQuery = new SolrQuery(); // Set the query condition query.setQuery("*:*"); QueryResponse QueryResponse = solrServer.query(query); SolrDocumentList Results = queryResponse.getresults (); System.out.println("count=" + results.getNumFound()); for (SolrDocument result : results) { System.out.println("id=" + result.get("id")); System.out.println("title=" + result.get("title")); }}Copy the code

delete

@Test public void testIndexDelete() throws Exception { SolrServer solrServer = new HttpSolrServer (" http://192.168.0.108:8080/solr "); Solrserver. deleteById("001"); solrServer.deleteById("001"); // Delete all solrserver.deleteByQuery ("*:*"); / / submit solrServer.com MIT (); }Copy the code

8. Chinese word segmentation

IK Analyzer profile

IK Analyzer is an open source, lightweight Chinese word segmentation toolkit developed based on the Chu Java language. It is based on the open source project Luence, combining the dictionary word segmentation and grammar analysis algorithm of Chinese word segmentation IK to achieve a simple word segmentation ambiguity elimination algorithm, marking the IK word segmentation from simple dictionary word segmentation to semantic word segmentation simulation. Function: it has the effect of Chinese semantic analysis and good effect on Chinese word segmentation.

The installation

8.1. Add ikanalyzer2012FF_U1. jar to the lib directory of solr project

cd /usr/local/solr/IKAnalyzer/

cp IKAnalyzer2012FF_u1.jar /usr/local/solr/tomcat/webapps/solr/WEB-INF/lib/

Create web-INF /classes folder

cd /usr/local/solr/tomcat/webapps/solr/WEB-INF/

mkdir classes

8.3. Put the extension dictionary, stop dictionary, and configuration files in web-INF /classes of solR project

cd /usr/local/solr/tomcat/webapps/solr/WEB-INF/classes

cp /usr/local/solr/IKAnalyzer/IKAnalyzer.cfg.xml ./

cp /usr/local/solr/IKAnalyzer/ext_stopword.dic ./

mv ext_stopword.dic stopword.dic

8.4. Modify ikAnalyzer.cfg. XML configuration file

Blog. Csdnimg. Cn / 20200808150…).

Stopword.dic is already available,ext.dic is not

Create ext.dic: touch ext.dic

Stopword. Dic – Stop dictionary

During word segmentation, any words that appear in the stop dictionary are filtered out.

Ext.dic – Extended dictionary

All proper nouns are going to go here, if it’s not a word in natural semantics, solr is going to go here

When you slice a word, you cut it into one word.

8.5. Configure the word segmentation

Modify the schema. XML file in Solrhome

cd /usr/local/solr/solrhome/collection1/conf

vim schema.xml

Add at the end

<fieldType name="text_ik" class="solr.TextField">
  <analyzer class="org.wltea.analyzer.lucene.IKAnalyzer"/>
</fieldType>
Copy the code

Custom field names use a self-created tokenizer

<field name="content_ik" type="text_ik" indexed="true" stored="true"/>
Copy the code

9. Test:

Restart Tomcat after the configuration

Test successful!