I. Install IK word divider

IK participle is making address: https://github.com/medcl/elas…

Because I install Elasticsearch is 5.6.9 version, so the corresponding installation Elasticsearch – analysis – ik – 5.6.9 version

$ ./bin/elasticsearch-plugin install https://github.com/medcl/elasticsearch-analysis-ik/releases/download/v5.6.9/elasticsearch-analysis-ik-5.6.9.zip

ElasticSearch (ElasticSearch) ElasticSearch (ElasticSearch) ElasticSearch (ElasticSearch

$cp/MNT/HGFS /elasticsearch-analysis-ik-5.6.9/ opt/elasticsearch-5.6.9/ -r

Loaded Plugin [analysis-ik] : Log Loaded Plugin [analysis-ik]

[2018-06-15T09:30:34,671][INFO][O.E.P. PluginsService][_JHOtaZ] Loaded module [ags-matrix-stats] [2018-06-15T09:30:34,671][INFO][O.E.P. PluginsService][_JHOtaZ] Loaded Module [ingest-common] [2018-06-15T09:30:34,671][INFO][O.E.P. PluginsService][_JHOtaZ] Loaded Module [lang-expression] [2018-06-15T09:30:34,671][INFO][O.E.P.PluginsService][_JHOtaZ] Loaded Module [lang-groovy] [2018-06-15T09:30:34,671][INFO][O.E.P.PluginsService][_JHOtaZ] Loaded Module [lang-groovy] [2018-06-15T09:30:34,671][INFO][O.E.P. PluginsService][_JHOtaZ] Loaded Module [lang-Mustache] [2018-06-15T09:30:34,671][INFO][O.E.P. PluginsService][_JHOtaZ] Loaded Module [lang-painless] [2018-06-15T09:30:34,671][INFO][O.E.P. PluginsService][_JHOtaZ] Loaded module [parent-join] [2018-06-15T09:30:34,671][INFO][O.E.P. PluginsService][_JHOtaZ] Loaded Module [Percolator] [2018-06-15T09:30:34,671][INFO][O.E.P. PluginsService][_JHOtaZ] Loaded Module [REINDEX][2018-06-15T09:30:34,672][INFO][O.E.P. PluginsService][_JHOtaZ] Loaded Module [REINDEX][2018-06-15T09:30:34,672][INFO ][O.E.P. PluginsService][_Jhotaz] Loaded Module [2018-06-15T09:30:34,672][INFO ][O.E.P. PluginsService][_Jhotaz] Loaded Module [2018-06-15T09:30:34,672][INFO ][O.E.P. PluginsService][_JHOtaZ] Loaded Plugin [analysis-ik] [2018-06-15T09:30:37,398][INFO][O.E.D. DiscoveryModule][_Jhotaz] using Discovery Type [Zen] [2018-06-15T09:30:38,365][INFO][O.E.N.Node] initialized [2018-06-15T09:30:38,365][INFO][O.E.N.Node][_JHOtaZ] starting ...

restart

$JPS $kill pid $./bin/elasticsearch -d

Second, the use of

Tutorial: http://keenwon.com/1404.html

Elasticsearch Built-in Word Split:

  1. Standard: Brainless word-by-word segmentation (Chinese characters), so it has a wide range of applications, but low accuracy.
  2. English (English participle) : More intelligent for English, can recognize negative singular numbers, case, filter stopwords (e.g. “the”), etc.
  3. Chinese (Chinese word segmentation) : Very poor.

1. Verify the effect of word segmentation_analyze

The two participles of IK

  1. IK_MAX_WORD will split the text in the most granular way. For example, it will break “National Anthem of the People’s Republic of China” into “People’s Republic of China, People’s Republic of China, China, Chinese, People’s Republic, People’s Republic, and, Country, National Anthem”. It will use up all possible combinations.
  2. IK_SMART will do the coarse-grained splitting, such as splitting “National Anthem of the People’s Republic of China” into “National Anthem of the People’s Republic of China.”
# test index $curl - new XPUT 'http://127.0.0.1:9200/test' word segmentation effect $# validation ik_max_word curl 'http://127.0.0.1:9200/test/_analyze? Analyzer =ik_max_word&pretty=true' -d '{"text":" People's Republic of China "} 'http://127.0.0.1:9200/test/_analyze? Analyzer =ik_smart&pretty=true' -d '{"text":" PRC "}'

2. To be updated…