Abstract

Mongo’s indexes are very powerful and are no different from relational database indexes. Here mainly introduces the basic knowledge of mongo index and mongo himself in the index of the mistake.

The index type

  1. Single field index

  2. Compound index The order of the fields in a compound index should be exact match fields (= XXX), sort fields (avoid sorting in memory, use index sort), and range query fields

    Db.book.find ({company: ‘XXX ‘, age:{$lt:30}).sort({name:1}) db.book.find().explain(“executionStats”). There are four important parameters: executionTimeMills: time when the query is executed nReturned: number of returned documents totalKeysExamined: number of index scans totalDocsExamined: number of document scans

    Of course we want nReturned number =totalKeysExamined not to scan documents. (Do not hang data, index and data at the back)

    Or nReturned = totalKeysExamined = totalDocsExamined if there is a sort, in order to keep the sort out of memory, based on nReturned = totalDocsExamined, TotalKeysExamined can be greater than nReturned. Memory sorting for large amounts of data can be very performance consuming

    If we create a compound index db.book.ensureIndex({company:1,age:1,name:1}) then nReturned = totalKeysExamined = totalDocsExamined. Because the query uses index, no additional document scanning is required. But there is SORT stage, which is SORT in memory, and memory SORT is slow in the case of large data volumes.

    Db.book. ensureIndex({company:1,name:1,age:1}) select a new index

     "indexBounds" : {
             "company" : [
                     "[\"a\", \"a\"]"]."name" : [
                     "[MinKey, MaxKey]"]."age" : [
                     "[- 1 # INF, 30.0)"]},Copy the code

    There is reject SORT in the execution plan

     "rejectedPlans" : [
             {
                     "stage" : "SORT",
                     "sortPattern" : {
                             "name" : 1
                     },
    Copy the code

    So nReturned = totalDocsExamined < totalKeysExamined scanned more index, but it was worth it. This is why we initially said that the sort order of the joint index fields is exact match field (= XXX), sort field (avoid in-memory sort, use index sort), range query fields such as {name:1,address:1}, contain two queries

    db.book.find({name:"xxx"})
    db.book.find({name:"xxx",address:"xxx"})
    Copy the code

    But if your query is not a range query. It’s exactly matching fields. I’m going to use the old index. The SORT stage is not needed because the SORT field uses the index query

     db.book.find({company:'a',age:30}).sort({name:1}).explain("executionStats")
      "indexBounds" : {
              "company" : [
                      "[\"a\", \"a\"]"
              ],
              "age" : [
                      "[30.0, 30.0]"
              ],
              "name" : [
                      "[MinKey, MaxKey]"
              ]
      },
    Copy the code
  3. Many key index Such as array index docs.mongodb.com/manual/core…

  • A multi-key index cannot check all matches in an array. The first element will be checked first, and the filter will be used later

  • $elemMatch

    $elemMatch:{$gt:9,$lt:11}} Son :{$gt:9,$lt:11} For example, the first field satisfies GT :9 and the second field satisfies LT :11. So when you use an index, you can only use one boundary condition.

  • Only one array field is allowed in a federated index. But because Mongo is free schema. It could be different fields, as long as there’s only one array in one document, it could be different fields in different documents

  1. Db.book.createindex ({“name”:1},{“unique”:true}) db.book.createIndex({“name”:1},{“unique”:true}) A unique index verifies data and does not allow duplicate data.

  2. Sharding Cluster index Indexes are created on individual shards, not globally. In sharding cluster environment, only _id and shard key can be used to create unique index. Because unique index requires communication between shards, it violates the shard design philosophy. So you need to avoid

Pay attention to

  1. When a collection has multiple indexes, it is possible for a query to hit multiple indexes.

    First of all, Mongo will execute a similar query statement in a certain kind of index that may be hit. If it is executed in parallel, it will return 100 results to find the optimal index at the earliest, and then remember the index used by this kind of query. This index will be used in future query operations. Change this value when index changes.Copy the code
  2. When there is a composite index {name:1,address:1,email:1}

    By this time there is a new query {name: XXX, address: XXX, phone: XXX} to use already created composite index. At this point you will not create a separate index. The advantage is that the query is also fast, but the disadvantage is that there is an extra index, which reduces insert performance.

    This might require measuring how much data is filtered out of the first two fields, and how much of the remaining data is accounted for by the phone field to determine what index to create.

  3. One of mongo’s names, Scalar (Scalar field), is a non-array, non-embedded Document field. According to these fields index and the relational database and indifference, no special treatment Think this share is a bit too much emphasis on reading mongo source to solve the problem of the importance of because this can find root cause through the above analysis yq.aliyun.com/articles/74… #array index# mongo creates an index for each element in an array. So be very careful when creating an index for an array. It can easily lead to a large index size. In addition, Mongo supports a specified array column for query.

test.book
{
	_id:1,
	name:english,
	address:[addr1,addr2]
}
Copy the code

Db.book. find({“address.0″:”addr1”}) When creating an index for address, this query does not use the index. Index is valid only for array-based queries. Mongo doesn’t have the magic of creating an index while retaining the number of columns.

#shard key index#

  • To create a shard key if there is data in a table, you need to create an index before creating a Shard key
  • No data in the table No data in the table. When the shard key is created, Mongo automatically creates an index for the corresponding field
sh.shardCollection("test.book",{name:1,address:1})
Copy the code

Index is automatically created

{name:1,address:1}
Copy the code

mongo index VS cassandra secondary index

1. In the query process Cassandra query, the partitioner key is used to find the corresponding partition, and the data in the partition is sorted according to the clustering key. Note that the table is sorted by clustering key, which is not an index.

Mongo (Sharding Cluster) query, first finds which node according to the given shard key, and then sends the request to this node. Do a lookup. If your query case is

db.book.find({name:"xxx",address:"xxx"})
Copy the code

And the shard key is name. Create a separate index for address. Your query is actually the single-field index of the address hit. Instead of having the name data filtered as expected. This is very different from Cassandra

2. Scope Cassandra Secondary index is local on each node. Mongo’s index is global. In the Mongo Sharding Cluster environment, indexes are also created independently on each shard.

reference

www.mongoing.com/eshu_explai…