Author’s brief introduction

Liu Cheng, technical expert of Ctrip Hotel R&D Department. In 2014, I joined Ctrip and was responsible for the development of several projects of order processing. I am good at solving various production performance problems.

The business scenario

As the order volume grew, the reading and writing capacity of a single database began to be limited. In this case, sharding the database becomes logical. Write after the sharding, as long as according to the dimension of the sharding modulus can be. But what about multidimensional queries?

One way to do this is to query it piece by piece and then aggregate it in memory. However, the disadvantages are obvious, for those queries with no data returned to the shard, not only an additional performance cost to the application server, but also an unnecessary burden on valuable database resources.

As for query performance, although it can be improved with open-threaded concurrent queries, multithreaded programming and the aggregation of results returned from the database increase programming complexity and error-prone. To get an idea, consider how a post-sharding paging query is implemented.

Therefore, we choose to build real-time index on the sharded database and close the query to an independent Web service, so as to improve the convenience of business application query on the premise of ensuring performance. So the question is, how do you build an efficient sharding index?

Selection of indexing technology

Real-time index data will contain columns used in common queries, such as user ID, user phone, user address, etc., and will be copied and distributed to a separate storage medium in real time. When querying data, the system checks the index first. If the index already contains the required columns, the system returns the data directly. If additional data is required, a secondary query can be performed based on the shard dimension. Queries are also efficient because the specific shard can already be determined.

Why are database indexes not used

A database index is a backup of selected columns of a table.

Queries are efficient, thanks to rows that contain low-level disk block addresses or are directly linked to the raw data. The advantage is that the database index mechanism is more stable, reliable and efficient. The drawback is that as the number of query scenarios increases, the number of indexes increases.

With the development of business, the attributes of the order itself have reached thousands. There are dozens of dimensions of high frequency query, and hundreds of deformation forms after combination. The index itself is not free, with each addition, deletion and change requiring additional writes and physical storage. The more indexes, the higher the cost of database index maintenance. So what’s the alternative?

Open source search engine of choice

What popped into our minds were open source Search engines Apache Solr and Elastic Search.

Solr is an open source search platform built on top of the JAVA library Lucene. Provide Lucene’s search capabilities in a more user-friendly way. It has been around for ten years and is a very mature product. Provides distributed indexing, copy distribution, load balancing query, and automatic failover and recovery functions.

Elastic Search is also a distributed RESTful Search engine built on top of Lucene. Provides a distributed full-text search engine through RESTful interfaces and Schema Fee JSON documents. Each index can be divided into multiple shards, and each shard can have multiple backups.

Both comparisons have their advantages and disadvantages. In terms of installation and configuration, Elastic Search is lighter and easier to install and use thanks to its newer product. On the Search side, Elastic Search has better performance for analytical queries than the full-text Search feature that everyone else has. On the distributed side, Elastic Search supports multiple shards on a single server and automatically balances shards across all machines as the number of servers increases. In terms of community and documentation, Solr has more to accumulate thanks to its credentials.

According to Google Trends, Elastic Search has more traction than Solr.

We chose Elastic Search because of its lightness, ease of use, and better support for distribution. The entire installation package is in the tens of megabytes.

Implementation of copy distribution

To avoid reinventing the wheel, we tried to find existing components. Because the database is SQL Server, no suitable open source component was found. SQL Server itself has real-time monitoring of additions, deletions and changes, the updated data into a separate table. But it didn’t automatically write data to Elastic Search and didn’t provide an API to communicate with specific applications, so we started trying to replicate distribution at the application level.

Why is data access layer replication distribution not used

First on our radar is the data access layer, which could be a breakthrough. Whenever an application adds, deletes or modifies a database, it writes a piece of data to Elastic Search in real time. But we decided to take a different approach after considering the following:

  • There are dozens of applications accessing databases, and dozens of developers changing the code in the data access layer. If you want to implement copy distribution at the data layer, you must visually scan the existing code for more than a decade and then modify it. Development is expensive and error-prone;

  • Writing Elastic Search every time you add, delete, or change means that your business processing logic is strongly coupled to copy distribution. Instability in Elastic Search or other related factors can lead to instability in business processing. Async open thread write Elastic Search? So how do you handle the application launch reboot scenario? Add a lot of exception handling and retry logic? And then refer to dozens of applications in JAR form? A small bug causing all related applications to be unstable?

Real-time scan database

At first glance, this is an inefficient scheme, but after combining the following practical scenarios, it is a simple, stable and efficient scheme:

  • Zero coupling. You do not need to modify related applications to ensure service processing efficiency and stability.

  • Batch write Elastic Search. Because the scanned data is in batches, you can write Elastic Search in batches to avoid frequent cache refresh due to too many single requests.

  • There are lots of concurrent writes at the millisecond level. No data returned when scanning the database means an additional database performance cost, and the concurrency and volume of our scenario writes are very high, so this additional cost is acceptable.

  • Data is not deleted. Deleting records cannot be found in the scanning database, but records related to orders must be retained. Therefore, data deletion does not exist.

Improved Elastic Search write throughput

Because it is the real-time copy distribution of the database, the efficiency and concurrency requirements will be higher. Here are some of the optimizations we use for writing Elastic Search:

  • Use upsert instead of SELECT + INSERT /update. Similar to MySQL’s Replace into, it avoids multiple requests and doubles the performance cost of multiple requests.

  • Bulkrequest is used to combine multiple requests into one request. The Elastic Search mechanism provides high performance for batch requests. For example, translog persists at the request level by default, which greatly reduces the number of disk writes and improves write performance. The number of requests in a specific batch depends on server configuration, index structure, and data volume. You can debug on production using dynamic configuration.

  • For indexes with low real-time requirements, set index.refresh_interval to 30 seconds (the default is 1 second). This will allow Elastic Search to create a new segment every 30 seconds, relieving later flush and merge pressures.

  • Set the index schema in advance to remove unnecessary functionality. For example, the default string mapping creates both keyword and text indexes. The former is suitable for matching short messages, such as mailing addresses, server names, labels, and so on, while the latter is suitable for querying a part of a piece of article, such as email content, product description, and so on. Select one based on the specific query scenario.

For fields that do not care about query result scoring, we can set norms:false.

For fields that do not use phrase Query, set index_Options: freqs.

 

  • For acceptable data loss index or a disaster preparedness server scenario, the index. The translog. Durability is set to async (default is request). Persisting lucene writes to hard disk is a relatively expensive operation, so translogs persist to hard disk and then write to Lucene in batches. Asynchronous translog writing means that you do not need to write to the disk for every request, which improves write performance. In the process of data initialization, the effect is obvious. In the late real-time writing, BulkRequest can meet most scenarios.

Improves Elastic Search read performance

To improve query performance, we made the following optimizations:

  • Specify that the highest field in the query scenario is the value of _routing. Since the default distributed partitioning principle of Elastic Search is to hash and mod the document ID to determine sharding, setting the highest field in the query scenario to _routing ensures that only one shard will be returned when the query for this field is made.

Write:

Check:

  • For date types, be as precise as the business will accept. Don’t include hours, minutes and seconds if you can only include days, months and years. When the data volume is large, the effect of this optimization will be especially obvious. Because lower accuracy means higher cache hits, faster queries will be made, and memory reuse will improve the performance of the Elastic Search server, reducing CPU usage and reducing GC times.

The realization of system monitoring

The technology center has developed a monitoring system specifically for business units. It periodically calls the Elastic Search CAT API of all servers, stores performance data in a separate Elastic Search server, and provides a web page for application owners to monitor data.

Implementation of DISASTER recovery

Elastic Search itself is distributed. When creating the index, we sharded the total amount of data for the next few years to ensure that the total amount of data for a single piece was within a healthy range. To strike a balance between write speed and DISASTER recovery, set the backup node to 2. Therefore, data is distributed on different servers. If one server in the cluster goes down, the other backup server takes over services directly.

At the same time, we deployed a set of identical Elastic Search cluster in another machine room in different regions in order to prevent the whole cluster from working properly due to the unexpected situation of disconnection or power failure in one machine room. When daily data is copied and distributed, a copy is also written to the DISASTER recovery equipment room for emergencies.

conclusion

The development of the whole project is a process of gradual evolution, and a lot of problems have been encountered in the process of implementation. After the project went online, the CPU and memory of the application server decreased significantly, and the query speed was basically the same as before sharding. Here to share the problems encountered and solve the problem ideas for your reference.

reference

  • Official Elastic Search document;

  • https://en.wikipedia.org/wiki/Database_index

  • https://baike.baidu.com/item/%E6%95%B0%E6%8D%AE%E5%BA%93%E7%B4%A2%E5%BC%95

  • https://logz.io/blog/solr-vs-elasticsearch/

【 Recommended reading 】

  • The evolution of Ctrip Presto technology

  • Kotlin’s awesome language features

  • How to use Xcode to check code coverage online

  • Practice of AIOps in Ctrip

  • The React Fiber que