Overview of migration solutions

Shard table needs to be migrated from single library to shard library, which involves migration work. So how do you migrate? I looked at the different migration modes

  • Stop moving
  • Utilize a synchronous database master-slave migration
  • Development write program migration

Stop moving

To simplify outage migration, it means to prepare a low traffic point in advance, make a good announcement of service outage, and then move the data from the monolithic library to the shard library, and then start the new shard library reading and writing services. There are several drawbacks

  • Operation and maintenance, development and testing are required to be present, and the collaborative cost is relatively high
  • Affecting User usage
  • Without additional processing, in order to ensure data integrity, all services need to be shut down before data handling. In the case of a large number of data, data handling will consume a certain amount of time
  • If the program that reads and writes the shard service has a problem, the project needs to stop the service and roll back. The new data needs to be restored and the program needs to synchronize the data from the shard library to the monolithic library.

The most important is that the data is not easy to restore. If there’s a problem with the new data it would be nice to cut back to the original data immediately.

Utilize a synchronous master-slave migration

There is a solution on the Internet that says, multiply capacity, use mysql master slave synchronization to carry out data handling. Then stop service switching.

Double-write program migration

Double write means that you need to write a copy of data to the singleton and shard, and then you need to find a time when the traffic is low to switch to the shard. These things are involved here

Double write (incremental synchronization)

Synchronization here includes both incremental and full synchronization

  • Incremental data synchronization is performed using a double-writer program, which ensures that the newly added single and shard databases of insert are consistent

Historical data Migration and Comparison (full synchronization)

  • Using the migration program to carry out full migration, the old single-database data is migrated to the shard database.

  • The comparison program is used to find the changes of monolithic database data in the process of full migration: the comparator scans the differences between monolithic database data and shard database data. If there is any discrepancy, the data needs to be synchronized to the shard library. The scan can be full or incremental
  • Scan for discrepancies found and use the migration program to perform incremental migration until the data is consistent

Cut read traffic

  • After the migration is completed, the comparison program can run the data within a certain period of time to find the data inconsistency caused by anomalies. After a certain period of time, when there is no problem in the comparison, it can find a period with low traffic and switch to the shard library for reading and writing, and then double write monolithic library (to ensure rollback).

Offline double write

  • After a period of observation, the single library can be taken offline. Note the useless code associated with the offline. Ensure that the program reads and writes only to the shard library
  • Note the migration of the original unilibrary binlog program
  • Double write related code offline

Double – write program migration details

Talk about some of the programs involved in the previous steps

  • Double write programs
  • The migration program
  • Comparing the program
  • Toggle switch

Double write programs

Double-writing can be implemented using AOP, or using binlog synchronization

The migration program

  • Multithreaded migration
  • Support for overwrite migration (based on single library)
  • Supports the specified migration time range, supports the specified order number migration
  • Records failed order numbers during migration (only logs are required), and migrates failed order numbers
  • The middleware approach used for migration does not support batch inserts, only single inserts.

Comparison program note

Because the original library data may change during double-writing, a comparator is needed to find differences. Comparison program note

  • In the case of a single repository, note not only update changes but also delete changes, or less data migration. For example, the same list, a single list associated with four items, but the shard library synchronization problems, there is no list or the list is not associated with any items, or the list is associated with only three items are problems
  • Note that the database default generated date, cannot use full equal, can compare the error within 3 seconds
  • Too many fields make it slow to compare only the data that changes in the business. This requires analysis of business fields and familiarity with the business

Toggle switch

You can refer to the following hot switch

Single library, fragment library read switch First identifier Read: single library read 0 Fragment library read 1 Second identifier Write: single library write 0 Fragment library write 1 Fragment library write 2 For example: Default: single library read, single library write -> 0, shard library double write on 0: single library read, single library and shard library write -> 0,1 read traffic shard library read, single library and shard library write -> 1,1 drop single library write: shard library read, shard library write ->1,2


Advantages and disadvantages of dual-write program migration

Advantages: The migration process is controlled by the development itself and has strong controllability. The downside: self-coding has development and testing costs

The last

Welcome to leave comments in the comments section to beat bricks/correction/suggestions/questions/etc