Summary: This article introduces consistency issues for dual-write scenarios, details three solutions, and provides best practices for Lindorm data subscription for DB->Binlog->Kafka schemes

Double writing problem introduction

Dual Write Problem refers to a scenario where two independent systems, such as Database and Kafka, need to be modified at the same time. For example, Database and cache, how to ensure data consistency between the two systems?

Taking common scenarios like Database and Kafka as examples, we can do this in several ways:

  1. Write to Database and Kafka concurrently
  2. Write Kafka first, then Database
  3. Write Database first, then Kafka

Write to Database and Kafka concurrently

In this case, distributed transactions are required to support strong consistency, otherwise inconsistencies can be complicated, and neither Database nor Kafka may have complete data.

Write Kafka first, then Database

Write Kafka first, then return client success, then subscribe Kafka message into the Database, to achieve the final consistency. However, this asynchronization leads to DB data update delay, which affects some scenarios requiring strong consistent read. For example, the bill was written successfully, but the customer could not see it immediately; Another example is the real-time attribution scenario, Flink real-time consumption Kafka, in the transaction event after the inverse check DB attribution, but the key data may not be in the database at this time.

Write Database first, then Kafka

Write the Database and Kafka to the client in sequence. If the Database succeeds and Kafka fails, the first write latency increases.

DB->Binlog->Kafka: write to Database, return client success, subscribe to Binlog write to Kafka, subscribe downstream Kafka consumption. Achieve final consistency while ensuring strong consistent reads on the Database.

Make decisions based on business scenarios

Above we introduced three solutions to the double writing problem, each adapted to a different scenario.

  1. If the business requires a strong consistent experience across the board, then we should choose distributed transactions.
  2. If the business prefers a holistic final consistency experience, we choose MQ as the first entry point for final consistency.
  3. If different services have different consistency experience requirements, we choose strong consistent read and write DB, and use DB binlog to achieve the final consistency of downstream services.

Introduction to Lindorm Data Subscription

The Lindorm data subscription is an upgrade to the “DB->Binlog->Kakfa” scheme.

The cloud native multi-model database Lindorm data subscription function supports every data change of any table, and can view data change records in real-time and orderly on the client. When a table is subscribed to, its operations to change data are stored. To ensure that the order of data consumption is the same as that of data writing, the data subscription function provides primary key level order preservation. Update operations on the same primary key are stored and consumed in the same update order. Each time an add, delete, or modify operation is performed on the Lindorm table, the data subscription generates a Stream Record key-value pair, whose key is the primary key of the row and whose value is the details of the operation (pre-operation value, post-operation value, timestamp, operation type).

To summarize the features of Lindorm data subscription:

  1. Real-time subscription
  2. 100% compatible with Kafka clients
  3. The Key level is sequential

The original link

This article is the original content of Aliyun and shall not be reproduced without permission.