As a new achievement in the field of distributed system algorithm research in recent years, CRDT base library brings wonderful possibilities for front-end applications: just need an API and backbone almost as simple model layer, your application can naturally obtain concurrent update support for multi-person collaborative scenarios. What kind of dark magic lies behind this? This article will take Yjs, which represents the peak performance of front-end CRDT library, as an example to show you how CRDT Works.

(The picture shows the performance comparison between Yjs and other front-end mainstream CRDT libraries, Yjs corresponds to the blue line at the bottom)

This article will start with the engineering implementation of Yjs and introduce how a typical industrial-grade CRDT library implements the following capabilities:

  • Modeling data structures
  • Resolving concurrency Conflicts
  • Backtracking history
  • Synchronizing network Status

As a scientific introduction, this article does not throw out large chunks of obscure source code, nor does it involve much abstract mathematics. All you need to read is a basic computer understanding of data structures.

Before actually introducing Yjs internals, how can we intuitively understand how the CRDT library is used? Yjs provides users with common data Types such as YText, YArray, and YMap (Shared Types, collectively referred to as YModel), which can be directly used as the model layer of the application:

import * as Y from 'yjs'

// All collaboration states in an application can be hosted in a single YDoc container
// This instance can support network synchronization by passing it to the provider of the WebSocket protocol
const doc = new Y.Doc()

Different types of top-level YModel instances can be created on YDoc
// Create a YMap with a top layer named root
const yRoot = doc.getMap('root')

// The class constructor can also be used to instantiate independent yModels such as YMap
// You can use get set delete API to add, delete, modify and check YModel
const yPoint = new Y.Map()
yPoint.set('x'.0)
yPoint.set('y'.0)

// The value of YMap can also be YMap to construct nested data types
yRoot.set('point', yPoint)

// YMap can also be stored in other ymodels, such as YText, to form composite data types
const yName = new Y.Text()
yName.insert(0.'Wilson Edwards')
yRoot.set('name', yName)
Copy the code

On the surface, the API looks mundane, but its real power lies in conflict-free, where potential state conflicts during concurrent updates are automatically resolved by Yjs for the upper level:

// Two separate YDoc instances can be used to simulate two clients
const doc1 = new Y.Doc()
const doc2 = new Y.Doc()
const yText1 = doc1.getText()
const yText2 = doc2.getText()

// When one YDoc is updated, apply binary update data to another YDoc
doc1.on('update'.(update) = > Y.applyUpdate(doc2, update))
doc2.on('update'.(update) = > Y.applyUpdate(doc1, update))

// Create two potentially conflicting updates
yText1.insert(0.'Edwards')
yText2.insert(0.'Wilson')

// The CRDT algorithm ensures that the state of the two clients is always the same
yText1.toJSON() // WilsonEdwards
yText2.toJSON() // WilsonEdwards
Copy the code

From these examples of the Yjs surface API, we can already see the power of CRDT. Now comes the really interesting question: how does Yjs implement this capability internally?

Modeling data structures

Mention “underlying principles,” and many of you might immediately start imagining some kind of elegant conflict resolution algorithm. But before introducing this algorithm, it’s best to familiarize ourselves with the basic data structure Yjs uses to model CRDT in engineering: bidirectional linked lists.

In Yjs, whether YText, YArray, or YMap, the data in these YModel instances is stored in a two-way linked list. Roughly speaking, each item in the list is a unique record of the data that was changed by a user action, in a way similar to blockchain. You can assume that the operations on YModel in the above example will eventually be transformed into the structure transformations of the list, such as append, INSERT, split, merge, etc. Each item in the linked list will be serialized, encoded and distributed. Based on the guarantee of CRDT algorithm, as long as each client finally receives all items, they can reconstruct exactly the same document state regardless of the order in which the client receives these items.

The CRDT structure based on the above means is one of the CRDT schools, which can be called operation-based CRDT, List CRDT or Sequence CRDT.

Obviously, in a scenario where multiple people collaborate in real time and the timing of item reception cannot be guaranteed, each item needs to carry some kind of identifier for the system to uniquely determine its position on the logical timeline. Yjs assigns each item a unique ID, whose structure is ID(clientID, clock), as shown below:

// Yjs ID source code, such a simple implementation is conducive to engine hidden class optimization
class ID {
  constructor (client: number, clock: number) {
    // Unique ID of the client
    this.client = client
    // Logical timestamp
    this.clock = clock
  }
}
Copy the code

Both clientID and clock are integers, the former used to uniquely identify a YDoc client, while the latter belongs to a logical timestamp called Lamport Timestamp, which can be thought of as a counter that increments from zero. The update rules are very simple:

  • When a local event occurs,localClock += 1.
  • When receiving a remote event,localClock = max(remoteClock, localClock) + 1.

This mechanism seems simple, but in fact it gives us a mathematically good fully ordered structure. This means that as long as the size of clientID is compared, the logical sequence relationship between any two items can be compared, which is very important to ensure the correctness of CRDT algorithm. However, relevant mathematical theories are not the focus of this paper and will not be expanded here.

We can use the text editing example to understand how this list CRDT works. In the naive case without any optimization, this structure requires us to model each character insertion operation as an item, that is, each character carries a logical timestamp ID:

In the example above, Y A T A! Each of these characters corresponds to an item (or operation of a character insert). They are connected by the left and right fields. When inserting a new character T, Yjs will find the appropriate insertion position in the linked list according to the ID of item and add the item corresponding to the new character into the linked list. In addition, the text continuously added by the same user will also be merged into a single item with a long length, avoiding the performance problem of a large number of fragmented objects.

Note that not all items persist in the linked list during frequent additions and deletions. However, since CRDT conflict resolution relies on the metadata of the historical item, we cannot directly delete the historical item. At most, we can only remove the corresponding YModel content of the item (for this reason, Yjs specially designs GC objects to replace the discarded YModel with empty structures. Kind of a tombstone mechanic). This brings us to the next question: What data structure should we use to store all the items in a document?

Since any item can be sorted by ID, one option is to maintain all items in a balanced binary tree using data structures with logarithmic insertion and deletion time complexity such as B-trees. In practice, however, Yjs chooses a simpler and more straightforward solution, which assigns each client a flat array of items, called a StructStore:

// In practice you can use doc.store.clients. Get (doc.clientid)
doc.store.clients: {
  114514: [Item, Item, Item], // ...
  1919810: [Item, Item, Item] // ...
}
Copy the code

In the above structure, each item array is sorted by the value of the clock field. In this way, you only need to do a binary search, and you can quickly find the item corresponding to an ID. When receiving a remote item, Yjs inserts the item into the linked list as well as the corresponding location in the StructStore.

Therefore, we can assume that there are two ways to index data in Yjs:

  • According to the document structure order (i.edocument order) modeling bidirectional linked lists.
  • According to logical timing (i.einsertion order) modeled StructStore.

Understanding this mental model will help us debug the internal state of Yjs. For example, the following code produces three items:

const yPoint = new Y.Map()
yRoot.set('a', yPoint) // item 1
yPoint.set('x'.0) // item 2
yPoint.set('y'.0) // item 3
Copy the code

Note that item is a container with CRDT fields like Left/right/ID, where specific YModel (or AbstractType) data is stored in the item.content field. In addition, the parent and parentSub fields are used to assist in expressing the parent-child relationship of nested structures such as YMap. Its overall structure is shown as follows:

Since item.content can also carry arbitrary AbstractType data such as YMap, nesting of data is naturally supported. Based on this chain structure, a YMap corresponds to a series of entries, in which each key uses the latest data in the logical timeline as the current state, and items corresponding to other earlier states under this key are marked as deleted.

To optimize the speed of local inserts, Yjs also designs additional caching mechanisms. If a new character is inserted randomly anywhere in the document, then the item corresponding to the new character theoretically needs O(N) time complexity to be inserted into the linked list. But in practice, most inserts follow the cursor position of the user’s last edit. Therefore, Yjs caches the 10 most recent insertion locations by default for quick matching, which is known in internal implementation as skip List (the authors believe that this caching mechanism is inspired by hop tables) or fast Search Marker.

Once we are familiar with the basic data structure design, we can further understand the conflict resolution mechanism designed by Yjs.

Resolving concurrency Conflicts

What strategy does the CRDT library use to resolve conflicts? In fact, in the case of conflict mentioned above, no matter how the two texts of Wilson and Edwards are arranged and combined, it can be regarded as an error. It doesn’t really matter what is “right” at this point, just make sure that all the nodes receiving the message automatically spawn the same state — a concept known as convergence.

The items in Yjs form a bidirectional linked list. But to properly resolve conflicts, its implementation does not only contain the left and right fields, but also these two fields:

  • originThe field stores the left node of the item when it is inserted.
  • rightOriginThe field stores the node to the right of item when it is inserted.

Each item needs to implement its integrate method when inserting the document. The core conflict resolution algorithm uses these fields here, and there are only about 20 lines in total:

while(o ! = =null&& o ! = =this.right) {
  itemsBeforeOrigin.add(o)
  conflictingItems.add(o)
  if (compareIDs(this.origin, o.origin)) {
    if (o.id.client < this.id.client) {
      left = o
      conflictingItems.clear()
    } else if (compareIDs(this.rightOrigin, o.rightOrigin)) {
      break}}else if(o.origin ! = =null && itemsBeforeOrigin.has(getItem(transaction.doc.store, o.origin))) {
    if(! conflictingItems.has(getItem(transaction.doc.store, o.origin))) { left = o conflictingItems.clear() } }else {
    break
  }
  o = o.right
}
Copy the code

According to the paper of YATA algorithm used by Yjs, the core idea of the above code is to ensure that the left and right lines (the red line in the figure below) of the new potential item are not crossed:

But why is the algorithm correct if you avoid crossing? This needs to be proved mathematically, and is covered in more detail in the YATA algorithm paper, which is beyond the scope of this article.

Interestingly, conflict is actually rare in practice. So while the code above is part of the algorithm core of the CRDT library, it is rarely executed and can be understood as a safeguard mechanism to ensure that data is not corrupted by concurrent updates.

Backtracking history

As we can see from the introduction above, virtually all item structures in a document can be continuously stored, which is obviously of great value for historical records. Yjs also comes with UndoManager to provide out-of-the-box history management. In my previous answer to Zhihu, I also introduced the design criteria of the history module in collaborative scenarios. However, the thinking model mentioned in this answer is mainly useful for deducing the effect of some product operation at a high level, rather than the implementation of engineering details. In order to pursue the ultimate performance, the implementation of Yjs has actually made more optimization, here is a simple summary.

Before we discuss the implementation of UndoManager, we need to understand the special handling of delete operations by Yjs. In Yjs, each item can be marked as deleted (which can be checked in the getter for item.deleted), but no additional information about the deletion operation is recorded in the item:

  • Item does not record when or by which user it was deleted.
  • Deletion operations are also not recorded in StructStore.
  • The local clock is not incremented after a deletion occurs.

So where is the information for the delete operation modeled? Yjs introduces the concept of DeleteSet, which records all deleted items in a logical timeline over a certain period of time (or transaction) and distributes this data independently of items. In other words, the delete operation in Yjs is designed to be independent of the structure of the bidirectional list. In the previous introduction to the item bidirectional list, we discussed an operation-based CRDT. However, Yjs is designed to be a simpler state-based CRDT when dealing with delete operations.

The so-called state-based CRDT sends the full local state to each client and performs the combined calculation on each client independently. This type of CRDT tends to be more practical for lightweight states such as counters. It can be considered that the modeling of delete operation by Yjs conforms to this concept.

In the introduction above we mentioned the concept of transaction, which is an abstraction used in Yjs to build event systems. Each transaction contains two pieces of data:

  • The item inserted for this update.
  • This update removes the DeleteSet of the item.

The binary incremental update data actually distributed over the network is the serialized encoded transaction. You can assume that Transaction will be used for both coded updates and the granularity of undo redo operations. Based on the above data structure, you can see that a transaction can be undone only by doing two things:

  • Mark the item inserted into this transaction as deleted.
  • Mark items removed from this update as restored.

Since no new items need to be added to this process, a continuous undo redo operation is theoretically equivalent to just distributing the lightweight DeleteSet continuously. Because DeleteSet’s structure is very light (for example, in the B4 Benchmark dataset that records the editing of real users’ LaTeX papers, only 4.5KB of DeleteSet was generated after 182,000 inserts and 77,000 deletions), This design is a step closer to “zero-overhead abstraction,” because undo redo does not create any unknown new data.

However, in my opinion, although this design can play a more extreme optimization effect, it also makes maintenance more difficult. For example, UndoManager previously had an unfixable problem: “Three consecutive attempts to undo can be redone to return to the original state, and more than four attempts may lose fields.” The root cause of this problem lies in the logic of item restoration at that time, which may not be able to find the correct item position that should be restored by continuous right shift. Although the individual has submitted a PR repair for this problem, the investigation process is also quite bumpy. For those of you who wish to implement your experimental CRDT library, deleteset-based optimization should not be a capability that needs to be implemented in the first prototype.

Synchronizing network Status

Although the CRDT library itself has nothing to do with specific network protocols, Yjs also designs supporting network protocols as a set of evolutionary solutions. The highlights of this section related to the internal implementation of Yjs include these:

  • The entire Yjs document can be encoded as update data in the form of Uint8Array, and Yjs can deserialize the YDoc object structure from binary data. Each document update transaction also computes incremental updates for synchronization, including item and DeleteSet, as described earlier.
  • Yjs locates the logical timeline through the concept of a state vector, a data structure that is essentially a set of clients that record all the clients at a given time(client, clock)A tuple.

When synchronizing document state, Yjs has two phases:

  • Phase 1: A client can send its own local state vector to fetch missing document update data from a remote client.
  • Phase 2: Remote clients can use their respective local clocks to calculate the item objects that the client needs and further code to generate minimal Update data that contains all the unsynchronized states.

In theory, we can support more network protocols for Yjs than WebSocket. Currently, Yjs also supports new distributed Web oriented protocols like Dat. However, network protocol-related content is not in the Yjs main repository, as can be seen from the Y-Protocols project.

conclusion

So far, we have covered the main internal features of Yjs:

  • Basic data structure based on bidirectional linked list and StructStore
  • Concurrent conflict resolution mechanism based on YATA algorithm
  • Undo redo based on DeleteSet and Transaction mechanisms
  • Synchronization mechanism based on two-phase partition

Due to the limitations of personal ability, there are many technical details that cannot be covered in this article. Finally, some valuable references are listed:

  • YATA paper introduces the algorithm design and correctness proof of Yjs.
  • Yjs Internals documents the key internal structure of Yjs, much of which is useful for this article.
  • Kevin Jahns, author of Yjs, describes some of the key optimizations he has implemented.
  • 5000X Fater CRDTs: An Adventure in Optimization is a blog post written by Seph Gentle, author of OT Library ShareDB, which deeply analyzes the process of CRDT engineering performance improvement.
  • How Yjs Works Form the Inside Out is a long but enlightening video interview that Seph invited Kevin to do in order to understand Yjs.
  • Personal D2 share introduces Yjs project access practices.

Because of the extremely decentralized nature of CRDT behind Yjs, it may have the opportunity to become some form of front-end infrastructure in the Web 3.0 era. From its case, we can feel that the practical application of academic research results is not achieved overnight, but depends on a large number of specific engineering details processing and targeted optimization, behind which there is still a lot of basic computer knowledge such as data structure and algorithm. And compared with the classic OT, the popularity of CRDT in recent years may also be a potential paradigm shift, which means new opportunities for front-end developers. I hope this article will be helpful to those who are interested.