This is the 9th day of my participation in the November Gwen Challenge. Check out the event details: The last Gwen Challenge 2021

One, foreword

The key to collaborative editing is to solve the problem of data consistency

Different collaborators to modify their own copy of the document, the editor of a local application will immediately to the local copy, then transfer to other collaborators, which leads to different copies of the application may be in a different order of various modifications, there should be a algorithm to ensure that after a collaborative editing, all copies of the final content is the same.

To solve real-time multi-person collaboration, there are three steps:

  1. Select the appropriate editor

  2. Text Model is designed using OT algorithm to deal with HTML transformation and collaboration conflicts

  3. The control state of Accepted Flag was used to ensure the correctness of the Text Model with frequent changes

In fact, the solutions for conflict management are relatively mature, including:

  1. Edit lock: When someone is editing a document, the system locks the document to prevent others from editing at the same time.

  2. Diff-patch: Based on the similar idea of Git and other version management, it compares and merges contents, including GNU Diff-patch, Myer’s diff-Patch and other schemes.

  3. Final conformance implementation: Operational Transformation (OT), conflict-free Replicated Data Type (CRDT).

Corresponding usage:

  1. The implementation of edit lock is simple and crude, but it directly affects user experience. Multiple processes read and write files and use read and write locks.

  2. Diff-patch can merge conflicts by itself or hand over to users when conflicts occur.

  3. The OT algorithm is the solution used in Google Docs, and the Atom editor uses CRDT.

(1)OTCRDT

The OT and CRDT approaches are similar in that they provide ultimate consistency.

The difference is how they operate:

  1. OT does this by changing operations:

    • OTWill edit operation split, conversion, to achieve the effect of conflict processing
    • OTIt does not include specific implementation, so it needs to be implemented by the project itself, but it can carry out high-precision conflict handling according to the needs of the project
  2. CRDT does this by changing the state:

  • Basically, CRDT is a data structure that, when updated with the same set of operations, will always converge to the same representation, even if the operations are applied in a different order

  • CRDT comes in two ways: operation-based and state-based

OT is primarily used for text, which is often complex and unextensible.

The CRDT implementation is simple, but Google, Microsoft, CKSource, and many others rely on OT for a reason. The current state of CRDT research supports collaboration on two main types of data: plain text and arbitrary JSON structures.

For more advanced structures, such as rich text editing, OT trades complexity for user expectations, while CRDT focuses more on data structures. As data structure complexity increases, the time and space complexity of the algorithm increases exponentially, resulting in performance challenges.

Therefore, most real-time collaborative editing is now implemented based on OT algorithm.


OTThe basic data structure of the algorithm

operationType:

  • Insert

  • Delete

  • Retain (retain)

Insert bold “ABC” : [insert(‘ ABC ‘, {bold: true})]

Operation length: Text Model Applied to the editor mechanism: from coordinate 0, the Operation sequence

Here’s an example:

  • The original:abcde
  • Changes:[retain(2), delete(1)]
  • Results:abde

Here’s another chestnut:

  • The original:Hey Edy!
  • Changes:[retain (2), insert (' llo), delete (1), retain (1), insert (" world "), delete (3)]
  • Results:Hello world!


(2) Means of communication

There are many ways for the front and back end to communicate, including HTTP short polling (Polling), Websocket, HTTP long polling (long-polling), and SSE (Server-sent Events).

Different online document teams use different communication methods.

For example, Google Docs uses Ajax for upstream data and HTTP for downstream data. Ajax is used for upstream data of graphite documents, SSE is used for downstream data push; Jinshan documents, feishu documents, Tencent documents are using Websocket transmission.

Each communication mode has its own advantages and disadvantages, including compatibility, resource consumption, real-time performance, etc., which may also be related to the background architecture of the business team.

Therefore, when designing the connection layer, we should consider the interface extensibility and reserve support for various modes.




Second, editor scheme

Solution:

  1. Building a wheel from scratch: Reference objectGoogle DocQUIP

Principle: Listen for keyboard events to display content in canvas or other ways

  1. Open source: Reference objectsEtherpad

Principle: Realized text model and HTML interworking editor

Text Model

HTML and Text Model conversion as shown below:

Text Model is another form of presentation of editor content, which is different from the HTML level, as follows:

For example, bold text "ABC" HTML: <b> ABC </b> Model: {data: 'ABC', bold: true}Copy the code

Supported collaboration conditions:

  1. The HTML content of the editor and the Text Model can be converted to each other

  2. The Text Model can handle multiple changes


Text ModelHandling multiplayer changes:

A Text Model is like an array of operations.

  1. Operation is used to indicate document contents and changes

  2. Transform to resolve conflicts of multiplayer changes





Three, problem,

(1) How to handle the version correctly?

The client changes frequently and the server processes them frequently. How do I ensure that the version of each change is correct?

The client’s Text Model stores the policy, and three versions of the Text Model exist simultaneously:

  • Submited Model

  • Committing Model

  • User Model

Client Text Model simple process, as shown in the figure: