An overview of the

Route is a high-performance publish-subscription middleware

  • Decentralization (any node is peer)
  • Elastic scaling, high availability and partition fault tolerance (AP), and ultimate consistency
  • High number of subscriptions, high performance (see performance tests below)
  • Support batch operation
  • .

Where is Route appropriate?

Route is suitable for scenarios that require high performance and high number of subscriptions but allow message loss. (Route does not guarantee delivery of messages)

The installation

Docker run

$ docker run -d --name route -p 4000:4000 --restart=unless-stopped routeio/route
Copy the code

K8s deployment cluster

The cluster will expose port 30400 for access by default. If you do not need this port, modify the K8S.yaml file yourself

$ git clone https://github.com/wiqun/route.git
$ cd route
$ kubectl.exe apply -f .\k8s.yaml
Copy the code

For details about installation and configuration, see Route

The performance test

  • Intel(R) Xeon(R) X5650
  • CPU limit is 1 core
  • The suppressing machine and the suppressed machine are in different network segments (data transmission needs to pass through the gateway)
  • Each connection periodically publishes a message to the corresponding topic and continues to pressure for 10 minutes
Number of connections (equal to subscriptions) TPS CPU utilization Average elapsed time (affected by gateway performance)
3000 1W1+ 60 ~ 70% 100ms+
10000 1W+ 70 + % 300ms+

If you switch to batch operation mode, performance will be significantly improved.

The CPU usage may not reach 100% because the processing speed of the gateway is not up to the requirement.

For an example, see Route-Go

Why so fast?

  1. Focus on high-performance publish-subscribe scenarios at the expense of a certain degree of message reliability (a one-size-fits-all design philosophy)
  2. 1+N coroutine mode (1 main coroutine +N IO coroutines). The main coroutine takes care of all subscription tree operations, thus avoiding the introduction of locks (performance issues associated with locking).
  3. Batch operation. Batch can reduce the number of I/OS, but also improve bandwidth utilization efficiency

design

Route can be generally divided into three modules, as shown below:

1. Broker

  1. The Broker maintains the connection to the client, reads the requests from the connection, and returns the results of the processing of the request to the connection

  2. Broker currently supports only the Websocket protocol

2. Service

  1. The main function of a Service is to handle requests forwarded by the Broker or Swarm

  2. The current request type of 4 kinds: LocalBatchTopicOp (containing Sub, Unsub), LocalQuery, BatchPubMessage

  3. Why put Sub and Unsub requests in the same structure? Actually use the characteristics of Chan to ensure FIFO characteristics for subscription operations (e.g., Sub before Unsub, Unsub after Sub will produce different results)

  4. Batchpubmessages from brokers behave slightly differently from batchpubmessages from Swarm

    • Broker: Pushes to all subscribers to this Topic

    • Swarm: will only be pushed to subscribers who subscribe to this Topic and have LocalSubscriberType == LocalSubscriberDirect

  5. The Service maintains a topic-> List map. The internal coroutine structure is 1+N, with one main coroutine. All map operations can only be performed by the main coroutine. N IO coroutines, mainly to write results to the corresponding subscribers (write are IO operations)

  6. There are two key methods for Subscriber interfaces. ID() identifies the uniqueness of the Subscriber and ConcurrentId() is used to prevent concurrent I/O write problems

3. Swarm

Swarm is responsible for synchronizing cluster status and discovering new nodes

Swarm uses the Gossip protocol to synchronize cluster states

Swarm maintains a state of class CRDT, and its Proto file is:

package crdt;


message State {
    map<uint64, TopicMap> set = 1;
}

message TopicMap {
    map<string.int64> map= 1;
}
Copy the code

Its meaning is as follows:

  • The layer 2 PeerName is unique within the cluster and is available on each node
  • The third layer represents the subscribed topics
  • Layer 4 is an INT64, which is a combination of TimeStamp+IsAdd(the Add status can be equivalent to the Topic was subscribed). The lowest value represents IsAdd, and its 63 bits represent the millisecond TimeStamp of the last IsAdd update

The following state transitions can occur for any Topic:

But the Topic state transitions for Myself and other are slightly different:

3.1 myself
local \ remote ADD DEL
There is no Force the merge to DEL DELTime is merged to the DEL state if it has not expired; otherwise, it is ignored
ADD Since the TimeStamp of remote cannot be greater than that of local, this change is ignored (if it occurs, it is an unknown error). Force the merge to ADD
DEL Since the TimeStamp of remote cannot be greater than that of local, this change is ignored (if it occurs, it is an unknown error). Take the latest version of both (the latest version is taken because of node disconnection and reconnection)

Special: State is scanned at intervals to delete topics whose DELTime has expired, thus changing the State from DEL to none (GC)

It’s worth noting that most TopicMap state changes in the Myself region can only be done by the node itself (other nodes simply merge updates), except when the node is disconnected and other nodes force all TopicMap items in the Myself region to be in Del state and DelTime

3.2 other
local \ remote ADD DEL
There is no Into the ADD DELTime is merged to the DEL state if it has not expired; otherwise, it is ignored
ADD Take the latest version of both Take the latest version of both
DEL Take the latest version of both Take the latest version of both

Special: State is scanned at intervals to delete topics whose DELTime has expired, thus changing the State from DEL to none (GC)

The last

Project address: github.com/wiqun/route

My other open source projects:

  • Github.com/MMMzq/bot_t… (a Library of Flutter Toast)
  • Github.com/auto-flutte… (A complete test scheme for Flutter automation)

Contact email: [email protected]