Welcome everyone to pay attention to my wechat public number [old week chat architecture], Java back-end mainstream technology stack principle, source code analysis, architecture and a variety of Internet high concurrency, high performance, high availability of solutions.

One, foreword

Last month, a reader asked me questions related to the Internet of things MQTT protocol actual battle, I said that I would do it later, did not think of imunconsciously a month, too busy, again how busy promised things or to give readers an explanation, so there is this article.

Insert the picture description here

MQTT protocol summary

2.1 What is the MQTT Protocol

MQTT (Message Queuing Telemetry Transport) is a lightweight communication protocol based on the Publish /subscribe mode. This protocol is based on TCP/IP. Invented by IBM in 1999. The main characteristics of the MQTT protocol are openness, simplicity, lightweight, and ease of implementation, which make it suitable for constrained application environments such as:

  • Network limited: Low network bandwidth and unreliable transmission
  • Terminals limited: The protocol runs on an embedded device. The processor, memory, and so on of an embedded terminal are limited

Through MQTT protocol, dozens of MQTT server programs have been extended, which can send messages to MQTT through PHP, Java, Python, C, C# and other languages. Due to the characteristics of open source code and small power consumption, MQTT is very suitable for the Internet of things, such as sensor and server communication, sensor information collection, etc.

2.2 Publish/subscribe Mode

The publish/subscribe pattern is not specific to the MQTT protocol, as many of our messaging middleware uses the publish/subscribe pattern. If you want to say that this is the observer pattern, it is not. The two patterns are easily confused. The observer model has only two roles: observer and observed, whereas the publish/subscribe model has a Broker. At a deeper level, the observer and the observed are loosely coupled, while the publisher and the subscriber are not coupled at all.

In the client/server model, the client communicates directly with the server endpoint. On the other hand, pub/sub separates publisher (which sends messages) from subscribers (which receive messages). Publisher and subscribers do not communicate directly. They don’t even know if each other exists, and the communication between them is mediated by a third party component, the Broker.

Insert the picture description here

The most important aspect of PUB/SUB is the decoupling of the publisher of the message from the receiver (subscriber). This decoupling has several dimensions:

  • Spatial decoupling: Publishers and subscribers do not need to know each other (for example, do not exchange IP addresses and ports).
  • Time decoupling: Publishers and subscribers need not be running at the same time.
  • Synchronization decoupling: The operations of two components do not need to be interrupted when published or received.

In short, the publish/subscribe model eliminates the traditional direct client/server communication, leaving it to the broker to broker and understand the spatial, temporal, and synchronous dimensions.

2.3 Scalability

Pub/SUB is a much better extension of the traditional client/server pattern because brokers are highly parallel and are based on an event-driven pattern. Scalability in message caching and intelligent routing of messages, as well as the ability to make millions of connections through clustering proxies, using load balancers to distribute the load across more single servers, is where MQTT goes deep.

2.4 Message Filtering

Obviously, the broker plays an important role in the PUB/SUB process. But how does the broker filter all messages so that each subscriber receives only the messages of interest? The broker has several options to filter:

  • Subject-based filtering

    This filtering is based on the topic that belongs to each message. Receiving clients subscribe to topics of interest to the broker, and once subscribed, the broker ensures that the client receives the messages published to the topic.
  • Content-based filtering

    In content-based filtering, the broker filters messages based on specific content, and the receiving client filters what they are interested in. One significant disadvantage of this approach is that the content of the message must be known in advance and cannot be encrypted or easily modified.
  • Type-based filtering

    Filtering based on the type/class of messages (events) is a common practice when using object-oriented languages. For example, subscribers can listen to all messages of type Exception or any subtype.

2.5 Differences between MQTT and message queue

Here again you would say that since MQTT and mainstream message queues use a publish/subscribe model, they are the same. There are a lot of similarities with message queues, but there are some differences. Here are some:

  • Message queues store messages until they are consumedWith message queues, each incoming message is stored in the queue until it is received by a client, usually called a consumer. If no client receives the message, it stays on the queue and waits to be consumed. In message queues, there is no case where the message is not consumed by the client, but in MQTT, there is a case where the topic has no subscriber subscription.
  • A message is consumed by only one clientAnother big difference is that in a traditional message queue, a message can only be processed by one consumer. The load is distributed across all consumers in the queue. In MQTT, the behavior is quite the opposite: every subscriber who subscribs to a topic receives a message, and each subscriber has the same load.
  • Queues are named and must be created explicitlyQueues are much stricter than topics. Before a queue can be used, it must be explicitly created using a separate command. Messages can be published or consumed only after the queue is named and created. In contrast, MQTT topics are very flexible and can be created on the fly.

3. Important MQTT concepts

3.1 the MQTT Client

Publisher and subscriber are both MQTT clients. The concept of publisher and subscriber is a relative concept, that is, whether the current Client is publishing or receiving messages. The publishing and subscription functions can also be implemented by the same MQTT Client.

An MQTT client is any device (from a microcontroller to a full-fledged server) that runs an MQTT library and connects to an MQTT agent over the network. For example, an MQTT client can be a very small, resource-constrained device that connects over a wireless network and has a minimal library. Basically, any MQTT device that uses THE TCP/IP protocol is called an MQTT Client. The client implementation of the MQTT protocol is very straightforward and easy to implement is one of the reasons MQTT is so well suited for small devices. The MQTT client library is available for multiple programming languages. For example, Android, Arduino, C, C++, C#, Go, iOS, Java, JavaScript, and.NET.

3.2 the MQTT Broker

The counterpart of an MQTT Client is an MQTT Broker, which is at the heart of any publish/subscribe protocol and can handle up to millions of connected MQTT clients, depending on the implementation.

The Broker receives all messages, filters them, determines which Client subscribed to each message, and sends the message to the corresponding Client. The Broker also holds session data, including subscribed and missed messages. The Broker is also responsible for client authentication and authorization.

3.3 an MQTT Connection

MQTT is based on TCP/IP. Both the client and proxy need to have a TCP/IP protocol support.

Insert the picture description here

The MQTT connection is always between a client and a broker. Clients are never directly connected to each other. To initiate a connection, the client sends a CONNECT message to the broker. The agent responds with a CONNACK message and status code. After the connection is established, the agent remains open until the client sends a disconnection command or the connection is interrupted.

Insert the picture description here

4. Message list

4.1 the CONNECT

To create a connection, the client sends a command message to the broker. If this CONNECT message is incorrectly formatted (according to the MQTT specification) or the time between opening the network socket and sending the connection message is too long, the broker will close the connection.

An MQTT client sends a CONNECT connection that may contain the following information:

Insert the picture description here

We will focus on the following options:

  • ClientId: ClientIdClientId can be 1 to 23 characters in length. ClientId cannot be repeated on a server. If there are more than 23 characters, the server returns the CONNACK message with the return code Identifier Rejected. In MQTT 3.1.1, you can send an empty ClientId if you do not need the broker to hold the state. An empty ClientId causes the connection to have no state. In this case, the Clean Session flag must be set to true, otherwise the agent will reject the connection.
  • Clean SessionThe: Clean Session flag tells the proxy client whether to establish a persistent Session. In a persistent session (CleanSession = false), the agent stores all subscriptions from the client and all lost messages from the client subscribing at quality of Service (QoS) level 1 or 2. If the session is not persistent (CleanSession = true), the agent stores nothing for the client and clears all information from any previously persistent session.
  • Username/Password: MQTT can send the user name and password for client authentication and authorization. However, if the message is not encrypted or hashed, the password will be sent as plain text. We strongly recommend using username and password in conjunction with secure transmission. Agents such as HiveMQ can authenticate clients using SSL certificates and therefore do not require user names and passwords.
  • Will Message: LastWillxxx is a will that the client sets when it connects to the broker. This will is stored in the broker. If the client disconnects from the broker for any unusual reason, The broker sends the will to clients that subscribe to this topic.
  • KeepAlive: keepAlive is the interval, usually in seconds, between the client and the broker when the connection is established. This is the maximum time that the client and broker can endure without sending messages.

4.2 CONNACK

When the broker receives a CONNECT message, it is obliged to respond with a CONNACK message. A CONNACK message consists of two parts:

  • The session present flag: Indicates the current flag of the session
  • A connect return code: Connection return code

Insert the picture description here

  • Session Present flag

    Session current flag, which tells the client whether the broker has a persistent session to interact with the client. The SessionPresent flag is related to the CleanSession flag. When a client connects with CleanSession set to true, the SessionPresent flag is always false because no persistent session is available. If CleanSession is set to false, then SessionPresent is true if ClientId session information is available and the broker already stores session information. Otherwise, if there is no session information for ClientId, then SessionPresent is false.

    Insert the picture description here

  • Connect return code

    The second flag in the CONNACK message is the connection acknowledgement flag. This flag contains a return code that tells the client whether the connection attempt was successful. The connection confirmation flag has the following options:

Insert the picture description here

4.3 the PUBLISH

MQTT clients can publish messages as soon as they connect to the broker. MQTT uses topic-based filtering. Each message must contain a topic that the broker can use to forward the message to interested clients. Typically, each message has a Payload, which contains the data to be transmitted in bytes. MQTT is data independent, meaning that it is up to the publish-Publisher to send XML, JSON, binary, or text data.

PUBLISH messages in MQTT have several properties that we want to discuss in detail:

Insert the picture description here

  • Topic Name: The topic name is a simple string that is layered with a forward slash delimiter. For example, “my home/living room/temperature” or “Germany/Munich/Octoberfest/people”.
  • QoS: This number represents the quality of service (QoS) of the message. There are three levels: 0, 1, and 2. The service level determines the type of guarantee that the message will reach the intended recipient (client or broker).
  • Retain Flag: This flag indicates that the broker saves the last received message with the RETAIN flag true on the server (memory or file).
  • PayloadThis is the actual content of each message. MQTT is data independent. You can send any text, image, encrypted data, and binary data.
  • Packet IdentifierThis packetId identifies the unique message identifier between the client and broker. PacketId is only related to QoS levels greater than zero.
  • DUP flag: This flag indicates that the message is duplicate and was resended because the intended receiver (client or proxy) did not acknowledge the original message. This is only relevant if the QoS is greater than 0.

When a client sends a message to the MQTT Broker for publication, the broker reads the message, acknowledges the message (depending on the QoS level), and processes the message. Processing by the broker involves determining which clients subscribe to the topic and sending messages to them.

Insert the picture description here

The client that originally published the message is only concerned with delivering the PUBLISH message to the broker. Once the broker receives a PUBLISH message, it is the responsibility of the broker to deliver the message to all subscribers. The publishing client does not receive any feedback on whether anyone is interested in the published message or how many clients receive the message from the broker.

4.4 the Subscribe

The client sends a SUBSCRIBE message to the broker to receive information about the topic it is interested in. The SUBSCRIBE message is simple. It contains a unique packet identifier and a subscription list.

Insert the picture description here

  • Packet Identifier: This PacketId is the same as the PacketId above, which represents the unique Identifier of the message.
  • List of Subscriptions: A single SUBSCRIBE message can contain multiple Subscriptions for the same client. Each subscription consists of a topic and a QoS level. Topics in a subscription message can contain wildcards, making it possible to subscribe to a topic pattern rather than a specific topic. If a client has overlapping subscriptions, the broker delivers messages with the highest QoS level for that topic.

4.5 Suback

To acknowledge each subscription, the broker sends a SUBACK acknowledgement message to the client. The message contains the packet identifier of the original Subscribe message (to clearly identify the message) and a list of return codes.

Insert the picture description here

  • Packet Identifier: The Packet Identifier is a unique Identifier used to identify messages. It’s the same as in the SUBSCRIBE message.

  • Return Code: The broker sends a Return Code for each topic /QoS pair it receives in a SUBSCRIBE message. For example, if the SUBSCRIBE message has five subscriptions, the SUBACK message contains five return codes. The return code identifies each topic and displays the QoS level granted by the broker. If the broker rejects the subscription, the SUBACK message contains the failure return code for that particular topic. For example, if the client does not have enough permissions to subscribe to the topic or the topic is incorrectly formatted.

    Insert the picture description here

    \

    Insert the picture description here

    After the client successfully sends the SUBSCRIBE message and receives the SUBACK message, it retrieves each published message that matches the topic in the subscription contained in the SUBSCRIBE message.

4.6 Unsubscribe

A SUBSCRIBE message corresponds to an UNSUBSCRIBE message. This message deletes an existing subscription from the client on the broker. UNSUBSCRIBE messages are similar to SUBSCRIBE messages in that they have a packet identifier and a list of topics.

Insert the picture description here

4.7 Unsuback

To acknowledge unsubscription, the broker sends an UNSUBACK acknowledgement message to the client. This message contains only the packet identifier of the original UNSUBSCRIBE message (to clearly identify the message).

Insert the picture description here

\

Insert the picture description here

When the client receives an UNSUBACK from the broker, it can assume that the UNSUBSCRIBE message was deleted.

Fifth, switchable viewer

We’ve talked a lot about MQTT protocol formats and message lists, so in this section we’ll look at Topics. Topics are important in MQTT because we often need to identify MQTT Topics first when writing code.

In MQTT, the term topic refers to the UTF-8 string that the broker uses to filter messages for each connected client. A topic consists of one or more topic levels. Each topic level is separated by a forward slash (topic level separator).

Insert the picture description here

MQTT topics are very lightweight compared to message queues. The client does not need to create the required topic before publishing or subscribing to it. The broker accepts each valid topic without any prior initialization.

5.1 the wildcard

When a client subscribing to a topic, it can subscribe to the exact topic of a published message, or it can use wildcards to subscribe to multiple topics simultaneously. Wildcards can only be used to subscribe to topics, not to publish messages. There are two different types of wildcards: single-level and multilevel.

  • Single stage: +

    As the name suggests, a single-level wildcard replaces a topic level. The plus sign represents a single-level wildcard in a topic. \

    Insert the picture description here

    If the topic contains an arbitrary string and not a wildcard, then any topic matches a topic with a single-level wildcard. For example, subscribing to myhome/groundfloor/+/temperature yields the following results:

Insert the picture description here

  • Multistage: #

    Multi-level wildcards cover multiple topic levels. Hash symbols represent multi-level wildcards in a topic. In order for the agent to determine which topics match, the multi-level wildcard must be placed as the last character in the topic, beginning with a forward slash. \

    Insert the picture description here

    When a client subscribing to a topic with multi-level wildcards, no matter how long or deep the topic is, it receives all messages for topics that begin with the pattern before the wildcard. If you specify only multi-level wildcards as topics (#), you will receive all messages sent to the MQTT broker. Using multi-level wildcard subscriptions alone is an anti-pattern if you expect high throughput (see best practices below).

5.2 Topics beginning with $

Typically, you can name MQTT topics as needed. There is, however, one exception: topics that begin with ∗∗ In general, you can name MQTT topics as needed. However, there is one exception: topics that begin with a symbol have a different purpose. When you subscribe to multi-level wildcards as topics (#), those topics are not part of the subscription. The $-symbol topic holds internal statistics for the MQTT broker. The client cannot publish messages to these topics. There is currently no official standardization of such topics. In general, $SYS/ is used for all of the following information, but agent implementations vary. A recommendation for $sys-topics is provided in the MQTT GitHub Wiki. Here are some examples:


Yeah, you can see the back. The following is an introduction to MQTT Version 3.1.1, and some protocol formats can be viewed in detail.

Docs.oasis-open.org/mqtt/mqtt/v…

This article is divided into basic features and actual combat, the next article will take you to build an MQTT server, let other manufacturers of equipment access, enjoy the anticipation ~