RocketMQ best practices

Some useful suggestions for RocketMQ users.

The agent role

The agent roles are: asynchronous master node, synchronous master node and slave node. If you cannot tolerate message loss, we recommend that you deploy a synchronization host and attach a slave server to it. If you are tolerant of loss but want the agent to always be available, you can deploy ASYNC_MASTER and SLAVE together, or if you want to keep things simple, you might just need an asynchronous host with no SLAVE nodes.

Refreshing the Disk Type

It is recommended to use asynchronous refreshes because synchronous refreshes are expensive and cause too much loss of new energy. If you want reliability, we recommend that you use synchronous and cluster machines.

Best practices for producers

Delivery status

When sending a message, you will get a send status and a send result. First we build the message’s isWaitStoreMsgok =true(the default is true). If there is no exception, we will always receive “OK”.

FLUSH_DISK_TIMEOUT: flushtimeout time

If the agent sets MessageStoreConfig’s FlushDiskType=SYNC_FLUSH(ASYNC_FLUSH by default), This status is obtained if the agent does not complete disk flushing within messagestoreConfig’s SyncFlushTimeout, which defaults to 5 seconds).

FLUSH_SLAVE_TIMEOUT: flushs the secondary node timeout

This status is obtained if the agent’s role is synchronous host (default asynchronous host) and if the slave agent is not asynchronously refreshed within the syncFlushTimeout(default 5 seconds) of MessageStoreConfig.

SLAVE_NOT_AVAILABLE: The slave machine is unavailable

This state is obtained if the role of the agent is synchronous master agent (asynchronous master agent by default), but no slave agent is configured.

SEND_OK: the message is sent successfully

Sending “OK” does not mean that it is reliable, and to ensure that no messages are lost, synchronous host or synchronous refresh should be enabled.

Duplication or Missing: Duplication or Missing

If you get flush-disk-timeout, flush-slave-timeout, and the agent is just off at this point, you may find that your message is lost. At this point, you have two options, one is to let it go, which may cause the message to be lost, or to resend the message, which may cause the message to be repeated. In general, we recommend resending and finding a way to handle deduplication when it’s used. Unless you think it’s okay when some information gets lost. Keep in mind, however, that resend is useless when there are no slave hosts, and if this happens, you should preserve the scenario and warn the cluster manager.

timeout

The client sends a request to the agent and waits for a response, but if the maximum wait event has passed and no response is returned, the client throws a RemotingTimeoutException. The default waiting time is 3 seconds. Instead of sending (MSG), you can set the timeout by using the send(MSG, timeout) method. Note that we do not recommend setting the wait time too short. This is because the agent takes some time to refresh disks or synchronize with slave servers.

Also, if the value exceeds syncflushTimeout too much, the effect may be minimal because the agent may return FLUSH_SLAVE_TIME or FLUSH_SLAVE_TIMEOU before timeout.

Message body size

We recommend that the size of the message should not exceed 512K.

Async Sending

The default send(MSG) method will block until a return response message is received, so if you prefer to care about performance, we recommend that you use Send (MSG, callback), which will work asynchronously.

Producer Group

In general, the producer group has no effect. But if you are adding a thing, you should note that by default only one production with the same production group can be created within the same JVM, which is usually sufficient.

Thread safety

Producers are thread-safe. You can use it in business solutions.

performance

If you want to have multiple producers doing big data processing in one JVM. We recommend:

Use several producers to send asynchronously (3-5 is sufficient)
Set the instance name for each producer

Best practices for consumption

Consumer groups and subscriptions

First, you need to note that different consumer groups can consume the same topic independently, and each consumer group has its own compensation. Make sure that every consumer in the same group subscribes to the same topic.

Message listener

Orderly

The consumer locks each message queue to ensure that it is consumed one by one in order. This incurs a performance penalty, but it is useful when you are dealing with the order of messages. An exception is not recommended, you can return to ConsumeOrderlyStatus. SUSPEND_CURRENT_QUEUE_A_MOMENT.

Concurrently

As the name implies, the user will consume concurrently. It is recommended to use it to obtain good performance, throw an exception, is not recommended. You can use ConsumeConcurrentlyStatus RECONSUME_LATER for replacement.

Consume Status

You can return for MessageListenerConcurrently RECONSUME_LATER tell consumers that you can’t spending it right now, want to wait for a moment, then spending it, then you can continue to consume the other messages, for MessageListenerOrderly, Because you care about the order, you can’t skip a particular message, but you can tell the consumer to wait by returning SUSPEND_CURRENT_QUEUE_A_MOMENT(pause the current queue).

Blocking

Blocking listeners are not recommended. Because it blocks the thread pool, it can eventually cause the consuming process to stop.

Number of threads

Consumers use ThreadPoolExecutor for internal consumption, so you can change the maximum number of consuming threads and the maximum number of consuming threads by setConsumeThreadMin or setConsumeThreadMax.

Where to start spending

When a new consumer is created, it will need to decide whether it needs to consume historical messages that already exist with the broker.

CONSUME_FROM_LAST_OFFSET: This configuration will ignore historical messages and consume any subsequent content.

CONSUME_FROM_FIRST_OFFSET: This configuration will consume all messages that exist in the broker.

CONSUME_FROM_TIMESTAMP: This configuration regenerates a message after consuming the specified timestamp.

repeat

Many situations can lead to duplication, such as:

The producer sends repeatedly (for example, in the case of FLUSH_SLAVE_TIMEOUT).
Consumers shut down and some compensation mechanisms were not updated to the agent in time.

Therefore, if the application does not tolerate duplication, you may need to do some external work to deal with the problem, for example, by checking the primary key of the database.

Best practices for name services

In Apache LocketMQ, name servers are designed to coordinate each component of a distributed system, primarily by managing topic routing information.

Management mainly consists of two parts:

The agent periodically updates the metadata stored in each name server.
Name servers provide up-to-date routing information for client services, including producers, consumers, and command-line clients.

Therefore, before starting the proxy and client, we need to tell them how to access the name server by providing a list of name server addresses. There are four ways to do this in Apache RocketMQ.

Programming methods

The Java configuration items

The name server address list can be provided to the application by specifying the successor Java option Rocketmq.namesrv.addr prior to startup.

The environment variable

The HTTP client

priority

The first approach is superior to the latter:

Programmatic Way > Java Options > Environment Variable > HTTP Endpoint
Copy the code