preface

Recently, when I was investigating the use of Netty, I encountered an abnormal encoding and decoding of Chinese string when I was writing the encoding and decoding module. Later, I found that the author made a low-level mistake. Here’s a little review.

Wrong again

In the design of Netty’s custom protocol, I found that the attribute of the string type, once the Chinese characters appear, there will be a phenomenon of decoding Exception, this Exception does not necessarily appear Exception, but the character truncation after decoding appears unreadable characters. The encodings and decoders are implemented as follows:

/ / entity
@Data
public class ChineseMessage implements Serializable {

    private long id;
    private String message;
}

// encoder - < error, do not copy >
public class ChineseMessageEncoder extends MessageToByteEncoder<ChineseMessage> {

    @Override
    protected void encode(ChannelHandlerContext ctx, ChineseMessage target, ByteBuf out) throws Exception {
        / / write ID
        out.writeLong(target.getId());
        String message = target.getMessage();
        int length = message.length();
        // Write the length of Message
        out.writeInt(length);
        // Write the Message character sequenceout.writeCharSequence(message, StandardCharsets.UTF_8); }}/ / decoder
public class ChineseMessageDecoder extends ByteToMessageDecoder {

    @Override
    protected void decode(ChannelHandlerContext ctx, ByteBuf in, List<Object> out) throws Exception {
        / / read ID
        long id = in.readLong();
        // Read the length of Message
        int length = in.readInt();
        // Read the Message character sequence
        CharSequence charSequence = in.readCharSequence(length, StandardCharsets.UTF_8);
        ChineseMessage message = newChineseMessage(); message.setId(id); message.setMessage(charSequence.toString()); out.add(message); }}Copy the code

Simply write client and server code and use the client server to send a message with Chinese:

// Server logsReceived client request :ChineseMessage(id=1Ty, message =) io.net. Handler. Codec. DecoderException: Java. Lang. IndexOutOfBoundsException: readerIndex (15) + length(8) exceeds writerIndex(21).// Client logsResponse received from the server:ChineseMessage(id=2And the message =)
io.netty.handler.codec.DecoderException: java.lang.IndexOutOfBoundsException: readerIndex(15) + length(8) exceeds writerIndex(21).Copy the code

In fact, the problem is hidden in the encoding and decoding module. Because the author has been 996 in the first two months, in the crazy writing CRUD code, amateur watching Netty, there are some basic knowledge of a short circuit did not recall. I searched the major search engines with this question, but I didn’t get the answer because of the wrong posture or the wrong keywords. In addition, many blog posts were copied from other people’s demos, which happened to be examples of writing messages in English. Therefore, this problem was stuck for a few days before the National Day holiday in 2019. I was busy with my business and didn’t take time to think about it.

In a flash

On the eve of the National Day holiday in 2019, as the team was rushing to build a CRUD background management system without separating the front and back ends, several colleagues discussed a garbled code problem when making a page. In the process of their discussion, two words popped out that made me suddenly awake: garbled and UTF-8. The first thing that comes to my mind is an article I wrote when I first started Cnblogs: “Boy, I’m Out of Order again – Summary of Java Character Coding Principles” (the title now seems quite odd). At that time, I did some research on the principle of character encoding. I felt a little ashamed to think that I had almost forgotten everything I had read more than a year ago.

The reason: in UTF-8 encoded Chinese, most of the time a Chinese character is 3 bytes long (3 bytes, 32 x 3 or 32 x 4 bits). In Java, String#length() is the length of the Char array that returns a String instance. ByteBuf#readCharSequence(int length, Charset Charset); ByteBuf#readCharSequence(int length, Charset Charset); Therefore, the length of the string sequence needs to be written in advance during encoding. But there is a hidden problem: ByteBuf#readCharSequence(int length, Charset Charset); char = 3; char = 4; Therefore, ChineseMessageEncoder always loses 2-3 bytes when writing character sequence length, although ChineseMessageEncoder writes the right number of characters, while ChineseMessageDecoder always reads a shorter length when reading character sequence length. You end up with an incomplete or incorrect sequence of strings.

The solution

Utf-8 encoded Chinese takes up three bytes in most cases, and may take up four bytes in some rare cases. It is possible to force the sequence of characters written to the byte buffer to triple by simply changing the encoder’s code:

public class ChineseMessageEncoder extends MessageToByteEncoder<ChineseMessage> {

    @Override
    protected void encode(ChannelHandlerContext ctx, ChineseMessage target, ByteBuf out) throws Exception {
        / / write ID
        out.writeLong(target.getId());
        String message = target.getMessage();
        int length = message.length() * 3;      // <1> Directly increases the prefetch length of the byte sequence
        // Write the length of Message
        out.writeInt(length);
        // Write the Message character sequenceout.writeCharSequence(message, StandardCharsets.UTF_8); }}Copy the code

Of course, this is too violent, and hard coding is neither normative nor friendly. Netty actually already provides a built-in tools io.net ty. Buffer. ByteBufUtil:

// Get the maximum length of the byte sequence for utF-8 characters
public static int utf8MaxBytes(CharSequence seq){}

// Writes utF-8 character sequences, returning the length of bytes written - this method is recommended
public static int writeUtf8(ByteBuf buf, CharSequence seq){}
Copy the code

ByteBufUtil#writeUtf8() is used to write a sequence of characters, and writerIndex overwrites the previously writerIndex based on the length of bytes returned:

public class ChineseMessageEncoder extends MessageToByteEncoder<ChineseMessage> {

    @Override
    protected void encode(ChannelHandlerContext ctx, ChineseMessage target, ByteBuf out) throws Exception {
        out.writeLong(target.getId());
        String message = target.getMessage();
        // Records are written to the cursor
        int writerIndex = out.writerIndex();
        // Prewrite a false length
        out.writeInt(0);
        // Writes the UTF-8 character sequence
        int length = ByteBufUtil.writeUtf8(out, message);
        / / cover lengthout.setInt(writerIndex, length); }}Copy the code

So, problem solved. If you encounter other Netty encoding and decoding problems, the solution is the same idea.

summary

Half of Netty learning is coding and decoding, and the other half is network protocol knowledge and tuning.

Netty source is very good, very aesthetic, very comfortable reading.

Netty is fun.

The appendix

Introducing dependencies:

<dependency>
    <groupId>io.netty</groupId>
    <artifactId>netty-all</artifactId>
    <version>4.1.41. The Final</version>
</dependency>
<dependency>
    <groupId>org.projectlombok</groupId>
    <artifactId>lombok</artifactId>
    <version>1.18.10</version>
    <scope>provided</scope>
</dependency>
Copy the code

Code:

/ / entity
@Data
public class ChineseMessage implements Serializable {

    private long id;
    private String message;
}

/ / encoder
public class ChineseMessageEncoder extends MessageToByteEncoder<ChineseMessage> {


    @Override
    protected void encode(ChannelHandlerContext ctx, ChineseMessage target, ByteBuf out) throws Exception {
        out.writeLong(target.getId());
        String message = target.getMessage();
        int writerIndex = out.writerIndex();
        out.writeInt(0);
        intlength = ByteBufUtil.writeUtf8(out, message); out.setInt(writerIndex, length); }}/ / decoder
public class ChineseMessageDecoder extends ByteToMessageDecoder {

    @Override
    protected void decode(ChannelHandlerContext ctx, ByteBuf in, List<Object> out) throws Exception {
        long id = in.readLong();
        int length = in.readInt();
        CharSequence charSequence = in.readCharSequence(length, StandardCharsets.UTF_8);
        ChineseMessage message = newChineseMessage(); message.setId(id); message.setMessage(charSequence.toString()); out.add(message); }}/ / the client
@Slf4j
public class ChineseNettyClient {

    public static void main(String[] args) throws Exception {
        EventLoopGroup workerGroup = new NioEventLoopGroup();
        Bootstrap bootstrap = new Bootstrap();
        try {
            bootstrap.group(workerGroup);
            bootstrap.channel(NioSocketChannel.class);
            bootstrap.option(ChannelOption.SO_KEEPALIVE, true);
            bootstrap.option(ChannelOption.TCP_NODELAY, Boolean.TRUE);
            bootstrap.handler(new ChannelInitializer<SocketChannel>() {

                @Override
                protected void initChannel(SocketChannel ch) throws Exception {
                    ch.pipeline().addLast(new LengthFieldBasedFrameDecoder(1024.0.4.0.4));
                    ch.pipeline().addLast(new LengthFieldPrepender(4));
                    ch.pipeline().addLast(new ChineseMessageEncoder());
                    ch.pipeline().addLast(new ChineseMessageDecoder());
                    ch.pipeline().addLast(new SimpleChannelInboundHandler<ChineseMessage>() {

                        @Override
                        protected void channelRead0(ChannelHandlerContext ctx, ChineseMessage message) throws Exception {
                            log.info("Response received from server :{}", message); }}); }}); ChannelFuture future = bootstrap.connect("localhost".9092).sync();
            System.out.println("Client started successfully...");
            Channel channel = future.channel();
            ChineseMessage message = new ChineseMessage();
            message.setId(1L);
            message.setMessage("Zhang Big Dog");
            channel.writeAndFlush(message);
            future.channel().closeFuture().sync();
        } finally{ workerGroup.shutdownGracefully(); }}}/ / the server
@Slf4j
public class ChineseNettyServer {

    public static void main(String[] args) throws Exception {
        int port = 9092;
        ServerBootstrap bootstrap = new ServerBootstrap();
        EventLoopGroup bossGroup = new NioEventLoopGroup();
        EventLoopGroup workerGroup = new NioEventLoopGroup();
        try {
            bootstrap.group(bossGroup, workerGroup)
                    .channel(NioServerSocketChannel.class)
                    .childHandler(new ChannelInitializer<SocketChannel>() {

                        @Override
                        protected void initChannel(SocketChannel ch) throws Exception {
                            ch.pipeline().addLast(new LengthFieldBasedFrameDecoder(1024.0.4.0.4));
                            ch.pipeline().addLast(new LengthFieldPrepender(4));
                            ch.pipeline().addLast(new ChineseMessageEncoder());
                            ch.pipeline().addLast(new ChineseMessageDecoder());
                            ch.pipeline().addLast(new SimpleChannelInboundHandler<ChineseMessage>() {

                                @Override
                                protected void channelRead0(ChannelHandlerContext ctx, ChineseMessage message) throws Exception {
                                    log.info("Request received from client :{}", message);
                                    ChineseMessage chineseMessage = new ChineseMessage();
                                    chineseMessage.setId(message.getId() + 1L);
                                    chineseMessage.setMessage("Little Dog Zhang"); ctx.writeAndFlush(chineseMessage); }}); }}); ChannelFuture future = bootstrap.bind(port).sync(); log.info("Server started successfully...");
            future.channel().closeFuture().sync();
        } finally{ workerGroup.shutdownGracefully(); bossGroup.shutdownGracefully(); }}}Copy the code

link

  • Making Page: www.throwable.club/2019/10/03/…
  • Coding Page: throwable. Coding. Me / 2019/10/03 /…

C-2-d E-A-20191003 Happy National Day (*^▽^*)

Technical official account (Throwable Digest), push the author’s original technical articles from time to time (never plagiarize or reprint) :