Before learning about Socket programming, we need to look at another computer base: Endianness, which literally means the order of bytes. In a computer, the order of the bytes within a word that makes up a multibyte word in computer memory or in a digital communication link. The original text reads as follows:

In computing, endianness is the order or sequence of bytes of a word of digital data in computer memory.

The CPU reads data from the data bus in words, not bytes. On a 32-bit processor system, a word is 4 bytes long.

So here’s the question. If these four bytes represent a large int, then which of the four bytes comes first and which comes last? How do we parse these four bytes to get the right number?

So talking about byte ordering is actually related to multi-byte scenarios, if the data is stored in single byte such as UTF-8 numbers, characters are represented by 1 byte. So byte order doesn’t really make sense in this scenario.

Big end, small end

Since multi-byte data has byte order problems, there are two kinds of order: Big Endian and Little Endian. Big-endian mode refers to the data in memory: the low address stores the high value data, and then the low value data. In the small-endian mode, the opposite is true: low addresses store low data, then high data.

As shown in the figure above, 0x0A0B0C0D is represented in hexadecimal format. Each pair of characters represents one byte. In big-endian mode, 0A is placed first, then 0B…

Big-end mode is in line with human reading habits, entering once from left to right; And the sign bit is the first, so it is very efficient to determine whether a number is positive or negative, but the logical operation is less efficient than the small-endian mode.

The disadvantage of the big-endian mode is the advantage of the small-endian mode. In the small-endian mode, the cast type does not need to adjust the byte content and directly takes the low-value data value, so the conversion efficiency is high. For example, the int type is converted to the short type.

Host byte order, network byte order

The concepts of host byte order and Network byte order are used in connection with network communication.

Host byte order is simply the byte order of the local data on the machine, while network byte order is the byte order used for data interaction with other devices on the network, and the order is specified in big-endian mode.

So whatever order you’re using locally, you’re going to go through my territory, and I’m going to receive the data in big-endian fashion, and you’re going to have to organize it the way I do, just arbitrarily.

Therefore, when using the Socket interface, you need to use related conversion functions to transfer data to the big-port mode, such as the port number:

Htons means host-to-network short
serv_addr.sin_port = htons("80");
Copy the code

How do I determine local bytes

In Unix Advanced Programming, we use a union. In essence, we convert an int to an array of characters (char[]), and then compare the first character in the array to the highest or lowest value. So I just convert an integer to character data:

#include <stdio.h>
#include <netinet/in.h>

int main(a) {
    unsigned int i = 0x0a0b;
    unsigned char *p = (unsigned char *) &i;
    if (p[0] = =0x0a) { // Determine the first byte value
        printf("before: Big Endian, val: %02x \n", p[0]);
    } else {
        printf("before: Little Endian val: %02x\n", p[0]);
    }
  
    // Cast to network sequence big-endian mode
    uint16_t rel = htons(i);
    p = (unsigned char *) &rel;
    if (p[0] = =0x0a) {
        printf("after: Big Endian val: %02x\n", p[0]);
    } else {
        printf("after: Little Endian val: %02x\n", p[0]);
    }
    return 0;
}
Copy the code

Output result:

before: Little Endian val: 0b
after: Big Endian val: 0a
Copy the code