Article source: blog.csdn.net/analogous_l…

Over the years, I have contacted various projects and written a lot of network programming codes. From Windows to Linux, I have fallen into many pits. Because network programming involves a lot of details and skills, I always want to write an article to summarize my experience and experience in this respect.

The platforms covered in this article include Windows and Linux, so here we go.

\

How to write a non-blocking connect() function

We know that using connect() blocks by default until after the three-way handshake has been established, or until a timeout is returned, during which the program execution flow is blocked. So how do you use connect() to write non-blocking connection code?

No matter in Windows or Linux platform, the following ideas can be adopted to achieve:

1. When creating a socket, set the socket to non-blocking mode. For details about how to set the socket, see my series of articles “Server Programming Tips (4) – How to set the socket to non-blocking mode”.

2. Then call connect() to connect, if connect() immediately successful, return 0; If the connection is not immediately successful at this point, -1 is returned (SOCKET_ERROR is also -1 on Windows), in which case the error code WSAEWOULDBLOCK (Windows) or EINPROGRESS (Linux) is not currently available.

3. The select() function is then called to check whether the socket is writable within the specified time. If writable, the connect() connection is successful.

Note that connect() cannot return -1 on Linux. The error code may be EINPROGRESS, or it may be interrupted by a signal, in which case it is EINTR. Take that into account; On The Windows platform, in addition to using the select() function to check whether the socket is writable, you can also use the Windows platform’s own functions WSAAsyncSelect or WSAEventSelect to check.

Here’s the code:

[cpp]  view plain  copy

  1. / * *
  2. *@param timeout Connection timeout time, expressed in seconds
  3. *@return Returns true on success, false on failure
  4. * * /
  5. bool CSocket::Connect(int timeout)  
  6. {  
  7. // Windows sets the socket to non-blocking mode
  8.     unsigned long on = 1;  
  9.     if (::ioctlsocket(m_hSocket, FIONBIO, &on) < 0)  
  10.         return false;  
  11.   
  12. // Linux sets the socket to non-blocking mode
  13. // Set the new socket to non-blocking
  14. / *
  15.     int oldflag = ::fcntl(newfd, F_GETFL, 0); 
  16.     int newflag = oldflag | O_NONBLOCK; 
  17.     if (::fcntl(m_hSocket, F_SETFL, newflag) == -1)       
  18.         return false; 
  19. * /
  20.   
  21.     struct sockaddr_in addrSrv = { 0 };  
  22.     addrSrv.sin_family = AF_INET;  
  23.     addrSrv.sin_addr = htonl(addr);  
  24.     addrSrv.sin_port = htons((u_short)m_nPort);  
  25.     int ret = ::connect(m_hSocket, (struct sockaddr*)&addrSrv, sizeof(addrSrv));  
  26.     if (ret == 0)  
  27.         return true;  
  28.   
  29. // Check WSAEWOULDBLOCK under Windows
  30. if (ret < 0 && WSAGetLastError() ! = WSAEWOULDBLOCK)
  31.         return false;  
  32.   
  33.   
  34. // Check EINPROGRESS and EINTR under Linux
  35. / *
  36. if (ret < 0 && (errno ! = EINPROGRESS || errno ! = EINTR))
  37.         return false; 
  38. * /
  39.   
  40.     fd_set writeset;  
  41.     FD_ZERO(&writeset);  
  42.     FD_SET(m_hSocket, &writeset);  
  43.     struct timeval tv;  
  44.     tv.tv_sec = timeout;  
  45. // You can use tv_usec to set the timeout with less precision
  46.     tv.tv_usec = 0;  
  47. if (::select(m_hSocket + 1, NULL, &writeset, NULL, &tv) ! = 1)
  48.         return false;  
  49.   
  50.     return true;  
  51. }  

How to send and receive data correctly under non-blocking socket

We will not discuss blocking mode, in which the send and RECV functions are blocked at the send and RECV calls if the TCP window is too small or there is no data. For the received data, The usual process is to check the socket for readable data with SELECT (Windows and Linux), WSAAsyncSelect() or WSAEventSelect(Windows), poll or epoll_wait (Linux). And then they collect. For sending data,; There are two types of epoll model in Linux: horizontal mode and edge mode. If there is edge mode, the socket must be received clean at one time, that is, the recV function must fail and the error code is EWOULDBLOCK. In horizontal mode on Linux or Windows, a fixed number of bytes can be collected at a time, depending on the business, or until the collection is complete. Another difference, as mentioned above, is that the code that Windows sends data is slightly different in that it does not detect if the error code is EINTR, only if WSAEWOULDBLOCK is found. The code is as follows:

Used in Windows or Linux horizontal mode to collect data, in which case the collected data can be less than the specified size, or as much as can be received at a time:

[cpp]  view plain  copy

  1. bool TcpSession::Recv()  
  2. {  
  3. // Only 256 bytes are collected each time
  4.     char buff[256];  
  5.     //memset(buff, 0, sizeof(buff));  
  6.     int nRecv = ::recv(clientfd_, buff, 256, 0);  
  7.     if (nRecv == 0)  
  8.         return false;  
  9.   
  10.     inputBuffer_.add(buff, (size_t)nRecv);  
  11.   
  12.     return true;  
  13. }  

If it is Linux epoll edge mode (ET), be sure to collect all at once:

[cpp]  view plain  copy

  1. bool TcpSession::RecvEtMode()  
  2. {  
  3. // Only 256 bytes are collected each time
  4.     char buff[256];  
  5.     while (true)  
  6.     {  
  7.         //memset(buff, 0, sizeof(buff));  
  8.         int nRecv = ::recv(clientfd_, buff, 256, 0);  
  9.         if (nRecv == -1)  
  10.         {  
  11.             if (errno == EWOULDBLOCK || errno == EINTR)  
  12.                 return true;  
  13.   
  14.             return false;  
  15.         }  
  16. // The peer closed the socket
  17.         else if (nRecv == 0)  
  18.             return false;  
  19.           
  20.        inputBuffer_.add(buff, (size_t)nRecv);  
  21.     }  
  22.   
  23.     return true;  
  24. }  

\

For sending data on Linux platforms:

[cpp]  view plain  copy

  1. bool TcpSession::Send()  
  2. {  
  3.     while (true)  
  4.     {  
  5.         int n = ::send(clientfd_, buffer_, buffer_.length(), 0);  
  6.         if (n == -1)  
  7.         {  
  8. // TCP window capacity is not enough, temporarily can not send out, next time
  9.             if (errno == EWOULDBLOCK)  
  10.                 break;  
  11. // The signal is interrupted
  12.             else if (errno == EINTR)  
  13.                 continue;  
  14.   
  15.             return false;  
  16.         }  
  17. // The peer ends the connection
  18.         else if (n == 0)  
  19.             return false;  
  20.   
  21.         buffer_.erase(n);  
  22. // All messages are sent
  23.         if (buffer_.length() == 0)  
  24.             break;  
  25.     }  
  26.   
  27.     return true;  
  28. }  

\

Setsocketopt is used to set the timeout duration for sending and receiving data. You can also customize the timeout duration for sending and receiving data. Record the time before receiving data, and record the time after receiving data. Timeout is considered, and the codes are:

[cpp]  view plain  copy

  1. long tmSend = 3*1000L;  
  2. long tmRecv = 3*1000L;  
  3. setsockopt(m_hSocket, IPPROTO_TCP, TCP_NODELAY,(LPSTR)&noDelay, sizeof(long));  
  4. setsockopt(m_hSocket, SOL_SOCKET,  SO_SNDTIMEO,(LPSTR)&tmSend, sizeof(long));  

\

[cpp]  view plain  copy

  1. int httpclientsocket::RecvData(string& outbuf,int& pkglen)  

  2. {  

  3.     if(m_fd == -1)  

  4.         return -1;  

  5.     pkglen = 0;  

  6.     char buf[4096];  

  7.     time_t tstart = time(NULL);  

  8.     while(true)  

  9.     {  

  10. Int ret = : : recv (m_fd, buf, 4096, 0);

  11.         if(ret == 0)  

  12.         {  

  13.             Close();  

  14. return 0; // The socket is closed

  15.         }  

  16.         else if(ret < 0)  

  17.         {  

  18.             if(errno == EAGAIN || errno ==EWOULDBLOCK || errno == EINTR)  

  19.             {  

  20.                 if(time(NULL) – tstart > m_timeout)  

  21.                 {  

  22.                     Close();  

  23.                     return 0;  

  24.                 }  

  25.                 else  

  26.                     continue;  

  27.             }  

  28.             else  

  29.             {  

  30.                 Close();  

  31. return ret; // Receive error

  32.             }  

  33.         }  

  34.         outbuf.append(buf,buf+ret);  

  35.         pkglen = GetBufLen(outbuf.data(),outbuf.length());  

  36.         if(pkglen <= 0)  

  37.         {

    // There is a problem with the received data

  38.             Close();      

  39.             return pkglen;  

  40.         }  

  41.         else if(pkglen <= (int)outbuf.length())  

  42. break; / / close enough

  43.     }  

  44. return pkglen; // Returns the length of the full package

  45. }  

How do upper-layer services parse and use the received data packets?

This topic is actually a follow-up to the last one. This question can also answer common interview questions: how to solve the problem of data packet loss, sticky packet, incomplete packet. First of all, because TCP is reliable, there is no packet loss problem, and there is no packet ordering problem (UDP will have this problem, this time need to use their own sequence number mechanism to ensure that, here only TCP). The usual practice is to receive a fixed size packet header information, and then receive the packet body size specified in the packet header (here “receive” can be received from the socket, or can be retrieved from the received data buffer). Here’s an example:

[cpp]  view plain  copy

  1. #pragma pack(push, 1)  
  2. struct msg  
  3. {  
  4. int32_t cmd; / / agreement
  5. int32_t seq; // The sequence number of the request packet is the same as that of the reply packet.
  6. int32_t packagesize; // Package size
  7. int32_t reserved1; // Keep the field, the contents remain unchanged in the reply packet
  8. int32_t reserved2; // Keep the field, the contents remain unchanged in the reply packet
  9. };  
  10.   
  11. / * *
  12. * Heartbeat packet protocol
  13. * * /
  14. struct msg_heartbeat_req  
  15. {  
  16.     msg header;  
  17. };  
  18.   
  19. struct msg_heartbeat_resp  
  20. {  
  21.     msg header;  
  22. };  
  23.   
  24. / * *
  25. * Login protocol
  26. * * /
  27. struct msg_login_req  
  28. {  
  29.     msg         header;  
  30.     char        user[32];  
  31.     char        password[32];  
  32. int32_t clienttype; // Client type
  33. };  
  34.   
  35. struct msg_login_resp  
  36. {  
  37.     msg         header;  
  38.     int32_t     status;  
  39.     char        user[32];  
  40.     int32_t     userid;  
  41. };  
  42.   
  43. #pragma pack(pop)  

Sizeof (msg_login_req) -sizeof (MSG); sizeof(msg_login_req) -sizeof (MSG); This ensures that a packet is complete. If the packet header or packet body size is insufficient, the data is incomplete and more data is waiting for it to arrive. Because TCP is a stream protocol, the other person sends you 10 bytes, you might get 5 bytes first, then 5 bytes; Or 2 bytes and then 8 bytes; Or 1 byte and then 9 bytes; Or 1 byte, then 7 bytes, then 2 bytes. Anyway, you can get it in any combination of these 10 bytes. Therefore, the general practice in the formal project is to first detect whether there is data on the socket, if there is, then accept it (as for whether to complete the collection, the difference has been said above), after receiving, the bytes received in the first test is enough to detect the size of a packet header, not enough to receive data after the next test; If so, check whether it is enough for the packet body size specified in the packet head, not enough for next processing; If so, take the size of a package, unpack it and hand it to the upper-level business logic. Note that at this time, it is necessary to continue to check whether the next head and body are enough, and so on until there is not enough head or body size. This is common, especially if the peer sends consecutive packets.

\

Fourth, Nagle algorithm

Nagle algorithm is a packet sending mechanism in the network communication layer of the operating system. If enabled, the data in the nic buffer (using send or write, etc.) is small and may not be sent immediately. As long as after multiple send or write, the data in the NIC buffer is enough. The operating system uses this algorithm to reduce network communication times and improve network utilization. Nagle algorithm can be disabled for applications requiring high real-time performance. Small send or write packets are sent immediately. The function is enabled by default. To disable the function, perform the following steps:

[cpp]  view plain  copy

  1. long noDelay = 1;  
  2. setsockopt(m_hSocket, IPPROTO_TCP, TCP_NODELAY,(LPSTR)&noDelay, sizeof(long));  

If noDelay is 1, the Nagle algorithm is disabled; if noDelay is 0, the Nagle algorithm is enabled.

\

Select the first parameter of the function

The prototype of the select function is:

[cpp]  view plain  copy

  1. int select(  
  2.   _In_    int                  nfds,  
  3.   _Inout_ fd_set               *readfds,  
  4.   _Inout_ fd_set               *writefds,  
  5.   _Inout_ fd_set               *exceptfds,  
  6.   _In_    const struct timeval *timeout  
  7. );  

Example:

[cpp]  view plain  copy

  1. fd_set writeset;  
  2. FD_ZERO(&writeset);  
  3. FD_SET(m_hSocket, &writeset);  
  4. struct timeval tv;  
  5. tv.tv_sec = 3;  
  6. tv.tv_usec = 100;  
  7. select(m_hSocket + 1, NULL, &writeset, NULL, &tv)  

For both Linux and Windows, this function is derived from Berkeley sockets. Readfds, WritefDS and ExceptfDS are all structures containing an array of socket descriptor handles. Under Linux, the first parameter must be set to the maximum value of all socket descriptor handles plus 1. Windows does not use this parameter, but retains it to remain compatible with Berkeley sockets, so it can be filled in to any value on Windows platforms.

\

The address of the bind function

When using bind, we need to bind an address. The following is an example:

[cpp]  view plain  copy

  1. struct sockaddr_in servaddr;  
  2. memset(&servaddr, 0, sizeof(servaddr));  
  3. servaddr.sin_family = AF_INET;  
  4. servaddr.sin_addr.s_addr = inet_addr(ip_.c_str());  
  5. servaddr.sin_port = htons(port_);  
  6. bind(listenfd_, (sockaddr *)&servaddr, sizeof(servaddr));  

For the IP address, we typically write 0.0.0.0 (the Windows macro INADDR_ANY) or 127.0.0.1. What’s the difference? In the case of the former, bind will bind any network card address on the machine (especially if there are multiple network card addresses), in the case of the latter, only the local loopback address 127.0.0.1 will be bound. Thus, with the former binding, you can use connect to connect to any local nic address, while the latter can only connect to 127.0.0.1. Here’s an example:



\

If you bind to 0.0.0.0, you can connect to 192.168.27.19 or 192.168.56.1 or 192.168.247.1. If you bind to 127.0.0.1, you can connect to 192.168.27.19 or 192.168.56.1 or 192.168.247.1. Only 127.0.0.1 can be used to connect.

SO_REUSEADDR and SO_REUSEPORT

The usage method is as follows:

[cpp]  view plain  copy

  1. int on = 1;  
  2. setsockopt(listenfd_, SOL_SOCKET, SO_REUSEADDR, (char *)&on, sizeof(on));  
  3. setsockopt(listenfd_, SOL_SOCKET, SO_REUSEPORT, (char *)&on, sizeof(on));  

These two socket options are commonly used by server programs, mainly to solve the problem that the address and port number bound to a socket cannot be reused within a maximum lifetime (MSL, about 2 minutes) after a socket is reclaimed by the system. When a TCP connection is disconnected, four waves are required. To ensure that the last socket in time_wait state receives an ACK reply, the operating system extends the socket lifetime to one MSL. However, in the case of a server program, especially a restart, the address and port number cannot be used immediately after the restart, causing the bind call to fail. So developers either change the address and port number, or wait a few minutes. Neither of these options is affordable. So you can set this option to avoid this problem.

However, the implementation on Windows is slightly different from that on Linux. On Windows, after a socket is recycled, during the MSL period, its address and port number combination can not be used by other processes, but this process can continue to reuse. The Linux implementation makes all processes unavailable during MSL, including this one. \

8. Heartbeat packet mechanism

To maintain a TCP connection, the firewall of the system shuts down a connection when there is no traffic for a long time. At this point, if you try to send data over this connection, it will fail, so you need to maintain the heartbeat mechanism. Although the TCP stack has its own Keepalive mechanism, the application layer heartbeat packet should be used to keep the connection alive. So how often is a good time to send a heartbeat packet? In my past project experience, there are a lot of different opinions, and I have been cheated a lot. Later, I found a more scientific interval:

Assume that a heartbeat packet is sent to the peer end every 30 seconds. In this case, you need to enable a timer to send a heartbeat packet every 30 seconds.

In addition to heartbeat packets, there are normal data (non-heartbeat packets) between the peer end and the peer end. Note down the send and RECV times of these data. In other words, if a non-heartbeat packet has been sent or received in the last 30 seconds, do not send heartbeat packet data after 30 seconds. In other words, heartbeat packets must be sent 30 seconds after no data is exchanged between the two ends. This not only reduces the strain on the server, but also reduces network traffic, especially on mobile devices where traffic is expensive.

Of course, heartbeat packets can be used not only to maintain the connection, but also to carry some data, such as periodically getting the latest value of some data. In this case, the above scheme may not be suitable, but still need to be sent every 30 seconds. You can decide which one to adopt according to the actual project requirements.

In addition, heartbeat packets are usually sent from the client to the server when needed. That is, the client checks whether it is still connected to the server, rather than the server actively sends heartbeat packets to the client. In program terms, the connect caller sends the heartbeat packet, and the listen caller receives the heartbeat packet.

Further, this idea can also be used to maintain a connection to a database. For example, if no database operation has been performed within 30 seconds, run an SQL query periodically to keep the connection open. For example, run a simple SQL query: select 1 from user;

\

Ix. Reconnection mechanism

In the early years of my software development career, I used the connect function to connect to a peer, and if it didn’t work, I would try again, and if it didn’t work, I would try again. And so on and so forth, even though the reconnection is placed in a dedicated thread (for client software, never put it in the UI thread, or your interface will freeze). But if the peer end is never connected, for example, because the network is disconnected. There’s no point in trying. It’s better not to do it. In fact, the most reasonable way to reconnect should be a combination of the following two schemes:

1. If the connection fails, try again in n seconds. If the connection fails again in 2n seconds, try again in 4n, 8n, 16n……

However, there is also a problem with the above scheme, that is, if the network is suddenly unblocked when the retry interval becomes very long, it takes a long time to connect to the server, in this case, method 2 should be adopted.

2. When the network status changes, try reconnection. For example, a communication software, due to network failure is now offline, suddenly the network is restored, this time should try to reconnect. The API used to detect network status changes in Windows is IsNetworkAlive. Example code is as follows:

[cpp]  view plain  copy

  1. BOOL IUIsNetworkAlive()    
  2. {    
  3. DWORD dwFlags; // Internet access mode
  4. BOOL bAlive = TRUE; // Whether to be online
  5.     bAlive = ::IsNetworkAlive(&dwFlags);          
  6.     return bAlive;  
  7. }  

About the error code EINTR

The error code is Linux. For many Linux network functions (connect, send, recv, epoll_wait, etc.), it is important to check if the error is EINTR. If the error is EINTR, the signal interrupts the function call. Such as send, RECV, or epoll_WAIT, or use other methods to check the completion, such as SELECT to check whether connect was successful. Do not assume that these calls failed and make false logical judgments.

\

11. Minimize system calls

Minimizing system calls is also a worthwhile optimization for high-performance server programs. Each system call means a switch from user space to kernel space. For example, in the Libevent network library, in the main loop, the fetched time is cached immediately after the fetched time, and cached if the fetched time is needed later. But some say that getTimeofday is not a system call on x86 machines, so libevent doesn’t need to do it. Is it necessary? Let’s borrow the idea of reducing system calls.

\

Ignore the Linux signal SIGPIPE

SIGPIPE is a signal for Linux platforms. When is this signal generated? After a listening socket is closed, if the peer end sends data (send or write) to the local end and then calls SEND or Write to send data to the local end, the local process generates a SIGPIPE signal, which terminates the process by default. But programs in general and server programs in particular certainly don’t want this default behavior, because we can’t crash and quit just because the client sends us random data. So this signal should be ignored as follows:

[cpp]  view plain  copy

  1. signal(SIGPIPE, SIG_IGN);  

About SIGPIPE details you can refer to this article: blog.csdn.net/lmh12506/ar…

\

So much for the time being, welcome to exchange, welcome to point out the confusion of the article.

\

Update record:

Zhangyl added article 12 in 2017.04.05.

\