Feng Yawei

Where to NETOPS

I joined Qunar in July 2014 and have rich experience in network operation and maintenance. Now I am responsible for the operation and maintenance of IDC and backbone transmission network of the company.

1. Unknown cause of unicast flooding

Data needs to be encapsulated layer by layer before transmission on the network. The IP header must be encapsulated first. The source IP address and destination IP address in the IP header are used to address the host to the host. Then encapsulate the data frame header, which contains the destination MAC address and source MAC address, for addressing in the same network segment (link, broadcast domain). After a packet is forwarded by a routing device, the source MAC address changes to the MAC address of the sending router, and the destination MAC address changes to the MAC address of the receiving router (or host).

Communication between hosts on the same network segment:

After the source host determines that the IP addresses of the destination host and the local host belong to the same network segment, the source host determines that the two hosts reside in the same network segment. That is, the two hosts can communicate directly without using the gateway to forward packets. In this case, the source host must obtain the MAC address of the destination host.

How does the source host obtain the MAC address corresponding to the DESTINATION host IP address? Check the ARP table. The ARP table records the MAC address corresponding to the IP address. If there are corresponding entries in the ARP table, the MAC address recorded in the ARP entry can be used as the destination MAC address of the data frame.

If no corresponding entry exists in the ARP table, address Resolution Protocol (ARP) is used to resolve the address. The host sends an ARP request: “Who is using IP-1, please tell me your MAC address?” The host using IP-1 responds with an ARP reply after receiving the ARP request: “I am IP-1, and my MAC address is MAC-1.” After receiving an ARP response, the host records the information in the ARP table and can use it next time. Of course, this entry has an aging time and will not be valid forever.

After that, the source host encapsulates its OWN MAC address as the source MAC address, and the destination host’S MAC address as the destination MAC address into the frame header to send the data frame.

The communication between hosts on the same network segment must be forwarded by layer 2 switches. The switch learns the source MAC address in the data frame header and binds it to the physical interface that receives the data frame to generate a MAC address table. The MAC address table also has an aging time. In this way, when a data frame needs to be forwarded to a certain MAC address, the switch checks the MAC address table to know which interface to send the data frame from. If the switch forwards a frame from an interface whose destination MAC address is not found in the MAC address table, the switch sends the frame from any interface that contains that network segment (determined by VLAN ID or trunk). Unknown unicast flooding is formed. This mechanism ensures that data frames can be received by the destination host. However, when a large number of unknown unicast flooding occurs on the network, the interface bandwidth is exhausted, resulting in packet loss and affecting normal network communication.

Communication mode between hosts on different network segments:

If the source host determines that the destination IP address and the local IP address belong to different network segments, the source host uses the default Gateway of the network segment to implement Layer 3 forwarding. The source host sends the data packet to the gateway first, and then the gateway forwards the data packet. First, the source host uses the local IP address as the source IP address and the destination IP address as the destination IP address to encapsulate the IP header. Then, the local MAC address as the source MAC address and the gateway MAC address as the destination MAC address are encapsulated in the frame header, and the data frame is sent to the gateway of the network segment where the source host resides. The packets are routed to the gateway of the network segment where the destination host resides. After receiving the packet, if the gateway does not have a MAC address corresponding to the destination IP address in the ARP table, it sends an ARP request to resolve the MAC address corresponding to the destination IP address. During ARP interaction, the gateway binds the MAC address of the destination host to the corresponding switch interface. The MAC address table entries are generated so that the gateway sends data frames from the corresponding interface after obtaining the MAC address of the destination host, without causing unknown unicast flooding.

However, if the destination gateway has an ARP table entry corresponding to the destination IP address but no MAC address table entry corresponding to the destination MAC address, the destination gateway sends data frames from all interfaces on the network segment, resulting in unknown unicast flooding.

The aging time of the ARP table is different from the aging time of the MAC address table (the aging time of the ARP table is longer than that of the MAC address table), which may cause unknown unicast flooding.

2. Cause analysis of unknown unicast problems in the online machine room

As shown in the figure, the server provides external services through the LVS tunnel mode. The LVS tunnel mode has the following features: When a server responds to a request from a client, the LVS server directly sends the packet back to the CLIENT rather than the LVS server. This reduces the pressure on the LVS server.

As shown in the figure, the client requests the service on the server. The path of sending data packets is client >SW1 > Core switch >SW3 >LVS Server >SW3 > Core switch >SW2→ Server

When a client requests the server, SW2 cannot learn and generate MAC address table entries of the client.

The path of packet return is server >SW2 > Core switch >SW1→ Client

When the server sends a packet back to the client for the first time, SW2 can learn and generate the MAC address table entry of the client during the ARP interaction. But the entry will age and disappear after five minutes.

As the server sends packets back to the client, five minutes later, after the MAC address table entries of the client are aged and deleted on SW2, SW2 starts to perform unknown unicast flooding of traffic from the server to the client and sends the flooding traffic to the core switch.

Normally, there should be MAC address table entries of clients on the core switch. Because LVS VIP packets always pass through the core switch, the core switch sends the packets through the corresponding interface without unknown unicast flooding.

However, in order to reduce the number of MAC address table entries on switches and save hardware resources, the manufacturer of core switches changes the MAC address table learning mechanism to: Only the source MAC addresses (Layer 2 port incoming and Layer 2 port outgoing) of packets forwarded by Layer 2 ports are learned, but the source MAC addresses (Layer 2 port incoming and Layer 3 port outgoing) of packets forwarded by Layer 3 ports are not learned. In this case, the request from the client to the LVS VIP requires layer 3 forwarding by the core switch (the client and the LVS VIP belong to different network segments). Therefore, the core switch does not learn the MAC address of the client.

In this way, unknown unicast flooding occurs when packets returned from the server to the client reach the core switch.

Because core switches and access switches are connected through layer 2 trunk (trunk includes all network segments), traffic floods to all access switches. The access switch floods traffic to all hosts (trunk ports) and servers on the same network segment as the client.

When the traffic from the server to the client reaches 1 GBIT/s, the flood volume also reaches 1 gbit/s. As a result, the NIC of the GE server is fully loaded, affecting the NETWORK of the GE server.

3. Problem improvement plan

1. Modify the MAC address learning mechanism on the core switch: Learn the source MAC addresses of all packets entering from layer 2 interfaces and identify the interfaces without checking, eliminating the unknown unicast flooding on the core switch.

2. Modify the routing mode of replying client requests on the server, change the route based on the destination IP address to the route based on the source IP address, and directly send the packets whose source IP address is LVS VIP to the core switch, eliminating the unknown unicast flooding on SW2 switch.