We know in the network communication, is to know the IP address of the other, to initiate connections, the IP address is a network layer, and below is the data link layer in the network layer, IP packets to continue here is encapsulated into Ethernet data frame, and of course there are other format data link layer, but the data link layer also need addressing mechanism, Usually a 48-bit hardware address, also known as a MAC address.

Address resolution protocol

Basic Workflow

Whenever we initiate a network connection, there is nothing more than the following process:

  1. If you know the target hostname, use the gethostName function to convert the hostname into an IP address. This function is called a resolver in DNS.

  2. The application establishes a connection using the resulting IP over TCP or UDP.

  3. If the target host is on the local network and knows its IP address, we can find the corresponding host. If the target host is on the remote network, the system will find the routing address of the next station on the local network and ask the router to forward THE IP datagram through IP route selection. These operations are the core of IP protocol and will be skipped here.

  4. Before the host knows the IP address and sends the data packet, the host needs to know the physical address of the target host at the link layer. The physical address of each host, also known as the MAC address, is unique in the world. In this step, we need our ARP protocol.

  5. The ARP protocol sends an ARP request. The ARP request is a broadcast data frame, which can be received by each machine on the LAN. The ARP data frame contains the IP address of the destination host.

  6. Once the hardware address of the destination host is obtained, the Ethernet data frame containing the IP datagram can be sent normally.

ARP packet format

On Ethernet, ARP requests and replies have the following grouping format.

Destination address and source address of the Ethernet
FF:FF:FF:FF:FF:FF

Frame type: A 2-byte frame type that tells us what data the Ethernet data frame is carrying. These two bytes are 0x0806 for ARP, 0x0800 for IP datagrams, and 0x8035 for RARP (Reverse Address resolution Protocol).

Hardware type and protocol type: these two fields describe ARP groups. The hardware type indicates the physical address to be mapped, and the protocol type indicates the protocol address to be mapped. I have the protocol type IP and the hardware type I need is the physical address of the Ethernet. So the hardware address is 1, which is the Ethernet address, and the protocol type is 0x0800. Look here, it’s the same frame type as the IP datagram mentioned in the frame type above, which is by design.

Hardware address length and protocol address length: the value here is 6 bytes and 4 bytes respectively. Indicates a 48-bit Ethernet address and a 32-bit IP address.

Operation code: 1=>ARP request, 2=>ARP reply, 3=>RARP request, 4=>RARP reply. These values are used to distinguish the specific operation type, because the fields are the same, so the opcode must be specified otherwise you will not be able to tell the request from the reply.

The last four fields:

  • Source hardware address
  • Source protocol address
  • Destination hardware address
  • Destination protocol address. Notice that there are two fields that duplicate the group header. When we send an ARP request, only the target hardware address is left empty because it is the value we requested. When the corresponding machine receives the ARP request, it writes its hardware address to this field, changes the opcode to 2, and sends it back. Then they know each other’s hardware addresses and start real communication.

ARP cache

After knowing the principle of ARP sending, we can not help wondering if it would be too slow to send ARP to request the hardware address every time before sending ARP, but in fact ARP is very efficient. That’s because each host has an ARP cache, and we can get all the contents of the native ARP cache by typing ARP -a on the command line:

All of this is done on a local network. What if ARP requests are sent from one network host to another? The following describes the concepts of ARP proxy

ARP proxy

If an ARP request is sent from one network host to another, the router connecting the two hosts can answer the request. This process is called delegated ARP or ARP proxy. We know about IP routing, and if the host is not connected, we send the datagram to a default route, and the router forwards the datagram. In ARP protocol, we request sent to the network host physical address will be answered by a router, is the router’s physical address, the sender will according to the physical address to send the data to the router, the router forwards, again the following things done by routers, that belongs to IP, and of course in the process, ARP is also used to obtain the physical address of each step.

conclusion

The destination IP address and its own IP address reside on the same network segment

  • The ARP cache has the MAC address of the destination IP address: it is directly sent to the physical address
  • Arp cache Does not have the MAC address of the destination IP address: Sends ARP broadcasts to request the MAC address of the destination IP address, caches the MAC address, and sends data to report the MAC address.

The destination IP address and its own IP address are on different network segments

In this case, you need to send the packet to the default gateway, so you need to obtain the MAC address of the gateway

  • The ARP cache has the MAC address of the default gateway: IP data is directly reported to the default gateway and then forwarded to the Internet by the gateway.
  • The ARP cache does not have the MAC address of the default gateway: it still sends an ARP broadcast request for the MAC address of the default gateway, caches the address, and reports data to the gateway.

Another topic: ARP spoofing

ARP spoofing, also known as ARP poisoning, is an attack mode targeting ARP. This section gives a brief introduction. Operating mechanism ARP spoofing means that an attacker sends a large number of fake ARP packets to the network, especially to the gateway. If your gateway IP address is 192.168.0.2 and the MAC address is 00-11-22-33-44-55, all the data you send will pass through this MAC address. In this case, I send a lot of ARP packets. However, my packet is constructed and the IP is your IP. However, I replaced the MAC address with my MAC address, so when you update your ARP cache, you will change the MAC address of my machine as the MAC address of 192.168.0.2, and your traffic will come to me. I can change your data and send it to the gateway, or I can do nothing, and you will not be able to access the Internet. Static ARP is the best method to prevent ARP, but not on large networks because ARP is often updated. Another method, such as DHCP snooping, allows network devices to reserve the MAC addresses of computers on the network through DHCP, so that forged ARP packets can be detected when they are sent. This approach is already supported in some brands of network equipment products.

References:

  • TCP/IP Details (Volume 1: Protocols)
  • Address Resolution Protocol(ARP)