In the first article in this series, I looked at how K8S utilizes virtual network devices and routing rules to allow a POD running on one cluster node to communicate with a POD running on another node, as long as the sender knows the receiver’s POD network IP address.

If you’re not familiar with how A POD communicates, it’s worth checking out the previous article before moving on.

The Pod network in the cluster is well designed and clever, but it is not enough by itself to implement the overall system. Because the POD in k8S is transient. You can use the POD IP address as the endpoint, but there is no guarantee that the address will not change the next time you recreate a POD, which can happen for a variety of reasons.

You’re probably already aware that this is an old problem, and it has a standard solution: reverse proxy/load balancing.

The client connects to the proxy, which maintains a healthy list of servers to forward requests. This means that there are some requirements for the agent itself: it must itself be persistent and fail-resistant; It must have a list of servers that can forward; And it must have some way of knowing whether a particular server is healthy and able to respond to requests.

The k8S designers solved this problem in an elegant way, building on the basic functionality of the platform to meet all three of these needs. This starts with a resource type called Service.

Services

In the first article, I showed a virtual cluster with two PODS and described how they could communicate across nodes. Here, I want to build on this example and describe how the K8S Service implements load balancing across a set of server Pods so that client pods can run independently and persistently. To create a server POD, we can use the following deployment:

kind: Deployment
apiVersion: extensions/v1beta1
metadata:
  name: service-test
spec:
  replicas: 2
  selector:
    matchLabels:
      app: service_test_pod
  template:
    metadata:
      labels:
        app: service_test_pod
    spec:
      containers:
      - name: simple-http
        image: Python: 2.7
        imagePullPolicy: IfNotPresent
        command: ["/bin/bash"]
        args: ["-c"."echo \"<p>Hello from $(hostname)</p>\" > index.html; python -m SimpleHTTPServer 8080"]
        ports:
        - name: http
          containerPort: 8080
Copy the code

This deployment creates two very simple HTTP server pods that respond on port 8080 with the hostname of the POD they are running on. After creating the deployment using Kubectl Apply, we can see that the POD is running in the cluster and we can also query the pod network address:

$ kubectl apply -f test-deployment.yaml
deployment "service-test" created

$ kubectl get pods
service-test-6ffd9ddbbf-kf4j2    1/1    Running    0    15s
service-test-6ffd9ddbbf-qs2j6    1/1    Running    0    15s

$ kubectl get pods --selector=app=service_test_pod -o jsonpath='{.items[*].status.podIP}'
10.0.1.2 10.0.2.2
Copy the code

We can prove that the POD network is working by creating a simple client POD to make the request and then viewing the output.

apiVersion: v1
kind: Pod
metadata:
  name: service-test-client1
spec:
  restartPolicy: Never
  containers:
  - name: test-client1
    image: alpine
    command: ["/bin/sh"]
    args: ["-c"."Echo" GET/HTTP / 1.1 \ r \ n \ r \ n '| nc 10.0.2.2 8080"]
Copy the code

Once created, the POD will run until done, then enter the “done” state, and can then be output using Kubectl logs.

$ kubectl logs service-test-client1HTTP / 1.0 200 OK <! -- blah --><p>Hello from service-test-6ffd9ddbbf-kf4j2</p>Copy the code

In this example, it is not shown on which node the client POD was created, but no matter where it is running in the cluster, it is able to contact the server POD and get a response. However, if the server pod dies and is restarted, or is rearranged to a different node, its IP will surely change and client communication will break down.

We circumvent this by creating a service.

kind: Service
apiVersion: v1
metadata:
  name: service-test
spec:
  selector:
    app: service_test_pod
  ports:
  - port: 80
    targetPort: http
Copy the code

A Service is a K8S resource that causes the broker to be configured to forward requests to a set of PODS. The PODS that receive the traffic are determined by the selector, matching the labels assigned to them when the PODS are created. Once the service is created, you can see that it has been assigned an IP address and is receiving requests on port 80.

$ kubectl get service service-testNAME cluster-ip external-ip PORT(S) AGE service-test 10.3.241.152 < None > 80/TCP 11sCopy the code

Requests can be sent directly to the service’s IP, but it is best to use a host name that can be resolved to an IP address. Fortunately, K8S provides an internal clustered DNS that resolves service names, which we can take advantage of with a few changes to the pod on the client side.

apiVersion: v1
kind: Pod
metadata:
  name: service-test-client2
spec:
  restartPolicy: Never
  containers:
  - name: test-client2
    image: alpine
    command: ["/bin/sh"]
    args: ["-c"."Echo" GET/HTTP / 1.1 \ r \ n \ r \ n '| nc service - test 80"]
Copy the code

After the POD runs, the output shows that the service forwards the request to one of the server pods.

$ kubectl logs service-test-client2HTTP / 1.0 200 OK <! -- blah --> <p>Hello from service-test-6ffd9ddbbf-kf4j2</p>Copy the code

You can keep trying to run the client POD, and you’ll see responses from two server pods, each getting about 50% of the requests. If you want to understand how this actually works, the IP address to which our service is assigned is a good place to start.

service network

The IP assigned to the Test service represents an address on the network. But as you may have noticed, this “network” is different from the one on which the POD is located.

Thing IP network ------- ------- POd1 10.0.1.2 10.0.0.0/14 Pod2 10.0.2.2 10.0.0.0/14 service 10.3.241.152 10.3.240.0/20Copy the code

It is also different from the private network on which the node resides, as will be more clear below. In the first article, I did not expose the POD network address range through Kubectl, so you need to query the cluster properties using service provider specific commands. The same is true for service networks. If you’re running the Google Container engine, you can do this:

$ gcloud container clusters describe test | grep servicesIpv4CidrServicesIpv4Cidr: 10.3.240.0/20Copy the code

The network specified by this address space is called a “service network “. The network assigns an IP address to each service of type “ClusterIP”. There are other types of services, and I’ll discuss a few of them in my next article on Ingress, but ClusterIP is the default, which means “the service will be assigned an IP address that any POD in the cluster can reach.” You can see the type of service by running the kubectl Describe Services command with the service name.

$ kubectl describe services service-testName: service-test Namespace: default Labels: <none> Selector: app=service_test_pod Type: ClusterIP IP: 10.3.241.152 Port: HTTP/TCP Endpoints: 80 10.0.1.2:8080,10.0. 2.2:8080 Session Affinity: None Events: < None >Copy the code

Like the POD network, the Service network is virtual, but it differs from the POD network in interesting ways. Now let’s consider a case where the POD network has an address range of 10.0.0.0/14. If you look at the hosts that make up the nodes in the cluster and list the Bridges and interfaces, you’ll see the actual device configuration in the grid. These are the virtual interfaces for each POD and the Bridges that connect them to each other and the outside world.

Now look at service network 10.3.240.0/20. You can try ifconfig, you won’t find any devices on this network with addresses configured. You can check the routing rules of gateways connecting all nodes, and you won’t find any routes on the network. The service network does not exist, at least not through an interface.

However, as we saw above, when we made a request to an IP on this network, somehow the request ended up on our server running on the POD network. How did this happen? Let’s follow a packet.

Imagine that the command we ran above creates the following pod in a test cluster:

There are two nodes, connected to their gateways (which also have routing rules for the POD network) and three PODS: a client POD on node 1, a server POD, and another server POD on node 2.

The client sends an HTTP request to the Service using DNS: service-test. The cluster DNS system resolves this name to the service cluster IP 10.3.241.152, and the client POD creates an HTTP request that targets this IP address.

An IP network is usually configured with such a route that when an interface cannot send a packet to its destination because there is no local device with the specified address, it forwards the packet to its upstream gateway.

Let’s walk through the process:

  1. The first interface you see the packet in this example is the virtual interface in the client POD. This interface is on pod network 10.0.0.0/14, and it is not aware of the device whose address is 10.3.241.152;
  2. So it forwards the packet to its gateway, which is the bridge CBR0;
  3. The bridge is very simple, just passing traffic back and forth, so the bridge sends packets to the Ethernet interface of the host/node.

The host/node interface in this example is on the network 10.100.0.0/24, and it also does not know any devices at 10.3.241.152, so packets are normally forwarded to the gateway of this interface, the top-level router shown in the figure. Instead, what actually happens is that the packet is intercepted in transit and redirected to one of the active servers, the POD.

When I was starting out with the K8S, what happened in the diagram above seemed amazing. Somehow, my client was able to connect to an address that was not associated with it, and the packages popped out at the right places in the cluster. I later learned that the answer to this question lies in a component called Kube-Proxy.


I’m going to continue, it’s too long, so I’ll leave kube-Proxy for the next article. It’s too much to put in one article.

This paper is participating in theNetwork protocols must be known and must be known”Essay campaign