Author: Rong Bin, Online backend architect of Kurow, focusing on microservice governance, containerization technology, Service Mesh and other technical fields

Open Source Project Recommendation

Pepper Metrics is an open source tool I developed with my colleagues (github.com/zrbcool/pep…) , by collecting jedis mybatis/httpservlet/dubbo/motan performance statistics, and exposure to Prometheus and other mainstream temporal database compatible data, through grafana show the trend. Its plug-in architecture also makes it easy for users to extend and integrate other open source components. Please give us a STAR, and we welcome you to become developers to submit PR and improve the project together.

Background paper

The status quo

The previous article is mainly about the investigation and performance test of the basic network components before the implementation of container. If you are interested, please refer to: Network performance test of The Terway plug-in of Ali Cloud open source K8S CNI

At present, the company of the backend architecture is basically a micro service architecture model, the diagram below, all inbound flow through the API gateway into the back-end services, API gateway have the effect of a “patron saint”, monitors all incoming requests, and prevent the brush, performance monitoring interface, alarm and other important function, after the flow through the gateway service calls are based on RPC calls.

The transition period

After the transformation of micro-service, although the micro-service architecture has brought us a lot of dividends, it also inevitably introduces some problems:

  • Service unbundling leads to more machines
  • In order to reduce resource waste, the port management cost caused by the deployment of multiple services to a host
  • Inconsistent delivery environment, resulting in troubleshooting costs

In order to solve the above problems, after investigation, we will focus on the container service Kubernetes to solve our problems. This article focuses on the nginx-Ingress-Controller (nGINx IC) part, so the following architecture mainly highlights the API gateway and IC. The figure below is our transition solution: by introducing an Intranet SLB, we solved the service discovery problem when IC acted as our API gateway upstream. In addition, the interface can be gradually switched to the SLB to achieve the effect of gradual migration. The granularity can reach the interface + percentage level. The transitional frame composition is as follows:

Final state

After all migration is completed, all machines are recycled, as shown below:

Group IC by line of service

practice

How to enable multi-IC

If you look at the IC startup command parameters, you can find:

containers:
  - args:
    - /nginx-ingress-controller
    - --ingress-class=xwz # look here
    - --configmap=$(POD_NAMESPACE)/xwz-nginx-configuration
    - --tcp-services-configmap=$(POD_NAMESPACE)/xwz-tcp-services
    - --udp-services-configmap=$(POD_NAMESPACE)/xwz-udp-services
    - --annotations-prefix=nginx.ingress.kubernetes.io
    - --publish-service=$(POD_NAMESPACE)/xwz-nginx-ingress-lb
    - --v=2
Copy the code

As well as

apiVersion: extensions/v1beta1
kind: Ingress
metadata:
  annotations:
    kubernetes.io/ingress.class: "xwz" # look here
  labels:
    app: tap-revision-ing
    wayne-app: tap-revision
    wayne-ns: coohua
  name: tap-revision-ing
  namespace: coohua
spec:
  rules:
  - host: staging.coohua.com
    http:
      paths:
      - backend:
          serviceName: tap-revision-stable
          servicePort: 80
        path: /
Copy the code

Next, let’s look at the resource structure of IC deployed in Kubernetes, as shown in figure:

  • ServiceAccount, ClusterRole, ClusterRoleBinding: Permission RBAC definition
  • Deployment: Controls the Deployment of controllers and depends on ServiceAccount, Configmap, and Service
  • ConfigMap: Three ConfigMaps are used to store custom controller configurations
  • Service: SVC of type LoadBalancer is mainly used to automatically bind to SLB instances by aliyun basic Service

Our scenario does not require special permissions, so we simply copy the resources in the red box and modify some of the configurations (for example –ingress-class= XWZ). Then we directly reference the ServiceAccount of the default IC and complete the deployment of a new set of IC. And with Kubernetes built-in IC is isolated from each other. Here I put my written configuration on my Github for reference: Ingress Resources

If any of the following errors are found after the new IC is created

E0416 11: lacketh. 831279       6 leaderelection.go:304] Failed to update lock: configmaps "ingress-controller-leader-xwz" is forbidden: User "system:serviceaccount:kube-system:nginx-ingress-controller" cannot update resource "configmaps" in API group "" in the namespace "kube-system"
Copy the code

Reference issue, need to modify clusterRole: nginx-Ingress-Controller, add the following content

.
- apiGroups:
  - ""
  resourceNames:
  - ingress-controller-leader-nginx
  - ingress-controller-leader-xwz Add the newly added configMap, otherwise the error mentioned above will be reported
  resources:
  - configmaps
.
Copy the code

How can an EXPANDED IC instance be automatically added to the back-end service list of the SLB

The way a externalTrafficPolicy = Cluster

Service spec. ExternalTrafficPolicy when for Cluster, the Cluster of each host can act as a three layer routers, have the effect of load balance and forwarding, but due to its SNAT operation for request packet, as shown in figure:

Way 2 externalTrafficPolicy = Local

Service spec. ExternalTrafficPolicy: when to Local node will only transfer request to the IC’s POD, because without a SNAT operation, IC can get real IP to the client, if the node without POD, would be an error. Thus we need to manually maintain the relationship between IC pods, nodes, and SLB back-end services. So is there a way to automatically manage and maintain this relationship? Ali Cloud container Service has done all this for us. As long as the following annotations are added to the Service whose type is LoadBalancer, it can automatically add the port and IP of the worker node whose POD is started to the SLB back-end Service for us. The capacity expansion or reduction is automatically updated, as shown below (note that we use Intranet SLB, type is Intranet, please modify according to the actual situation) :

metadata:
  annotations:
    service.beta.kubernetes.io/alicloud-loadbalancer-address-type: intranet
    service.beta.kubernetes.io/alicloud-loadbalancer-force-override-listeners: "true"
    service.beta.kubernetes.io/alicloud-loadbalancer-id: Lb - 2 zec8x x x x x x x x x x 965 n
Copy the code
Mode 1 (Cluster) vs. Mode 2 (Local)
contrast Cluster Local
advantages Simple, K8S default Reduces network forwarding, improves performance, and obtains the real IP address of the client
disadvantages SNAT address camouflage A hop is added to the network, and the performance deteriorates. As a result, the real IP address of the client cannot be obtained It is necessary to check whether the port of the node is open, and to customize service discovery (Ali Cloud has been integrated with SLB).

Around the pit article

Nginx worker process count problem

The default worker_processes configuration for Nginx is auto, which automatically computs worker_processes based on the current host CPU information. However, nginx is not a Cgroups Aware application, so it can assume that there are “many” cpus available. Here we need to specify it by setting the parameters in configMap:

apiVersion: v1
data:
  worker-processes: "8"
kind: ConfigMap
metadata:
  annotations:
  labels:
    app: ingress-nginx
  name: xwz-nginx-configuration
  namespace: kube-system

Copy the code

Kernel Parameter Settings

For the time being, we will use the parameters given in the default deployment, which will be adjusted according to the situation in later tuning

initContainers:
- command:
  - /bin/sh
    - -c
    - | sysctl -w net.core.somaxconn=65535 sysctl -w net.ipv4.ip_local_port_range="1024 65535" sysctl -w fs.file-max=1048576 sysctl -w fs.inotify.max_user_instances=16384 sysctl -w fs.inotify.max_user_watches=524288 sysctl -w fs.inotify.max_queued_events=16384Copy the code

The real IP address of the client is incorrect

During the gray scale of small traffic, the business students reported that the third party’s anti-cheating found that the interface we called them was abnormal. After a round of analysis, it was found that the client IP carried in the request sent to the third party was written as the host IP of our API gateway. Still remember the previous architecture diagram?

apiVersion: extensions/v1beta1
kind: Deployment
metadata:
  .
spec:
  template:
    spec:
      containers:
        name: nginx-ingress-controller
        .
        volumeMounts:
        - mountPath: /etc/nginx/template
          name: nginx-template-volume
          readOnly: true
      .
      volumes:
      - name: nginx-template-volume
        configMap:
          name: xwz-nginx-template
          items:
          - key: nginx.tmpl
            path: nginx.tmpl
Copy the code

Problem solved

Monitoring report monitoring

Metrics are already monitored in detail on the existing CooHua API gateway, and since Ingress – Controller also exposes Prometheus metrics, it was easy to deploy the community dashboard directly and make simple changes to it, as shown below:

Please click on the I
Please click on the I

The Troubleshooting article

Dump nginx. Conf file

Please refer to this article docs.nginx.com/nginx/admin…

Refs

Yq.aliyun.com/articles/69… Yq.aliyun.com/articles/64… Bogdan-albei.blogspot.com/2017/09/ker… Danielfm. Me/posts/painl… www.asykim.com/blog/deep-d…