This is the 20th day of my participation in the Genwen Challenge

17 Scheduling Policies

The Master mainly runs the control plane components of the cluster, such as Apiserver, Scheduler, and ControlerManager. The Master also relies on storage nodes such as ETCD.

The kubeadm deployed cluster will run the Master’s control components as static pods. Essentially, these components are processes running on the Master node that serve the cluster. The Master is not responsible for running the workload.

The node node is responsible for running the workload POD. The user only needs to submit the running task to the Master. The Master does not care which node the workload POD is running on.

17.1 POD Creation Process

The final node on which a user created task should run is determined by the scheduler on the Master node, which allows the user to define its work features. By default, we do not define it. The default scheduler is used.

When we use Kubectl Describe Pods MyApp to view POD information, there is a Events field with information about scheduling results.

Scheduler will select the node that meets the OPERATION requirements of POD from a large number of nodes, and then record the information of the selected node in ETCD. Kubelet always waits for the change of information about this node on apiserver. Kubelet will go to Apiserver to get the configuration list of the change information and create the POD based on the definition of the configuration list.

17.2 Service Creation Process

When a user creates a service, the request is submitted to Apiserver, which writes the manifest file to etCD, The Kube-Proxy on each node then waitchapiserver for changes in service resources. When changes occur, the Kube-Proxy on each node creates iptables/ IPVS rules for the service.

In terms of communication, Kubectl, Kubelet, and Kube-Proxy are all clients of Apiserver. When these components interact with Apiserver, the data format is JSON, and the internal data serialization mode is Protocolbuff.

17.3 Dimensions of Resource Restrictions

Resource requirements: The minimum resource requirements required to run a POD
Resource Limit: Maximum resource limit that a POD can occupy

17.4 Scheduler Scheduling Process

Pre-selection phase: Exclude all nodes that do not meet the requirements for running this POD, such as minimum resource requirements, maximum resource quotas, and whether ports are occupied
Optimization stage: The priority of each node is calculated based on a series of algorithm functions, and the node with the highest score is sorted according to the priority
Select phase: If the preferred phase produces more than one result, a node is randomly selected

The field in POD that affects scheduling in kubectl explain Pods.spec

nodeName          # Directly specify the POD running node
nodeSelector      # Select a node according to the tag on the node
Copy the code

Other factors that affect scheduling

Node affinity scheduling: displays the nodeSelector field

Affinity between PODS: PODS tend to run with certain pods, for example, in the same equipment room or machine

Anti-affinity between PODS: A POD and a POD tend not to run together. This is called anti-affinity. For example, a POD listens on the same nodeport and has confidential data

Taints: Taints some nodes

Tolerations: A POD tolerates stains on node, and if new stains appear as it runs

Evict POD: Node gives the POD a limited time to leave the node.

17.4 Pre-selection factors

The following preselection conditions must meet all the preselection conditions to pass the preselection

CheckNodeConditionPred

Check whether the node is normalCopy the code

GeneralPredicates

Child policy	role
HostName	Check whether the hostname is the NodeName specified by pod.spec.hostname
PodFitsHostPorts	Check Pod within each container pods. Spec. Containers. Ports. HostPort listing is already occupied by other containers, if it needs hostPort doesn’t meet the demand, the Pod can’t dispatch on this host
MatchNodeSelector	Check that pods.spec.nodeSelector is defined on the POD container to see if node tags match
PodFitsResources	Check that Node has enough resources to run the basic requirements of this POD

NoDiskConflict (not enabled by default)

Check whether the storage defined by pod is used on the Node node.Copy the code

PodToleratesNodeTaints

Check whether the node's stain nodes.spec.taints is a subset of pod.spec. tolerations from the POD stain tolerance listCopy the code

PodToleratesNodeNoExecuteTaints

Check whether the POD tolerates NoExecute stains on nodes. What does "NoExecute" mean? NoExecute: If a pod runs on an untainted node and the node is tainted, NoExecute means that the newly tainted node will expel the pod that is running on it. No NoExecute will not expel pods running on the node, indicating acceptance of the fait accompli, which is the default policy.Copy the code

CheckNodeLabelPresence (not enabled by default)

Check the presence of the specified label on the node. If the node has the label specified by pod, the node is selected.Copy the code

CheckServiceAffinity (not enabled by default)

A Service can have multiple pods. For example, if the pods are all running on machines 1, 2, and 3, but not on machines 4, 5, and 6, then CheckServiceAffinity indicates that the newly added pods are all running on machines 1, 2, and 3. The benefit of this centralization is that internal communication between pods within a Service becomes more efficient.Copy the code

MaxEBSVolumeCount

Ensure that the mounted Amazon EBS storage volume does not exceed the maximum value set. The default value is 39Copy the code

MaxGCEPDVolumeCount

Ensure that the number of mounted GCE storage volumes does not exceed the maximum value. The default value is 16Copy the code

10 MaxAzureDiskVolumeCount

Ensure that the number of attached Azure storage volumes does not exceed the maximum value. The default value is 16Copy the code

CheckVolumeBinding

Check whether the PVC on the node is bound to another PODCopy the code

NoVolumeZoneConflict

Check whether volume conflicts exist if POD is deployed on the host based on the given zone (room) limitCopy the code

CheckNodeMemoryPressure

Check whether the memory on the node is under pressureCopy the code

CheckNodeDiskPressure

Check whether the disk I/O pressure is too highCopy the code

CheckNodePIDPressure

Check whether the PID resources on the node are under pressureCopy the code

MatchInterPodAffinity

Check whether the Pod meets affinity or anti-affinity requirementsCopy the code

17.5 Preferred Function

The optimization function is executed at each node, and the result of each optimization function is added, and the one with the highest score wins.

least_requested.go

Select the node with the least consumption (CPU evaluated based on idle ratio (total capacity -sum(used) x 10/ total capacity))Copy the code

balanced_resource_allocation.go

The balanced resource usage means that the CPU and memory usage are similar. The closer the CPU and memory usage are, the higher the score will be. The one with the highest score wins.Copy the code

node_prefer_avoid_pods.go

Whether the node is annotated information "scheduler. Alpha. Kubernetes. IO/preferAvoidPods". The absence of this annotation information indicates that this node is suitable for running the POD.Copy the code

taint_toleration.go

Tolerations and nodes.spec.taints were matched, and the more matched items, the lower the scores.Copy the code

selector_spreading.go

Search for service, StatefulSet, ReplicatSet, etc., which correspond to the current POD object. The fewer pods running on the node with such tags, the higher the score. This means we need to spread pods running on the same tag selector across multiple nodes.Copy the code

interpod_affinity_test.go

Iterate through the POD object affinity items, and add those that can match to the weight of the node, the larger the value of the higher the score, the higher the score wins.Copy the code

most_requested.go

Least_requested: if possible, use up resources on a node first. This is the opposite of least_requested.Copy the code

node_label.go

The score is evaluated based on whether the node has a label, regardless of what the label is.Copy the code

image_locality.go

Represents the selection of nodes based on the sum of the sizes of existing mirrors that meet the requirements of the current POD object.Copy the code

node_affinity.go

According to nodeselector in POD object, the matching degree of nodes is checked. The more nodes can be successfully matched, the higher the score will be.Copy the code

17.6 Selecting functions

When there are more than one preferred node, one of them is randomly selected

other

Send your notes to: github.com/redhatxl/aw… Welcome one button three links

mo4tech.com (Moment For Technology) is a global community with thousands techies from across the global hang out!Passionate technologists, be it gadget freaks, tech enthusiasts, coders, technopreneurs, or CIOs, you would find them all here.

Kubernetes Notes (17) – Scheduling policies

17 Scheduling Policies

17.1 POD Creation Process

17.2 Service Creation Process

17.3 Dimensions of Resource Restrictions

17.4 Scheduler Scheduling Process

17.4 Pre-selection factors

17.5 Preferred Function

17.6 Selecting functions

other

Kubernetes Notes (17) – Scheduling policies

17 Scheduling Policies

17.1 POD Creation Process

17.2 Service Creation Process

17.3 Dimensions of Resource Restrictions

17.4 Scheduler Scheduling Process

17.4 Pre-selection factors

17.5 Preferred Function

17.6 Selecting functions

other

Related Posts

Java backend development interview 7 core summary, for you to escort gold nine silver ten!

Design Patterns – A learning journey of Strategic patterns

Dubbo main exception types and handling