What are the new features in Kubernetes 1.15 that stand out?

Kubernetes released version 1.15 on June 20, 2019, but I’ve had time to take a closer look at what’s coming. This update is aimed at continuing to improve stability and scalability. After a close look at the 25 new or changed features, many of the previous minor annoyances have been addressed in this release. Here is the format for each feature:

#492: indicates the GitHub issue number, followed by the specific feature progress: indicates the current stage of the feature, such as Alpha, Beta, Stable: indicates the category of the feature, such as API, CLI, Network, etc.

Here are the details of the features, such as what was improved and why, and some features with simple usage examples.

1. Core functions

# 1024 NodeLocal DNSCache

Progress: Towards Beta

Feature category: Network

NodeLocal DNSCache improves cluster DNS performance by running the DNS cache proxy on cluster nodes in a Deamonset manner, avoiding the use of iptables DNAT rules and connection tracing. If the local DNS cache agent cannot find the corresponding DNS record in memory, it will issue a query request to the kube-DNS service (with the suffix cluster.local by default).

For more details on this feature read the design notes in the Kubernetes Enhancement Proposal (KEP) documentation.

# 383 Redesign event API

Progress: Alpha

The Scalability is described

This work has two main objectives:

  1. Reduce the performance impact of Events on the rest of the cluster;
  2. Adding more data structures to the Event objects is a necessary first step in automating Event analysis.

Currently, the main problem with the Event API is that it contains too much spam, making it difficult to ingest and analyze valid information. There are also several performance issues, such as Events that can overload the API server when a cluster fails (such as the common Crashloop)

A discussion of the issue and suggested solutions and improvements can be found in the design proposal here.

# 492 Admission webhook

Progress: Beta

Feature category: API

Mutating and Validating Webhook have become the mainstream choice for expanding apis. Before 1.15, all Webhooks were called only once in alphabetical order, which caused a problem: An earlier Webhook can’t handle updates from later webhooks, which can lead to unknown problems, such as earlier Webhooks setting startup parameters for a pod that later Webhooks change or remove.

In Kubernetes 1.15, webhooks are allowed to be called repeatedly, even for modifications to the same object. To enable this feature, you must ensure that any Admission Webhook you introduce is idempotent, that is, any number of operations performed on the same object has the same effect as a single operation.

# 624 Scheduling framework

Progress: Alpha

Scheduling

This feature provides a new pluggable structure for the Kubernetes 1.15 scheduler, primarily to address the increasing need for customized scheduling. Scheduler Framework adds more than ten interfaces such as Reserve and pre-bind on the basis of the original Priority/Predicates interface.

The following diagram shows the Pod Scheduling process in the new Scheduling framework:

More information about this feature can be found in the official design documentation.

# 606 Support 3rd party device monitoring plugins

Progress: Towards Beta

Feature category: Node

This feature allows Kubelet to expose container binding information to third-party monitoring plug-ins so that system administrators can use third-party device monitoring agents to monitor the usage of custom resources allocated to pods (for example, GPU usage per Pod).

Before decoupling, Kubelet checks for the presence of all supported devices, even if they are not installed on the node.

Through /var/lib/kubelet/pod-resources/kubelet.sock, a new GRPC service will be provided. This service will expose the information about the resources allocated to containers and devices.

# 757 Pid limiting

Progress: Towards Beta

Feature category: Node

Pid is an important resource in Linux system. It is easy for the system to reach the maximum number of processes before the CPU or memory usage reaches the maximum. Therefore, the administrator must ensure that the Pod does not run out of Pids, causing other important services to fail (e.g., Container Runtime, Kubelet, etc.).

The new feature allows you to modify the Kubelet configuration to limit the number of Pids per Pod. The ability to limit Pids at the Node level can now be used directly, eliminating the need to display Settings via the feature Gate parameter SupportNodePidsLimit=true.

The official Kubernetes blog has a detailed description of this feature.

# 902 Add non-preempting option to PriorityClasses

Progress: Alpha

Scheduling

Kubernetes 1.15 added the PreemptionPolicy field as an Alpha feature in PriorityClass.

The default value of the PreemptionPolicy field is PreemptLowerPriority, which means that pods of that priority are allowed to preempt pods of lower priority (this is the default preemption behavior). If the value of the PreemptionPolicy field is Never, the Pod is placed in the scheduling queue before lower-priority pods, but it cannot preempt other pods.

Take data science as an example: A user submits a job that he wants to have a higher priority than other jobs, but he doesn’t want the current task to be put on hold because he wants to preempt a Pod.

# 917 Add go module support to k8s.io/kubernetes

Progress: Stable

Feature classification: Architecture

Since Kubernetes opened source, godep has been used to vendoring all dependent libraries. As the Go ecosystem matured, Vendoring became mainstream and Godep was no longer maintained, so Kubernetes started using a custom version of Godep, Other vendoring tools (such as Glide and DEP) followed, and now Go’s dependency library management can finally be added directly to Go as a Go Module.

Go has enabled the Go Module by default since 1.13 and removed the $GOPATH mode. To support this change, Kubernetes version 1.15 tweaked the code of several components to use the Go Module.

# 956 Add Watch bookmarks support

Progress: Alpha

Feature category: API

A Kubernetes cluster only keeps a change history for a period of time, for example, a cluster using ETCD3 keeps a change history of 5 minutes by default. Adding a bookmark to the Kubernetes Watch event can be thought of as an extra checkpoint. All Client requests will be filtered by this bookmark if they match the resourceVersion they want to find.

Such as: Add a request of Watch to search for all events of resource version X. At this time, THE API Server knows that the Watch request is not interested in events of other resource version, so it will use bookmarks to skip all other events and only send specific events to the client. This avoids increasing the load on the API Server.

# 962 Execution hooks

Progress: Alpha

Feature category: Storage

ExecutionHook provides a general mechanism for the user to trigger the desired hook commands in a container, for example:

  • Application backup
  • upgrade
  • Database Migration
  • Reload the configuration file
  • Restart the container

The definition of a hook contains two important pieces of information:

  1. What commands need to be executed
  2. Where is the command executed (throughPod Selector)

Here is a simple example:

For more details on this feature read the design notes in the Kubernetes Enhancement Proposal (KEP) documentation.

# 981 PDB support for custom resources with scale subresource

Progress: Towards Beta

Feature category: Apps

Pod Disruption Budget (PDB) is a Kubernetes API used to limit the number of pods that go down in an application (such as Deployment or ReplicaSet) that voluntarily outages at the same time. The PDB can customize interrupt budgets by specifying either a minimum number of pods available or a maximum number of pods that are not available.

For example, for a stateless front-end application:

  • Requirements: Service capacity cannot be reduced by more than 10%
  • Solution: Use an includeminAvailable 90%The value of the PDB

Using PDB allows administrators to operate Kubernetes workloads without compromising service availability and performance.

2. Customize resources

# 95 CustomResourceDefinitions

Progress: Beta

Feature category: API

This feature has no real functionality, but just groups the crD-related fixes and improvements in Kubernetes 1.15:

  • Structural schema using OpenAPI
  • CRD pruning
  • CRD defaulting
  • Webhook conversion moved to beta
  • Publishing the CRD OpenAPI schema

# 692 Publish CRD OpenAPI schema

Progress: Towards Beta

Feature category: API

This feature allows developers to use OpenAPI V3 Schema to define CustomResource Definitions (CRD) to enable server-side validation of CustomResources (CR).

Publishing CRDS that use OpenAPI specifications enables client-side validation (e.g. Kubectl Create and Kubectl Apply) and specification description (e.g. Kubectl Explain), Clients are also automatically generated for CRs, so developers can easily interact with the API using any supported programming language.

Using the OpenAPI specification helps make the direction of the CRD developers and the Kubernetes API clearer and the document format more concise.

# 575 Defaulting and pruning for custom resources

Progress: Alpha

Feature category: API

The following two features are primarily intended to make CRD-related JSON processing easier.

Pruning: CRD is traditionally stored in JSON format in ETCD. Now if it is defined in OpenAPI V3 specifications and preserveUnknownFields is false, undefined fields will be deleted when created or updated.

Defaulting: This feature is in the Alpha phase of Kubernetes 1.15. It is disabled by default and can be enabled using the CustomResourceDefaulting parameter of feature Gate. Defaulting, like Pruning, starts with a specification, and things that don’t fit the specification get removed.

# 598 Webhook conversion for custom resources

Progress: Towards Beta

Feature category: API

Different VERSIONS of CRD can have different specifications, now you can handle conversion between versions in action, and implement webhook version conversion. This Webhook is called in the following situations:

  • The requested custom resource version is inconsistent with the previously stored version
  • A version of a custom resource is created during Watch, but the next time it is modified, it is found to be inconsistent with the stored version
  • usePUTDescription The requested version is inconsistent with the stored version when a user requests a custom resource

Here is an example of a Webhook Server that converts custom resources to each other for your reference.

3. Configure management

# 515 Kubectl get and describe should work well with extensions

Progress: Towards Stable

Feature category: Cli

It is already possible to use Kubectl GET and Describe to allow third-party API extensions and CRD to provide custom formatted output. This feature allows the output to be printed to the server for better scalability and decouples Kubectl from the implementation details of the extension.

For more details on this feature, consult the relevant design documentation.

# 970 Kubeadm: New v1beta2 config format

Progress: Towards Beta

Cluster Lifecycle

Over time, the number of options for configuring Kubernetes cluster creation in the kubeadm configuration file has greatly increased, and the number of CLI parameters has not changed, so using the configuration file to create the cluster is currently the only way to meet the needs of the user.

The goal of this feature is to redesign the way configurations are stored to ameliorate the problems encountered in the current version, and to move away from a single configuration file containing all options in favor of a substructure to provide better support for high availability clusters.

# 357 Ability to create dynamic HA clusters with kubeadm

Progress: Towards Beta

Cluster Lifecycle

Kubernetes can provide high availability through multiple control planes. The kubeadm tool can now be used to create a high availability cluster in two ways:

  • Etcd co-exists with the Control Plane node (Master)
  • Etcd is separate from the Control Plane node (Master)

This version of Kubeadm will automatically copy the required certificates, reducing the need for human intervention. Currently, a temporarily encrypted secret key is used to secure the certificates in transit. More details can be found in the KEP documentation.

4. Cloud providers

# 423 Support AWS network load balancer

Progress: Towards Beta

Feature category: AWS

Annotations Kubernetes 1.15 uses Annotations to create AN AWS NLB (Service type LoadBalancer).

Unlike classic elastic Load Balancers, Network Load Balancers (NLBs) pass the client’s IP directly to the node. AWS NLB has been in Alpha since 1.9. Now that the code and API are relatively stable, we are ready to move to Beta.

# 980 Finalizer protection for service LoadBalancers

Progress: Alpha

Feature category: Network

By default, the Load Balancer resource provided by the cloud Service provider should be deleted when the Kubernetes Service is deleted. However, in various extreme cases, it can be found that after the Kubernetes Service is deleted, the Kubernetes Service is deleted. The Load Balancer resource was left alone and not cleared, and finalizers were introduced to prevent this from happening.

If your cluster is already integrated with a cloud Service provider, Finalizer will be added to any Kubernetes Service that contains the type=LoadBalancer field. The Service can be deleted only after the Load Balancer resource is deleted by Finalizer.

5. Storage

# 625 In-tree storage plugin to CSI Driver Migration

Progress: Alpha

Feature category: Storage

Storage plug-ins were originally in Kubernetes’ base code base, adding complexity to code maintenance and hindering extensibility. Therefore, the goal of this feature is to move all storage-related code out into add-on plug-in form and interact with Kubernetes through the Container Storage Interface (CSI). This will reduce development costs and make it more modular and extensible, allowing for better compatibility between different versions of storage plug-ins and Kubernetes. For the latest updates on this feature, see here.

# 989 Extend allowed PVC DataSources

Progress: Alpha

Feature category: Storage

This feature lets users copy existing PVS. Replication and backup are not the same. Replication creates a new volume with the same contents as the original volume. Copying an existing PV consumes the volume quota and follows the same creation and check process as other storage volumes. The copied PV has the same life cycle and work flow as a common PV. When using the feature, note the following:

  • copy-functionalVolumePVCDataSourceParameter only applies to CSI Driver.
  • The replication function applies only to dynamic storage volumes.
  • The ability to use replication also depends on whether the CSI Driver implements replication of the storage volume.

# 1029 Quotas for ephemeral storage

Progress: Alpha

Feature category: Node

The current mechanism for limiting the amount of temporary storage volumes is to periodically traverse the size of each temporary storage volume, which is slow and has a high latency. The mechanism proposed in this feature uses a file system’s Project Quota to monitor resource consumption and then decide whether to limit it. Hope to achieve the following goals:

  • Improve monitoring performance by optionally using Project Quota to collect information about temporary volume usage.
  • Detects a storage volume that has been deleted from Pod but is hidden because the file is still open.

This allows Project quotas to limit the usage of each volume.

# 531 Add support for online resizing of PVs

Progress: Towards Beta

Feature category: Storage

This feature allows users to modify the PVC to extend the file system used by the storage volume online without having to restart the PVC used by the storage volume. Online to extend functionality of the PVC is still on the stage of Beta, and the default is open, you can also through the feature parameters of gate ExpandInUsePersistentVolumes according to open it.

File system expansion can be triggered when:

  • When Pod starts
  • When Pod is running and the underlying file system supports online extensions (for example, XFS, ext3, or ext4)

Please refer to the official Kubernetes documentation for more information on this feature.

# 559 Provide environment variables expansion in sub path mount

Progress: Towards Beta

Feature category: Storage

Currently Kubernetes has a limitation for mounting local volumes on nodes: If there are more than two pods running on the same node, writing the same log file name to the same volume will cause these pods to conflict.

Using subPath is a good option, but subPath is currently only writable and does not provide flexibility. The previous solution was to create a Sidecar container with a soft link to the mount path.

To make it easier to solve this problem, it is proposed to add environment variables to the subPath to mitigate this limitation, as shown in the following example:

It can also be written in this format:

6. Summary

This article not only informs readers of the new features of Kubernetes 1.15, but also provides an opportunity to understand how to solve the performance bottleneck of a large system such as Kubernetes when integrating with a third party or a component. It provides a reference for our future architecture design.

7. Reference materials

  • Sysdig.com/blog/whats-…