Kubernetes officially released v1.26, the stability has been significantly improved¶
On December 8, 2022 Pacific Time, Kubernetes officially released v1.26 with the topic Electrifying .
As the last version in 2022, many new features have been added, and the stability has also been significantly improved. We will introduce the update of version 1.26 from the following perspectives.
Update overview:
- Kube APIServer: As the entry point for Kubernetes requests, this update adds 4 new KEP features, and makes some optimizations in response compression, and another 2 features have also been upgraded from Alpha to Beta.
- Node: We put the updates most closely related to kubelet here, mainly including 4 new KEP features, and 4 features are GA in this version.
- Storage: In terms of storage, the feature of allocating volumes from snapshots of other namespaces (cross-namespaces) has been added, and 2 storage-related features have entered the Beta stage, and 3 features are officially GA.
- Network: Mainly updates to Kube Proxy, including a performance-optimized KEP, and 2 features have entered Beta, and 4 features are officially GA.
- Resource Control and Coordination: Mainly for the update of related resource controllers in kube-controller-manager, there are also 2 new KEP features, 2 features are upgraded from Alpha to Beta, and 1 This feature is officially GA.
- Scheduler: Mainly added an important KEP feature - PodSchedulingReadiness is used to control when a Pod can be scheduled by the scheduler, and one feature has been upgraded from Alpha to Beta.
- Observability: In terms of observability, it mainly adds a new mechanism for exposing component status - Component Health SLIs. In addition, many metrics are added to each component.
- kubectl command, kubeadm and client-go also have some optimizations and Bug Fixes.
- For features that have been GA, according to the version iteration strategy of Kubernetes, 11 Feature Gates are also removed in 1.26. If these Feature Gates are continued to be set in the component command, the component will not start normally.
Next, let's take a look at some of the more important API deprecations and changes that affect upgrades.
API deprecations and changes¶
PR#110618 Kubelet no longer supports the v1alpha2 version of CRI, and the connected container runtime must implement the v1 version of the container runtime interface.
This means that Kubernetes v1.26 will not support containerd 1.5.x and earlier versions; you need to upgrade to containerd 1.6.x or later before you can upgrade the node's kubelet to 1.26.
PR#112306 flowcontrol.apiserver.k8s.io increases v1beta3 version, and sets v1beta2 as the optimal version, v1beta3 will be set in 1.27 is the optimal version.
PR#113336 The CSIMigrationvSphere feature is GA and cannot be turned off.
Tip
Official tip: Do not upgrade to Kubernetes v1.26 if you need to use Windows, XFS or raw blocks. You can upgrade after vSphere CSI Driver adds related support in versions after v2.7.x.
PR#113710 The --pod-eviction-timeoutflag in the kube-controller-manager command is deprecated and is compatible with --enable- taint-managerflag was removed in 1.27.
PR#112643 DynamicKubeletConfig is deprecated in 1.23, the logic in kubelet has been removed in 1.24, this update will FeatureGateDynamicKubeletConfig and APIServer The logic within is removed.
PR#112120 Remove some invalid klog related flags in Kube component.
Kubernetes APIServer¶
KEP-2799 Reduction of Secret-based Service Account Tokens¶
Added Alpha Feature Gate —— LegacyServiceAccountTokenTracking to control whether to enable this feature.
When LegacyServiceAccountTokenTracking is enabled, the secret-based sa token will use the label kubernetes.io/legacy-token-last-used to record the last usage time.
KEP-3488 CEL for Admission Control¶
Related PRs: PR#113314, PR#113349, PR#112994, PR#112792, PR#112926, PR#112858
Based on the KEP-2876 CRD validation expression language provided by Kubernetes v1.25, This feature adds a new resource under admissionregistration.k8s.io/v1alpha1 - ValidatingAdmissionPolicy , which allows field validation when not using a Validation Webhook.
apiVersion: admissionregistration.k8s.io/v1alpha1
kind: ValidatingAdmissionPolicy
metadata:
name: "demo-policy.example.com"
Spec:
failurePolicy: Fail
matchConstraints:
resourceRules:
- apiGroups: ["apps"]
apiVersions: ["v1"]
operations: ["CREATE", "UPDATE"]
resources: ["deployments"]
validations:
- expression: "object.spec.replicas <= 5"
This will require the resource's spec.replicas field to be less than or equal to 5.
Added Alpha Feature Gate —— ValidatingAdmissionPolicy to control whether to enable this feature
KEP-3352 Aggregated Discovery PR#113171¶
Currently, users can only traverse the Group and Version APIs to obtain the Discovery API, but this feature reduces these calls to only two interfaces: /api and /apis.
Added Alpha Feature Gate —— AggregatedDiscoveryEndpoint to control whether to enable this feature.
KEP-3325 Auth API to get self user attributes PR#111333¶
Add resource SelfSubjectReview in authentication/v1alpha1 group to provide users with querying their own user information mapped in Kubernetes.
And the kubectl alpha auth whoami command has been added to facilitate query.
apiVersion: authentication.k8s.io/v1alpha1
kind: SelfSubjectReview
status:
userInfo:
username: jane.doe
uid: b79dbf30-0c6a-11ed-861d-0242ac120002
groups:
-students
-teachers
-system:authenticated
extra:
skills:
-reading
- learning
subjects:
- math
-sports
Added Alpha Feature Gate —— APISelfSubjectAttributesReview to control whether to enable this feature.
PR#112193 APIServer adds --aggregator-reject-forwarding-redirect flag, Users can set it to false to continue forwarding the redirection response of AA (Aggregated API) Server, the default is true.
PR#113015 You can specify custom resources through the --encryption-provider-config file, and these custom resources can be stored encrypted in etcd .
Response Compression¶
PR#112299 Based on load testing and production data collected from thousands of production Kubernetes clusters, the community observed that gzip compression in Kubernetes APIServer is currently suboptimal.
ISSUE: kubernetes/kubernetes#112296
PR#112309 Add DisableCompression field in kubeconfig, when set to true, it is required to no longer compress the response.
PR#112580 Add --disable-compression flag in kubectl, when set to true, request no longer compress the response.
Function stability upgrade¶
Alpha -> Beta
Feature Gate | KEP |
---|---|
LegacyServiceAccountTokenNoAutoGeneration | KEP-2799 Reduction of Secret-based Service Account Tokens |
APIServerIdentity | KEP-1965 kube-apiserver identity |
Node (Kubelet)¶
KEP-3063 dynamic resource allocation PR#111023¶
Add resource.k8s.io/v1alpha1 group and add resources related to dynamic resource allocation under this group: 'ResourceClaim', 'ResourceClass', 'ResourceClaimTemplate', 'PodScheduling'.
Added Alpha Feature Gate —— DynamicResourceAllocation to control whether to enable this feature.
The new API is more flexible than Kubernetes' existing Device Plugins functionality, because it allows Pods to request special types of resources, which can be provided at the node level, cluster level, or according to other modes set by the user.
Similarly, the Pod structure also adds corresponding support for dynamic resource allocation.
apiVersion: v1
kind: Pod
spec:
containers:
- name: with-resource
image: busybox
command: ["sh", "-c", "set && mount && ls -la /dev/"]
resources:
claims:
-name: resource
resourceClaims:
-name: resource
source:
resourceClaimName: shared-claim
# resourceClaimTemplateName: test-inline-claim-template
KEP-3545 Improved multi-numa alignment in Topology Manager PR#112914¶
This feature better handles NUMA (Non-Uniform Memory Access) nodes by optimizing the TopologyManager.
Add a new configurable item topologyManagerPolicyOptions field and --topology-manager-policy-options flag to set additional configuration for Topology Manager Policy.
And add three Alpha Feature Gates to control the configuration of Topology Manager Policy:
- TopologyManagerPolicyOptions
- TopologyManagerPolicyAlphaOptions
- TopologyManagerPolicyBetaOptions
TopologyManagerPolicyOptions controls whether the TopologyManagerPolicyOptions feature is supported, and the other two are used to control whether the Alpha and Beta level Options of TopologyManagerPolicy can be set respectively.
Of course, to use the TopologyManager function, you need to enable the TopologyManager Feature Gate, but the Feature Gate is already in the Beta stage and does not need to be actively set.
KEP-3386 Kubelet Evented PLEG for Better Performance PR#111384¶
This feature allows kubelet to reduce periodic polling by relying as much as possible on notifications from the Container Runtime Interface (CRI) when tracking Pod status in a node, which reduces kubelet's CPU usage
Added Alpha Feature Gate —— EventedPLEG to control whether to enable this feature.
PR#86139 In past versions, when using httpGet in container preStop and postStart lifecycle callbacks, Even if the schemes are set to HTTPS, http is still used to access, and the headers set by the user will not be applied in the setting request.
This PR fixes these problems, and when the https access is abnormal, it will fall back to the http request. When the fallback occurs, A LifecycleHTTPFallback event is created for the Pod and the kubelet_lifecycle_handler_http_fallbacks_total metric is updated.
In addition, Added an Alpha Feature Gate - ConsistentHTTPGetHandlers , users can set --feature-gates=ConsistentHTTPGetHandlers=false in the kubelet to turn off the fallback behavior.
KEP-3503 Windows allows specifying whether a Pod is added to a node's network namespace PR#112961¶
Added Alpha Feature Gate —— WindowsHostNetworking to control whether to enable this feature.
PR#112414 allows to combine multiple line options in /etc/resolv.conf into a single line setting in Pod.
->
BUG FIX¶
PR#113041 Fixed kubectl exec causing kubelet to pick wrong container due to duplicate container name due to lifecycle.preStop .
PR#108832 When the container sets limit.cpu, but requests.cpu is "0", cgroups cpuShares takes the minimum value of 2 instead of using limit.cpu.
PR#112184 When kubelet only sets --cloud-provider or --node-ip , it will ensure that invalid annotations in the node are cleared - alpha.kubernetes.io/provided-node-ip .
PR#113481 Fix kubectl logs --timestamps when viewing logs, the problem of confusing random timestamps appears.
PR#112518 Fixed an issue where pods would continue to run on nodes tainted with NoExecute taints when the PodDisruptionConditions feature gate was enabled.
PR#112123 Set the minimum value of cpuCFSQuotaPeriod from 1us to 1ms, setting the value below 1ms will fail the verification.
Related PRs: PR#112077, PR#111520, PR#63437
Function stability upgrade¶
Beta -> GA
Feature Gate | KEP |
---|---|
CPU Manager | KEP-3057 Graduate to CPUManager to GA |
DevicePlugins | KEP-3573 Graduate DeviceManager to GA |
Kubelet CredentialProviders | KEP-2133 Kubelet Credential Provider |
WindowsHostProcessContainers | KEP-1981 Support for Windows privileged containers |
storage¶
KEP-3294 Provision volumes from cross-namespace snapshots¶
Related PRs: PR#113186, PR#kubernetes-csi/external-rpovisioner#805
Prior to Kubernetes 1.26, with VolumeSnapshot , users could allocate volumes from snapshots. But it fails to bind to VolumeSnapshots in other namespaces in a PersistentVolumeClaim .
apiVersion: v1
kind: PersistentVolumeClaim
spec:
storageClassName: csi-hostpath-sc
accessModes:
- ReadWriteOnce
resources:
requests:
storage: 1Gi
dataSourceRef:
apiGroup: snapshot.storage.k8s.io
kind: VolumeSnapshot
name: new-snapshot-demo
namespace: prod
volumeMode: Filesystem
This feature supports users to allocate volumes from snapshots across namespaces by setting the newly added field spec.dataSourceRef.namespace in PVC
Added Alpha Feature Gate —— CrossNamespaceVolumeDataSource to control whether to enable this feature
Function stability upgrade¶
Alpha -> Beta
Feature Gate | KEP |
---|---|
RetroactiveDefaultStorageClass | KEP-3333 Retroactive default StorageClass assignment |
NodeOutOfServiceVolumeDetach | KEP-2268 Non-graceful node shutdown |
Beta -> GA
Feature Gate | KEP |
---|---|
CSIMigrationAzureFile | KEP-625 In-tree storage plugin to CSI Driver Migration |
CSIMigrationvSphere | KEP-1491 vSphere in-tree to CSI driver migration |
DelegateFSGroupToCSIDriver | KEP-2317 Allow Kubernetes to supply pod's fsgroup to CSI driver on mount |
network¶
PR#110268 Optimize kube-proxy performance, it only sends rules changed in call to iptables-restore instead of whole ruleset
Added Alpha Feature Gate —— MinimizeIPTablesRestore to control whether to enable this feature.
PR#108250 kube-proxy adds flag--iptables-localhost-nodeports to allow users to prohibit access to NodePort services through localhost.
PR#111806 If the user requests to use ipvs, but the system is not configured correctly, it will no longer fall back to iptables mode, but return an error.
PR#113363 If kube-proxy detects that the pod.Spec.PodCIDRs assigned to Nodes have changed, it will restart.
PR#112133 Removed the deprecated "userspace" proxy mode.
Function stability upgrade¶
Alpha -> Beta
Feature Gate | KEP |
---|---|
Proxy Terminating Endpoints | KEP-1669 Proxy Terminating Endpoints |
ExpandedDNSConfig | KEP-2595 Expanded DNS configuration |
Beta -> GA
Feature Gate | KEP |
---|---|
MixedProtocolLBService | KEP-1435 Support of mixed protocols in Services with type=LoadBalancer |
ServiceIPStaticSubrange | KEP-3070 Reserve Service IP Ranges For Dynamic and Static IP Allocation |
Service Internal Traffic Policy | KEP-2086 Service Internal Traffic Policy |
EndpointSliceTerminatingCondition | KEP-1672 Tracking Terminating Endpoints |
Resource Control and Coordination¶
KEP-3017 PodHealthyPolicy for PodDisruptionBudget PR#113375¶
Added spec.unhealthyPodEvictionPolicy field to PodDisruptionBudget resource to control when unhealthy Pods should be considered for eviction.
The spec.unhealthyPodEvictionPolicy field currently has two values that can be set - IfHealthyBudget and AlwaysAllow
spec:
minAvailable: 2
selector:
matchLabels:
app: zookeeper
unhealthyPodEvictionPolicy: IfHealthyBudget
Add Alpha Feature Gate —— PDBUnhealthyPodEvictionPolicy to control whether to enable this feature
KEP-3335 StatefulSet Start Ordinal PR#112744¶
StatefulSets currently number pods starting at 0.
This feature adds a spec.ordinals.start field to StatefulSet to control the starting number of Pods.
Add Alpha Feature Gate —— StatefulSetStartOrdinal to control whether to enable this feature
PR#112011 Stops working if multiple HPAs involve the same pod, And set HPA's ScalingAction Condition to False, Reason is AmbiguousSelector, this PR also includes multiple HPAs pointing to the same Deployment.
Function stability upgrade¶
Alpha -> Beta
Feature Gate | KEP |
---|---|
JobPodFailurePolicy | KEP-3329 Retriable and non-retriable Pod failures for Jobs |
PodDisruptionConditions | KEP-3329 Retriable and non-retriable Pod failures for Jobs |
Beta -> GA
Feature Gate | KEP |
---|---|
JobTrackingWithFinalizers | KEP-2307 Job tracking without lingering Pods |
Pod Scheduling¶
KEP-3521 Pod Scheduling Readiness¶
Related PR: PR#113275, PR#113274, PR#113442
Not all Pods in the Pending state are ready to be scheduled, and some Pods cannot be successfully scheduled due to the lack of necessary resources, which will also bring additional work to the scheduler.
This feature adds the spec.schedulingGates field in the Pod to control whether the Pod is ready for actual scheduling.
Scheduling will only start if spec.schedulingGates is cleared:
Added Alpha Feature Gate —— PodSchedulingReadiness to control whether to enable this feature
PR#111726 Output Pending status Pod information in the Scheduler's debug Dummer.
Optimization and BUG FIX¶
PR#111809 When using Patch to update Pod status, In addition to net.ConnectionRefused , retries are also performed on ServiceUnavailable and InternalError errors.
The ServiceUnavailable error occurs when the APIServer is temporarily unable to process a request.
E0805 20:54:21.624945 123623 scheduler.go:356] Error updating pod foo: the server is currently unable to handle the request (patch pods foo)
InternalError usually occurs due to a temporary failure of the webhook.
E0811 23:32:30.886582 213747 scheduler.go:357] Error updating pod foo: Internal error occurred: failed calling webhook "xyz": Post "xyz": context deadline exceeded
Function stability upgrade¶
Alpha -> Beta
Feature Gate | KEP |
---|---|
NodeInclusionPolicy | KEP-3094 Take taints/tolerations into consideration when calculating PodTopologySpread skew |
Observability¶
KEP-3466 Kubernetes Component Health SLIs¶
There was no standard format to expose the health information of Kubernetes components before. This feature adds a new path /metrics/slis to each component to expose the service level metric (ServiceLevelIndicator) in the Prometheus format.
Every component needs to expose two metrics:
-
gauge: the current health check status of the component
-
counter: Cumulative count of detected states for each health check
# HELP kubernetes_healthchecks_total [ALPHA] This metric records the results of all healthchecks. # TYPE kubernetes_healthchecks_total counter kubernetes_healthchecks_total{name="etcd", status="success", type="healthz"} 15 kubernetes_healthchecks_total{name="etcd", status="success", type="readyz"} 15
-
Kube APIServer: PR#112741
- Kube Controller Manager: PR#112978
- Kube Scheduler: PR#113026
- Kubelet: PR#113030
- Kube Proxy: PR#113057
- Cloud Controller Manager: PR#113340
Added Alpha Feature Gate - ComponentSLIs to control whether to enable this feature.
Some metrics have been added to each component, and some metrics calculation problems have been fixed.
Kubectl commands¶
Subcommand enhancements and fixes¶
PR#109525 kubectl wait command supports setting non-existent fields in -o jsonpath=, which can be useful when some fields are set asynchronously
PR#111096 kubectl api-resources adds categories column in -o wide output, and adds --categories parameter to support filtering based on categories.
PR#113819 kubectl alpha event moved to top-level command kubectl events.
PR#111093 Fixed kubectlrollouthistory --revision=
PR#111571 Optimize the prompt information of the kubectl label --dry-run command to prevent users from misunderstanding that the label has been set.
-before
```shell
$ kubectl label pod foo bar=baz --dry-run=server
pod/foo labeled
```
-after
```shell
$ kubectl label pod foo bar=baz --dry-run=server
pod/foo labeled (server dry run)
```
PR#112556 Optimize the error message when kubectl patch uses StrategicMerigePatch to update custom resources.
PR#112700 Fix kubectl covert choose wrong api version.
PR#109505 kubectl annotate no longer throws an error when setting an annotation with the same value as the original value.
PR#110907 When executing kubectl apply, if --namespace is specified, but --prune-allowlist is not specified, non-namespace resources will be deleted. The pr is just to print a warning. In 1.28, when kubectl apply specifies a namespace, it no longer deletes the resource pv & namespace without a namespace.
PR#113116 kubectl apply adds --prune-allowlist flag, used with --prune flag to replace the deprecated --prune-whitelist flag.
Other¶
PR#113146 The kubectl explain command can use OpenAPIv3 via the environment variable KUBECTL_EXPLAIN_OPENAPIV3.
PR#112553 Kubectl escapes terminal special characters in output. Fixes CVE-2021-25743.
PR#112150 Optimize kubectl's display of invalid requests returned by APIServer.
PR#112243, PR#112261 deprecated kubectl run command Multiple flags, even if set they will be ignored.
Shell Completion¶
PR#113636 kubectl shell completion supports showing command description in bash.
bash-5.1$ kubectl a[tab][tab]
alpha (Commands for features in alpha)
annotate (Update the annotations on a resource)
api-resources (Print the supported API resources on the server)
api-versions (Print the supported API versions on the server, in the form of "group/version")
apply (Apply a configuration to a resource by file name or stdin)
argo (The command argo is a plugin installed by the user)
attach (Attach to a running container)
auth (Inspect authorization)
autoscale (Auto-scale a deployment, replica set, stateful set, or replication controller)
bash-5.1$ kubectl --c[tab][tab]
--cache-dir (Default cache directory)
--certificate-authority (Path to a cert file for the certificate authority)
--client-certificate (Path to a client certificate file for TLS)
--client-key (Path to a client key file for TLS)
--cluster (The name of the kubeconfig cluster to use)
--context (The name of the kubeconfig context to use)
PR#105867 provides shell completion for the kubectl plugin, and the plugin can pass kube_complete-
Kubeadm¶
Command fixes and enhancements¶
PR#113005 kubeadm join phase control-plane-preapare certs supports running with --dry-run .
PR#112945 supports dry-run mode for sub-phases, e.g. kubeadm reset phase cleanup-node --dry-run .
PR#111512 A new p hase —— show-join-command has been added to the kubeadm init command. Users can pass kubeadm init --skip -phase=show-join-command to skip printing the join information, this phase cannot be executed alone.
PR#112172 The kubeadm reset command adds the --cleanup-tmp-dir flag, which will clean up the contents of /etc/kubernetes/tmp , The default is false.
PR#112732 kubeadm adds validation for container registry format in configuration.
PR#111783 When the CertificateAuthorityData of the kubeconfig read by kubeadm is empty, it will try to load the CA certificate from the external CertificateAuthority file.
PR#112508 Allow keys in RSA and ECDSA formats during preflight check.
PR#110972 kubeadm reset will try to clean up stale data when executing, and the old data will be cleared when each reset phase is executed , the default etcd data directory will be deleted when the remove-etcd-member phase is executed.
PR#112751 Fixed a bug when validating ClusterConfiguration network-related fields (dnsDomain, serviceSubnet, podSubnet).
Other¶
PR#111277 Optimize the error message when kubeadm runs subcommands.
PR#112008 Since the node-role.kubernetes.io/master taint is no longer set in the control plane node in 1.25, kubeadm is no longer for CoreDNSDeployment Set node-role.kubernetes.io/master toleration.
PR#112000 The kubeadm init|join|upgrade command removes the --container-runtime flag since there is only one flag since dockershim was removed A value that can be set with --container-runtime=remote.
Client-Go¶
PR#112200 The SharedInformerFactory of client-go adds a Shutdown method to wait for all running informers in the Factory to end.
REMOVE FUNCTIONALITY¶
The new version removes the GA feature Feature Gate:
-ServiceLoadBalancerClass - ServiceLBNodePortControl -CSRDuration - DefaultPodTopologySpread -NonPreemptingPriority - PodAffinityNamespaceSelector -PreferNominatedNode - PodOverhead - UnversionedKubeletConfigMap - Indexed Job -SuspendJob
Function downgrade¶
Beta -> Alpha
LocalStorageCapacityIsolationFSQuotaMonitoring was upgraded to Beta in v1.25, but was rolled back to Alpha due to the problem that the ConfigMap would not be properly synced to the Pod file system after the update.
Release History¶
-
Kubernetes 1.25 officially released, major breakthrough in many aspects
-
Kubernetes 1.23 is officially released, what are the enhancements?
-
Kubernetes 1.22 subverts your imagination: you can enable Swap, launch a PSP replacement, and...
-
Kubernetes 1.21 shock release | PSP will be abolished, BareMetal will be enhanced