Practice of Fine-grained Cgroups Resources Scheduling in Kubernetes

Qingcan Wang, Xianlu Chen at KubeCon + CloudNativeCon North America 2020

Alibaba supports resource scheduling for hundreds of thousands of nodes, millions of containers, and tens of thousands of applications. Many online services need to dynamically increase the resource limit during operation, and cannot accept the impact of restart.Other applications may require NUMA awareness, CPU Core binding, reduce data copy between CPU caches, and speed up data processing tasks. We have developed a combined scheduling system based on Kubernetes Scheduler framework and Cgroups controller. The scheduler perceives cgroups level resources, such as numa, cpu core, memory limit, etc., and applies scheduler dynamic scheduling to specified nodes, while allowing certain Pod is bound to the specified cpu core. The cgroups controller can also dynamically adjust the pod resources limit without causing the Pod to restart.