Resource Requests vs Limits

Kubernetes employs requests and limits to control resources. Requests are guaranteed resources that a container is entitled to use. Limits, on the other hand, are the maximum resources or threshold a container can use. After reaching the limits, containers will be restricted. If a container requests a resource, Kubernetes will only schedule it on an available node that can provide those resources. These resources and limit are defined in the standard YAML configuration of your containers. In Kubernetes, there are two types of resources: CPU and Memory. CPU is measured in core units, and memory is specified in bytes.

CPU

CPU resources are measured in millicore. If a node has 2 cores, the node’s CPU capacity would be represented as 2000m. The unit suffix m stands for “thousandth of a core.”

1000m or 1000 millicore is equal to 1 core. 4000m would represent 4 cores. 250 millicore per pod means 4 pods with a similar value of 250m can run on a single core. On a 4 core node, 16 pods each having 250m can run on that node.

Next, unless the apps require multi-core processing such as a multi-threaded database, the best practice is to define to 1000 or below. Then, run more replicas to scale out those applications. It is important to note, pods will never be scheduled if they are defined more than the node’s capacity. A pod cannot have a definition of 3000m on a 2 core node.

Keep in mind, CPU is a compressible resource. In simple terms, applications will start throttling once they hit the CPU limits. Throttling can adversely affect your application’s performance by making it run slower. Kubernetes will not terminate those apps. Hence, you should take this into consideration as you architect your applications.

Memory

Memory is measured in bytes. However, you can express memory with various suffixes (E,P,T,G,M,K and Ei, Pi, Ti, Gi, Mi, Ki) to express mebibytes (Mi) to petabytes (Pi). Most simply use Mi.

Like CPU, pods will never be scheduled if they require more resources than the capacity of a node. Unlike CPU, memory is not compressible. You can’t make memory run slower or faster like CPU or network throttling. Pods will be terminated if it reaches the memory limit.

Example configuration for defining Resources

Since K8 has no defaults, we need to define the resources in YAML format. A configuration may look like this:

apiVersion: v1
kind: Pod
metadata:
  name: wordpressapp
spec:
  containers:
    - name: db
      image: mysql
      resources:
        requests:
          memory: "64Mi"
          cpu: "250m"
        limits:
          memory: "512Mi"
          cpu: "1000m"
    - name: wp
      image: wordpress
      resources:
        requests:
          memory: "64Mi"
          cpu: "250m"
        limits:
          memory: "256Mi"
          cpu: "500m"

Here, we have an example LAMP stack Wordpress application with MySQL DB and the Wordpress image. Both the db and Wordpress containers are provisioned with 64Mi (megabyte) of RAM AND 250 millicore of

CPU (or a 1/4th of a core). These settings are defined in the requests block.

requests:
  memory: "64Mi"
  cpu: "250m"

In terms of limits, the Wordpress image is set to 256Mi with a 500m CPU limit. The database has double the limit at 512Mi and a full core with a 1000m setting. Since database applications tend to be more resource intensive, we can test with a higher estimate.

limits:
  memory: "512Mi"
  cpu: "1000m"

If you feel the database requires more resources, you can increase the limits to 1GB of memory and 2 CPU cores.

limits:
  memory: "1024Mi"
  cpu: "2000m"

Here are some important things to remember. A limit can never be lower than the request. Kubernetes will error out if you attempt to do this. If a container’s request is higher than a node’s capacity, Kubernetes will never schedule that container. For example, if your application is specified to use 3.5 cores on a 2 core node, that specific application will never be deployed.