I have just noticed on my kubernetes dashboard this:
CPU requests (cores) 0.66 (16.50%)
CPU limits (cores) 4.7 (117.50%)
I am quite confused as to why the limit is set as to 117.50%...? Is one of my service using too much, but wouldn't it be in the requests? Looking into kubectl describe node, I don't see any service using more than 2% (there are 43, which is a total of 86 max).
Thank you.
My approximate understanding is that Kubernetes lets you overcommit — that is, have resource requests on a particular node that exceed the capacity of the node — to let you be a little more efficient with your resource use.
For instance, suppose you're running deployments A and B, both of which require only 100 MB of memory (200 MB total) when they're idle, but require 1 GB of memory when they're actively processing a request. You could set things up to have each one of them run on a node with 1 GB of memory available. You could also put them on a single node with 1.5 GB of memory, assuming that A and B won't have to process traffic simultaneously, thereby saving yourself from a huge resource allocation.
This might be especially reasonable if you're using lots of microservices: you might even know that B can't process data until A has completed a request anyway, providing you a stronger guarantee things won't overlap and cause problems.
How Kubernetes decides to overcommit resources or not depends on the quality of service (QoS) tolerance that you've configured for the deployment. For instance, you won't get overcommitment on the Guaranteed QoS class, but you may see overcommitting if you use the default class, BestEffort.
You can read more about QoS classes in the Kubernetes documentation.
Limits (of all things) are allowed to overcommit the resources of the node. Requests cannot, so that should never be more than 100% of available. Basically the idea is "request" is a minimum requirement, but "limit" is a maximum burst range and it's not super likely everyone will burst at once. And if that is likely for you, you should set your requests and limits to the same value.