Managing Kubernetes without losing your cool
DevOps Notts 29th March 2022
A presentation at DevOps Notts in March 2022 in by Marcus Noble
DevOps Notts 29th March 2022
Hi,
I’m Marcus Noble, a platform engineer at Giant Swarm.
I’m found around the web as AverageMarcus in most places and @Marcus_Noble_ on Twitter.
I have about 5 years experience running Kubernetes in production environments.
I also like home automation, IOT and 3D printing.
My 10 tips for working with Kubernetes
#1 → #5 Anyone can start using these today #6 → #7 Good to know a little old-skool ops first #8 → #10 Good have some programming knowledge
OK, this one is kinda tongue in cheek but worth mentioning. If you have dozens or hundreds of clusters on-top of other development work you’re going to be stretched thin. Getting someone else to manage things while you focus on what makes your business money can often be the right choice.
Create your own workflow of tasks you perform often. Avoid typos and ”fat fingering” by replacing long, complex commands with short aliases (bonus points for adding help text to remind you later)
k get pods -A
k explain pods.spec.containers
KIND: Pod
VERSION: v1
RESOURCE: containers <[]Object>
DESCRIPTION:
Save time by only typing k
.
Kubectl explain for digging into resources and their properties (useful when you can’t access the official docs or know exactly what you’re looking for)
kubeswitch my fave as it supports directory of kubeconfigs to make organising easier
Interactive terminal. Supports all resource types and actions. Lots of keybinding and similar to quickly work with a cluster. Find, view, edit, port forward, view logs, delete, etc.
Plugins can be in any language.
You can easily add your own by creating Bash scripts with a kubectl-
prefixed name.
Note: autocomplete is a bit trickier here. Some plugins support it but generally expect your tabcompletion to only recommend core kubectl features.
My 10 tips for working with Kubernetes
#1 → #5 Anyone can start using these today - Done #6 → #7 Good to know a little old-skool ops first #8 → #10 Good have some programming knowledge
Not so scary so far, right? Now on to a little more hands-on techniques.
Launch a temporary pod running a bash shell for cluster debugging
alias kshell='kubectl run -it --image bash --restart Never --rm shell'
Need more tools? Replace bash
with ubuntu
Great for more general debugging of a cluster, especially with networking issues or similar.
Launch a temporary pod running a bash shell for cluster debugging
# kshell If you don't see a command prompt, try pressing enter. bash-5.1# nslookup google.com
Server: 1.1.1.1
Address: 1.1.1.1:53
Non-authoritative answer:
Name: google.com
Address: 142.250.187.206
Great for more general debugging of a cluster, especially with networking issues or similar.
Debugging a running pod - kubectl exec
# kubectl exec my-broken-pod -it -- sh
error: Internal error occurred: error executing command in container: failed to exec in container: failed to start exec……
Debugging a running pod - kubectl debug (Requires Kubernetes 1.23)
# kubectl debug -it --image bash my-broken-pod Defaulting debug container name to debugger-gprmk. If you don't see a command prompt, try pressing enter. bash-5.1#
kubectl exec
is great for debugging misconfigured pods that aren’t crashing and have enough OS to exec into.
But…
If the pod is CrashLooping you’ll get kicked out of the session when it crashes.
If the pod doesn’t have a shell you can exec into (e.g. a container that only has a Golang binary) you’ll not be able to exec
kubectl debug
is great for pods that either don’t have any OS
Example - investigate a CrashLooping pod
# kubectl run debug-demo --image=bash -- exit 1
# kubectl get pods debug-demo
NAME debug-demo
READY 0/1
STATUS CrashLoopBackOff
RESTARTS 2 (20s ago)
AGE 44s
(This will prevent us from kubectl exec
into the pod)
# kubectl debug -it --image bash debug-demo
Defaulting debug container name to debugger-5mkjj.
If you don't see a command prompt, try pressing enter.
bash-5.1#
kubectl debug
has a few different modes:
kubectl debug
kubectl debug –copy-to
kubectl debug node/my-node
This has some limitations
When to use what:
Multiple workloads experiencing network issues - kshell
Workload not running as expected but not CrashLooping and isn’t a stripped down image (e.g. not Scratch / Distroless) - kubectl exec
Workload not running as expected but not CrashLooping and has an image based on Scratch / Distroless or similar - kubectl debug
Workload is CrashLooping - kubectl debug
sh -c "$(curl -sSL https://raw.githubusercontent.com/AverageMarcus/kube-ssh/master/ssh.sh)"
[0] - ip-10-18-21-146.eu-west-1.compute.internal
[1] - ip-10-18-21-234.eu-west-1.compute.internal
[2] - ip-10-18-21-96.eu-west-1.compute.internal
Which node would you like to connect to? 1
If you don't see a command prompt, try pressing enter.
[root@ip-10-18-21-234 ~]#
Why? - I prefer to use ephemeral instances with minimal needed to run Kubernetes, no sshd, no port 22 open etc. but there are times when you just need to check what’s actually going on with the underlying host machine.
Always verify a shell script before you run it! Ideally, download it first and run that instead.
Why? Smaller potential attack surface. Less chance of “hotfixes” or “tweaks” being forgotten about.
#1 → #5 Anyone can start using these today - Done #6 → #7 Good to know a little old-skool ops first - Done #8 → #10 Good have some programming knowledge
Two types of webhooks:
Implement more advanced access control than is possible with RBAC.
Add default labels to resources as they’re created.
Enforce policies such as not using latest as an image tag or ensuring all workloads have resource requests/limits specified.
“Hotfix” for security issues (e.g. mutating all pods to include a LOG4J_FORMAT_MSG_NO_LOOKUPS env var to prevent Log4Shell exploit).
Allows for subtractive access control (take away a users ability to perform a certain action against a certain resource) - something not possible with RBAC
See blog post about how we avoided a nasty bug in our CLI tool with a ValidatingWebhook. https://www.giantswarm.io/blog/restricting-cluster-admin-permissions
Notes:
Webhooks can break a cluster. Make sure your service is resilient and that your webhooks don’t block critical workloads.
Webhooks can be backed by either services within the cluster or pointing to an URL outside of the cluster.
All Kubernetes operations are done via the API - kubectl uses it, in-cluster controllers use it, the scheduler uses it and you can use it too!
Currently using OpenAPI V2 (OpenAPI V3 available as an alpha feature in v1.23)
The API can be extended either by Custom Resource Definitions (CRDs) or by implementing an Aggregation Layer (such as what metrics-server implements).
You can easily try out the API using kubectl with the —raw argument.
# kubectl get --raw /api/v1/namespaces/default/pods {"kind":"PodList","apiVersion":"v1","metadata":{"selfLink":...
If no host is provided kubectl will use the API of the current context.
HTTP Method to Kubectl command mappings: GET - kubectl get —raw POST - kubectl create —raw DELETE - kubectl delete —raw PUT - kubectl replace —raw
To target another cluster not set as your current kubeconfig context you can specify the full URL of the endpoint.
Be aware that not all kubectl commands map to a single API call. Lots do several API calls under the hood.
Not sure what APIs are available?
# kubectl api-resources NAME bindings componentstatuses configmaps endpoints deployments
API Endpoint format:
/{API_VERSION}/namespace/{NAMESPACE_NAME}/{RESOURCE_KIND}/{NAME}
If APIVERSION is just v1 the endpoint starts with /api/v1/
E.g. /api/v1/componentstatuses
The “core” API is accessible on /api/v1
Otherwise, the endpoint starts with /apis/{APIVERSION}/ (Note the extra ‘s’)
E.g. /apis/apps/v1/
APIs added to kubernetes in later versions are available on the /apis
endpoint. This include the built-in ones like Deployments as well as Custom Resources
The NAMESPACED column indicates if the resource is bound to a namespace.
If false: /api/v1/componentstatuses If true: /apis/apps/v1/namespaces/default/deployment
Resources:
Where is this useful?
Make use of one of the many client libraries available rather than interacting with the REST endpoint directly. Plenty more official clients available at https://github.com/kubernetes-client
Extend Kubernetes’ built-in API and functionality with your own Custom Resource Definitions (CRDs) and business logic (operators).
Frameworks
References
This topic is too large to cover within this talk, there are already plenty of better resources available.
Kubebuilder tends to be the most popular framework and used by all of the cluster-api projects.
#1 → #5 Anyone can start using these today - Done #6 → #7 Good to know a little old-skool ops first - Done #8 → #10 Good have some programming knowledge - Done
#1 - Love your terminal - Shell aliases and helpers #2 - Learn to love kubectl - Alias k, kubectl explain #3 - Multiple kubeconfigs - Kubeswitch #4 - k9s - Interactively work with clusters #5 - Kubectl plugins - Krew. Build your own with bash. Kubectl- prefixed name #6 - kshell / kubectl debug - Pod debugging #7 - kube-ssh - Node debugging #8 - Webhooks - Validating and mutating requests to the Kubernetes API #9 - Kubernetes API - Working directly with the API to build our own logic #10 - CRDs & Controllers - Extending Kubernetes with our own resources and logic
🧡