Linux user namespace management wit CRI-O in Kubernetes
In this blog post I will introduce user namespaces, then I will show you how you can use it in Kubernetes.
Parts of the K8S Security Lab series
Container Runetime Security
- Part1: How to deploy CRI-O with Firecracker?
- Part2: How to deploy CRI-O with gVisor?
- Part3: How to deploy containerd with Firecracker?
- Part4: How to deploy containerd with gVisor?
- Part5: How to deploy containerd with kata containers?
Advanced Kernel Security
- Part1: Hardening Kubernetes with seccomp
- Part2: Linux user namespace management wit CRI-O in Kubernetes
- Part3: Hardening Kubernetes with seccomp
Network Security
- Part1: RKE2 Install With Calico
- Part2: RKE2 Install With Cilium
- Part3: CNI-Genie: network separation with multiple CNI
- Part3: Configurre network wit nmstate operator
- Part3: Kubernetes Network Policy
- Part4: Kubernetes with external Ingress Controller with vxlan
- Part4: Kubernetes with external Ingress Controller with bgp
- Part4: Central authentication with oauth2-proxy
- Part5: Secure your applications with Pomerium Ingress Controller
- Part6: CrowdSec Intrusion Detection System (IDS) for Kubernetes
- Part7: Kubernetes audit logs and Falco
Secure Kubernetes Install
- Part1: Best Practices to keeping Kubernetes Clusters Secure
- Part2: Kubernetes Secure Install
- Part3: Kubernetes Hardening Guide with CIS 1.6 Benchmark
- Part4: Kubernetes Certificate Rotation
User Security
- Part1: How to create kubeconfig?
- Part2: How to create Users in Kubernetes the right way?
- Part3: Kubernetes Single Sign-on with Pinniped OpenID Connect
- Part4: Kubectl authentication with Kuberos Depricated !!
- Part5: Kubernetes authentication with Keycloak and gangway Depricated !!
- Part6: kube-openid-connect 1.0 Depricated !!
Image Security
Pod Security
- Part1: Using Admission Controllers
- Part2: RKE2 Pod Security Policy
- Part3: Kubernetes Pod Security Admission
- Part4: Kubernetes: How to migrate Pod Security Policy to Pod Security Admission?
- Part5: Pod Security Standards using Kyverno
- Part6: Kubernetes Cluster Policy with Kyverno
Secret Security
- Part1: Kubernetes and Vault integration
- Part2: Kubernetes External Vault integration
- Part3: ArgoCD and kubeseal to encript secrets
- Part4: Flux2 and kubeseal to encrypt secrets
- Part5: Flux2 and Mozilla SOPS to encrypt secrets
Monitoring and Observability
- Part6: K8S Logging And Monitoring
- Part7: Install Grafana Loki with Helm3
Backup
What are user namespaces?
As we talked about in a prewious post container engines uses the linux kernels namespaces to isolate the conatiners. For example, two containers in different network namespaces will not see each other’s network interfaces. Two containers in different PID namespaces will not see each other’s processes.
On Linux, all files and all processes are owned by a specific user id and group id, usually defined in /etc/passwd
and /etc/group
. User namespaces are isolates user IDs and group IDs from each other. With user namespaces the container engine can let a container only see a subset of the host’s user IDs and group IDs.
Why this is important
By default the container engines share the same user namespace in the container as the host use. So If I use the root user with the 0 user ID in a container it is the same ID as he root user use on the host. So if an unprivileged user on a hos has the ability to run containers with this security loophole it can make changes on the host without sudo privilege on the host:
$ docker run -v /etc/:/etc/ -ti ubuntu
root@6803a66e58d0:/# passwd
New password:
Retype new password:
passwd: password updated successfully
root@6803a66e58d0:/# exit
$ su -
Password:
Hello, DevOpsTales! You are a sysadmin now
#
The solution rootless mode
The firs container engine that can be used in rootless mode was podman they used the subuid
and bunguid
to run containers in rootless mode. Normally a user or group has only one ID, but wit subuid
and bunguid
you can allocate an ID segment for the user or groupe.
====================================================================
User Specification
====================================================================
# The following below commands allocates the UIDs and GIDs from 100000to 165535 to the podman user and group respectively.
$ sudo touch /etc/{subgid,subuid}
$ sudo usermod --add-subuids 100000-165535 --add-subgids 100000-165535 ${USER}
$ grep ${USER} /etc/subuid /etc/subgid
/etc/subuid:${USER}:100000:65536
/etc/subgid:${USER}:100000:65536
Now bot podman and docker can be installed in rootless mode. The only problem with rootless mode that Kubernets can not use it.
Kubernetes user namespace management with CRI-O
To solve this problem CRI-O added support for user namespace configuration through pod annotations.
VERSION=1.25
sudo curl -L -o /etc/yum.repos.d/devel_kubic_libcontainers_stable.repo https://download.opensuse.org/repositories/devel:kubic:libcontainers:stable/CentOS_8/devel:kubic:libcontainers:stable.repo
sudo curl -L -o /etc/yum.repos.d/devel_kubic_libcontainers_stable_cri-o_${VERSION}.repo https://download.opensuse.org/repositories/devel:/kubic:/libcontainers:/stable:/cri-o:/${VERSION}/CentOS_8/devel:kubic:libcontainers:stable:cri-o:${VERSION}.repo
yum install cri-o nano wget
cat <<'EOF' | sudo tee /etc/modules-load.d/crio.conf > /dev/null
overlay
br_netfilter
EOF
modprobe overlay
modprobe br_netfilter
cat <<EOF > /etc/sysctl.d/k8s.conf
net.bridge.bridge-nf-call-ip6tables = 1
net.bridge.bridge-nf-call-iptables = 1
net.ipv6.conf.all.disable_ipv6 = 1
net.ipv6.conf.default.disable_ipv6 = 1
net.ipv4.ip_forward = 1
net.bridge.bridge-nf-call-ip6tables = 1
EOF
sysctl --system
free -h
swapoff -a
swapoff -a
sed -i.bak -r 's/(.+ swap .+)/#\1/' /etc/fstab
free -h
# Configure User anmespacing in CRI-O
mkdir /etc/crio/crio.conf.d/
cat <<EOF > /etc/crio/crio.conf.d/01-userns-workload.conf
[crio.runtime.workloads.userns]
activation_annotation = "io.kubernetes.cri-o.userns-mode"
allowed_annotations = ["io.kubernetes.cri-o.userns-mode"]
EOF
nano /etc/containers/registries.conf
unqualified-search-registries = ["registry.access.redhat.com", "registry.redhat.io", "quay.io", "docker.io"]
sed -i.bak -r 's/network_backend = "cni"/#network_backend = ""/' /usr/share/containers/containers.conf
Change the pod-subnet 10.244.0.0/16
in the cri-o bridge config /etc/cni/net.d/100-crio-bridge.conf
nano /etc/cni/net.d/100-crio-bridge.conf
{
"cniVersion": "0.3.1",
"name": "crio",
"type": "bridge",
"bridge": "cni0",
"isGateway": true,
"ipMasq": true,
"hairpinMode": true,
"ipam": {
"type": "host-local",
"routes": [
{ "dst": "0.0.0.0/0" },
{ "dst": "1100:200::1/24" }
],
"ranges": [
[{ "subnet": "10.244.0.0/16" }]
]
}
}
The CRI-O will run the containers with the containers
user so I need to create /etc/subuid
and /etc/subgid
on nodes.
SubUID/GIDs are a range of user/group IDs that a user is allowed to use.
echo "containers:200000:268435456" >> /etc/subuid
echo "containers:200000:268435456" >> /etc/subgid
First I created the id ranges for
root
user because CRI-O runs asroot
Fu the I find the fallowing ERROR in the CRI-O log:Cannot find mappings for user \"containers\": No subuid ranges found for user \"containers\" in /etc/subuid"
Install Kubernetes
cat <<EOF > /etc/yum.repos.d/kubernetes.repo
[kubernetes]
name=Kubernetes
baseurl=https://packages.cloud.google.com/yum/repos/kubernetes-el7-x86_64
enabled=1
gpgcheck=1
repo_gpgcheck=1
gpgkey=https://packages.cloud.google.com/yum/doc/yum-key.gpg https://packages.cloud.google.com/yum/doc/rpm-package-key.gpg
EOF
CRIP_VERSION=$(crio --version | egrep ^Version | awk '{print $2}')
yum install kubelet-$CRIP_VERSION kubeadm-$CRIP_VERSION kubectl-$CRIP_VERSION cri-tools iproute-tc -y
IP=192.168.200.10
mkdir /var/lib/kubelet/
# --node-ip for multi interface configuration
cat <<EOF > /var/lib/kubelet/kubeadm-flags.env
KUBELET_KUBEADM_ARGS="--node-ip='$IP'"
EOF
systemctl enable kubelet.service
systemctl enable --now crio
kubeadm config images pull --cri-socket=unix:///var/run/crio/crio.sock --kubernetes-version=$CRIP_VERSION
kubeadm init --pod-network-cidr=10.244.0.0/16 --apiserver-advertise-address=$IP --kubernetes-version=$CRIP_VERSION --cri-socket=unix:///var/run/crio/crio.sock
mkdir -p $HOME/.kube
sudo cp -i /etc/kubernetes/admin.conf $HOME/.kube/config
sudo chown $(id -u):$(id -g) $HOME/.kube/config
crictl info
kubectl get node -o wide
kubectl get po --all-namespaces
kubectl apply -f https://github.com/coreos/flannel/raw/master/Documentation/kube-flannel.yml
kubectl taint nodes --all node-role.kubernetes.io/master-
kubectl taint nodes --all node-role.kubernetes.io/control-plane-
and docker/containerd can be install in rootless mode. user namespaces it.
Demo time
nano pod.yaml
---
apiVersion: v1
kind: Pod
metadata:
name: not-userns-pod
annotations:
io.kubernetes.cri-o.userns-mode: "auto" # this will not work
spec:
containers:
- command:
- sleep
- 2d
image: registry.fedoraproject.org/fedora-minimal
name: not-userns-ctr
imagePullPolicy: IfNotPresent
status: {}
---
apiVersion: v1
kind: Pod
metadata:
name: standard-pod
spec:
containers:
- command:
- sleep
- 3d
image: registry.fedoraproject.org/fedora-minimal
name: not-userns-ctr
imagePullPolicy: IfNotPresent
status: {}
kubectl aply -f pod.yaml
ps -eo args,pid | grep sleep
sleep 2d 57277
sleep 3d 58878
grep --color=auto sleep 58918
# standard container
cat /proc/58878/uid_map
0 0 4294967295
# namespaced container
cat /proc/57277/uid_map
0 200000 65536
As you can see the container’s uid range is shifted.