Single Node Kubeflow 1.7 cluster with Nvidia GPU support
Kubeflow
At Kubeflow 1.4 on a Minikube Kubernetes Node I described the setup of a Kubeflow/Minikube setup. Still getting the GPUs provided to Kubeflow is difficult if Minikube is used. Minikube only supports GPUs for selected drivers. The none
driver is not recommend, https://minikube.sigs.k8s.io/docs/drivers/none/, and the kvm2
driver adds another layer of virtualization I like to avoid.
Still, Minikube has the advantage that it provides network plugins and storage provisioners out of the box. However, I went the long and difficult road to setup a pure Kubernetes cluster for Kubeflow which includes Calico as network plugin and Rook/Ceph as storage provisioner. Such a setup, scales easily to a multi-node cluster, since Kubernetes as well as Rook/Ceph are designed for such use cases. Only due to our limited hardware equipment, we restrict our setup to a single-node. A multi-node cluster would offer redundancy, high-availabily and scalability, characteristics desired in production environments.
Our Hardware Setup
Our setup is based on an ASUS ESC4000A-E10 with
- 2x AMD EPYC 7413 24
- 512 GB RAM DDR4-3200
- 7x Nvidia A30 GPU
- 2x 1,92 TB SSD/NVMe 2.5" system partition as software raid 1
- 2x 15,3 TB NVMe M.2 data partition as Ceph redundant partitions.
Our Software Setup
The most difficult part is the selection to find a set of working components, so far I used:
I installed Ubuntu 22.04 with:
- Containerd
- In previous installations, it was difficult to find a working container engine, for now containerd seems to work good.
- Kubernetes 1.25
- Kubeflow 1.7 is tested with Kubeflow 1.24/1.25.
- Rook and Ceph as storage provider
- Kubeflow uses persistent volume claims, therefore a storage provider is required that can provide them. We use Ceph as single node installation.
- Nvidia GPU Operator to make GPUs available to notebooks
- Kustomize v5.0.3
- Since Kubeflow 1.3
kustomize
manifests are used to deploy Kubeflow.
- Since Kubeflow 1.3
- Kubeflow 1.7
Software Installation
This section gives a brief summary of the commands used for installation and references to the related documentation.
Preparation
Create /etc/sysctl.d/kubeflow.conf
and insert
fs.inotify.max_user_instances = 1280
which seems to solve the problem related to https://github.com/kubeflow/manifests/issues/2087
.
According to https://kubernetes.io/docs/setup/production-environment/container-runtimes/#install-and-configure-prerequisites, load required kernel modules and enable bridging and forwarding.
cat <<EOF | sudo tee /etc/modules-load.d/k8s.conf
overlay
br_netfilter
EOF
sudo modprobe overlay
sudo modprobe br_netfilter
# sysctl params required by setup, params persist across reboots
cat <<EOF | sudo tee /etc/sysctl.d/k8s.conf
net.bridge.bridge-nf-call-iptables = 1
net.bridge.bridge-nf-call-ip6tables = 1
net.ipv4.ip_forward = 1
EOF
# Apply sysctl params without reboot
sudo sysctl --system
Install containerd
See https://docs.docker.com/engine/install/ubuntu/
sudo apt-get update
sudo apt-get install ca-certificates curl gnupg lsb-release
curl -fsSL https://download.docker.com/linux/ubuntu/gpg | sudo gpg --dearmor -o /usr/share/keyrings/docker-archive-keyring.gpg
echo "deb [arch=$(dpkg --print-architecture) signed-by=/usr/share/keyrings/docker-archive-keyring.gpg] https://download.docker.com/linux/ubuntu $(lsb_release -cs) stable" | sudo tee /etc/apt/sources.list.d/docker.list > /dev/null
sudo apt-get update
sudo apt-get install containerd.io
Install Kubernetes
Kubeflow 1.7 is tested with 1.24/1.25, we use the newer Kubernetes release 1.25.
Since we use Ubuntu with systemd, configure containerd to use systemd cgroup driver, see https://kubernetes.io/docs/setup/production-environment/container-runtimes/#containerd:
sudo su
containerd config default > /etc/containerd/config.toml
exit
Set SystemdCgroup = true
in section [plugins."io.containerd.grpc.v1.cri".containerd.runtimes.runc.options]
in /etc/containerd/config.toml
.
and restart `containerd``
sudo systemctl restart containerd
afterwards install Kubernetes.
Prepare the software repositories
sudo apt-get install -y apt-transport-https ca-certificates curl
curl -fsSL https://packages.cloud.google.com/apt/doc/apt-key.gpg | sudo gpg --dearmor -o /etc/apt/keyrings/kubernetes-archive-keyring.gpg
echo "deb [signed-by=/etc/apt/keyrings/kubernetes-archive-keyring.gpg] https://apt.kubernetes.io/ kubernetes-xenial main" | sudo tee /etc/apt/sources.list.d/kubernetes.list
sudo apt update
Install kubernetes command line tools
Install the latest minior release of 1.25, which is 1.25.11-00.
KVER=1.25.11-00
sudo apt-get install -y kubelet=$KVER kubeadm=$KVER kubectl=$KVER
sudo apt-mark hold kubelet kubeadm kubectl
Set the packages to hold to avoid upgrades.
Init the Cluster and install the Network Plugin Calico
Note, that the master role is removed for our single node cluster, so that pods are scheduled in the master, too.
sudo kubeadm init --pod-network-cidr=192.168.0.0/16 --kubernetes-version="1.25"
mkdir -p $HOME/.kube
sudo cp -i /etc/kubernetes/admin.conf $HOME/.kube/config
sudo chown $(id -u):$(id -g) $HOME/.kube/config
kubectl create -f https://raw.githubusercontent.com/projectcalico/calico/v3.26.1/manifests/tigera-operator.yaml
kubectl create -f https://raw.githubusercontent.com/projectcalico/calico/v3.26.1/manifests/custom-resources.yaml
kubectl taint nodes --all node-role.kubernetes.io/master-
We also increase number if maximum pods from 110 to 200 in /etc/systemd/system/kubelet.service.d/10-kubeadm.conf
and add --max-pods=243
to ExexStart
since Kubeflow schedules pods for any user logged in once.
Install the storage provider
Kubeflow requires a storage provider. For our setup, we use Ceph deployed by Rook. Therefore, we provide two spare disks that are initialized by Ceph. Ensure that the disks do not have any filesystem, otherwise Ceph does not use them.
To wipe a filesystem see https://rook.github.io/docs/rook/latest/ceph-teardown.html.
Typically, Rook/Ceph is used in a multi-node cluster for high-availability. Here, we make a single node deployment. Example yaml-files are already included at https://github.com/rook/rook.git. The single-node specific configurations are in cluster-test.yaml
and storageclass-test.yaml
. Details see below.
Install a single node Rook/Ceph cluster:
git clone --single-branch --branch master https://github.com/rook/rook.git
cd rook/deploy/examples
kubectl create -f crds.yaml -f common.yaml -f operator.yaml
kubectl create -f cluster-test.yaml
and create the storage class used for the persistent volume claims:
cd csi/rbd
kubectl create -f storageclass-test.yaml
Make this class the default class:
kubectl patch storageclass rook-ceph-block -p '{"metadata": {"annotations":{"storageclass.kubernetes.io/is-default-class":"true"}}}'
Check that the OSD pods rook-ceph-osd-...
are running with
kubectl -n rook-ceph get pods
Check the rook/ceph storage health status, see https://rook.io/docs/rook/v1.11/Upgrade/health-verification/, with the toolbox:
#Change back to deploy/examples in the rook folder
cd ../..
kubectl create -f toolbox.yaml
ROOK_CLUSTER_NAMESPACE=rook-ceph
TOOLS_POD=$(kubectl -n $ROOK_CLUSTER_NAMESPACE get pod -l "app=rook-ceph-tools" -o jsonpath='{.items[*].metadata.name}')
kubectl -n $ROOK_CLUSTER_NAMESPACE exec -it $TOOLS_POD -- ceph status
Add GPU support
Different ways exist as described at https://docs.nvidia.com/datacenter/cloud-native/kubernetes/install-k8s.html .
We select the Nvidia GPU operator, which handles the installation of drivers and additional required libraries.
curl -fsSL -o get_helm.sh https://raw.githubusercontent.com/helm/helm/master/scripts/get-helm-3 && chmod 700 get_helm.sh && ./get_helm.sh
helm repo add nvidia https://nvidia.github.io/gpu-operator && helm repo update
helm install --wait --generate-name -n gpu-operator --create-namespace nvidia/gpu-operator
We had to restart the server, so that GPU-Operator initializes sucessfully.
Test with
cat << EOF | kubectl create -f -
apiVersion: v1
kind: Pod
metadata:
name: cuda-vectoradd
spec:
restartPolicy: OnFailure
containers:
- name: cuda-vectoradd
image: "nvidia/samples:vectoradd-cuda11.2.1"
resources:
limits:
nvidia.com/gpu: 1
EOF
and see the logs
kubectl logs cuda-vectoradd
Install Kustomize v5.0.3
wget https://github.com/kubernetes-sigs/kustomize/releases/download/kustomize%2Fv5.0.3/kustomize_v5.0.3_linux_amd64.tar.gz
tar xvzf kustomize_v5.0.3_linux_amd64.tar.gz
sudo mv kustomize /usr/bin
Install Kubeflow 1.7
See https://github.com/kubeflow/manifests#installation
Preparation
Since Kubeflow 1.4 manifests are used to deploy Kubeflow.
Download the manifests, change into the directory, and checkout the latest release
git clone https://github.com/kubeflow/manifests.git
cd manifests
git checkout v1.7.0
For the next commands stay in this directory.
If desired, set a non-default user password for the default user. First create a password hash with python.
sudo apt install python3-passlib python3-bcrypt
python3 -c 'from passlib.hash import bcrypt; import getpass; print(bcrypt.using(rounds=12, ident="2y").hash(getpass.getpass()))'
and set the hash
option in the file ./common/dex/base/config-map.yaml
to the generated password hash
vi ./common/dex/base/config-map.yaml
Bugfix login stucks in infinite loop, issue see https://github.com/kubeflow/manifests/issues/2423,due to outdated image of authservice, bugfix see https://github.com/kubeflow/manifests/pull/2474 or change referenced image in common/oidc-authservice/base/kustomization.yaml
from
newName: gcr.io/arrikto/kubeflow/oidc-authservice
newTag: e236439
to
newName: gcr.io/arrikto/oidc-authservice
newTag: 0c4ea9a
Install
Install all components with one commands, see https://github.com/kubeflow/manifests#install-with-a-single-command
while ! kustomize build example | awk '!/well-defined/' | kubectl apply -f -; do echo "Retrying to apply resources"; sleep 10; done
Now we should have a running single node Kubeflow cluster, verify with kubectl get pods -A
that all pods are in the states Running
or Completed
.
Login via SSH port forwarding
So far, we do not make the web interface publicly available and use SSH port forwarding.
On the Kubeflow machine expose the port 8080:
kubectl port-forward -n istio-system service/istio-ingressgateway 8080:80 --address=0.0.0.0
On the connecting client forward the local port 8080 to the remote port 8080:
ssh -L 8080:localhost:8080 <remote-user>@<kubeflow-machine>
Open a web browser on the client and open localhost:8080
.
For a better usability use a loadbalancer.
Setup the Loadbalancer with TLS
Adapted from https://v1-5-branch.kubeflow.org/docs/distributions/nutanix/install-kubeflow/#setup-a-loadbalancer-optional
Requirement: Create a valid certificate before.
kubectl apply -f https://raw.githubusercontent.com/metallb/metallb/v0.13.10/config/manifests/metallb-native.yaml
Specify a IPAddressPool in pool.yaml
with a single address:
apiVersion: metallb.io/v1beta1
kind: IPAddressPool
metadata:
name: first-pool
namespace: metallb-system
spec:
addresses:
- 192.168.10.2/32
Specify an advertisement in advert.yaml:
apiVersion: metallb.io/v1beta1
kind: L2Advertisement
metadata:
name: example
namespace: metallb-system
kubectl apply -f pool.yaml
kubectl apply -f advert.yaml
Have your certificate cert.pem
and encrypted private file key.pem
ready and add it:
kubectl create -n istio-system secret tls kubeflowcrt --key=key.pem --cert=cert.pem
Now adapt the istio kubeflow gateway with
kubectl -n kubeflow edit gateways.networking.istio.io kubeflow-gateway
set the spec section to
spec:
selector:
istio: ingressgateway
servers:
- hosts:
- '*'
port:
name: http
number: 80
protocol: HTTP
tls:
httpsRedirect: true
- hosts:
- '*'
port:
name: https
number: 443
protocol: HTTPS
tls:
credentialName: kubeflowcrt
mode: SIMPLE
Change the type of the istio-ingressgateway service to LoadBalancer and get the IP
kubectl -n istio-system patch service istio-ingressgateway -p '{"spec": {"type": "LoadBalancer"}}'
kubectl -n istio-system get svc istio-ingressgateway -o jsonpath='{.status.loadBalancer.ingress[0]}'
This should be the IP you have configured for metallb.
Add REDIRECT_URL
in oidc-authservice-parameters configmap
to data section, where x.x.x.x is your IP address or if you have DNS name use this instead:
kubectl -n istio-system edit configmap oidc-authservice-parameters
apiVersion: v1
data:
AUTHSERVICE_URL_PREFIX: /authservice/
OIDC_AUTH_URL: /dex/auth
OIDC_PROVIDER: http://dex.auth.svc.cluster.local:5556/dex
OIDC_SCOPES: profile email groups
PORT: '"8080"'
REDIRECT_URL: https://x.x.x.x/login/oidc
...
Append https://x.x.x.x/login/oidc
also to redirectURIs in the dex
configmap
kubectl -n auth edit configmap dex
Rollout and restart services
kubectl -n istio-system rollout restart statefulset authservice
kubectl -n auth rollout restart deployment dex
Now Kubeflow should accessible via https://x.x.x.x.
LDAP integration
Change accordingly the fields and the filter
Get the current config:
kubectl get configmap dex -n auth -o jsonpath='{.data.config\.yaml}' > dex-config.yaml
Add the LDAP connector:
cat << EOF >> dex-config.yaml
connectors:
- type: ldap
id: ldap
name: LDAP
config:
host: <LDAP host>
usernamePrompt: username
userSearch:
baseDN: dc=<domain>,dc=<>
filter: (&(objectClass=posixAccount)(|(uid=<username>)))
username: uid
idAttr: uid
emailAttr: mail
nameAttr: givenName
EOF
Apply config
kubectl create configmap dex --from-file=config.yaml=dex-config.yaml -n auth --dry-run=client -oyaml | kubectl apply -f -
Restart Dex
kubectl rollout restart deployment dex -n auth
For details see https://cloudadvisors.net/2020/09/23/ldap-active-directory-with-kubeflow-within-tkg/
Disable unused ports in istio
Istio opens a few ports by default, you can disable (if not required) other than 80 and 443 with by modification of the service
kubectl edit services istio-ingressgateway -n istio-system