VMware Bitfusion and Tanzu – Part 3: Utilize GPU from Kubernetes Pods and TKGS

This will be a multi-part post focused on the VMware Bitfusion product. I will give an introduction to the technology, how to set up a Bitfusion server and how to use its services from Kubernetes pods.

We saw in parts 1 and 2 what Bitfusion is and how to set up a Bitfusion Server cluster. The challenging part is to make this Bitfusion cluster usable from Kubernetes pods.

In order for containers to access Bitfusion GPU resources, a few general conditions must be met.

I assume in this tutorial that we have a configured vSphere-Tanzu cluster available, as well as a namespace, a user, a storage class and the Kubernetes CLI tools. The network can be organized with either NSX-T or distributed vSwitches and a load balancer such as the AVI load balancer.

In the PoC described, Tanzu on vSphere was used without NSX-T for simplicity. The AVI load balancer, now officially called NSX-Advanced load balancer, was used.

We also need a Linux system with access to Github or a mirror to prepare the cluster.

The procedure in a nutshell:

  • Create TKGS cluster
  • Get Bitfusion baremetal token laden and create K8s secret
  • Load Git project and modify makefile
  • Deploy device-plugin to TKGS-cluster
  • Pod deployment
Continue reading “VMware Bitfusion and Tanzu – Part 3: Utilize GPU from Kubernetes Pods and TKGS”