GPU enabled K8s Clusters in vSphere with Tanzu

Using GPU in container workloads is an important demand by developers who work with machine learning and artificial intelligence.

You can create a custom VM class where a VI admin can define a vGPU specification for that class. Developers can use this class to assign GPU resources to the workload. The vm class will define node placement an vGPU profile.

This not only available to GPU enabled TKG clusters, but also for standalone VMs. The use of custom classes will simplify the consumption of GPU resources in ML/AI applications.

See a sample class below

kind: TanzuKubernetesCluster
apiVersion: run.tanzu.vmware.com/v1
metadata:
  name: GPU-Cluster
spec:
  topology:
    workers:
      count: 3
      class: gpu-vmclass
  distribution: v1.20.2

This class can be consumed for example in a VM

kind: VirtualMachine
metadata:
  name: gpu-vm
  namespace: tkg-dev
spec:
  networkInterfaces:
  - networkName: "dev-network"
    networkType: vsphere-distributed
  classname: gpu-vmclass
  imageName: ubuntu-custom-gpu    
  storageClass: GPU-vm-policy

This blogpost used to be part of my recent vSphere7 Update3 What’s new artice, but has been withdrawn at VMware’s request with an extended embargo until October 5 2021.