Using GPU in container workloads is an important demand by developers who work with machine learning and artificial intelligence.
You can create a custom VM class where a VI admin can define a vGPU specification for that class. Developers can use this class to assign GPU resources to the workload. The vm class will define node placement an vGPU profile.
This not only available to GPU enabled TKG clusters, but also for standalone VMs. The use of custom classes will simplify the consumption of GPU resources in ML/AI applications.
See a sample class below
This class can be consumed for example in a VM
- networkName: "dev-network"
This blogpost used to be part of my recent vSphere7 Update3 What’s new artice, but has been withdrawn at VMware’s request with an extended embargo until October 5 2021.
This blogpost was under embargo until 28th of September 2021 8:00am (PT) / 17:00 (CEST). The fact that you can read this now means that vSphere 7 Update 3 has (probably) already been released.
[Update 29th Sept 2021]: Download is not yet available. Maybe we need to wait until VMworld2021 next week.
VMware vSphere 7 Update3 comes with a wide range of innovations. They can be categorized into the sections below:
- Tanzu with Kubernetes
- Lifecycle, Upgrade and Patching
- Artificial Intelligence & Machine Learning
- Resource Management
- Availability & Resiliency
- Security & Compliance
- Guest OS and Workloads
- vSphere Management & APIs
Another bunch of features goes into vSAN. But these features will be covered in an extra post.
Continue reading “vSphere 7 Update 3 – What’s New”
VMworld 2021 is now taking place virtually for the second time. Even if this is not a real substitute for an on-site event, it is a good source of information in times of pandemic.
VMworld will take place virtually in several time zones between October 5 and 7, 2021. Participation is free of charge. All you need to do is register on the VMworld portal page.
This list are my selected favorites. The order is purely alphabetical and no ranking. A shift towards Kubernetes, Modern Apps and GPUaaS has been noticeable in recent years. The latter was added in the current year by a Bitfusion with Kubernetes project. However, some classics like Frank’s 60 Minutes of NUMA, talks by Cormac Hogan and Duncan Epping are still part of the must-attend program.
- 10 Things You Need to Know About Project Monterey [MCL1833] – Sudhanshu Jain, Niels Hagoort
- 60 Minutes of Non-Uniform Memory Access (NUMA) 3rd Edition [MCL1853] – Frank Denneman
- Antrea and NSX-T update for Container Networking [CODE2743] – Tuan Loc Nguyen, Rahul Dondeti
- Architect the Enterprise Data Center for AI with VMware and NVIDIA [VI1501] – James Brogan, Joe Cullen
- Attach GPU Anywhere with vSphere Bitfusion Extension [VMTN2801] – Tiejun Chen
- Build and Publish a PowerShell Module to the PowerShell Gallery [CODE2756] – David Stamen
- Deep Dive on Logical Routing in NSX-T [NET1443] – Francois Tallet, Nicolas Michel
- Deep Dive on vSphere with Tanzu Updates [APP2063] – Karthik Balachandran
- Extreme Performance Series: Performance Best Practices [MCL1635] – Mark Achtemichuk, Valentin Bondzio
- Get the Most Out of VMware NSX Data Center with Advanced Load Balancing [NET1791] – Dan Watson
- How to measure and improve the performance of your HCX migrations? [VMTN3225] – Agnieszka Koziorowska
- Live Coding: Terraforming Your vSphere Environment [CODE2755] – Kyle Ruddy
- Loop, Swoop and Pull – PowerCLI Will be as Easy as Tying Your Shoes! [CODE2744] – Justin Sider
- Maximize GPU Utilization with VMware Tanzu/Kubernetes and vSphere Bitfusion [VI1624] – Earl Ruby
- Modernize Windows Apps: Introduction to Windows Containers on Kubernetes [APP1999] – Stuart Preston
- NSX Advanced Threat Prevention: Deep Dive [SEC1376] – Stijn Vanveerdeghem
- NVMe/TCP – The Future of Storage Connectivity [MCL2766S] – Paul Turner, Ihab Tarazi
- Project Monterey: Present, Future and Beyond [MCL1401] – Sudhanshu Jain, Simer Singh
- The Future of VM Provisioning – Enabling VM Lifecycle Through Kubernetes [APP1564] – Myles Gray, Nikitha Suryadevara
- VEBA Revolutions – Unleashing the Power of Event-Driven Automation [CODE2773] – William Lam, Michael Gasch
- VMware Cloud Foundation Tips and Tricks from the Trenches [MCL1025] – Dharmesh Bhatt, Paudie O’Riordan
- VMware vSAN – Dynamic Volumes for Traditional and Modern Applications [MCL1084] – Duncan Epping, Cormac Hogan
- vSAN Technical Deep Dive [MCL1654] – Biswapati Bhattacharjee, Junchi Zhang
- Want to deploy your SDDCLab in about an hour? [VMTN3192] – Luis Chanu, Rutger Blom
- What’s New in NSX-T [NET2354] – Varun Santosh, Soumee Phatak
- What’s New in vSphere [APP1205] – Himanshu Singh, Ken Werneburg
This will be a multi-part post focused on the VMware Bitfusion product. I will give an introduction to the technology, how to set up a Bitfusion server and how to use its services from Kubernetes pods.
We saw in parts 1 and 2 what Bitfusion is and how to set up a Bitfusion Server cluster. The challenging part is to make this Bitfusion cluster usable from Kubernetes pods.
In order for containers to access Bitfusion GPU resources, a few general conditions must be met.
I assume in this tutorial that we have a configured vSphere-Tanzu cluster available, as well as a namespace, a user, a storage class and the Kubernetes CLI tools. The network can be organized with either NSX-T or distributed vSwitches and a load balancer such as the AVI load balancer.
In the PoC described, Tanzu on vSphere was used without NSX-T for simplicity. The AVI load balancer, now officially called NSX-Advanced load balancer, was used.
We also need a Linux system with access to Github or a mirror to prepare the cluster.
The procedure in a nutshell:
Continue reading “VMware Bitfusion and Tanzu – Part 3: Utilize GPU from Kubernetes Pods and TKGS”
- Create TKGS cluster
- Get Bitfusion baremetal token laden and create K8s secret
- Load Git project and modify makefile
- Deploy device-plugin to TKGS-cluster
- Pod deployment