VMware Cloud Foundation is a unified SDDC platform for the hybrid cloud. It is based on VMware’s compute, storage, and network virtualization.
VCF can be expanded with more workload domains by adding further hosts, or it can be stretched over two availability zones (AZ). The expansion is initiated by and under control of the SDDC-Manager. The procedure is fairly straightforward and SDDC-Manger does all the configuration tasks in the background, i.e. forming vSAN clusters, networks, kernel ports, vCenters and NSX control planes.
setup hosts with ESXi base image
confige a management IP address
set root credentials
configure DNS and NTP
import new hosts into SDDC-Manager
deploy new WLD
There is a pitfall that can be easily overseen: The order of the new host’s NICs. Before we can import new hosts, we’ll get to see a checklist about the host requirements. The hosts need to have two NICs with at least 10 GBit.
While reading the list there’s a little detail that is often overlooked. Traditional numbering means that both NICs must have numbers vmnic0 and vmnic1. Unfortunately this seems to be hard coded and cannot be changed (as of current version 4.2). To make matters worse, many server systems have onboard 1 GBit network adapters. There’s a KB article that explains how VMware ESXi determines the order in which names are assigned to network devices. It’ll start with onboard NICs and then continues with PCIe cards. As a result you’ll might end up with two 1 GBit onboard NICs as vmnic0 and vmnic1. In this case the bringup of the VCF expansion will fail.
While you can choose NICs during initial VCF bringup, this is not possible during expansion and this time there’s no such thing as a bringup sheet. You can’t select more than two NICs either when using SDDC-Manager. In that case you’ll need to use API-calls.
Currently there’s no other way than to disable onboard NICs in the system BIOS. If your desired NICs still show a higher number you’ll need to put the PCIe card into a lane with lower number.
As part of a VMware Cloud Foundation (VCF) greenfield deployment, the Cloud Builder appliance is used for one-time use. It automatically deploys the management infrastructure of a VCF cluster and can be discarded afterwards.
The ideal situation is that the previously created workbook or JSON is processed and the cluster is successfully created.
In the UI of the Cloud Builder, however, there is no option to reset the wizard and restart it from zero. For example, when requirements have changed and a new or adapted workbook is to be used. Or you want to use the appliance for another rollout. In this case, the appliance would have to be completely redeployed. Any errors in the JSON file cannot be corrected this way either.
However, there is a trick to reset the Cloud Builder to zero and feed it with a modified JSON file. This is thanks to an API call that may have been ‘forgotten’ during development. In order to do so, we have to log in to the console of the Cloud Builder as user root.
[Optional] It may be easier to grant the root user temporary SSH access. Log in to the VM console as root and edit the sshd configuration.
sudo vi /etc/ssh/sshd_config
Browse the sshd_config and look for the line PermitRootLoginno. Disable the line by putting a # in front of it.
# PermitRootLogin no
Save the configuration and open a SSH session as user root. We now can execute an API call as user root.
curl -X GET http://localhost:9080/bringup-app/bringup/sddcs/test/deleteAll
Login to the web UI of the Cloud Builder appliance. You now can start from the beginning.
ESXi on Intel x86 architecture has been a commodity for many years now. In recent years and during VMworld for example we’ve seen early alpha versions of ESXi running on Arm architecture like smart NICs or even Raspberry Pi. Meanwhile VMware developers published a Fling named ESXi Arm Edition to deploy ESXi on Arm architecture. Of course this is a lab project and not supported by VMware for production workloads. But anyway, it’s a great opportunity to play around with ESXi on a cheap and tiny computer like Raspberry Pi. I will not explain how to deploy ESXi on Arm. Check the detailed documentation on the Fling project page (PDF). I will focus on day-2 operation.
I would like to thank William Lam for providing a lot of background information, hacks and tricks around PhotonOS and ESXionArm.
Now I’ve got an ESXi host on my Raspi. What can I do with it?
Just a few remarks before we start:
You can’t run any workload on the ESXi on Arm platform. As the project name says, it’s an Arm architecture, So you can’t run operating systems based on Intel architecture. All guest VMs need to be made for Arm architecture. That will rule out Windows guest systems and also most Linux distributions. But luckily there are a couple of Linux distributions made specific for Arm architecture like Ubuntu Server for Arm, or Photon OS. For my demonstration I chose the latest Photon OS (version 4 beta). As host hardware I’m using the “big” Raspberry Pi 4 with 8 GB RAM. You can imagine that 8 GB of RAM isn’t very much for host OS and guest VMs. We have to use resources sparingly.
Heat sink for Raspberry Pi4 (your Raspi will become hot without)
SD-card (only for UEFI BIOS)
USB stick for ESXi installation
USB 3 hub with external power supply (Raspi doesn’t provide reliable power on USB port for an NVMe SSD)
USB 3 NVMe M.2 case
Samsung NVMe EvoPlus 250 GB M.2
Using ESXi on Arm in standalone mode
Although I have joined my ESXi on Raspi to my vCenter 7, I will not use any vCenter features. All works are done like on a standalone ESXi (with all the shortcomings and limitations).
First we need 3 VMs for the 3 K3s nodes. It’s a good idea to build a VM with everything we need except K3s and then clone it. Well, if you think cloning a VM on a standalone ESXi on Arm host is just a mouse click in the UI, then welcome to the real world. 😉 I will come to that point later. Let’s build our first Photon OS VM.