Cluster Retreat Mode gone wrong – vSphere Client lockout

With the release of vSphere 7.0 Update 1, vSphere Cluster Services VMs (vCLS) appeared in vSphere clusters for the first time. This made cluster functions such as Distributed Resource Scheduler (DRS) and others independent of the availability of the vCenter Server Appliance (VCSA) for the first time. The latter still represents a single point of failure in the cluster. By outsourcing the DRS function to the redundant vCLS machines, a higher degree of resilience has been achieved.

Retreat Mode

The vSphere administrator has little influence on the provisioning of these VMs. Occasionally, however, it is necessary to remove these VMs from a datastore if it is to be put into maintenance mode, for example. There is a procedure for setting the cluster to retreat mode. This involves setting temporary advanced settings that lead to the deletion of the vCLS VMs by the cluster.

According to the VMware procedure, the Domain ID must be determined to activate Retreat Mode. The domain ID is the numerical value between ‘domain-c’ and the following colon. In the example from my lab, it has the value 8, but the number can also have four digits or more.

The domain ID has to be transferred to the Advanced Settings of the vCenter.

config.vcls.clusters.domain-c8.enabled = false
Correct Retreat Mode settings.

Admin error occured during activation of retreat mode.

After activating retreat mode on a vSAN cluster, administrators had lost all privileges to all objects in the vSphere Client.

A review of the services showed that the vCenter Server Daemon (vpxd) was not running.

Continue reading “Cluster Retreat Mode gone wrong – vSphere Client lockout”

ESXi Boot Media – New Requirements for v8

The requirements for ESXi boot media have changed fundamentally with ESXi v7 Update3. Partitioning has been changed and the requirements for the load capacity of the medium have also increased. I covered this in my blog article “ESXi Bootmedia – New features in v7 und legacy issues from the past v6.x“.

USB boot media did not turn out to be robust enough and were therefore no longer supported from v7U3 onwards. It is still possible to install ESXi on USB media, but the ESX-OSData partition needs to be redirected to permanent storage.

Warning! USB media and SD cards should not be used for production ESXi installations!

Valid setup targets for ESXi deployments

SD cards and USB media are unsuitable as installation targets due to their poor write endurance. Magnetic discs, SSDs and SATA DOMs (disc-on-modules) are still permitted and recommended.

SATA-DOM on a Supermicro E300-9D

New requirements from version ESXi v8 onwards

My Homelab previously used vSAN 7 and thus the classic OSA architecture. To run the cluster under the new vSAN ESA architecture, it was necessary to use vSphere 8 and new storage devices.

I tested the installation and hardware compatibility on a 64 GB USB medium (not recommended and not supported!). During the installation, there were warnings regarding the USB medium as expected. Nevertheless, I was able to successfully test the detection of the NVMe devices and the vCenter deployment.

Setup warning when trying to use an USB flash medium.

Having successfully completed the test phase, I installed ESXi 8U2 on the SATA DOM of my Supermicro E300 server. To my surprise, the setup failed at a very early stage with the message: “disk device does not support OSDATA“.

RTFM

The explaination is simple: “Read the fine manual!”

My 16 GB SATA DOM from Supermicro was simply too small.

The setup guide for ESXi 8 clearly states the requirements under “Storage Requirements for ESXi 8.0 Installation or Upgrade“:

For best performance of an ESXi 8.0 installation, use a persistent storage device that is a minimum of 32 GB for boot devices. Upgrading to ESXi 8.0 requires a boot device that is a minimum of 8 GB. When booting from a local disk, SAN or iSCSI LUN, at least a 32 GB disk is required to allow for the creation of system storage volumes, which include a boot partition, boot banks, and a VMFS-L based ESX-OSData volume. The ESX-OSData volume takes on the role of the legacy /scratch partition, locker partition for VMware Tools, and core dump destination.

VMware vSphere product doumentation

In other words: New installations will require a boot medium of at least 32 GB (128 GB recommended) and upgrading from an ESXi v7 version will require at least 8 GB, but the OSData partition of this installation must already be redirected to an alternate storage device.

Dirty Trick?

Needless to say, I tried a dirty trick. I first successfully installed an ESXi 7U3 on the 16 GB SATA DOM and then performed an upgrade installation to v8U2. This attempt also failed, as the OSData area was not redirected in the fresh v7 installation.

I don’t want to install on USB media as I have seen too many cases where these devices have failed. The only option is to invest in a larger SATA DOM.

I opted for the 64 GB model because it is a good compromise between minimum requirements and cost-effectiveness.

ESXi Config-Backup with PowerCLI requires HTTP

There is a really useful and convenient PowerCLI one-liner for backing up the host configuration. I have been using it for years and had also explained this in detail in an old blogpost.

Get-Cluster -Name myCluster | Get-VMHost | Get-VMHostFirmware -BackupConfiguration -DestinationPath 'C:\myPath'

This is a command I always teach my students as part of my VMware courses. Backing up the host configuration is downright mandatory before making changes to the host, installing patches and drivers, or host updates. Just a few seconds of additional effort, but these configuration backups have saved me more than once from major trouble and many hours of extra work.

Recently, I was backing up host configurations in a major datacenter. Surprisingly, the command did not work on some of the vCenter instances and aborted with an error message.

Get-VMHostFirmware : 18.08.2023 12:05:49 Get-VMHostFirmware An error occurred while sending the request.
At line:1 char:28
+… et-VMHost | Get-VMHostFirmware -BackupConfiguration -DestinationPath …
+ ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
+ CategoryInfo : NotSpecified: (:) [Get-VMHostFirmware], ViError
+ FullyQualifiedErrorId : Client20_SystemManagementServiceImpl_BackupVmHostFirmware_DownloadError,VMware.VimAutomation.ViCore.Cmdlets.Commands.Host.GetVMHostFirmware

To understand the error, we must first understand how the PowerCLI command works. First, a backup of the host configuration is triggered on the host via vCenter. The host stores this locally as a zipped TAR archive (.tgz). The name is configBundle-HostFQDN.tgz (example: configBundle-esx01.lab.local.tgz). The archive is then downloaded from the host in a second step. The URL for this is:

http://[HostFQDN]/downloads/[Host-UUID]/configBundle-HostFQDN.tgz

By reading the error message above, there was obviously a problem with the download of the TGZ file. With the help of the network admins, it quickly became obvious what had happened. My workstation, from which I sent the PowerCLI command, tried unsuccessfully to establish an HTTP connection to the ESXi host. But this was blocked by a firewall rule.

I was wondering why the transfer is handled using unencrypted HTTP. In the log of the firewall you can see a connection attempt to the ESXi host with HTTP and HTTPS.

Is there a way to force the download using HTTPS?

My first thought was that there might be a parameter to the command that enforces the HTTPS protocol. A query in the VMTN forum unfortunately brought some disillusionment.

It is a bit surprising that VMware uses an unencrypted protocol for this sensitive data. All the more since the PowerCLI session to vCenter already runs over HTTPS anyway. The most plausible explanation would be that it was simply ‘forgotten’ to secure the transfer via SSL with this quite old command.

So currently there is no other choice but creating a firewall rule that allows downloading via HTTP.

vExpert 2023 – Subprogram Nominations

VMware annually grants the vExpert award to individuals who have made a special contribution to the VMware community. This can be either through publications, presentations, blogs, or work in the VMware User Group (VMUG). I am pleased to be part of the vExpert community for the seventh year in a row in 2023.

In addition to the common vExpert, there are subprograms for specialized application branches.

I applied for the three sub-programs vExpertPro, Application-Modernization and Multi-Cloud and was accepted in all three categories.

vExpertPro

The mission of the vExpert PRO program is to create a global network of vExperts willing to find new vExperts in their local communities, support them, and mentor them on their way to becoming vExperts.

For this purpose, vExpertPro exist in many regions of the world. I have been a member of this group since 2021 myself and have been confirmed for another year.

vExpert Multi-Cloud

The multi-cloud area covers large parts of the VMware Compute portfolio. The term cloud includes not only the public cloud, but also local data centers (private cloud) and combinations of both approaches (hybrid cloud). This includes numerous products such as vSphere, vSAN, VMware Cloud Foundation (VCF), Aria, VMware Cloud on AWS, Site Recovery Manager (SRM) or vCloud Director (VCD).

I submitted my first application for this relatively new vExpert path in 2023 and was accepted. Many thanks to the business unit for the decision.

vExpert Application Modernization

Application Modernization is all about Tanzu and Kubernetes, as well as the ecosystem around these technologies. The background was described in great detail by Keith Lee in his article “Announcing the VMware Application Modernization vExpert Program 2023“.