Teach-The-Expert: vSAN Diskgroup Management on CLI

As part of my work as a trainer, I often come across questions on topics that are only covered in passing or not at all in the course. This series of articles provides trainee IT experts with tools for everyday use.

Intro – What are Diskgroups?

VMware vSAN OSA (original storage architecture) structures the vSAN datastore into disk groups (DG). Each vSAN node can contain up to 5 disk groups. Each of these disk groups consists of exactly one cache device (SSD) and at least one to a maximum of 7 capacity devices per group. These may be either magnetic disks or SSDs, but no combination of the two. We differentiate between cache tier and capacity tier.

Disk groups can be managed using the graphical user interface (GUI). However, there are situations where disk group management on the command line interface (CLI) is necessary or more appropriate.

UUID

Each disk device of a vSAN cluster (OSA) has a universally unique identifier (UUID).

We can list all devices of a vSAN node on the CLI with this command:

esxcli vsan storage list

The sheer amount of information may be a bit too much and we only want to display the lines containing the UUID.

esxcli vsan storage list | grep UUID

We receive a list of all disk devices in the vSAN node. We also receive the UUID of the disk group to which the device is assigned.

If you take a closer look at the output, you will notice that there are some devices whose UUID is identical to the UUID of the diskgroup. Is this a contradiction to the statement that the UUID is unique? No. These are cache devices. Each diskgroup in vSAN OSA consists of exactly one cache device. The disk group adopts the UUID of its cache device. In this way, we can quickly distinguish a cache device from a capacity device.

Continue reading “Teach-The-Expert: vSAN Diskgroup Management on CLI”

Cluster Retreat Mode gone wrong – vSphere Client lockout

With the release of vSphere 7.0 Update 1, vSphere Cluster Services VMs (vCLS) appeared in vSphere clusters for the first time. This made cluster functions such as Distributed Resource Scheduler (DRS) and others independent of the availability of the vCenter Server Appliance (VCSA) for the first time. The latter still represents a single point of failure in the cluster. By outsourcing the DRS function to the redundant vCLS machines, a higher degree of resilience has been achieved.

Retreat Mode

The vSphere administrator has little influence on the provisioning of these VMs. Occasionally, however, it is necessary to remove these VMs from a datastore if it is to be put into maintenance mode, for example. There is a procedure for setting the cluster to retreat mode. This involves setting temporary advanced settings that lead to the deletion of the vCLS VMs by the cluster.

According to the VMware procedure, the Domain ID must be determined to activate Retreat Mode. The domain ID is the numerical value between ‘domain-c’ and the following colon. In the example from my lab, it has the value 8, but the number can also have four digits or more.

The domain ID has to be transferred to the Advanced Settings of the vCenter.

config.vcls.clusters.domain-c8.enabled = false
Correct Retreat Mode settings.

Admin error occured during activation of retreat mode.

After activating retreat mode on a vSAN cluster, administrators had lost all privileges to all objects in the vSphere Client.

A review of the services showed that the vCenter Server Daemon (vpxd) was not running.

Continue reading “Cluster Retreat Mode gone wrong – vSphere Client lockout”

ESXi Boot Media – New Requirements for v8

The requirements for ESXi boot media have changed fundamentally with ESXi v7 Update3. Partitioning has been changed and the requirements for the load capacity of the medium have also increased. I covered this in my blog article “ESXi Bootmedia – New features in v7 und legacy issues from the past v6.x“.

USB boot media did not turn out to be robust enough and were therefore no longer supported from v7U3 onwards. It is still possible to install ESXi on USB media, but the ESX-OSData partition needs to be redirected to permanent storage.

Warning! USB media and SD cards should not be used for production ESXi installations!

Valid setup targets for ESXi deployments

SD cards and USB media are unsuitable as installation targets due to their poor write endurance. Magnetic discs, SSDs and SATA DOMs (disc-on-modules) are still permitted and recommended.

SATA-DOM on a Supermicro E300-9D

New requirements from version ESXi v8 onwards

My Homelab previously used vSAN 7 and thus the classic OSA architecture. To run the cluster under the new vSAN ESA architecture, it was necessary to use vSphere 8 and new storage devices.

I tested the installation and hardware compatibility on a 64 GB USB medium (not recommended and not supported!). During the installation, there were warnings regarding the USB medium as expected. Nevertheless, I was able to successfully test the detection of the NVMe devices and the vCenter deployment.

Setup warning when trying to use an USB flash medium.

Having successfully completed the test phase, I installed ESXi 8U2 on the SATA DOM of my Supermicro E300 server. To my surprise, the setup failed at a very early stage with the message: “disk device does not support OSDATA“.

RTFM

The explaination is simple: “Read the fine manual!”

My 16 GB SATA DOM from Supermicro was simply too small.

The setup guide for ESXi 8 clearly states the requirements under “Storage Requirements for ESXi 8.0 Installation or Upgrade“:

For best performance of an ESXi 8.0 installation, use a persistent storage device that is a minimum of 32 GB for boot devices. Upgrading to ESXi 8.0 requires a boot device that is a minimum of 8 GB. When booting from a local disk, SAN or iSCSI LUN, at least a 32 GB disk is required to allow for the creation of system storage volumes, which include a boot partition, boot banks, and a VMFS-L based ESX-OSData volume. The ESX-OSData volume takes on the role of the legacy /scratch partition, locker partition for VMware Tools, and core dump destination.

VMware vSphere product doumentation

In other words: New installations will require a boot medium of at least 32 GB (128 GB recommended) and upgrading from an ESXi v7 version will require at least 8 GB, but the OSData partition of this installation must already be redirected to an alternate storage device.

Dirty Trick?

Needless to say, I tried a dirty trick. I first successfully installed an ESXi 7U3 on the 16 GB SATA DOM and then performed an upgrade installation to v8U2. This attempt also failed, as the OSData area was not redirected in the fresh v7 installation.

I don’t want to install on USB media as I have seen too many cases where these devices have failed. The only option is to invest in a larger SATA DOM.

I opted for the 64 GB model because it is a good compromise between minimum requirements and cost-effectiveness.

vSAN 8 Update 2 – what’s new on technical level

Earlier this month I had the opportunity to participate in the VMware Exclusive Blogger Early Access Program. This is where VMware experts share news and upcoming features for products before the actual release date.
I would like to thank Pete Kohler and John Nicholson for the feature briefing on the release of vSAN 8 Update 2. I’d also like to thank Heather Haley from VMware’s global communications team and Corey Romero for helping to make it happen.

[This blogpost was under embargo until 22nd of August 5:00 AM PDT (14:00 CEST)]

VMware vSAN 8 Update 2 delivers a range of new features and improvements in flexibility, performance and ease of use.

New ESA Features at a glance

  • vSAN ESA storage-only clusters with vSAN Max
  • VMware Cloud Foundation Support for vSAN ESA in VCF 5.1
  • vSAN file services now available on vSAN ESA
  • New ESA Ready Node Profiles for small environments
  • Support for a new class of read intensive storage devices
  • Auto policy remediation
  • ESA Prescriptive Device Claim
  • Stretched- and 2-node-cluster impovements
  • Up to 500 VMs per host (vs 200 before)
Continue reading “vSAN 8 Update 2 – what’s new on technical level”