Automatic VLAN assignment and use of DHCP relays
Software defined datacenters (SDDC) enable us to keep many components within the hypervisors software layer. But sooner or later we need to exit that layer in order to get in touch with the user. Usually Thin- or Zeroclients are used as VDI endpoints. Those hardware boxes are connected by LAN and need to have an IP address.
I will demonstrate how to assign endpoints to separate them into subnet segments and VLANs and still assign IP addresses by a centralized DHCP server.
Continue reading “Automatic Segmentation of VDI Endpoints”
Malfunction is worse than failure
Redundancy is key in virtual environments. If one component fails, another will jump in and take over. But what happens if a component does not really fail but isn’t working properly any more. In this case it isn’t easy to detect a failure.
I recently got a call by a friend, that he has suddenly lost all file shares on his (virtual) file server. I opened a connection to a service machine and started some troubleshooting. These were the first diagnostic results:
- Fileserver did not respond to ping.
- Ping to gateway was successful.
- Name resolution against virtual DC was successful.
- A browser session to vCenter failed and vCenter did not respond to ping.
It is a little two-node cluster running on vSphere 6.5 U2. Maybe one ESX has failed? But then HA should have restarted all affected VMs. That was not the case. So I’ve pinged both hosts and got instant reply. No, it did not look like a host crash.
Next I’ve opened the host client to have a look on VMs. All VMs were running.
I’ve opened a console session to the file server and could not login with domain credentials, but with a local account. The file server looked healthy from inside.
Now it became obvious that there was a problem with networking. But all vmnics were active and link status was “up”. The virtual standard switch on which the VM-Network portgroup resided had 3 redundant uplinks with status “up”. So where’s the problem?
I’ve found another VM that responded to ping and had internet connectivity on the same host as vCenter and the fileserver.
I opened a RDP session and from there I was able to ping every VM on the same host. Even vCenter could be connected by browser. Now the picture became clearer. One of the uplinks must have a problem, although it didn’t fail. But which one? Continue reading “Troubleshoot vmnic malfunction”
What is beacon probing?
In my recent blog article “ESX physical uplink resiliency” I’ve discussed countermeasures to harden vSphere traffic against downstream physical failures. Today I will discuss another failover detection method which can handle uplinks that are not yet dead but not functional either.
Reasons for failure can be driver / firmware related errors on the NIC itself, or a broken downstream path (cable / switch).
Beacon probing is a mechanism, where an ESX host will send out beacon packets over every uplink port every second to verify that each other uplink is reachable.
Continue reading “ESX physical uplink resiliency (part 2)”
Host upgrades with custom images offer extended driver support for vendor specific hardware or agents. You’ll get drivers that are not included in a standard VMware (Vanilla) image. Upgrading with customized images may lead into trouble while updating existing driver packages. There used to be a nasty bug with the lsiprovider package on Fujitsu ESXi 5.1 images. Another example was the “death by upgrade” bug (blog post in German) when upgrading a customized Fujitsu installation to ESXi 6.0. There are other examples from different vendors in the hall of shame.
Continue reading “Upgrade ESXi 6.5 with Fujitsu Custom Image”