Tech

Fixing these dratted Unknown vSAN Objects

Like a few folks out there, I’ve got my own personal home lab. I never really thought I would have one until my current job because in my previous jobs, I always had access to boatloads of equipment.

Let me cut to the chase: I screwed my vSAN installation up M.A.N.Y. times and ended up with both Unknown and Unhealthy vSAN Objects. I knew what they belonged to and quite frankly didn’t care since I didn’t need those vm’s anymore. However, the error they threw in my VCSA was an eyesore and it annoyed my OCD self.

In the end, this is how you delete them and make the error go away. I am unable to provide you the actual screenshots because I did it late one night when the eyesore errors got the better of me and I just banged it out.

Also, this was my way of avoiding the Ruby Console for vSAN(RVC) but if you need to dive down that path, see my credits down below.

Let me first say that I have no idea why my objects ended up this way. I see some folks online post that there were old policies that no longer existed and/or policy mismatches that cause their errors.

So you see a red or yellow flag right on your cluster and the cluster summary complains about vSAN. You click on Cluster -> Monitor -> vSAN -> Health

Oh yeah, this is all vSphere 6.7.0 by the way.

In this list, you have something in red that says a vSAN Object Health Test has failed.

Mine had Reduced Availability with no rebuild on 2 disks.

I had already removed from Inventory the offending vm so there was nothing to reference. I just happen to already know that the disks belonged to the bad vm. All I had to go on was the vSAN UUID.

If you now navigate to Cluster -> Monitor -> vSAN -> Virtual Objects, you will see all the actual objects of every vm in this vSAN Cluster. If this is new to you, it’s because vSAN is actually an object based storage. Not sure what that is? Google it. Or click here. Suffice to say, it’s not the standard VMFS or NFS datastore where it’s just an open spot to hold random things. It’s more like S3 where everything is tagged and has it’s place.

Your unhealthy and healthy objects will all be listed here. I point you to this not because some of you may already know this but so you know where to find the UUID easily. The commandline interface for locating the UUID makes my eyes burn a little. The GUI is easier on my eyes.

Ok, now I assume you know which objects are the ones you want to get rid of.

Next, you really need to know which host those objects exist on because you’re going to be ssh’ing into that host to delete the object. Again, you can use commandline to do this but it makes my eyes burn. (vsan.cmmds_find -u UUID cluster-name)

In the GUI, check the box of the object you want to locate and click Placement Details above.

Now you will see something like this:

Under Host, note the name or IP. In my case, it’s the IP of the host.

Now, you are ready to do some damage so make sure you have a backup handy and you know what you’re going to do next weekend if you fat finger it.

SSH into the host where the object you want to delete is listed.

Either navigate to this directory: /usr/lib/vmware/osfs/bin

Or alternatively, execute the entire path:

/usr/lib/vmware/osfs/bin/objtool delete -u UUID-copy this from the GUI -f -v 10

I find it much easier to copy the UUID from the H5 GUI and alt-tab to the ssh session and paste the UUID because I had about 20 of them I had to delete.

Once done, refresh the GUI to ensure the unwanted objects are no longer listed. Go to Cluster -> Monitor -> vSAN -> Health and click Retest. Everything should clear up now.

Credit should totally go to the following:

https://www.thinkcharles.net/blog/2018/2/16/removing-inaccessible-objects-in-vsan