Unable To Delete Hyper-V Root Snapshot in Hyper-V Manager

During a build-out for a customer it became necessary to move some virtual machines between a Hyper-V 2012 cluster and a Hyper-V 2012 R2 cluster but when trying to do so, all sorts of nasty errors came cropping up:

Live Migration Error Due To Differencing Disk

Error (12700)
VMM cannot complete the host operation on the host1.contoso.com server because of the error: Virtual machine migration operation for ‘MachineToMove.contoso.com’ failed at migration destination ‘host2.contoso.com’. (Virtual machine ID 1D5042AA-1A93-4635-9F0A-F7C7B0D10BDD)

Failed to access disk ‘C:\ClusterStorage\Volume2\MachineToMove.contoso.com\Windows Server 2012 DC with SP1_disk_1_3F40B5A6-E8DC-4752-873C-D9742C9419F4.avhdx’: ‘The system cannot find the file specified.'(‘0x80070002’).
Unknown error (0x800b)

Error (23753)
The virtual machine or tier load balancer configuration requires an IP pool and there are no appropriate IP pools accessible from the host.

Recommended Action
Select a host with access to an appropriate IP pool and try the operation again.

Live Migration Error Due To Differencing Disk 2

Error (12700)
VMM cannot complete the host operation on the MachineToMove.contoso.com server because of the error: Virtual machine migration operation for ‘MachineToMove.contoso.com’ failed at migration source ‘Host1’. (Virtual machine ID 1D5042AA-1A93-4635-9F0A-F7C7B0D10BDD)

Virtual machine migration for ‘MachineToMove.contoso.com’ failed because configuration data root cannot be changed for a clustered virtual machine. (Virtual machine ID 1D5042AA-1A93-4635-9F0A-F7C7B0D10BDD)
Unknown error (0x8005)

Recommended Action
Resolve the host issue and then try the operation again.

You may notice in the top error that the disk path is pointing to an odd file name. Looking at the settings for the machine in Hyper-V Manager and inspecting the disk, we find:

Live Machine Properties

Lo and behold, it’s a differencing disk. Let’s try removing the snapshot that created it:

Hyper-V Snapshot Missing Delete

And there’s the problem – no delete option!

Let’s look at the snapshot in PowerShell. To do so, open an elevated PowerShell session on a Machine with the Hyper-V PowerShell tools installed and run:

Get-VMSnapshot -VMName MachineToMove.contoso.com -ComputerName host1.contoso.com | fl

Here’s the output for the above VM:

SnapshotType : Recovery
VMId : 1d5042aa-1a93-4635-9f0a-f7c7b0d10bdd
VMName : MachineToMove.contoso.com
State : Off
Key : Microsoft.HyperV.PowerShell.SnapshotObjectKey
IsDeleted : False
ComputerName : host1.contoso.com
Id : 4382dc53-2fdd-476f-91b8-81963c292d24
Name : MachineToMove.contoso.com - Backup - (1/16/2014 - 6:00:19 PM)
Version :
Notes : #CLUSTER-INVARIANT#:{434c76e7-5581-463a-b1b4-71027d39770f}
Generation :
Path : C:\ClusterStorage\Volume2\MachineToMove.contoso.com
CreationTime : 16/01/2014 20:22:24
IsClustered : True
SizeOfSystemFiles : 49254
ParentSnapshotId :
ParentSnapshotName :
MemoryStartup : 8589934592
DynamicMemoryEnabled : False
MemoryMinimum : 536870912
MemoryMaximum : 1099511627776
ProcessorCount : 4
RemoteFxAdapter :
NetworkAdapters : {MachineToMove.contoso.com}
FibreChannelHostBusAdapters : {}
ComPort1 : Microsoft.HyperV.PowerShell.VMComPort
ComPort2 : Microsoft.HyperV.PowerShell.VMComPort
FloppyDrive : Microsoft.HyperV.PowerShell.VMFloppyDiskDrive
DVDDrives : {DVD Drive on IDE controller number 1 at location 0}
HardDrives : {Hard Drive on IDE controller number 0 at location 0, Hard Drive on SCSI controller
number 0 at location 0}
VMIntegrationService : {Time Synchronization, Heartbeat, Key-Value Pair Exchange, Shutdown...}

Time to remove it:

Get-VMSnapshot -VMName MachineToMove.contoso.com -ComputerName host1.contoso.com | Remove-VMSnapshot

You can run this command while the machine is running and if you look in Hyper-V after running this you’ll see that the differencing disk will quickly merge into the parent and then the recovery-point snapshot will be removed. Migrating the VM in this state should go without a hitch.

What caused this?

In this instance, the environment is running HP Data Protector 8.0 which is HP’s incredibly powerful (albeit rather old-looking) backup platform. The environment had been configured to back up the machines in the Hyper-V cluster using the HP StoreVirtual P4000 VSS/VDS Providers along with Application Aware Snapshot Manager. As I understand it, this uses the differencing disks so that incremental backups can be achieved – they’re merged and renewed during each Full backup. This is why you see the word “Backup” in the snapshot name along with the data and time that Data Protector took the backup.

Hyper-V 2012 -> 2012 R2 Cluster Migration Issues

Quick post more to document an oddity than anything…

Migrating machines from a 2012 cluster to a 2012 R2 cluster using VMM 2012 R2 with mixed results. In particular, I’m seeing the machines duplicated in Failover Cluster Manager – one of the two seems to be the live machine and the second, prefixed with SCVMM (as all VMM created machines are) seems to be broken with various errors such as ID 21502 “Missing or invalid virtual machine ID resource property”. Simply removing the duplicate starting with SCVMM and all seems to be ok.

Odd though.

SCVMM 2012 R2 Bare Metal Deploy WinRM Error

I’ve been getting to know SCVMM much better, in particular the ability to provision new hosts using the iLO port on a fresh HP server and I found this problem that the search engines don’t seem to have an answer for.

Towards the end of the deploy process, after the OS is installed, joined to the domain and the agent is installed, it stops with this error:

VMM Bare Metal Deploy Error

Error (20552)
VMM does not have appropriate permissions to access the resource C:\Windows\system32\qmgr.dll on the server.domain.com server.

Recommended Action
Ensure that Virtual Machine Manager has the appropriate rights to perform this action.

Also, verify that CredSSP authentication is currently enabled on the service configuration of the target computer server.domain.com. To enable the CredSSP on the service configuration of the target computer, run the following command from an elevated command line: winrm set winrm/config/service/auth @{CredSSP=”true”}

As a result the network connections and a few other bits don’t correctly apply but the host does appear in VMM.

Looking at the host properties, you can see it’s a WinRM issue:

VMM Bare Metal Deploy Error 2

Error (20506)
Virtual Machine Manager cannot complete the Windows Remote Management (WinRM) request on the computer server.domain.com.

Recommended Action
Ensure that the Windows Remote Management (WinRM) service and the Virtual Machine Manager Agent service are installed and running. If a firewall is enabled on the computer, ensure that the following firewall exceptions have been added: a) Port exceptions for HTTP/HTTPS; b) A program exception for scvmmagent.

Having checked all of the obvious, including that WinRM is enabled as it should be, GPOs aren’t getting in the way and firewall rules are set up to allow the traffic, I took a look at the security log on the new host:

VMM Bare Metal Deploy Error 3

In the Microsoft Documentation, it says very specifically that when creating a Host Profile for the deployment, the Run As account that you use to join the host to the domain should have “very limited privileges” and “should be used only to join computers to the domain”. Hence the dedicated Domain Join account I used.

So why is this account logging into the server after deployment? A quick trip to the host properties reveals the answer:

VMM Bare Metal Deploy Host Properties

D’oh! Nicely done SCVMM.

Go back into the Host Profile:

VMM Bare Metal Deploy Host Profile

And there is our Domain Join account. Create a new Run As account with the appropriate permissions to administer newly created hosts (unfortunately this is possibly Domain Admins, depending on your environment), update the Host Profile and redeploy the host and you should be good. Please note that you cannot use the SCVMM service account for this task, it has to be separate account.

HP MSM720 Wireless Controller Factory Reset & Firmware Bug

I couldn’t find any correct documentation about how to actually reset the configuration of a HP MSM720 Wireless Controller without using the web interface and I had to figure it out for myself – the issue that caused me needing to do this is in the second half of this post. Here’s how you do it:

Connect via serial to console port

Here’s a screenshot from PuTTY with everything you need to know:

MSM 720 Serial Settings

Reset the configuration

Type in the following commands to clear the configuration and reboot the device:

enable
config
factory settings

What doesn’t work

The documentation talks about using the Reset and Clear buttons together to return the device to factory defaults. Here’s a picture of the device:

MSM 720 Front Panel Features

What you actually find when you try it is that there isn’t a clear button, only a hole. I’ve seen this on at least two different controllers so this definitely isn’t a manufacturing fault but I’ve no idea why. What this means, of course, is that you have to use the CLI method above to reset the device.

The original issue

I install a lot of HP MSM equipment and am used to the more than occasional idiosyncrasy (more on that another time) but by and large they do what you tell them to do. This one had me stumped. Consider the following:

Access Network VLAN IP: 10.100.1.10/24
Internet Network VLAN IP: 10.100.99.10/24
Default Gateway IP: 10.100.99.254
Static Route: Destination 10.0.0.0/8, next-hop 10.100.1.254

Here you have the basic information for the initial configuration of an MSM720 with an “inside” and “outside” network assuming that the internal LAN/WAN is based on 10.x.x.x addresses and the internet is available through the gateway on the Internet VLAN at 10.100.99.254. You need the static route as you can’t have two default gateways and need the controller to be able to talk to the APs across the internal networks. All completely textbook.

Unfortunately, when configuring these settings, in this order, on a controller that came out of the box running version 5.7.1.1 – the controller stopped responding when applying the static route. Power cycling the box would appear to work but I couldn’t ping the device on the LAN or Internet VLANs but the console was perfectly responsive once I’d figured out the very odd serial settings.

After resetting the box, I upgraded it to V6.0.0.1 and went through the above steps with no issues this time. It’s also my understanding that this issue is fixed in 5.7.3.0 but I’ve not fully tested this.

Exchange 2013 PowerShell Unavailable

The problem with the bleeding edge is that there’s not a huge amount of supporting knowledge that goes with it as not many people have experienced all of the corner cases that Google/Bing so helpfully index for us when we have a problem. Here’s one that hit me today.

After completing a migration to Exchange 2013, I wanted to install Remote Desktop Gateway and Remote Desktop Web roles onto the same server as Exchange 2013 was running on since this is only a small environment and it didn’t make sense to provision separate servers for what will only be used by a handful of people. After doing this, I couldn’t open the Exchange 2013 PowerShell console due to multiple occurrences of the following error:

VERBOSE: Connecting to CT-EXCH-01.corp.collective-tech.com.
New-PSSession : [ct-exch-01.corp.collective-tech.com] Connecting to remote server ct-exch-01.corp.collective-tech.com
failed with the following error message : The client cannot connect to the destination specified in the request.
Verify that the service on the destination is running and is accepting requests. Consult the logs and documentation
for the WS-Management service running on the destination, most commonly IIS or WinRM. If the destination is the WinRM
service, run the following command on the destination to analyze and configure the WinRM service: “winrm quickconfig”.
For more information, see the about_Remote_Troubleshooting Help topic.
At line:1 char:1
+ New-PSSession -ConnectionURI “$connectionUri” -ConfigurationName Microsoft.Excha …
+ ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
+ CategoryInfo : OpenError: (System.Manageme….RemoteRunspace:RemoteRunspace) [New-PSSession], PSRemotin
gTransportException
+ FullyQualifiedErrorId : CannotConnect,PSSessionOpenFailed

In addition, authentication on multiple virtual directories had been reset and the various services such as OWA were flaky at best.

Removing and recreating the PowerShell directory as per some suggestions didn’t help.

Resolution:

Exchange 2013 uses two web sites in IIS; one for production and one for back-end. The former uses the normal ports (80/443) and the latter normally uses incremented port numbers (81/444). For some reason, an additional binding on the back-end site had been added for port 443. This was causing all HTTPS traffic for the front-end to end up on the back-end site. To fix:

  • Open up IIS Manager
  • Navigate to the site “Exchange Back End”
  • Click “Bindings…” under “Actions” on the right-hand pane
  • Click the item with the values https/443 (not 444!)
  • Click “Remove” and then “Close”
  • Restart IIS
  • Make sure that both sites are started

From this point you should be able to connect to the Exchange Shell as normal and reset the authentication settings on the various virtual directories as required.

Error 1010 during Exchange 2010 SP3 Upgrade

During an upgrade of Exchange 2010 to Service Pack 3 on Windows 2008 R2 in perparation for an upcoming migration to Exchange 2013, the installation failed at the Language Files section and the following was logged in the setup log:

[04/03/2013 00:06:40.0161] [1] [ERROR] Unexpected Error
[04/03/2013 00:06:40.0161] [1] [ERROR] Performance counter names and help text failed to unload. Lodctr exited with error code ‘1010’.
[04/03/2013 00:06:40.0223] [1] Ending processing install-Languages

Resolution:

From an elevated command prompt, run the following:

lodctr /r

Re-run the Service Pack and it should complete this time.

2012 Core APC PowerChute Network Shutdown

Hit an issue with APC PowerChute Network Shutdown on Windows Server 2012 Core running Hyper-V:

PowerChute cannot communicate with the Network Management Card

PCNS is NOT receiving the data from the NMC.

The client was successfully installing and the IP was registering on the NMC but PCNS wouldn’t connect.

Resolution:

Make sure that the firewall rule “PCNS NMC Communication Port (UDP 3052)” is enabled for all profiles, not just Public which is all that’s selected by default.

Here’s the full set of installation steps:

  1. Install the correct version of PCNS for Windows Server 2012 from the command line.
  2. Connect to the server via an MMC console using the Windows Firewall with Advanced Services snap-in.
  3. For each of the three PCNS rules, open properties, head to the Advanced tab and enable Private and Domain
  4. Connect to https://server:6547/
  5. Run through the configuration wizard as normal