Site icon BLOGS

NetScaler HA on Microsoft Azure “Planned Maintenance”

by Saadallah Chebaro, CTA, UAE CUGC Leader

Citrix NetScaler High Availability on Microsoft Azure has never been an easy subject, especially after Microsoft supported multi IP/NICs on Azure Virtual Machines a couple of months ago. The debate still rages on today about how NetScaler HA should be configured, nevertheless, a recent announcement by Microsoft on a New Planned Maintenance Experience for Azure Virtual Machines could change all that. Let’s discuss NetScaler HA options on Azure before diving into the New Planned Maintenance Experience “Proactive-Redeploy.

I have addressed NetScaler HA on Microsoft Azure in several blog posts before, and was actually scripting my way through a workaround, which is very similar to what Microsoft fortunately did:

Citrix NetScaler HA on Microsoft Azure “ The Multi NIC/IP Untold Truth ’’

Configuring Multiple VIPs for Citrix NetScaler VPX on Microsoft Azure ARM Cloud Guide

Configuring Active-Active Citrix NetScaler Load Balancing on Microsoft Azure Resource Manager

Configuring Multiple IP Addresses for Citrix NetScaler VPX on Microsoft Azure Resource Manager

Citrix NetScaler VM Bandwidth Sizing on Microsoft Azure

NetScaler with single IP mode can be configured with Active-Active or Active-Passive high availability on Azure using an ALB “Azure Load Balancer.” NetScaler would have a single IP “NSIP,” which would act as a LB or AG or other supported services with NS on Azure utilizing same IP but different ports. Azure LB fronting both NetScalers in an availability set would be used to PAT “Port Address Translation” ports to the NetScalers. This means the public IP: Port published to users is assigned on the Azure LB which, in turn, has rules to forward traffic to NetScaler NSIP: Port.

NetScaler with Multi IP/NIC mode can only be configured with Active-Active high availability on Azure using an ALB as well. The reason being, Azure uses L3 network virtualization internally so multiple IPs cannot float over the network and be owned by the second NetScaler, and this applies to single IP mode as well, nevertheless in single IP mode, the second NetScaler has one known IP statically assigned so it can take over the services which have different ports. That is why Active-Passive is supported by Single IP mode but not Multi IP/NIC mode.

NetScaler single IP mode had a lot of limitations, like well known ports ( 80, 443, and others) being already used by NetScaler (because NSIP is only used), and GSLB not supported amongst other features. NetScaler Multi IP/NIC mode does not have the ports or GSLB limitation, thus making it the preferred deployment model. Many industry experts chose to go a different way and use Azure Traffic Manager to handle NetScaler services HA, which is also a viable option, but nevertheless incurs the same management overhead as an Active-Active HA deployment which means each configuration change must be done manually on both NetScalers.

As of now, Microsoft is upgrading its Azure infrastructure to Server 2016, which means most probably a restart of those hosts will be required. Those hosts might just be hosting your NetScaler VM or any other VM for that matter. On top of that, those VMs need to be restarted for improving overall performance from the upgraded infrastructure. Most Microsoft updates use an in-place migration (VM Preserving Maintenance) to ensure minimal downtime, so the VM is paused for about 30 seconds until it moves to a different host that has already been updated. Some major updates/upgrades require a VM reboot which was, until recently, a major hassle for single VMs, because Microsoft normally sends a request 7 days in advance that this VM will incur downtime at this specific time (time out of your control). Normally, the downtime is about 2 minutes, but that is not a sure thing. This down timing and duration might just be in the peak of your working hours, so without an HA configuration in place, there is no way to maintain your NetScaler services as the VM will be shutdown at that time.

Now comes the major new announcement by Microsoft New Planned Maintenance Experience “Proactive-Redeploy.” Microsoft now gives you a window of one full month to manually choose when you want to restart this VM at your convenience within the 30 day timeframe. If you miss that time window, Microsoft Azure will restart the VM automatically using normal scheduled maintenance. The cool thing is that everything will be preserved (except unsaved configuration or DCHP IP, which should never be used on a NetScaler) in terms of networking, security, and other settings, so, it’s basically like an offline migration for the VM to an already upgraded host.

“Planned maintenance that requires a reboot, is scheduled in waves. Each wave has different scope (regions).

The goal in having two windows is to give you enough time to start maintenance and reboot your virtual machine while knowing when Azure will automatically start maintenance.”

You can now view upcoming maintenance windows through Azure Service Health–Planned Maintenance.

More so, Monitor notifications can be configured to ensure that all required team members receive the maintenance window timeline, ensuring that it is not missed. This can be through Email and SMS …

PowerShell can be also utilized to view and initiate proactive planned maintenance on Azure VMs:

Get-AzureRmVM -ResourceGroupName rgName -Name vmName -Status
Get-AzureRmVM -ResourceGroupName rgName --Status
Restart-AzureRmVM -PerformMaintenance -name $vm.Name -ResourceGroupName $rg.ResourceGroupName 

On a side note, most of the talk from Microsoft product managers seems to point at this  only being for West U.S. at this stage, nevertheless, the documentation or announcement doesn’t seem to mention any limitations based on region.

My recommendation would be to use a single NetScaler with Multi IP/NIC mode on Microsoft Azure and utilize the 30 day maintenance window to restart the NetScaler VM when required by Azure. This would ensure all supported features of NetScaler on Azure are supported with no additional management overhead/considerations.

Let me know your thoughts.

Thanks.

Exit mobile version