My First Nutanix AHV/AOS Upgrade

by Ray Davis, CTA

We recently made a good investment in our VDI CVAD space: We purchased 28 Nutanix Nodes, and as this is our first upgrade, I wanted to break down everything we did. It was over two days, and I captured all the screenshots and documentation I could in a production environment. Nutanix does a fantastic job of documenting everything. I do have to admit, this was one of the smoothest upgrades in my entire career. This doc is to go over what they already have written up. It’s purely AOS/AHV and Super Micro Servers and no VMware.

  1. Support numbers just in case you need help or get stuck
    • 1-855-NUTANIX
    • 1-855-688-2649
  • Either reference old case or give a serial number.
  • Serial Number for support = ######## ( I use this to open cases)
  1. Use the Nutanix Check Sheet in OneNote to check off items you have upgraded
  2. https://portal.nutanix.com/page/documents/details?targetId=Acropolis-Upgrade-Guide-v5_19:upg-upgrade-recommended-order-r.html
  1. How the upgrades work
  1. Upgrade Matrix (Check compatibility) Compare what you are on and what you are going to. It will show you if it’s supported or not.
  1. The upgrade Matrix will give you an upgrade path for what products you are updating.


Example:

  1. We will be referring to the Nutanix Upgrade procedure here https://portal.nutanix.com/page/documents/details?targetId=Acropolis-Upgrade-Guide-v5_19:upg-upgrade-recommended-order-r.html
  2. We will log into Prism Center with the admin account. The URL is https://Priscenter.Domain.com:9440
  1. Check the version you are on for Prism Center, which will show you NCC and LCM versions.
  1. Prism Central (PC)
  2. Perform an LCM inventory, which also updates the LCM framework. Do not upgrade any other software component except LCM in this step.
  3. Upgrade and run Nutanix Cluster Check (NCC) on Prism Central.
  1. After these downloads, you can now upgrade.

 14. Yes

  1. After about 15 minutes, it will be complete. Click the gear icon top right, and hit “About Nutanix.”
  1. Upgrade Prism Center
  1. PC: Upgrade Prism Central.
  2. Check compatibility https://portal.nutanix.com/page/documents/upgrade-paths
  1. Downloading
  1. Now click pre-Upgrade to simulate a test (although it will do it when it upgrades).
  1. Now upgrade PC
  1. This will take about 30 minutes.
  1. This will occur as well.
  1. After it’s finished, let’s check it.
  2.  Looks good.
  1. Prism Element clusters – upgrade LCM and NCC You can use LCM >Software to perform the rest of the updates.
  2. Perform an LCM inventory, which also updates the LCM framework. Do not upgrade any other software component except LCM in this step.
  3. PE: Run and upgrade Life Cycle Manager (LCM)
  1. Log into PE and upgraded NCC and Foundation

You will see it being downloaded.

Now click upgrade

PE: Upgrade Foundation.

PE: Run and upgrade Life Cycle Manager (LCM)

  1. File Server (Nutanix Files) Software

Installing (or Upgrading) Files

What happens when I click “Upgrade Now”?

  • First, the pre-upgrade checks will run to make sure that the cluster is able to be upgraded. If any of the pre-upgrade checks fail, you will see information about this in Prism and the actual File Server upgrade will not start. Users will have to click “Back to Versions” and start the upgrade again after the issue reported by the pre-checks is resolved. To see the full list of pre-checks and their related KB articles, check out KB-6524.
  • Once the File Server upgrade beings, each File Server VM is upgraded one-at-a-time onto the new Nutanix Files version. While an FSVM is down for the upgrade, users connected to shares hosted by this node may experience a loss of connectivity for a duration of roughly 20-30 seconds. After this short period, another FSVM will pick up on hosting those shares, and users will regain access to their files.
  • After each FSVM completes its reboot onto the new version of Nutanix Files, the upgrade will make sure that it can once again host shares before starting to upgrade the next FSVM.

How long does it take? About 20 minutes per-File Server VM.

  • Upgrade FSM: This will take 20-30 minutes

The File Server Module (FSM) manages the Files lifecycle and appears in LCM. The FSM includes the Files UI component but relies on AOS for the control plane.

 The File Server Module (FSM) manages the Files lifecycle and appears in LCM

From <https://portal.nutanix.com/page/search/list?stq=FSM>

  1. File Analytics

View all task to see the progress.

  1. Cluster Maintenance Upgrade

Go to LCM > Software>  Cluster Maintenance

Cluster Maintenance would have been here, but we upgraded it, and I currently get a screenshot.

  1. AOS Software

Upgrade Prerequisites

What happens when I click “Upgrade Now”?

  • First, the pre-upgrade checks will run to make sure that the cluster is able to be upgraded. If any of the pre-upgrade checks fail, you will see information about this in Prism and the actual AOS upgrade will not start. Users will have to click “Back to Versions” and start the upgrade again after the issue reported by the pre-checks is resolved. To see the full list of pre-checks and their related KB articles, check out KB 6524.
  • Next, the AOS software is copied to each CVM (Controller VM) in the cluster.
  • In the last stage, the Controller VMs in the cluster reboot one-at-a-time onto the new AOS version. Storage traffic from User VMs will be redirected to a neighboring CVM while the local one is upgrading. During this short period (about 10 minutes) the local User VMs may experience a small amount of additional latency since they are receiving their storage I/O from a remote CVM.

How long does it take? 15-20 minutes per node. The upgrade process in a two-node cluster will take longer than the usual process because of the additional step of syncing data while transitioning between single and two node state. Nevertheless, the cluster remains operational during upgrade.

From <https://portal.nutanix.com/page/documents/kbs/details?targetId=kA00e000000LMgICAW>

  • Perform available firmware updates (BIOS/BMC/Host boot drive or other critical firmware as recommended by LCM).

After upgrading AOS and before upgrading your hypervisor on each cluster, perform a Life Cycle Manager (LCM) inventory, update LCM, and upgrade any recommended firmware. See the Life Cycle Manager documentation for more information.

PE: Run and upgrade Life Cycle Manager (LCM):

  • Perform an LCM inventory (also updates LCM framework).
  • Upgrade SATA DOM firmware (for hardware using SATA DOMs) as recommended by LCM.
  • Upgrade all other firmware as recommended by LCM (BMC / BIOS / other).

For release-specific information (all branches), see the Life Cycle Manager Release Notes.

LCM performs two functions: taking inventory of the cluster and performing updates on the cluster.

From <https://portal.nutanix.com/page/documents/details?targetId=Acropolis-Upgrade-Guide-v5_19:upg-firmware-upgrades-c.html>

  • AVH Hypervisor upgrade
  1. Upgrade FSM and FA. I don’t have screenshots for this. But you can see them in LCM and Do the FSM first and FA next. It’s straightforward from a GUI part.

You will see this. Just ignore it because it’s a part of FA.

36. Run another LCM to make sure all upgrades and good.

That concludes the upgrade process that I went through. I didn’t see any performance impact, it took me about 27 hours straight. It would have been faster if I could have took the CVAD offline while doing this, but as many of you know, most of the time that is not ideal. But overall, I score Nutanix at 100% for the KISS method.

Leave a Reply