The Art and Science of Upgrading Infrastructure Services

How can upgrading a service infrastructure be an art AND a science?  I just click a button, the stuff upgrades, and I’m good… right?

How can upgrading a service infrastructure be an art AND a science?  I just click a button, the stuff upgrades, and I’m good… right?  We’re talking infrastructure services here, people.  If the infrastructure is unavailable, the business loses money.

If I asked the following question to 100 admins, “Do you enjoy testing software upgrades in the lab?” Exactly 100 admins would say, “NO”.  But guess what? Testing is more important than actually performing the upgrade.

What do I test?

  • The hardware and software you’re upgrading – You can’t test if you don’t have an environment. It doesn’t have to be a mirror image, but having similar hardware/software is needed, albeit in a reduced capacity.
  • Test matrix with success criteria – Having a matrix of what you’ve tested, and if it passed (simple acceptance criteria) is essential. It’s a big CYA (cover your behind) move, so if management asks, “Did you test XYZ?” You can say, “YES!” Using DNS, DHCP and IP Address Management for an example, your test matrix should include things like:

o    Upgrading of your DNS primary and DHCP server(s) from version A to version B

o    Testing a variety of services and devices, if available

o    Validating if any customizations still work, including API environments

  • Having an upgrade document – You’re testing the upgrade in your lab. You’re performing steps that will simply need to be repeated in production, why not document it? This ensures that nothing gets overlooked or forgotten during the production upgrade. And, you might be able to use this document to help support your change request.

What’s my upgrade strategy?
Alright, so you’ve tested the upgrade in the lab in your “spare” time and everything is good. Now do you upgrade with a slow roll, or do a fork-lift upgrade?

  • Slow-roll and segmentation of upgrades, if possible – Again, we’re talking about business-critical core services here. Doing a fork-lift upgrade and then having to revert a large portion of network infrastructure can be painstaking.  That said, maintenance windows for infrastructure services are hard to procure and it’s not always possible to slow-roll. If you have to do a fork-lift upgrade, it’s all the more important to test your upgrade meticulously beforehand.
  • Aligning resources – Most, if not all, enterprises have multiple data centers in multiple locations. It’s important that you’ve got hands and feet ready to hit the DC if problems arise. It’s also important to line up resources from your network teams, firewall/security teams, etc.
  • Go/no-go checkpoints – What happens if you’re a couple hours into your six-hour maintenance window and you know you simply won’t come close to completing your task? There’s no sense in completing more of the work when you’ll simply need to revert.

Planning for the worst
We know that nothing will go wrong with your upgrade, especially since you’re running BlueCat gear.  But, you still need to plan for the worst and ensure all your bases are covered – no one ever got in trouble for being prepared.

  • Backup resources – What if something unexpected happens during your 6-hour maintenance window and the network team forgot to inform you of a core router change, the network is down and you can’t validate your upgrade?  If it takes the network team hours to fix their problems, you’ll have been awake for 24+ hours.  People make mistakes when they’re tired.  Having a back-up resource available and up-to-speed is a good plan.
  • Roll-back – What about if you need to roll back? Do you need “boots on the ground” remotely?  How long will it take? Do you have the files at the ready if needed? Have you tested the roll-back? Does anything else need to happen after the roll-back has taken place?

Engage the upgrade ninjas
Upgrading infrastructure isn’t something that happens often – maybe once or twice a year. Engaging the BlueCat “upgrade ninjas” will help you navigate through a successful upgrade. Here at BlueCat, we’ve got a handful of teams – from Professional Services to our Technical Account Management teams – that have the field experience and solid, vetted methodologies to ensure a successful upgrade.

Alright, I’m upgraded.  Now what?
OK, your upgrade is done. No alerts have fired. Nothing seems to have blown-up. What’s next? VALIDATION! Everyone loves validation – you know, checking log files, running some tests, working with other operational teams, ensuring applications are up and running. When doing a slow-roll upgrade, having some burn-in time before your next major upgrade will be the ultimate validation that your upgrade has been successful.

I’ll leave you with a quote from an unnamed Samurai: “Cry in the dojo, laugh on the battlefield.”  It’s a mantra that the upgrade ninjas at BlueCat try to live by.


An avatar of the author

BlueCat provides core services and solutions that help our customers and their teams deliver change-ready networks. With BlueCat, organizations can build reliable, secure, and agile mission-critical networks that can support transformation initiatives such as cloud adoption and automation. BlueCat’s growing portfolio includes services and solutions for automated and unified DDI management, network security, multicloud management, and network observability and health.

Related content

Micetro_ Hybrid Cloud Integration Blog

Micetro simplifies hybrid cloud DNS and IP address management

Learn how Micetro can help you simplify and streamline DNS and IP address management across hybrid and multicloud environments.

Read more
Three operational reasons to drop legacy tools and unify your DDI

Three operational reasons to drop legacy tools and unify your DDI

Learn with BlueCat how visibility and control, process automation, and infrastructure reliability offer three reasons to adopt Unified DDI.

Read more
Micetro_ Simplify Microsoft DNS_ DHCP_ and Active Directory Sites Management

Simplify Microsoft DNS, DHCP, and Active Directory with Micetro

Learn how Micetro makes it easy to administer Microsoft DNS, DHCP, and Active Directory sites and subnets and manage your DDI environment.

Read more
Get insight into your DDI environment with Live DDI Analytics

Get insight into your DDI environment with Live DDI Analytics

Enroll in our technology preview today to use the Live DDI Analytics tool to get real-time reports and analysis for your DDI environment.

Read more