Last updated on September 13, 2021.
Critical Conversations on Critical Infrastructure Ep. 5: “Cloud provider DNS: what should architects know?”
Cloud service providers do a great job of abstracting the complexity that is inherent in DNS. While this breaks barriers to innovation, it also breaks architectures. As enterprises lean further into their cloud adoption, what should architects know about working with cloud service provider (CSP) DNS services in a hybrid cloud environment?
CSP DNS is a different beast than enterprise DNS. And whether Microsoft Azure, Amazon Web Services (AWS), or Google Cloud, each of the cloud DNS services they provide comes with its own set of capabilities and limitations to understand and consider.
This roundtable of networking and cloud experts explored the challenges around getting CSP and enterprise DNS to work together effectively.
Panelist: Scott Penney [LinkedIn], VP of Strategy, BlueCat
On-premises and CSP DNS are not either/or
Wertkin set the stage by asking why it’s inevitable that companies will need to use both traditional and CSP DNS.
Often a company gets into the public cloud through shadow IT, Swinford observed. Somebody gets some credits, starts to build something, and it accidentally becomes a production application. And that triggers the need for security and governance, and for hooking it up to the corporate network.
“We need all this other reachback to our on-prem things in our data centers,” he said. “And that’s when cloud-native starts to become a little less native.”
At first, many companies believe that they can just use their internal corporate DNS alongside the CSP. However, Sharp pointed out that that’s not a good idea. Just as enterprise DNS is tightly integrated into internal systems, CSP DNS is integrated into the provider’s systems.
(Customers) go into their cloud systems, they change the DNS, and then some sort of cloud-native service breaks,” she said.
Even BlueCat came to that realization, Wertkin said. Although its original goal was to replace every other kind of DNS, reality intruded. There are cloud services that are tightly bound to and require the use of the CSP’s DNS.
But, he cautioned, that doesn’t mean you should deploy everything in CSP DNS. The challenge is to figure out how to integrate internal DNS with CSP DNS.
The importance of intentional DNS architecture
Panelists ticked off a list of the chaos that a wild west approach can cause: duplicated zones for no good reason, copied and pasted records, troublesome conditional forwarding rules, and an impossible-to-monitor network. Since DNS is an important tool in network protection and threat hunting, it can also impact security.
You have to think about a global DNS strategy, Sharp said. It’s not just names and IP addresses anymore. “It’s also, how do you incorporate that service into your overall security program?” she said. “And that’s just not something that you can overlook.”
It doesn’t help that each CSP has very different capabilities around DNS. There’s no single way to approach it. For example, O’Connor pointed out that Azure and AWS have different requirements around Active Directory. There is no way to hybridize that kind of approach, he added.
“If it’s routable on the network, and if it has a security toolkit on it, it requires authentication. You need to be able to resolve against an on-prem DNS server one way or the other,” he said.
Added Swinford, “We found that if we try to architect everything ahead of time, we’re going to get something wrong. So we just try to design in a way that allows us to just switch paths later. Because developers can build something quickly and then tear it down and rebuild.”
When she’s working with customers who are building foundational environments, Sharp said she has to learn not only about the technical environment on premises, but their organizational structure as well. “There’s a lot to think through and talk through,” she said. “This is a key part of that conversation from the networking side. What are you doing today? How does that integrate? What are you doing in other cloud providers?”
Architect for flexibility and change
Panelists agreed that, whatever you start with, it’s not engraved in stone. It will change. It must change. And it must be designed so it can change.
Swinford’s approach moved from simple forwarding to a more advanced architecture, and it’s still evolving. “It’s really driven by need,” he said, noting that the cloud platform lets developers rebuild quickly. But, he added, “With DNS, everyone’s relying on it as soon as you stand it up. So you don’t necessarily get that second chance. You need to build in that ability to change later.”
O’Connor agreed. When he started, the cloud team didn’t have the technical capability to architect DNS, so the design in place was his. The architecture document is a working document that can be adjusted to accommodate changes caused by things like new CSP capabilities.
“Unlike network architectures in the past, where you get it right and you’re good for a few years, we need to make sure [DNS] continues to allow for change, to facilitate the application, which is what the cloud is really about,” Wertkin noted. “That’s the other shift here, from ‘applications need to meet the needs of the network’ standpoint to ‘the application is king’. And the rest of the infrastructure needs to be able to flex to allow for it.”
Plan for connectivity failures
It doesn’t matter how well the DNS servers are running. In a hybrid environment, if you lose connectivity, you’re dead in the water.
“One of the key pieces for us has been, what if we lose connectivity to on-prem? Because it’s all well and good to set forwarders up and forward to DNS traffic from a Route 53 instance or an Azure DNS instance across the direct connect or express route to on premises. But the minute you lose that connection, you lose all the capability in the cloud,” O’Connor said.
A caching DNS server placed in the cloud can save the day by holding records until connectivity is restored, he said.
Dependency mapping is key, said Sharp. You have to know what is running where, and how each service interconnects with services everywhere else.
“DNS is a perfect example of that,” she said. “What happens if I lose access? My cloud resources may not be down, but if I can’t do name resolution, I’m still dead in the water.”
Wertkin agreed, saying it needs to go both ways. We like to think that cloud services don’t go down, but it happens. Understanding fault domains is critical. As is extending on-premises DNS into the cloud, ensuring local resolution there, and in turn ensuring that required names created in CSP DNS are exposed to the enterprise.
It’s all about governance
Governance no longer just applies to connectivity or name resolution, Swinford said. Now, it includes role-based access control, identity, and permissions.
Kroger has found ways to allow application teams to configure and populate CSP DNS zones with their application names while maintaining good governance. On the other hand, O’Connor said Zurich hasn’t yet done so because of the risk.
“Certain parts of our application stack rely heavily on the zurich.com DNS name. And that’s where a .com DNS name also happens to be an Active Directory domain name,” he said. “If we start to give people access in there, there could be an awful lot of problems around their authentication.”
Advice from the experts
The panelists offered advice for enterprises looking to integrate DNS as part of their cloud strategy.
Both Sharp and Swinford suggested asking these questions:
- Who are your clients?
- What services are you providing?
- How do you connect them up?
- Where do the clients live?
- How do they need to consume these services?
- How do your application teams need to publish those services?
Sharp added that application teams can’t assume that the services they will use in the cloud work the same way they do on premises or the way they did with another cloud provider.
“Something that we’re learning the hard way is, as the subject matter expert for your DDI environments, you need to go and you need to look at what it’s like to walk into a greenfield CSP, and all of the capability that you can have as an application owner in the greenfield CSP,” O’Connor said. “Because when they come to you looking to deploy DNS, they’re looking for that capability. They’re not looking for the corporate DNS to be put out in the cloud.”
Learn how BlueCat Cloud Resolver tames cloud DNS by simplifying zone discovery and conditional forwarding rule management to improve service delivery.
A compendium of BlueCat’s Learning Certification Program offerings, including learning streams for Integrity, Edge, automation, and DDI.
EMA research found that cloud, automation, and security are the three primary drivers behind investing time or money in DDI technology. What drives you?
EMA research found three distinct stages of DDI maturity, with 65% of enterprises realizing the value of a full-stack DDI solution. Is yours one of them?