The Future of DNS: Privacy, DoT & DoH
In the first episode of ‘The Future of DNS’ series, BlueCat CTO Andrew Wertkin explores DNS privacy and the role that DoT and DoH can play.
In the first episode of ‘The Future of DNS’ series, our CTO Andrew Wertkin explores DNS privacy and the role that DoT and DoH can play. He covers:
- A technical definition of DNS
- DNS resolution in various contexts
- The privacy problem
- DNS over TLS (DoT), DNS over DTLS (DoD), DNS over HTTPs (DoH)
Need a transcript? Read on.
Introduction – 00:00
The future of DNS is more of a series of presentations because there’s a lot that continues to move with DNS. And a lot of it obviously impacts BlueCat and how we go to business. So this is one of a series, and there’ll be other speakers and other topics. But today, we decided to focus on DNS over TLS and DNS over HTTPS. You know there’s a polling question. Which are both emerging standards, and both of them are very focused on one thing: DNS privacy. And we’ll go through a bit of why that’s important. And it’s really more important in the consumer space than in the enterprise space, but certainly, it impacts us in the enterprise space as well. Okay?
A Brief Recap: What is DNS? – 00:54
So, we’re just going to take a step back and talk about what DNS actually is so that we have a little bit of context to talk about what DNS over TLS and DNS over HTTPS are. DNS is a hierarchical, decentralized, and distributed naming system. DNS is actually many things. It also specifies the technical characteristics of the database. How am I going to store this information and transfer this information? And also about the protocol itself. There’s a DNS protocol. There are DNS servers that provide different roles. There’s the global domain name system that is, actually, allows this decentralized database of names and their corresponding addresses, in many of the cases.
Requests for Comments, or RFC’s – 01:37
DNS is defined by Internet Engineering Task Force requests for comments (RFCs) and there are tons of these RFCs. These RFCs are important in, sort of, the mission of the IETF, [which] is to define standards so that there can be interoperability across the internet. If this stuff was proprietary, if there wasn’t agreement on these different standards, then the internet wouldn’t work. So all of the core protocols in how the internet actually works, whether it’s routing, whether it’s DNS over and over and over again, will be defined by these RFCs that go through a lifecycle. There are many, many, many RFCs that touch on DNS. Some of these touch on best practices, some of these define specific protocols. Some of these define the DNSSEC, and they build on each other. Some obsolete previous ones, some, again, suggest best practices. There’s a huge map of them.
If we just sort of look at the core basic DNS specification ones, starting in like 1987 or so, sorry, 1983, up until now, there’s just this huge expansion of them. And people talk about this idea of the DNS camel or this protocol has gotten so darn complicated, based on all of these emerging specifications that it’s difficult to, in many cases, keep up with the changes across the broad internet. Which is why there’s been things recently like DNS flag day, so that more changes can be made and sort of drum out the servers that are out there, that are not compliant with all of these changes going forward. Today, we’re going to focus on two of these, specifically, RFC 8484 and 7858, which define DNS, changes to the DNS protocol for privacy. And it’s part of a broader set of RFCs that talk about DNS privacy. You’re welcome to read these. I suggest doing it if you are looking to nod off.
A DNS Message is… – 03:41
What’s a DNS message? DNS message is very well structured. It has a header, it has a question, answers, authorities for those answers, and additional information. Whether it’s a query or a response, this is what a DNS message looks like. The header details are also very broken down. And these things are almost impossible to change at this point if we’re going to keep compatibility across the broad internet. So the message basically stays the same and stays somewhat consistent. The founders, the original RFC writers, were smart enough to add some expansion capabilities so that additional bits can be added or utilized that weren’t necessarily reserved to begin with so that there can be some additions to the protocol over time.
Whether we’re sending a DNS message to the DNS protocol or we’re sending it over TLS or HTTPS or SMS or email or whatever the case might be, this is what a DNS message is. A DNS message is – the DNS protocol is plain text. This is a Wireshark packet capture of my laptop doing a DNS query, and the actual query itself is going to be byte encoded, but Wireshark is kind enough to decode and break down the actual DNS message, so I can see like anybody can see. As long as I’ve attached that wire, I can see exactly what DNS is flowing through it. It’s again, plain text. If I’m a DNS server, then I don’t need to necessarily go to the packet capture level, because this stuff is flowing through me, but it’s there. It’s plain text, anybody can see it.
DNS Query – 05:35
Here’s how a simple query works. Your local host, your phone, your laptop, whatever, has something called a stub resolver on it, which is a pretty dumbed down DNS server that doesn’t know how to do a lot, because it’s running on a client operating system. We don’t want to configure a full, broader role for DNS there. That stub resolver knows how to issue a query to, let’s say, a recursive resolver, that is going to do all the work, the hard work to go get the answer to that query. That recursive resolver will go, for instance, to the internet and it will identify the authoritative server, where it is, and it’ll ask a question to that authoritative server, and now I’m going to get the answer to my question. Broadly, that’s how DNS works. Different roles for servers, I’ve gotten an authoritative name server, I have a recursive resolver that’s going to cache answers, usually. It doesn’t have to. So that I don’t have to keep going and asking the internet for the answer or the intranet, and I’m going to return that to the stub resolver, which is my laptop.
So now we have an answer to my query. Multiple servers are part of this process. We go to a home user. When you’re at home or well more than 99% of people at home, their stub resolvers are things on your home network, are going to hit your home router, which is basically a forwarder, and it’s going to forward that along to, by default, usually your internet service provider’s resolver. And your internet service provider’s recursive resolver is going to go get the answer and cache the answer. If your neighbor has already looked up this address, then it’s usually cached. They don’t need to go out to the internet. There’s an answer for it.
The majority of home users use their ISP’s DNS recursive resolver. Whether they know it or not, it’s set up for you when that thing boots. Historically, the ISPs have used this role they have towards their advantage. One, obviously they need to build up a big, scalable resilient system because they deal with potentially hundreds of thousands or millions of consumers, but they’ve done things to sort of muck with DNS. They monetized NX domains and by that I mean like if you misspell an address as you’re trying to go to a specific website, and there’s no answer to that domain, there’s no answer to that query. Instead of just sending you back the fact that there’s no answer, so your browser can say that site doesn’t exist, they send you somewhere else where they can imprint some ads and therefore they can make money, because companies are paying them to imprint their ads. They’ve been doing that for quite some time. Basically, hijacking the response, you’re not getting the actual response, you’re getting the response they want you to get, because they can monetize it.
It’s also a ton of data that they can use. They can profile their customers, they know who’s going where. They know that from more than just DNS, but it’s so obvious and easy to access that for DNS, therefore, they use it. They can change answers if they want to, and many of them do for their own advantage. It might be for their own network operations, but they can change the answer going to you as well. A lot of countries require specific blacklists, because users are not allowed to go to certain sites, whether it’s around pornography or gambling or whatever the topic might be, and DNS becomes a very simple place to do that. And it’s not just Indonesia, it’s the UK, which is now starting to put into place some laws around access to pornography, like an opt-in, and their plan was to use DNS as one of the major control planes for how to do that. Because I can easily block answers based on DNS. China has a very famous, the Chinese firewall, which is a huge DNS component, where they can control access and do control access to social media or any other sites, different news sites. It’s again, a very good control plane to do this and this is how it’s normally done.
Open Resolvers on the Internet – 09:42
Some users are a bit more advanced and they don’t necessarily want to go through that route, and so they configure their machines to go to an open resolver on the internet. And there’s some very well known. Quad8, which is Google’s open resolver, Quad1, which CloudFlare launched recently, Quad9 from a service called Quad9. And others like OpenDNS, for instance. So for savvy users that want some protection through DNS, not privacy, but for some protection, these services specifically Quad1 and Quad9 have active blacklists and they’ll block sites, and if you pay them some money then you can block porn and your kids can’t surf porn, for instance. They offer services based on DNS. These are free. There’s a variety of paid services out there and there’s a business around doing this. And to some extent, we do this with our DNS firewall and Edge, but we’ll get into that in a second. So I can bypass that. It doesn’t give me any privacy and my ISP can still see all the DNS messages. So if they’re monetizing the data, they can still see those messages. Frankly, they can still change them if they want to, because the packets are going through their system. Something like DNS sec would stop them from, well it wouldn’t stop them from changing them, but it would invalidate the response.
But the majority of users are not behind DNS sec validating resolvers and don’t know if the answer got changed regardless. So it doesn’t stop them from changing it, but it does afford the users some more protection and if they’re a little bit more savvy, maybe they’ve set up their home router to go to one of these. So everything in their home network connects to one of these. The corporate user, there’s a stub resolver again. That query is going to flow through one to many intermediate servers, depending on their DNS deployment architecture. Eventually in their DMZ, it usually hits a recursive resolver and that recursive resolver will then go out and find the answer to that question. Now, they could also forward here to OpenDNS or another service on the internet, and our customers, some of them forward along to a service they pay for, some forward to Google’s free service. And many of them have BlueCat actually go resolve the answer for them and then return that all the way back to the client.
This is usually not allowed. Most corporations block any access to the public internet, other than through the allowed path. Very easy to do with firewalls and that’s how they set their system. They set up their system that way, because they are one from a security standpoint, they don’t want direct access out, but also just from, they’ve deployed this DNS architecture to distribute a load along with their network deployment, and how they look at their enterprise architecture for networks. And so it’s part and parcel with how they’re trying to optimize what’s going on inside their company.
The Privacy Problem – 12:53
So what’s the problem? Some users want privacy. DNS messages are plain text, easy to eavesdrop on, we can attribute it back to the user device or close to it. DNS is a simple point of control. So ISPs, countries, governments can and do block certain content or change content. And this is a concern, especially in areas where censorship is an oppressive thing. Users want the ability to be able to access services without any attribution to them or their devices, for instance. From a corporate perspective, there’s no desire for privacy nor is there an expectation from the user on a corporate network that there is privacy. From a corporate perspective, great, the messages are plain text. This visibility is fantastic for net-ops and sec-ops. We know it’s the DNS protocol. We know what’s going on and we can utilize this data in a number of different ways.
We can attribute it to the user device. Fantastic. It helps us with network operations and security operations. Super. It’s a simple point of control. Fantastic. We can control the flow of DNS egress, we have an efficient layer in defense, in depth. Query answers can be changed. Great. We bypass things that dynamically change query answers based on healths of backend application. These are all good things from a corporate perspective. This is the basic rub, and this is the basic rub in the industry right now. You launch these protocols for DNS privacy and whether or not DNS privacy gives you individual privacy, the answer to that is no, but we’ll get into that in a bit. These protocols get created, these RFCs get created and deployed, and from a corporate perspective, there’s nothing necessarily good about them, because I sort of depend on DNS being a control plane protocol that I can monitor, understand and change. So DNS over HTTPS and DNS over TLS are really focused here, between the stub resolver and the recursive resolver. This is where they tend to be focused. One, because they’re focused on the user, and this is what the user can control. They’re not focused here, because the user can’t control that. This is interesting though, and certainly there’s been other attempts to do this, like a DNSCrypt has been around for quite some time. And our friends at Cisco with umbrella with OpenDNS use DNSCrypt as an example. And that will give an enterprise at least some level of, it will give them some privacy in terms of the queries they’re spitting out of their company. But the real focus of these RFCs are here.
DNS over TLS, or DoT – 15:56
So the first one is DNS over TLS. Encryptions are provided by TLS, so now we have certificate-based encryption, I can’t see these packets anymore. That removes any opportunity for eavesdropping or on path tampering with DNS queries. Great. It sort of comes with two different profiles. One’s a strict profile, like if my queries aren’t encrypted, then fail them. Another one is an opportunistic one saying, okay, so if there’s a DNS recursive resolver in my path that supports TLS, use it, otherwise just fall back to normal DNS. These profiles have been defined with it. It’s stupid easy to block. It’s on a unique port. It’s on port 853, close the port, it’s blocked. From a corporate perspective, if they’re concerned about users using DNS over TLS, they just don’t open that port in the firewall, done.
If I execute a query, here I did on my command line, in debug mode, so I can sort of see the certificates downloaded and ensured that their trust is part of the process. And then I get my normal answer to the query. I can see it, because I’m on one end point and the system I just connected to, in this case, I think I went to Quad1, so I went to Cloudflare’s DNS over TLS recursive resolver that’s available for free on the internet. And they can see it, because the encryption is terminated there and the encryption began with me. Nobody else along the line was able to see it. If I go back and look at Wireshark, I’m going to see a bunch of protocols that are around TCP and TLS versus DNS. We don’t know this is a DNS protocol anymore. It’s all going to be in port 853 and we’ll see a bunch of encrypted bytes that can’t be decrypted into a DNS message anymore. Nobody along the way can see that this is a DNS message, but we know it is because it’s on port 853.
DNS over TLS is TCP. There’s also a new RFC that has a variant to do this over UDP, because UDP has an advantage over TCP, which over time is less and less important, based on how you tweak TCP. But regardless, but why not do DNS over to TLS via UDP? And there’s actually an RFC for that. DNS over DTLS it’s called for those that are interested, there’s the RFC number. And that actually makes it a bit closer to DNSCrypt. But basically it’s exactly the same as DNS over TLS, other than I’m using UDP instead of TCP. So therefore, it should be potentially faster and more so for the operator’s perspective, maybe easier to scale, because I’m not worried about the burden of connections. UDP is sort of a fire and forget protocol. And with TCP, I’m now actually creating a connection between the two end points.
So if I’m a user and I want DNS privacy, how can I access this thing? The reality is, it’s not included in the stub resolvers. The operating systems don’t normally come with an option to just switch on DNS over TLS. Your ISP most likely doesn’t offer it. Some ISPs might. So therefore, I need to go muck with my operating system, install a proxy, install DNSCrypt, install something to give me this ability. So basically, I’ve now ruled out the vast majority of humans on this Earth from using this thing because they’re not going in and mucking around with their DNS, adding additional DNS capabilities at their operating system level.
There’s a lot of public recursive resolvers that support DOT. Some promised not to log, but keep in mind, the encryption is terminated at this recursive resolver. So somebody knows what you’re doing. And depending on which recursive resolver you choose on the internet, you might choose one that is more abusive to your privacy than your ISP was. You don’t know. Certainly they have the better ones and the more commercial ones will have statements on privacy and what they do with their data and it’s user beware. It’s like all the VPN services out there that people buy where they say they don’t log, and then it ends up a log, because governments access their logs. It’s sort of buyer beware there. On Linux distros, system D resolver now actually supports DNS over TLS, but again, not for the masses, because the number of people that are running Linux at their home is ridiculously small. Consumer routers, either some consumer routers like home routers might support or certainly you can go install like one of the open source packages for Wi-Fi routers for instance, and then go enable these services by installing additional components. Or, maybe they just allow for them. Great.
Again, this isn’t something that’s going to be wildly adopted by people. It’s going to be adopted by very specific people who are looking for DNS privacy. There’s no easy way to just turn this on everywhere, which leads us to DNS over HTTPS.
DNS over HTTP, or DoH – 21:31
Basic concept is the same, encrypt the query. The difference with DNS over HTTPS is the query is now encrypted, TLS. It eliminates the opportunities for eavesdropping and on-path tampering. Fantastic. But the queries are sent through standard web protocols and ports. It’s port 443, it’s HTTPS. It looks like somebody executed, is using HTTPS. It looks no different than any other HTTPS. Am I going to a website, am I using curl, am I executing a DNS query? I don’t know. As a network operator inside of our customers, that means if I want to actually see what’s there, I need to go figure out how to decrypt all of this data, and I’m already having a hell of a time decrypting standard web traffic, given how much web traffic is encrypted today. I think it’s up like more than 75% of web traffic is encrypted and our customers invest in a lot of stuff to try to decrypt, and they’re buying more cloud based systems to try to do it. Hard to do, requires access to the endpoint and installing certs on the endpoint, but just the same. It’s doable and certainly our customers might block well-known DNS over HTTPS. However, way harder to detect even DNS.
Here’s what it looks like. I’ve executed behind the Zoom stuff, a HTTPS query. I’ve literally just dropped something in a browser. In this case, I’m specifically looking at the response and the response is going to come back in JSON, although I have a couple of options and here’s the response to DNS and JSON. I can see it because I’m the client and the server I called, in this case, I used CloudFlare again. CloudFlare knows my query as well because they are a termination point for the encryption. And now the result of my query, same data, same header, same question, same authority, additional it’s all there, but in this case delivered via JSON. It’s a different protocol to execute the same basic DNS flow. And for those that love UDP, UDP has advantages over TCP, so why don’t we do DNS over HTTPS via UDP with QUIC? And there’s a draft RFC for that. It hasn’t made like a proposed standard yet, but there’s a draft RFC as those that are pushing QUIC are driving QUIC more and more into mainstream.
So same point though, how can I use DNS over HTTPS if I want DNS privacy? Is it vastly more accessible? If I wanted to go change the plumbing of my DNS subsystems and my operating system, it’s equally as inaccessible. It’s something that you don’t expect users to ever do. However, the browser vendors are adding direct support for DoH in their browsers. Some of the browser vendors already ignore the operating system’s stub resolver, but now they’re adding direct support for DNS over HTTPS such that a user may have no idea that they’re using it. Mozilla may or may already in a recent data started optimistically searching for reachable DoH recursive resolvers and then enabling them by default. Their initial go to market partner is CloudFlare with this, with CloudOne they have an under 5% market share, but the second they do that, then the 5% of users that use Firefox would be using DoH, whether they knew it or not. That’s sort of scary, it’s not that big of a market share, but it’s something they can easily do. More scary is Google. Google provides a DNS over HTTPS service. Chrome has a 63% market share. Here’s a quote from APNIC, which I think is pretty good:
“If a browser chooses to use DoH as their default, then there’s little that the platform and the network can do to prevent it. If the browser has installed DoH support, then control over DNS name resolution function has passed from the user to the browser provider, and rather than being an esoteric function enabled by a handful of users,” the way we would do DoT or muck with the operating system, “It becomes a mainstream service potentially by billions of users.”
Tomorrow, Google can drive like their 63% market share of browsers out there and drive all of that DNS traffic to them. That’s a sort of scary thing when you start thinking what’s in it for them, but also what’s in it for the DNS in general? Which tends to want to be decentralized. But that’s pretty scary, especially since users don’t know, and the users that don’t know also don’t care. They want a good internet experience. They really, most users, aren’t even thinking about this from a privacy standpoint at all.