Illustration of HTTPS based DNS

Users face increasing privacy and security risks. At Mozilla, we pay close attention to these threats. We believe we have a responsibility to do everything we can to protect Firefox users and their data.

Some companies and organizations want to secretly collect user data and sell it. That’s why we added trace protection and created the Facebook Container extension. We will be taking additional steps to protect user data in the coming months.

Two more protections to add:

HTTPS based DNS, which is a new IETF standard scheme we advocate
Trusted Recursive Resolver, a new solution to DNS security that we provide in partnership with Cloudflare

With these two initiatives, the problem of data breaches in the domain name system, which was created 35 years ago, has been gradually solved. We’d like you to help test both options. So let’s see how we can protect our users with HTTPS based DNS and trusted recursive resolvers.

But first, let’s look at how web pages work on the Internet.

If you already know a lot about how DNS and HTTPS work, you can jump to the benefits of HTTPS based DNS.

What is HTTP?

When we explain how browsers download web pages, we usually say something like this:

The browser makes a GET request to the server.
The server sends a response, which is a file containing HTML.

This system is called HTTP.

But this picture is a bit of an oversimplification. The browser does not talk directly to the server. Because the browser and the server may not be close.

Conversely, the server may be thousands of miles away. So there can be no direct connection between your computer and the server.

Requests made from the browser need to pass through many hands before reaching the server. The same is true for the response returned from the server.

It’s like passing notes in class. The note will indicate to whom it should be passed. The student who wrote the note passed it to the next student. The student then passes the note to the next student – perhaps not the ultimate recipient, but certainly someone in the direction of the recipient.

The problem is that anyone along the way can open the note. And there is no way to determine in advance the route of the note, so it is uncertain who will have access to it.

Notes can end up in the hands of people who want to do bad things…..

It’s like the contents of the note were made public to everyone.

Or change the response.

To address these issues, a new secure version of HTTP, HTTPS, was created. With HTTPS, it’s like having a lock on every message.

Both the browser and the server know the lock combination, but no one else.

That way, even if messages are sent across multiple routers, only you and the site can read the content.

This solves a lot of security problems. But there are still unencrypted messages between the browser and the server. This means there are still people in your path who can steal your messages.

Data may still be exposed during the connection to the server. When the initial message is sent to the server, the server name is also sent (in a field called server Name Indication). This lets servers running multiple sites still know who to talk to. A portion of this initial request is encrypted, but the initial request itself is not encrypted.

Another place for data exposure is in DNS. But what is DNS?

What is DNS?

In the diagram above, the name of the receiver must be placed outside the slip. The same is true for HTTP requests…… HTTP requests need to specify a destination.

However, you cannot use names for HTTP requests. No router will know what you’re talking about. Instead, you must use an IP address. The router in the middle of the IP address knows where to send the request.

This can cause problems. You don’t want users to have to remember the IP address of a website. Instead, you want to be able to give your site a catchy name…… Something the user can remember.

That’s why we have the Domain Name System (DNS). The browser uses DNS to translate site names into IP addresses. This process of converting a domain name into an IP address is called domain name resolution.

How do browsers do this?

One option is to have a large list, such as a phone book in a browser. However, it can be difficult to keep the list up to date as new sites come online or when sites migrate to a new server.

So instead of having one list of all the domain names, there are many smaller lists that are related to each other. This allows them to manage independently.

To get the IP address that corresponds to the domain name, you must find the list that contains the domain name. It’s like a treasure hunt.

For sites like The English version of Wikipedia (en.wikipedia.org), how do they “treasure hunt”?

We can divide this domain into parts.

Through these sections, we can search for a list containing the IP address of the site. We need some help, though. The tool that finds our IP address is called a parser.

First, the parser communicates with a server called the root DNS. It knows several different root DNS servers, so it sends the request to one of them. The parser asks the root DNS server where to find more information about the ORG top-level domain.

The root DNS will provide the resolver with the address of a server that knows the.org address.

The next server is called a top-level domain name server (TLD). The TLD server knows all secondary domain names that end in. Org.

But it doesn’t know the subdomain name under Wikipedia.org, so it doesn’t know the IP address of en.wikipedia.org.

The TLD name server will tell the parser to ask wikipedia’s name server.

The analysis is almost done. Wikipedia’s name server and permission server. It knows all the domain names under Wikipedia.org. So the server knows en.wikipedia.org and other subdomains, such as the German version of de.wikipedia.org. The permission server informs the parser which IP address has the HTML file for the site.

The parser returns the IP address of en.wikipedia.org to the operating system.

This process is called recursive parsing because you have to ask different servers back and forth, basically the same question.

We need such a parser to complete network requests. But how does the browser find this parser? In general, it requires the computer’s operating system to provide a usable parser to set up.

How does the operating system know which parser to use? There are two possible approaches.

You can configure a trusted parser for your computer. But few do.

Instead, most people just use the defaults. By default, the operating system will only use whatever parsers the network tells it. When a computer connects to a network and gets its IP address, the network recommends a parser.

This means that the parser can change multiple times per day. For example, going to a coffee shop for an afternoon work meeting might be different from the parser you used in the morning. This is true even if you have a parser configured, because the DNS protocol has no security.

How is DNS used?

So how does the system leave users vulnerable?

Usually the parser tells each DNS server which domain name you are looking for. The request sometimes includes your full IP address. Or, if it’s not a full IP address, the request usually contains most of your IP address, which can easily be combined with other information to find out who you are.

This means that every server that performs domain name resolution looks at the site you’re looking for. But more importantly, it also means that anyone with access to those servers can see your requests.

The system puts users’ data at risk in several ways. The two main risks are tracking and spoofing attacks.

tracking

As mentioned above, it is easy to get all or part of the IP address information and find out who is requesting the site. This means that the DNS server and any other router on the path to that DNS server (the path router) can create a file about you. They can create a record of the sites you visit.

And that data is valuable. A lot of people and companies pay a lot of money to see what you’re browsing.

Even if there is no need to worry about possible malicious DNS servers or path routers, there is still a risk that data will be collected. Because the parser itself – what the network provides you – may not be reliable.

Even if you trust a web-recommended parser, you probably only use it at home. As mentioned earlier, you might use a different parser every time you go to a coffee shop or hotel or use access to another network. Who knows what its data collection policy is?

There are more dangerous ways than collecting data and then selling it without your knowledge or consent.

Spoofing attacks

With spoofing, someone on the path between the DNS server and you will change the response. Instead of telling you the real IP address, the scammer will give you a wrong IP address. This way, they can block access to real sites or direct you to fraudulent sites.

Again, the parser itself can be bad.

For example, imagine shopping at Megastore. You want to do a price comparison and see if you can get a better price at big-box.com.

But if you’re using Megastore’s WiFi, you’re probably using their parser. The parser might hijack a request to big-box.com and falsely claim that the site is unavailable.

How do you solve this problem with trusted Recursive parsers (TRR) and HTTPS based DNS (DoH)?

At Mozilla, we feel strongly that we have a responsibility to protect users and their data. We have been working hard to address these vulnerabilities.

We have introduced two new features to solve this problem – Trusted Recursive Resolver and HTTPS based DNS (DNS over HTTPS). Because there are really three threats that need to be addressed right now:

You might use an untrusted parser that tracks your request, or tamper with the response from the DNS server.
Routers on a path can be tracked or tampered with in the same way.
The DNS server can track your DNS requests.

How to solve these problems?

Use trusted recursive parsers to avoid unreliable parsers.
HTTPS – based DNS prevents eavesdropping and tampering on paths.
Transfer as little data as possible to protect users from anonymous processing.

Use trusted recursive parsers to avoid unreliable parsers

When networks provide untrusted parsers to collect your data or spoofing attacks, providers of web services can still get away with it because few users know the risks or how to protect themselves.

Even for users who understand the risks, it can be difficult for individual users to negotiate with their ISPs or other entities to ensure that their DNS data is handled responsibly.

However, we take the time to study these risks…… And we have negotiating power. We tried to find a company that could work together to protect users’ DNS data. We found one: Cloudflare.

Cloudflare provides recursive resolution services through professional user privacy policies. They promise to discard all personally identifiable data after 24 hours and never pass it on to a third party. Regular audits are conducted to ensure that data is cleared as expected.

We now have a reliable parser to protect user privacy. This means Firefox can ignore the network-provided parser and go directly to Cloudflare. With this reliable parser, we don’t have to worry about rogue parsers selling our user data or cheating our users with fraudulent DNS.

Why did we choose such a parser? Cloudflare shares our commitment to building privacy-first DNS services. They worked with us to build a DoH solution service that delivers services to users in a transparent way. They’ve been very happy to add user protection to the service, so we’re excited to work with them.

But that doesn’t mean you have to use Cloudflare. Users can configure Firefox to use any doH-enabled recursive parser they want. As more products become available, we’ll make switching parsers easier.

HTTPS – based DNS prevents eavesdropping and tampering on paths

Parsers aren’t the only threat, though. Routers on the path can track and fool DNS because they can see the content of DNS requests and responses. But the Internet already has technology to make sure routers in its path can’t eavesdrop like this. This is the encryption I mentioned earlier.

By encrypting DNS packets using HTTPS, you can ensure that no one can monitor the DNS requests the user is making.

Transfer as little data as possible to protect users from anonymous processing

In addition to providing a trusted parser that communicates using the DoH protocol, Cloudflare is working with us to make it more secure.

Typically, the resolver will send the entire domain name to each server – root DNS server, TLD name server, secondary name server, etc. But Cloudflare does something different. It will only send the part that is relevant to the DNS server that it is currently talking to. This is called QNAME minimization.

Parsers also typically include the first 24 bits of your IP address in the request. This helps the DNS server know where you are and choose a CDN closer to you. But this information can be used by DNS servers to link different requests together.

Cloudflare does not do this, but instead makes requests from an IP address near the user. This provides a geographic location without having to bind it to a specific user. In addition to that, we are looking at how to better achieve very fine-grained load balancing in a privacy-sensitive way.

Doing so – removing irrelevant parts of the domain name and not including your IP address – means the DNS server collects much less data about you.

Unsolved problems of TRR based on DoH

With these solutions, we reduce the number of people who can see you visit the site. But this does not eliminate data leaks completely.

After performing a DNS lookup to find the IP address, you still need to connect to the Web server at that address. To do this, send the initial request. The request contains a server name indication that indicates which site on the server to connect to. This request is unencrypted.

This means that your ISP can still figure out which site is being visited because it happens to be in the server name indicator. This information is also visible to the router that passes the initial request from the browser to the Web server.

But once you establish a connection to the Web server, everything is encrypted. And this encrypted connection can be used for any site hosted on the server, not just the one originally requested.

This is called HTTP / 2 connection merging, or simply connection reuse. When you open a connection to a server that supports it, the server will tell you what other sites it hosts. You can then access other sites using your existing encrypted connection.

What good is that? You can access these other sites without having to launch a new connection. This means you don’t need to send an unencrypted initial request whose server name indicates the site being visited. This allows you to access any other site on the same server without revealing which site you are viewing to your ISP and path router.

With the rise of CDN, more and more independent sites are served by a single server. Because multiple merge connections can be opened, you can connect to multiple shared servers or CDNS at the same time, accessing all sites on different servers without compromising data. This means that privacy protection is becoming more effective.

The status quo

We now encourage you to enable HTTPS based DNS in Firefox.

We want this to be the default for all users. We believe that every user deserves this privacy and security, whether they are aware of DNS breaches or not.

But it’s a big change, and we need to test it first. That’s why the study was done. We asked half of Firefox Nightly users to help collect performance data.

The default parser will be used for now, but we will also send requests to Cloudflare’s DoH parser. We then compare the two to make sure everything is working as we expect.

Cloudflare DNS responses will not be used for participants in the study. We just check that everything is ok and then throw away the Cloudflare response.

We appreciate the support of Nightly users – those who test Firefox every day – and hope to help us test this out.

aboutLin-Clark

Lin is an engineer on Mozilla’s developer relations team. She uses JavaScript, WebAssembly, Rust and Servo, and draws code diagrams.