Distributed Lookup

Domain Name System (DNS)

Paul Krzyzanowski

March 24, 2021

Goal: Design a system to look up domain names that can scale to the planet-wide internet and handle billions of objects.

The Internet Domain Name System (DNS) is the naming system for nodes on the Internet. It associates human-friendly names with numeric IP addresses and other information about that node.

It is challenging to create and manage meaningful unique names on a very large scale (e.g., try picking a twitter handl). Hierarchical naming systems are commonly used to facilitate uniqueness and management. A name that is made up of a list of components is called a compound name. We see this in names such as pathnames (“/home/paul/src/qsync/main.c”) and Internet domain names (“www.cs.rutgers.edu”).

IP domain names (e.g., cs.rutgers.edu) are hierarchical and assigned completely independently from their corresponding addresses (e.g., 128.6.4.2). Any name can be associated with any address.

IP addresses are managed globally by the IANA, the Internet Assigned Numbers Authority, which hands off management to Regional Internet Registries that, in turn, assign blocks of IP addresses to ISPs.

Internet domain names are also assigned hierarchically, with hundreds of top-level domains (.com, .org, .ca, .es, .yoga, .wedding, …) under a single root. The IANA delegates the management of individual top-level domain names to various companies. These are called domain name registry operators, each of which operates a NIC, or Network Information Center.

Domain name registrars are companies that allow end users to register domain names. They update the registry database at the NIC. For example, the .yoga top-level domain is operated by a company in Ireland called Top Level Domain Holdings Ltd. When you register a domain with a domain name registrar such as Namecheap or GoDaddy, the company contacts Top Level Domain Holdings to update the master database of information about domains registered under .yoga.

The DNS server is a distributed name server for resolving Internet domain names into IP addresses as well as providing other information about the domain name. Each server is responsible for answering questions about machines within its zone. A zone is a subtree of the Internet Domain name space that is managed by that server. A registered domain name must include the address of one or more name servers that are responsible for answering queries about hosts within that domain.

Each DNS server may do one of several things: (1) answer a request if it knows the answer to the query, (2) contact another name server(s) to search for the answer, (3) return an error if the domain name does not exist, or (4) return a referral: the address of another name server that may know more of the answer.

For example, a search for mail.pk.org (with no cached information) begins by querying one of several replicated root name servers. These keep track of the name servers responsible for top-level domains. This query will return a referral to a name server that is responsible the .org domain. Querying that name server will provide the name server responsible for pk.org. Finally, the name server responsible for pk.org will provide the IP address or an authoritative “this host does not exist” response. Along the way, referrals may be cached so that a name server need not go through the entire process again.

An iterative name resolution process is one where the client handles referrals. It follwos the process describe dabove, where a DNS server returns a referral to a lower-level DSN server, an authoritative answer, or an error. Because the contacted server may return a referral, the client must iterate, issuing requests to these other name servers, until it reaches the leaf of the hierarchy where a name server can provide an authoritative answer about the host. The advantage of this approach is that name server do not need to track state about the query: they immediately either know an answer, know the address of a lower-level name server, or know nothing. They never need to issue network requests to resolve a query.

A recursive name resolution process is one where a domain name server itself will query other DNS servers as needed to get the answer to the query. The DNS protocol supports both forms. The local DNS server, called a DNS resolver (for example at your ISP) will generally support recursive requests and perform the iteration itself rather than passing referrals back to the client.

Last modified April 7, 2021.
recycled pixels