Question about nameservers

So, in an effort to fill in some holes of understanding I have more completely in how things on the Internet work, I have a bit of a question about DNS and name servers in general.

I think I understand the basics of DNS. You type a URL, the URL (assuming it’s not cached) goes to a root name server, which send back info to send it to a top-level domain name server, which then sends back info to send it to an authoritative (or whatever it’s called) name server, which then sends back the IP address.

OK, so now I have a URL connected with an IP address. So what happens after that, especially in the case of websites that are on shared server space? If I look up my IP address for my website, I can’t just type in http://123.45.6.7 and have my page displayed. What is happening at this level of communication. I assume the computer is directed to 123.45.6.7 and then perhaps the URL I typed in is passed along to the server at 123.45.6.7 and that doles out the correct page? Is that’s what’s going on? Or is there an extra layer of name servers on that site (like what I assume ns1.mywebsite.com and ns2.mywebsite.com) which directs my request to the correct directory on the server at that IP address?

I know this question isn’t completely focused, so I may be asking or assuming something completely stupid along the way, but I’m sure there’s people here who can clearly explain what’s going on once the computer get the info that equates www.mywebsite.com with 123.45.6.7.

The IP address is used to access the web site, but in addition a bunch of headers are passed to the server, one of which is the name by which the site was referred to on the calling side. The server then can decide, based on that information, to serve you different pages.

Shared web hosting service - Wikipedia explains it

OK, thanks for the info. I assumed it must be something like this, but I couldn’t quite figure out what to google for.

The specific HTTP header involved is called HOST. This was actually an addition in HTTP/1.1. Version 1.0 did not include this header, which meant that the HTTP server had no way of knowing which domain name was being used. Therefore you needed multiple IP addresses to host multiple domains on the same server.

It’s actually a good demonstration of the abstracted layer design of modern network protocols. IP only cares about sending data from one IP address to another. HTTP only cares about sending messages composed of headers and documents between clients and servers. HTTP doesn’t require TCP/IP to work (or even DNS) but because of layering they work very well together.

In the interests of clarity:

The URL in your browser comprises several parts (url broken for safety)


http :// boards.straightdope.com /sdmb/ newreply.php ?
 do=newreply&p=100000

protocol: http://
hostname: boards.straightdope.com
path: /sdmb/
file: newreply.php
query: do=newreply&p=100000

What DNS resolves is the hostname.

Also, your browser (or your PC) does not do all the searching to resolve DNS names.
Your PC has a configured DNS server and DNS requests are passed to it.
That server is a recursive DNS system that can either go looking for the root servers or resolve from local cache.

[QUOTE=Okrahoma]
The IP address is used to access the web site, but in addition a bunch of headers are passed to the server, one of which is the name by which the site was referred to on the calling side. The server then can decide, based on that information, to serve you different pages.
[/QUOTE]
This works for HTTP, but the process is different for HTTPS.

HTTPS servers present a certificate that allows the client to validate that the website is who it says it is. Until this occurs, no encrypted communication is possible and no headers can be exchanged.

HTTPS servers either serve a single certificate from a single IP address, or use Server Name Indication. This is an optional extension in the initial client request to the HTTPS server (the ClientHello) that specifies the name of the server that is being requested. The SNI name can be matched against a list of SSL certificates, and the appropriate certificate supplied to the client. Once the HTTPS connection has been established, the host header in the HTTP request can be used to select web server content.

The DNS system, is just a technique to get an ip address from a fully qualified domain name like

http://boards.straightdope.com/

It is a lookup table and it is hierarchical if one DNS server does not know the address, it passes the query on to others in a chain until eventually one does know and sends you the address.

You don’t need this if you already know your destinations IP address. For some websites it is possible to type in http:/<ip address> and it will work.

However, this does not always work and the reason is often that the server hosting the website at that ip address is handling more than one website. Some shared hosting sites have over a thousand websites sharing a server. The webserver works out which webpage to present by doing its own internal lookup of the domain names it handles. This technique was developed an became a standard to prevent all the ip addresses supported on the Internet being used up very quickly.

There can be other reasons why there is not a one to one correspondence between a domain name and an ip address. For example, a common technique used by big websites is to spread copies of the most used pages of a website to many servers spread around the internet, so all the traffic is not concentrated on going to one place. These are content distribution networks and they will serve the same web pages from different IP addresses, one which is geographically closest to you.

The Internet has evolved a whole set of techniques to deal with the problems of huge scale, traffic and security and bumping up against these makes it very confusing to understand the basic mechanism.

Setting up a little webserver on your own network is a good way to learn about DNS, DHCP and IP Addressing.

Yes, as I was reading up on this last night, I figured out the HOST header was the main bit of info I was looking for. I figured that piece of data must have been passed along to the server at the IP address it was directed to somehow, but none of the info on DNS mentioned it (which I guess dies make sense, since it’s not part of DNS.) And the pages I was finding about how a computer gets to a website were all simplified examples that assumed a unique IP for each website, not considering shared hosting.

At a time when my ISP’s name server was kaput, I verified this by actually getting to a couple of sites whose actual address I knew. When I called the ISP (actually the phone company) to complain, the help(less) person there had to ask me what a name server was and then denied there could be a problem.

Google has an easy-to-remember public DNS server: 8.8.8.8. Useful for debugging DNS problems.

Continuing from this point, but expanding the details a bit.

Which means that not only does the IP address of www.myhostedwebsite.com point to the same public IP address as a great many other websites hosted on the same server(s), the opposite is fully true too.

There is not one single unique answer to the question of “what’s the IP address of www.google.com?” Any given DNS server can hand out a bunch of different answers. And different DNS servers in different places in the DNS hierarchy and in different parts of the world are free to offer different answers as well.
Last of all, it’s worth mentioning that we’re not really talking about “*the *IP address” of anything. We’re talking about “*a public *IP address” of something.

That something is sitting at the public / private interface and is some type of firewall / router / concentrator etc. device that is probably umpteen-directions redundant, is physically listening for many dozens, if not hundreds, of IP addresses, and is rerouting traffic or parts of traffic to a metric buttload of redundant machines on the private side.
One *can *learn a lot by setting up a simple homebrew network with a few machines, virtual or otherwise. But just like 4th grade Science class, it’s important to recognize that almost everything you’ll learn is the simplified Tinkertoy version, not the real thing. the knowledge gained is useful, but needs to be recognized for what it is.

Yes, instead of my ISP’s DNS, I have my router’s DNS set up to use Google (8.8.8.8 and 8.8.4.4) or Open DNS (whose IP I can’t remember, but it’s saved in my router’s firmware–well, in the custom firmware I installed [Gargoyle]).

OK, so next quick question:

When I do a whois lookup on my domain, I get some nameservers at the bottom, say, ns1.mywebsite.com and ns2.mywebsite.com. So DNS lookup works like this? :

Let’s assume I’m using Google’s public DNS server 8.8.8.8. Now that server is the recursive name server, right. Is this the step it takes?:

  1. Check to see if URL is already cached.
  2. If not, check with root server for address of TLD name server (.com in our case)
  3. Query TLD name server for address of authoritative namer server (mywebsite.com)
  4. Query authoritative name server for final IP address
  5. Then the computer connects with the IP address, sends along the hosts header, and gets routed to the correct directory on that server.

Is that about correct? Maybe with some extra cache-checking steps in there. If so, my questions is that nameserver I see in my WHOIS information, is that the IP address of that what is being returned in step 3? If not, at what step does ns1.mywebsite.com come into play?

ETA: And I am generally aware that a URL can map to many different IP addresses.

So far you have been looking at this by considering how your computer uses DNS to find the ip address of a website with a fully qualified domain name.

But how does new website become associated with a domain name, how does the entry get into this chain of DNS servers? That is the job of a domain registry company. You see ads for them all the time, companies like GoDaddy. These companies usually sell you two things: web hosting and a domain name, they may also do email hosting (they rarely do all of these well). They allow you to point your new domain to one of their servers that will hold your website. This is done through the administration pages for your domain. They supply the authoritative DNS servers you see that will propagate your entry out to the DNS world.

The domain name configuration can be changed at any time by you to some other server hosting your website. The domain configuration can also point to an email server and other servers like FTP servers that might also use the domain name in their addressing.

Usually, the domain configuration looks like this:

https://www.namecheap.com/support/knowledgebase/article.aspx/9559/32/how-to-set-up-private-nameservers-reseller-packages

See no 3. The DNS Zone files. Note the A record, that points to the IP address of the website. And you can see the authoritative DNS server for the domain name.

Learning about DNS configuration can be very useful because when it comes to email a lot of big, cheap hosting companies provide a really poor service and it is a good idea to find another company that does email hosting properly. To use such a service, you have to change the MX records to get them to point to the email host. You can also point your website to a different host if you want a better service.

I usually buy a domain name from a specialist domain registry company, then point it to some website hosting provided by another and an email service provided by another company. That way all my eggs are not in the same hosting basket and I can change things easily by altering the DNS zone records, should I, for instance, want to move my website.

Finally, back to your PC for one last point about DNS. You don’t have to use a DNS server at all. Your computer has a hosts file that has a list of domain names and ip addresses and it looks there first before it uses any DNS server. You don’t have to put a domain name and a proper IP address in each line. You can send a domain name to the null address, which means nowhere. I use this for directing domain names associated with advertising statistics collection to null. It speeds up advertising heavy webpages dramatically because they are usually waiting for lots of slow dns servers to resolve.

:stuck_out_tongue:

OK, that part I know. I have my website set up the same way. My registrar is not the same as my hosting company, and I’ve moved my pages before from one hosting company to another, so I have some idea of A records, MX records, CNAME, DNS propagation times, etc. I’m trying to fill in the knowledge from the other side. I’m also aware of hosts files, but only on the PC. Not sure where on my Mac hosts are, but I’ve never needed to use it for any reason.

Also, while 123.45.6.7 is listening on only certain ports, eg 80, 443… each new tcp/ip connection from your computer to 123.45.6.7 N will be from a new local port number.
Suppose you had two web browsers running to the same site, 123.45.6.7 … each web browser will open that site using a new local port number,so both 123.45.6.7 and your computer can tell the difference by the local port number.

One program can create multiple tcp/ip connections too, and each one will get a unique local port number…

While HTTP isn’t strictly reliant on TCP/IP , any replacement has to be guaranted transmission. Although that could be done at a different layer, perhaps. SO you could use UDP as long as the network layer never lost packets.

> 1) Check to see if URL is already cached.

As I noted previously, DNS deals with hostnames (Fully Qualified Domain Names). I’m being picky here.
But yes, the DNS resolver checks cache first.

> 2) If not, check with root server for address of TLD name server (.com in our case)

Yes - there are “13” root name servers. By 13, I mean that there are 13 anycast root name server IP addresses, with a much larger number of actual servers.

> 3) Query TLD name server for address of authoritative namer server (mywebsite.com)

Yes - the resolver queries the .COM nameserver for mywebsite.com, and gets your nameserver NS records back in return

> 4) Query authoritative name server for final IP address

Yes - the query goes to one of your nameservers, and should return an A record.
It is possible that the query will return a different record
www.myserver.com could return a CNAME record (canonical name) that points to
pulykamell.myserver.com
If this happens, the CNAME is returned and a new resolution starts (which should return an A record, and leverages any cached data)

> 5) Then the computer connects with the IP address, sends along the hosts header, and gets routed to the correct directory on that server.

Yep.

How a webserver can handle several different websites:

The word to search for is virtual hosting.

A hosting site that allows you to resell websites to other people (a reseller account) usually gives you access to configuration pages that allow you to set up this ‘virtual host’ feature.

Sweet. Thanks! I appreciate the pickiness, too. :slight_smile:

It’s also helpful if your ISP’s DNS pretends it can find any domain in the Universe, even ones which don’t exist, and gives you IP addresses for its own servers when you lookup domain names which don’t exist.