As I understand it - and greatly simplfied, the internet works something like this. Computer A makes a request to contact a certain computer (Computer E). Comp A contacts Comp B and asks ‘are you comp E’, B says no, but I’ll pass it along, B contacts Computer D, D in turn contacts Computer E and the connection is made. This system would work in if thee are only a few computers, but with all the computers on the net the simple pass the buck method wouldn’t work, so how does info know which direction to go.
Thanx
It’s less a matter of “are you computer E” and more a matter of “hey, B, here’s a packet for E. Do what you need to do.”
In this case, B (and C and D) are special-purpose computers called routers. Basically, a router is a computer with more than one network interface (i.e. two ethernet ports) which are on different networks. When a router receives a chunk of info (a packet) for some computer, it has to decide which of its multiple networks it should go on.
So A hands B the packet, and B has to decide what to do with it. Now, B has a table of information which says: If you have a packet for A or F or G or M or P, send it to network #1. If you have a packet for any other computer, send it to C, which is on network #2. So B sends the packet to C, and C makes a similar decision.
In the case of a router with three network interfaces, it’s just got a more complex table. It says something like: packets with these destinations go to #1, packets with these destinations go to #2, and the rest go to #3.
So the basic idea is that a router is a piece of equipment which knows which computers are on which side of it, and pushes the traffic in that direction.
Of course, now you’re thinking: yeah, but that means that every router has to know where every computer on the internet is in relation to itself. That’s a lot of information to store! This is true, and there are two ways routers make shortcuts.
The first is the concept of a “default route”. This is “the rest” in the example I give above. If you are a router, and you know about a handful of computers on one side of you, and you know that the rest of the world is on the other side, you can just keep track of the handful and then say “everything else goes thataway”. Makes things simple.
Second, the routers don’t usually have routing information for individual computers. Instead, they know about whole subnets. This is where the network mask comes into play. Instead of listing 1.2.3.0 and 1.2.3.1 and 1.2.3.2 up to 1.2.3.255, the router has patterns for things like 1.2.3.x or 1.2.x.x. The way this is done would require an explanation of binary numbers and would probably bore you to tears. But if you want, I’ll take a shot at explaining it.
So the basic concepts are that each computer or router has a little bit of knowledge about who the next hop should be for a given packet, and that’s it. This is illustrated by the example where some network administrator screws up some routing table information, and when trying to get from A to E, the packet will go A -> B-> C-> B-> C-> B-> C-> B-> timeout.
If you’re on windows and want to get a little nerdier, do something like “tracert http://www.yahoo.com” at a command prompt, and you’ll see the routers that the packet hits along the way. (tracert stands for “traceroute”, which is what the command is called on unix systems). It’s a pretty useful troubleshooting tool, too.
In the meantime, computer C is pissed that it was left out of the lop entirely, and launches nuclear missles at China, ending the world as we know it. This is a known bug that will be addressed in the next release of the software.
galt laid it out really well. Here’s a bit more detail that you didn’t ask for and probably don’t care about:
As pointed out, each router has a set of tables defining what networks lie beyond each of it’s interfaces. As for how those tables come to be, the tables can either be statically configured, i.e. a human plugs rules in, or dynamic, where routers talk to each other and share their tables. This conversation between routers is known as a routing protocol. There are several routing protocols, but they all fall into one of two buckets, distance-vector and link state. In the case of distance-vector, routers annouce the routes they can address, and routers record which routes they can get to on each interface. In the case of link state, routers share their entire tables, and each router constructs it’s own map of the universe from which it figures out which interface is best to use for each route.
[hijack]
Wouldn’t it be true, then, that the more computers there are on the internet, the more time it takes for a packet to get from point A to point B?
And, also, what happens if I’m sending a packet from my computer to a server, and somewhere along the line, one of the computers that the packet is being sent through shuts off before it sends the packet to another computer (because of a power outage, for example)
What happens to the packet? Does the server recieve fragmented information, or is that the source of those “no route to host” errors that seem to go away when I reload the webpage?
[/hijack]
Going back to galt’s excellent explanation, all the routers involved in passing your packets from A to B will need to examine the packets destination and make a routing decision. Luckily, this is done by custom-designed hardware that runs pretty damn fast - your packets go from a small local router to some badass backbone routers and then through some smaller routers again until they reach the desired server. Of course, some milliseconds of processing time are added every time you pass through a router.
The actual number of computers connected to the internet isn’t a problem, except of course for the fact that they tend to generate traffic that competes with your traffic for the available bandwidth.
Slow connections are mainly due to network bottlenecks - leading to queues and eventually buffer overflows and packet drops. Misconfigured routers can pose problems as well. And of course, when the topology changes, the routers have to use CPU ressources to recalculate roting tables instead of moving data - this might lead to packet loss, as well. Some protection mechanisms are in place to prevent this from going critical (Router A was so busy calculating routes based on updates from router B that router C concluded A was down and decided to inform B of this, leading to a NEW set of updates from B - you see the problem ?)
As for packet drops - that’s another can of worms.
If a packet is dropped for whatever reason (power outage, buffer overflow, noisy lines) the protocol you’re using might have a complex set of sequence numbers etc. to handle this and politely request that the packet be retransmitted (TCP) or it might not (UDP). If it doesn’t, it’s because it’s assumed that the applications using the protocol either will take care of asking for the information again or won’t care.
Packet drops happen all the time - sometimes they’re symptoms of something highly unpleasant (a backbone router was wiped out by a meteorite), sometimes they happen in the normal scheme of things (a Frame Relay node got temporarily congested and dropped a couple of hundred frames). The internet is designed to handle both scenarios and does so.
The “no route to host” sounds like a problem in your setup or that of your ISP - generally speaking, problems in the backbone would just lead to a variation over the “request timed out” error.
S. Norman
Thanks for filling in that missing link
I’ve mentioned this before, but to get a really good understanding of how the internet works, download the movie Warriors of the Net at:
It’s a bit large, but a fantastic, multimedia way of understanding the whole thing.