When a router gets a packet, how does it know which computer on the network to send it to?
Example: Say we have a router with the global IP address 1.2.3.4 and the local IP address 10.0.0.1. To this router are connected two machines on the local network, with local IPs 10.0.0.2 and 10.0.0.3. Both machines connect to some instant messaging network. Suppose the IM protocol is such that all messages get routed through a central server, 5.6.7.8.
Say Joe Schmoe at IP 9.10.11.12 wants to send an IM to John Doe’s machine, 10.0.0.2, behind the router. So 9.10.11.12 sends the message to the IM server 5.6.7.8, which verifies that John Doe is online and in turn forwards the message on to the router 1.2.3.4. (Presumably all the information 5.6.7.8 has about John Doe is that his IP address is 1.2.3.4; it doesn’t know about the local IPs behind the router.) So now the router has the message; how does it know whether it’s meant to be forwarded to 10.0.0.2 or 10.0.0.3? Remember that both 10.0.0.2 and 10.0.0.3 are running IM clients.
I know this works, because on my home network my housemates and I each has his own computer, and we are connected to the same IM networks, and yet we don’t receive each other’s messages. But I don’t understand how this is.
This particular situation is called NAT, for Network Address Translation, and it was invented precisely for the purpose of allowing two-way communication between a machine on a private network behind a router and a machine on the public Internet (the IM server in this example.)
NAT works by keeping track of port numbers. Whenever an incoming or outgoing connection is made via TCP or UDP, a two-byte port number is included along with the IP address. This port number allows the host machines to differentiate between different kinds of traffic. Port numbers below 1024 are standardized; port 80 is the default for HTTP, for example. When outgoing connections are made, the originating computer choses a random port number above 1024 to use on its end.
So, let’s say our guy behind the router at 10.0.0.1 makes an outgoing connection to the IM server at 5.6.7.8. His machine picks a random port number, say 23456 and opens a connection from 10.0.0.1:23456 to 5.6.7.8:1234 (here we assume that 1234 is the standard port number for this IM service.)
That packet has to go through the router since 5.6.7.8 is located on another network. The router makes a note that 10.0.0.1 has an outgoing connection on port 23456 to Internet machine 5.6.7.8. The router then forwards the packet, substituting its public IP address, 1.2.3.4, for the private one, but keeping the originating port number the same[sup]1[/sup]. When the IM server sends a message back, it sends it to 1.2.3.4:23456. The router remembers that local machine 10.0.0.1 had an outgoing connection to that machine originating from port 23456, and forwards the packet on, rewriting the “from” address with the router’s own IP.
ETA: Wikipedia article on NAT.
[sub]1. Actually most NAT implementations will use different originating port numbers and have to keep track of those translations, too.[/sub]