To give you an idea of my background, I work as an Internet developer in an IT department.
So, last week a couple of us got to discussing networking at the office, and we got into rather a heated debate. I should point out that the two guys I was debating with are very intelligent, and knowledgeable about computers. They are both ex-programmers - one a hardcore C/C++/C# guy, the other from the Java world.
On top of that, one of them is my boss, and the other is his boss’s boss, therefore 3 levels above me, so I had to be somewhat diplomatic. When I say “heated”, it was actually pretty cordial - but we’re all geeks and therefore passionate about our subject.
We were talking about sockets & ports. One of them made the statement that when a client connects to a web server, even though the initial connection is to port 80, it is immediately thereafter shunted to another port, to “free up” port 80 so that other clients can connect, without “overloading” it.
To which I replied, that’s bollocks.
Or rather, I politely pointed out that I didn’t agree (since he’s a director). I asserted that all incoming connections arrive on port 80 (except for SSL on 443), and they stay there for the duration of the TCP/IP conversation. (session?) I know that the connection can be made on any port, but I’m talking about the normal case for this example - a web server. Once the connection is established, all traffic is between some (usually) randomly assigned port on the client, and the well-known port 80 on the server.
I consulted the TCP/IP RFC to back up my argument - sure enough, a “socket” is defined by its endpoints, which consist of port number + IP address; if the port number changes, it’s a different socket and therefore a separate connection. I read through the RFC sections a couple of times, but they didn’t seem to nail this down specifically enough.
I also pointed out that if subsequent traffic were to be sent to a port other than 80, most firewalls would block it, since they only allow a narrow subset of ports (usually 80, 443, 20 & 21, plus maybe a small number of others). Therefore it must be continuing to hit the same port.
I further asserted that multiple connections to port 80 are possible because of differing, usually dynamically-assigned ports on the client side, so it’s perfectly legal to have this:
Client 1:25001 -> Server:80
Client 2:28321 -> Server:80
since the client ports are different, and therefore are distinguishable at the server.
I think the confusion our director has is with load distribution that happens at the application level - it’s very common for an app listening on a TCP/IP socket to receive incoming requests, then immediately dispatch them to separate “worker” threads, leaving the main thread to pull new incoming messages. This is how, for example, Microsoft’s web server IIS works. I managed to get them to agree that threads have nothing to do with sockets and ports, and that the TCP/IP stack only cares that “someone” is listening on a particular port number, and it is that application (in this case, the web server) that receives the traffic.
I think I managed to finally convince them of all this. However, one nagging issue remains: my manager says that the term ‘port’ used to have an alternative meaning way back in Windows’ history. It meant something like a “connection”, the sum total of client IP + client port + server IP + server port, and that Windows maintains a “collection” of such things for balancing network requests. He said they were called something like “logical ports” or “virtual ports”. Note that these have nothing to do with hardware ports like COM1:, and were different from regular TCP/IP ports. He said they were sort of like file descriptors - this rang a bell with me, as I remember in the Winsock headers, some of the #define’s begin with FD_ for file descriptor.
OK, you made it this far through my meandering post - I probably should summarize what the actual General Questions are here:
-
Is my description right about how the connection goes to a particular port, and stays there for the life of the connection?
-
Has anyone heard of these strange “ports” from way back when, which aren’t the same as TCP/IP ports?
Any clarification much appreciated…