It’s a lot more complex than you probably thought. One of the big reasons is that TCP requires acknowledgement so that the server knows that you received the last batch of data. This means that all of the networking stuff in between the server and you (that you are trying to ignore) actually has a very significant impact on the server’s scheduling.
The server doesn’t just throw packets onto the internet and hope for the best. Each chunk is given a sequence number, and your phone (or computer, tablet, refrigerator, whatever) has to acknowledge that chunk. The server won’t serve you another chunk of data until you do. If your device doesn’t acknowledge that it received it, the server will assume it got lost somewhere and will re-transmit it.
Let’s take a relatively simple example of a server that doesn’t really have any sort of optimized scheduling. Keep in mind that even this is very much oversimplified.
So, your device request a file. There is some handshaking that goes on to open a “channel” (a TCP connection) between your device and the server. Typically the server will spawn off a thread to handle your request. If there are a lot of other threads being opened and handled, your thread might get bumped a bit by the server’s operating system as it handles other higher priority threads first. You are requesting a file, so now the operating system has to start accessing that file. If the server has a lot of disks and no one is accessing the disk where your file resides, then great! Your thread gets to grab that data right away. But if someone else is accessing that disk, then your thread has to wait. Once it gets the data, now it sends it back out to you. Your packet gets queued up in the server’s network driver. If there are a lot of packets queued up ahead of it, then your packet has to wait until those packets go out first. Then your packet finally goes out onto the network.
This is where all of that networking stuff that you don’t want to consider comes into play. Your packet has to go through all kinds of network wiring, switches, etc. to finally get to your phone. Then your phone has to acknowledge that it received it. Your thread is going to set there completely idle in the server while all of this happens. The server at largeunspecifiedfile.com is literally doing nothing for your file transfer, because your phone hasn’t acknowledged that it received the packet yet. All of that networking stuff that you don’t want to know about has had probably the largest impact out of everything affecting the server’s scheduling algorithm.
Finally the acknowledge comes back from your phone, which wakes up the thread. The thread waits in line in the task queue so that it can be run. Maybe the thread already fetched the next chunk of data of the disk, or maybe the thread now has to fetch that data and might have to get queued for disk access. One way or the other, the thread sends out the next chunk of data, and once again the thread goes to sleep waiting for your phone to respond.
As you can see, the scheduling algorithm that largeunspecifiedfile.com uses isn’t prioritizing connection time or bandwidth. It’s mostly “hurry up and wait”, where the server does what it can as quickly as it can, then waits for your phone to tell it that it received what the server sent so that it can send the next chunk of data.
What you may think are intentional behaviors aren’t so intentional, it’s just a question of maxing out different resources. If the network connection is fast, then your phone might ack data packets more quickly than the server can fetch them off of the disk. In this case, the disk accesses become the bottleneck. One way to get around that is to use multiple disks with the same data so that different tasks grab their data from different disks, and things only slow down when there are more tasks than disks and the disk accesses start to collide and force tasks to queue up to get their data.
Now that you’ve sped up the disk accesses, maybe the CPU and memory become the bottlenecks. You solve that with a bigger CPU with more cores and more memory. Or maybe you run a bunch of servers all together in a cluster so that it spreads out the work between multiple machines. To your phone, all you see is a data transfer from something called largeunspecifiedfile.com. Your phone likely isn’t even aware that a bunch of different machines are all working together to give it the best download performance.
The network can also become the bottleneck. Once you get onto the internet, the server no longer has any control over this, and no matter what optimizations it uses to make things faster, it still has to wait for your phone to ack each chunk of data. Unless the server is overloaded, this will usually be the determining factor for the server’s scheduling algorithm. It’s mostly sitting there handling other requests while your thread is stuck idle because the ack hasn’t come back from the phone yet.