What do internet services care more about, speed of conect time?

(Okay, one more shot.) No, I am not asking how the connection between my phone and my provider works. Being a phone has nothing to do with it, the ISP has nothing to do with it. The question (which I must have mistakenly thought was simple) was:

1.) I go to largeunspecifiedfile.com.

2.) I pick a file for download at largeunspecifiedfile.com.

3.) The software running largeunspecifiedfile.com says to itself “I have x amount of bandwidth available and y number of connections available. Do I care more about spreading bandwidth around to more users, or getting rid of this task as fast as possible?”

Anything outside of how the software at largeunspecifiedfile.com answers that question isn’t part of my question.

I think I see where you are going.

TCP may be the mechanism that implements the wishes of the “throttle”; all you want to know is what the hand that sets the throttle has chosen.

(assuming TCP is used here…there are definitely other protocols in use)

In most cases your traffic to/from a content provider will go through plenty of Internet plumbing, a load balancer, then eventually resolve to a physical server. Your machine will establish a TCP connection with that server, often 10 or 20 hops away. There will be a three-way handshake at the beginning of the connection to establish TCP settings, then your machine asks for the file.
The server will start sending the file as fast as it can, limited by I/O, CPU, buffer size, and so on. The thing that will throttle everything will be TCP.

TCP will be throttled by network devices all along the way that will twiddle settings in the packets (e.g. Receive Window size) or drop packets (a valid way to say “hold-on, cowboy!” to the sender). These are controlled by different organizations with differing goals. The site owner will have their settings; their ISP will have theirs; your ISP will apply settings, and you yourself can apply quality-of-service settings. All of these will affect the stream.

With that said, I apologize that I don’t have insight into the optimization goals of the ISP at either end or the content provider.

Moderator Note

This is pretty rude, especially when directed at someone who is just trying to help you. Try not to be so snippy in this forum. Stick to the facts.

Look at it from the point of view of the provider.

If they are streaming - i.e. you are watching a show as it downloads - then there is no real incentive to stream faster than enough to build a suitable buffer for interruptions.
If you are downloading - then they will download as fast as they can, given total circumstances.
Then consider you are nobody special - everyone (presumably) is an equal partner. So your download speed will also be determined by everyone else using the same download service/server(s).
(I think the days where number of TCP connections was an issue is past - “the server is too busy now, try later”)

I presume the question is - “will they prioritize my shorter remaining download to get me off their list?” Again, consider it from the server’s point of view. Odds are there are hundreds, or thousands, of downloaders. Completing one download faster because there’s less remaing, will not ease the load on the server significantly. Also, programmatically, “give everyone equal time” is the simplest algoritm to program. (Or more likely, for live streams, set the minimum useful stream speed by priority - more for 4K than for 1080P, for example)

So the short answer is likely - everyone online at the same time gets the most they can handle until the server is maxed out, then it alocates based on giving everyone about the same. “You can have all you can handle up to 120Mbps” or something.

You might also have traffic prioritized for “premium” customers. Who knows.

I’m also going to guess that 99% of the time, the server is not maxed out, unless the server company is really stupid and cheap or there’s a major live event. So you are getting the max your current connection allows. As others point out, despite your protests, your local connection speed depends on a lot of things.

At the server end, there is a server application - NGINX webserver, for example. This application has sockets (i.e connections) that communicate with the operating system network stack, which manages the data from the sockets and sends/receives from the physical network interface.

The network stack has an output queue, and socket queues. The output queue is filled by taking packets from all the available socket queues. In most cases, and for most network stacks, this will be some form of fair-access algorithm so that all sockets with data to send get an equal slice of the available output bandwidth. However, TCP stack features (such as Quality of Service or rate shaping) may adjust this so that some sources/destinations get a larger or smaller slice of the available bandwidth.

For the default TCP stack on most servers, if you are the only current connection, then you get the available bandwidth. Otherwise, you will share bandwidth with all available connections. But this is not a given, because all network traffic is managed. Not always directly from the server, but it might be.

If you throw out all the complexities about streaming and quality of service and so on, then the answer is minimum time. If a server has 10 users downloading a file simultaneously, and it will take the server 10 minutes to transmit that information, it’s better for everyone if the first user gets maximum bandwidth for the first minute, the second user max bandwidth for the second minute, etc. As opposed to sharing the bandwidth equally and forcing all of them to take the full 10 minutes.

But the complexities can’t be thrown out, and this isn’t generally how it works out in practice. I’ve seen Bittorrent clients do this, but general web servers do not usually have the information needed to prioritize in this way. It’s more likely to just be split based on who can accept the bandwidth.

Thanks for the reply. I don’t mind that the answer and situation is complex, as long as it is related to the scheduling done by the download site, which is where my curiosity lay.

As si_blakely notes, generally the decision is more on the side of the TCP stack. It aims for some sense of fairness, taking into account traffic priority and other things. Generally it’s going to try not to starve anyone, but bandwidth will also generally go to the client that can accept it. And there will be higher priority packets based on various factors.

It is indeed a complex question. As originally posed, it’s more of a spherical cow hypothetical.

This is basically the idea behind Quality of Service (QOS), as mentioned a couple posts above.

For a single 1GB file, maybe you can have a definitive answer for a given set of circumstances for the source.

But as soon as there’s a single complication - say the person sending the file also wants to video chat at the same time because getting the file is part of some job - there’s now some issues to balance. That video stream needs a fairly guaranteed rate - nobody wants the video call stuttering the entire time. But maybe you’re willing to deal with a little stuttering? Or a lot? If your kid is streaming YouTube or something, you need to also factor that in. Balance the loads becomes pretty important all around.

So, as above, if you have only the single connection, you max out the connection. That’s how baseline TCP works. But that’s as useful as the hypothetical “spherical cow” you get in physics problems. Those spherical cows don’t really exist in the real world. Your phone alone is sending and receiving all sorts of information constantly, for example, so you effectively never have a dedicated connection anymore.

It’s a lot more complex than you probably thought. One of the big reasons is that TCP requires acknowledgement so that the server knows that you received the last batch of data. This means that all of the networking stuff in between the server and you (that you are trying to ignore) actually has a very significant impact on the server’s scheduling.

The server doesn’t just throw packets onto the internet and hope for the best. Each chunk is given a sequence number, and your phone (or computer, tablet, refrigerator, whatever) has to acknowledge that chunk. The server won’t serve you another chunk of data until you do. If your device doesn’t acknowledge that it received it, the server will assume it got lost somewhere and will re-transmit it.

Let’s take a relatively simple example of a server that doesn’t really have any sort of optimized scheduling. Keep in mind that even this is very much oversimplified.

So, your device request a file. There is some handshaking that goes on to open a “channel” (a TCP connection) between your device and the server. Typically the server will spawn off a thread to handle your request. If there are a lot of other threads being opened and handled, your thread might get bumped a bit by the server’s operating system as it handles other higher priority threads first. You are requesting a file, so now the operating system has to start accessing that file. If the server has a lot of disks and no one is accessing the disk where your file resides, then great! Your thread gets to grab that data right away. But if someone else is accessing that disk, then your thread has to wait. Once it gets the data, now it sends it back out to you. Your packet gets queued up in the server’s network driver. If there are a lot of packets queued up ahead of it, then your packet has to wait until those packets go out first. Then your packet finally goes out onto the network.

This is where all of that networking stuff that you don’t want to consider comes into play. Your packet has to go through all kinds of network wiring, switches, etc. to finally get to your phone. Then your phone has to acknowledge that it received it. Your thread is going to set there completely idle in the server while all of this happens. The server at largeunspecifiedfile.com is literally doing nothing for your file transfer, because your phone hasn’t acknowledged that it received the packet yet. All of that networking stuff that you don’t want to know about has had probably the largest impact out of everything affecting the server’s scheduling algorithm.

Finally the acknowledge comes back from your phone, which wakes up the thread. The thread waits in line in the task queue so that it can be run. Maybe the thread already fetched the next chunk of data of the disk, or maybe the thread now has to fetch that data and might have to get queued for disk access. One way or the other, the thread sends out the next chunk of data, and once again the thread goes to sleep waiting for your phone to respond.

As you can see, the scheduling algorithm that largeunspecifiedfile.com uses isn’t prioritizing connection time or bandwidth. It’s mostly “hurry up and wait”, where the server does what it can as quickly as it can, then waits for your phone to tell it that it received what the server sent so that it can send the next chunk of data.

What you may think are intentional behaviors aren’t so intentional, it’s just a question of maxing out different resources. If the network connection is fast, then your phone might ack data packets more quickly than the server can fetch them off of the disk. In this case, the disk accesses become the bottleneck. One way to get around that is to use multiple disks with the same data so that different tasks grab their data from different disks, and things only slow down when there are more tasks than disks and the disk accesses start to collide and force tasks to queue up to get their data.

Now that you’ve sped up the disk accesses, maybe the CPU and memory become the bottlenecks. You solve that with a bigger CPU with more cores and more memory. Or maybe you run a bunch of servers all together in a cluster so that it spreads out the work between multiple machines. To your phone, all you see is a data transfer from something called largeunspecifiedfile.com. Your phone likely isn’t even aware that a bunch of different machines are all working together to give it the best download performance.

The network can also become the bottleneck. Once you get onto the internet, the server no longer has any control over this, and no matter what optimizations it uses to make things faster, it still has to wait for your phone to ack each chunk of data. Unless the server is overloaded, this will usually be the determining factor for the server’s scheduling algorithm. It’s mostly sitting there handling other requests while your thread is stuck idle because the ack hasn’t come back from the phone yet.

It’s really damned remarkable that the TCP “acknowledge last packet before sending next packet” system has scaled as well as it has. That is a very chatty & latency intensive way to do things.

There is an ack bit in each packet, so you can ack the last sequence number while requesting the next one. You only send out a dedicated ack packet if you don’t have anything else to send on that connection.

Well, delayed ACKs have been around for a long time.

The TCP window size can scale up to 1 GB, so there can be quite a lot of data in flight before anything actually waits for an ACK to come in. Fast, high-latency connections (say, over a geosync satellite) can get there.

Probably not good enough for interplanetary use, though…

The key term of art is sliding window protocol.

You have a ring buffer containing multiple packets, and allow multiple packets in flight. The ACK return will indicate packets so far successfully received, allowing the output ring buffer to roll around.

With no packet loss you can max out the connection if the output buffer is large enough.

Yes, there’s usually a sliding window, even going abck to (slow) phone modem transfer days. Instead of waiting for each packet’s ACk you have a window - I will send up to, say, 10MB before I will stop to see if they are actually getting through.

(I was ignoring the possibility some transfers had different scheduling requirements - I assume this was an “all are equal” mental exercise.)

Yes and no. Who wants to wait forever just to get a blurt of data at once? What if the first guy suddenly wanders into a slower connection area? At least if everyone is getting their fair share, everyone is getting something. The scenario suggested would mean you’d request a download, be put on the queue (which is full) and wait until everyone ahead of you has finished downloading - might be a bit faster, but staring at a download doing nothing for a long time is not encouraging in terms of customer relations, which presumably is the goal of commercial servers.

No one wants to be the tenth person, but their experience is no worse than it would have been otherwise. And the other 9 people get a strictly better experience.

I agree that this is mostly not how it works in practice, and you probably wouldn’t want to do it for generic downloads. I’ve only ever seen the behavior on Bittorrent–where it’s already expected that you might have to wait arbitrary times before chunks start flowing in. It’s better for one person to receive the last chunk they’ve been waiting on than for someone else to get a chunk when their file is only 50% complete.

This is how it works when my wife was shopping for shoes, because a store with 3 workers has limited bandwidth. You queue up and then when an attendant is free, you get their undivided attention until you are done, then they serve the next person. If there were only one attendant it would be even more frustrating when the person ahead has inordinately long demands.

Which is a roundabout way of saying since demand and download speed are not necessarily predictable, trying to be too clever with algorithms may not pay off. The real goal is to be flooding the server’s output at the maximum bandwidth, since maximum bandwidth is one of the factors that determines the cost of the connection to the business. (Not yet mentioned is also latency. If the first burst takes a while to be acknowledged, it’s not in the servers interest to be idle until the acknowledgements start rolling back; start servung someone else).

I found this article

which reviews many resource allocation techniques for 5G, making me think there is not one specific scheme mandated. Therefore there may not be a short, unequivocal answer to @Darren_Garrison 's question.

I wasn’t asking about 5G, though. I (very, very, very mistakenly) mentioned using my cell phone, but the specific network and device being used isn’t part of my question. It could just as well be a desktop computer on a cable modem, satellite connection (traditional geosynch or Starlink) or a T1 (if those are still a thing). Forget about the 5G and cell phone part.