Is anyone clever enough to reinvent Ben's Method for improving disk performance?

As fragmentation increases, the method’s value decreases.

But, even for best case scenario with sequential sectors/blocks, to avoid performance penalties of reading in reverse order compared to rotation (e.g. read X, then read the one right before X requires full rotation), the OS would need to make requests in reverse order, which just recreates the original situation.

Not a bad puzzle, at least for someone like me with computer experience but no disk-specific experience. FWIW, I only saw the thread just now, and figured it out before your #40 post, so the hints were sufficient I’d say.Given the hints about block ordering, the reasoning ended up pretty simple. Normally when you seek, you have to wait some time until the block passes under the head. How do you arrange things so that you pre-fetch during that otherwise wasted time? Backwards is the obvious choice.

But doesn’t become worse than the normal method. (Anyway, users concerned with performance will run defrag programs target at intensive files.)

This makes no sense at all. All I can imagine is that you envision a case where host has issued a read request for block N and then issues a read request for N+1 while block N is being read. But even then, had the seek not been infortuitious (i.e. landing between N and N+1, bad luck which happens equally in either method) the controller saw … N+2, N+1, N, … and the newly desired data (N+1) would have already been pre-fetched into controller’s buffer.

Be kind, my experience with reading/writing disks was a long time ago and the technology was pretty basic, meaning my assembly program literally controlled almost every aspect of the operation, moving the head, responding to interrupts, etc. So I don’t have familiarity with how the controller software might be optimizing across the list of known read requests.

But having said that, my assumption is that for the general case, the read requests will not all arrive at the same point in time. As a program is processing a file, the OS is sending read requests to the controller as the need arises. If the read requests are flowing from first sector/block to last over time and the layout on the disk is the opposite of rotation, it means that frequently the next requested sector/block will be prior to the sector/block that was just read. And that request may not have been known to the system prior to the one currently in process.

Does Ben’s method have anything to do with prime numbers?

I’ll try to address your question in more detail, RaftPeople. Let’s first consider reads; then writes.

In a typical Unix OS circa 1980-1990, when a process requests a read of block N, it will then block (sleep) until that read completes. Since the inter-sector delay will not generally be long enough to allow a round-trip through kernel, application and back to disk with a N+1 request based on N, it sounds like you are imagining some interleave, e.g. 1, x, 2, x, 3, x, 4, x as mentioned earlier. In fact, however, such interleaving is undesired — you run at only half-speed in the sequential case. Instead applications will read in large chunks — e.g. {N,N+1,N+2,N+3} all in one chunk; this will be supplemented by disk controller pre-fetching and, optionally, kernel-driven speculative reads. In any case you either meet the rotational deadline or you don’t (you ‘slip a rev’); slippage is the same in either case (very slightly better for Ben, in fact).

But this is largely beside the point. When the disk was on-track for a sufficient period the future data you speak of {N+1, N+2, …} will have already been pre-fetched with Ben’s N+2, N+1, N, N-1 ordering.

Random reads are the same in either case.

So much for reads; let’s talk about writes.

Again, the application written for high performance may issue its writes with large quantum chunks, e.g. {N,N+1,N+2,…,N+7} for a 64k write. (Just an example. Neither the actual block-size, nor the ordering granularity is essential to the method.) The disk controller will lay down that write data in the order {N+7, N+6, N+5, …, N}.

But suppose that the application does issue writes in smaller granules.

In the Unix OS, writes can be asynchronous. Setting aside the possibility of a permanent disk I/O error in the future (and anyway write failures are often unnoticed until read-back), the application can be rewoken when kernel (or controller) has committed to the write. Disk queues are often full of asynchronous writes under the Unix Operating System, and it is routine to re-order those requests to maximize disk performance. Thus performing the writes {N}, {N+1}, {N+2}, {N+3} in the opposite order of request arrival will only be a new variation on the re-ordering that disk firmware (at whatever level) does even without Ben’s method.

It may not look like it, but that’s just a method of interleaving.

I understand that you can (retroactively :slight_smile: ) compose a definition of “interleaving” that includes Ben’s Method, but that’s just semantic play, and is of limited interest.

I’ll give you enough credit to assume you grasp that the intent and function of Ben’s “interleave” are fully different than the interleaves that you studied … wherever. Or, Yes or no, are you claiming that particular advantages of Ben’s Method were taught at your school?

I see your point about the fact that requesting N+1 would require the program to have processed N first and by that time the disk has moved anyway, I hadn’t thought it all the way through, valid point.

It’s fun to find the counter so I would ask you, under what conditions do you think this method is worse?

Possibilities:
1 - When the request for N+1 arrives after positioning for N but prior to completion of reading N
One thought is if parallel processes were reading same data at same time, but that’s an edge case

2 - When the request for N+1 doesn’t require waiting for N to be processed, for example, a high level request to read the entire file. In this case the higher level request to load the entire thing could be supported by a lower level process of reading the set of sectors/blocks that make up the file and issuing read requests based on what was read from the list.

If there were multiple entries that pointed to sequential blocks, then the requests would be arriving in sequential order but the physical arrival at the head is in reverse order. This seems like a counter but I don’t know how frequently that would happen.

No, it isn’t semantic play. You have an interleave order of -1. Such interleaves were to be used when the controller was slow.

Specifically? No. In general? Yes. Ben’s Method actually has no particular advantages and one huge disadvantage: when the buffer gets filled before you reach the right sector you have to discard data. These days buffers are large but that was not the case decades ago. It’s also a mistake to assume that the next sectors you will require will be the (logically) next sectors. If you are working with random-access files, the next sectors required are just as likely to be the ones (logically) before the one you want. Sectors were often 256 bytes - more than enough to hold a record.

One method already mentioned was to read the whole track / cylinder, another was to read the sectors until you got to the one you wanted (but see above), a third was once you’d got to the right place to keep reading the sectors until the head was moved, the buffer filled, or the track completely read - this is the converse of Ben’s Method and is computationally easier. There are more. I never got to the level of complexity that markn+ did. And my textbooks are long gone.

I think you’re moving the goalposts here. The premise was that this would be useful for sequential files, and that somehow the controller or low-level software could know that sequential access was in progress.

On Windows, the CreateFile() call (which also handles opening existing files for reading) has an optional flag called FILE_FLAG_SEQUENTIAL_SCAN that informs the cache manager that a file will be read mostly from start to end. I haven’t heard of an equivalent on Unix systems, though, so sequential access would have to be detected.

I agree that it’s nifty. It’s the kind of thing one of my professors may have mentioned in passing, I don’t know. I’ve lost my notes and my old CS textbooks are busy propping up my secondary monitor. Ben’s Method is probably too specific to have been implemented in a “normal” OS, and would have become moot after LBA became the norm.

I’m glad someone finally conceded it was a nifty" idea. :slight_smile: But I’m curious whether it was actually mentioned. Nobody in the thread seems to have remembered it.

But LBA—if that just means treating a file system as a simple linear array—was already quite normal for Unix disk drivers, at least in the late 1980’s. (Applications and kernel code which estimated a cylinder number just used that for heuristics.) I don’t see why “LBA” affects the efficacy of Ben’s method.

How would that method perform during write operations?

Seems like the OS can’t predict what the app is going to do so it can’t re-order the writes in all cases, so it seems like sequential writes would be slower (e.g. write #1, wait for almost full rotation to write #2 right before it).

Asked and answered; see #46. The problem would arise only if request {N+1} arrived precisely while {N} was being written; this is unlikely since seek and rotation delays are much longer than the actual write time. Anyway, if the application is writing two blocks it might be writing k blocks, and the worst case is for all k blocks to be written in two revolutions, not k revolutions.

Ben proposed that his method be optional, perhaps activated on a per-cylinder basis. (This might have led to two different pre-fetching policies, but that would still be simpler than a calibration-based method like markn+'s.)

The main reason Ben’s method was rejected, I think, was Fear of the Unknown — a concern I shared to some extent. Whether the particular project where Ben offered his suggestion suffered abnormal fears would be beyond the scope of this discussion.

I was assuming there was a write buffer that would increase the problem, but some googling tells me it’s not common due to the danger of failure and the app thinks it’s completed.

Googling around yesterday, I found a lot of papers with very detailed analysis of workloads and performance of various methods. With the level of detailed math and simulations going into some of these, it would surprise me if they hadn’t explored pretty much every possibility.

So I’m still wondering under what conditions this method would be worse.

At least five Dopers have suggested that the idea has almost certainly already been published and discarded. Pursuing the “puzzle” of “re-inventing Ben’s method” seemed like a waste of time to them.

Yet the number of mentions of this method that have actually been cited to date is … Zero. With a Z.

Regardless, the interesting question is: under which workloads is this method better and under which workloads is this method worse.

To evaluate this method you need to answer those two questions.

Is it your opinion that the workloads where it is worse are relatively few?

I’m really not sure. I can say that for such a radical idea to have been embraced at Acme Computers (not its real name) where it was proposed, a long hierarchy of managers would have weighed in. IIRC no specific objection was ever raised beyond an inchoate fear of the unknown. (Such radical changes have an adverse risk-reward ratio.) At the time, I myself was consulting for Acme and doing a wide-ranging study of disk performance issues, but that was thirty years ago and if I came up with a “bust” for Ben’s method I’ve forgotten it.

Disk calibration, as markn+ implemented (see upthread) would have reduced the advantages of Ben’s method, and this was also proposed on that project, though I’m pretty sure it was not adopted. Note that the firmware required to support such calibration (and steer the “elevator” etc.) would be much more complicated than the firmware needed to support Ben’s method.

I just realized that your answer to read/write of N and then N+1 doesn’t seem valid because interleaving allows for this exact situation according to what I’m reading.

So, you would need to compare performance between reverse sequences and interleaved. Obviously sequential read/write isn’t the only type of activity, but it’s significant.