Is anyone clever enough to reinvent Ben's Method for improving disk performance?

BeepKillBeep · August 10, 2018, 1:31pm

I have no idea about this thread because it isn’t something I am familiar with but why are you responding to yourself?

RadioWave · August 10, 2018, 2:53pm

Most of us who do this for a living do so under pretty strict NDAs (Non Disclosure Agreements) so we can’t really talk about what we have or haven’t implemented. Secondly, if it hasn’t been published, how did you hear about it? Are you “Ben”?

I should think any disk controller which replaces I/O requests from the host would be short lived.

Again, the problem is that there are so many ways to optimize for particular workloads that it’s difficult to imagine which one you’re thinking of. I would almost guarantee that at this point in time, whatever it is, someone has thought of it and discarded it because it only works for certain patterns.

Some general ideas:

Does it just start reading from wherever the head lands on the track and then fills in the front of the buffer when it gets there? (very common)

Does it place consecutive blocks on different tracks so that the head moves towards the middle of the platter (thinking it would reduce the average next seek time)

Are the tracks themselves “curved” so that the head moves back towards the center of the disk?

Are you going to eventually tell us what Ben’s method is? The last thread kind of came off as “I know this really genius thing but I’m not going to tell you what it is.” It would be frustrating if there’s no reveal at the end.

Quartz · August 10, 2018, 3:31pm

I expect that when you make the reveal, many of us will go, “Oh, that old thing?”

RadioWave · August 10, 2018, 4:43pm

I’m sure that’s what you’d like us all to believe but I’m not falling for that old trick. I’m going to implement Ben’s method immediately before you can, dominate the world disk market, make millions of dollars and retire somewhere in the Turks and Caicos islands.

He did say it wasn’t patented, didn’t he?

Pantastic · August 10, 2018, 4:43pm

It makes about as much sense as giving credit for a ‘method’ to a person who never published or implemented it. Realistically, if it’s useful someone has probably already tried it, if it’s not useful then what’s the point? No, I don’t think that everything has already been invented, but I do think that the set of ‘disk optimization methods that might be guessed and discussed on non-technical message boards’ has been pretty thoroughly plundered already.

septimus · August 10, 2018, 4:44pm

So, instead of attempting to understanding what I wrote, you assume I’m an idiot. Got it.
Was it too far-fetched to imagine there might be some explicit or implicit cooperation between disk controller designed and kernel programmer?

The blocks(*) of a sequential file are typically laid out so that they pass under the read head in the order
1, 2, 3, 4, 5, 6, 7, 8, 9, …
An alternative, used when inter-sector gaps are too brief for necessary processing is something like
1, 11, 2, 12, 3, 13, 4, 14, 5, 15, 6, …
Ben’s method uses an ordering different from either of these.

(* - Irrelevant for our purpose here is whether a “block” is a sector or some other-sized chunk.)

RadioWave · August 10, 2018, 4:57pm

No need to take offense, I was simply replying to what you said. It’s been years since I’ve worked at the low level (SCSI CDB, ATA, etc) but I don’t remember any hints that an operation could be replaced by something else. I suppose in a captive market like IBM SLEDs or something they could have done something like that.

ftg · August 10, 2018, 6:09pm

First of all, the OP’s “method” is so vaguely worded it cannot really be compared to any actual method in use.

Second of all, prefetching from a disk has been well known since there were disks. So way before “the late1980s”. Lots of ways of doing it including in software and disk firmware. It is not remotely novel.

It’s like claiming to have invented the idea of putting tasks to be done on a queue. Only less specific. (Although I came across a patent for that given in the late 90s.)

Quartz · August 10, 2018, 7:04pm

So it’s just another method of organising interleaving. Big whoop. We looked at that in our first or second year at university. IIRC - it’s been 30 years - we looked at various intervals, varying sector counts (inner tracks having fewer sectors), and much, much more.

iamthewalrus_3 · August 10, 2018, 8:58pm

My guess is that there’s something clever here that may or may not really be novel to people in the field (I am not in the field, so it’s probably going to seem clever to me), but that the problem is far too underspecified for people to guess at what you’re getting at, septimus.

beowulff · August 10, 2018, 9:11pm

Are the blocks interleaved in some fashion that optimizes cache filling and emptying?
For example, if the cache is 64K, and the blocks are 8K, and it takes (say) 4 blocks worth of time to empty the cache, then arrange them like: 1,2,3,4,5,6,7,8,13,14,15,16,17,18,19,20.

septimus · August 10, 2018, 10:36pm

It’s exciting to hear there are other disk controller experts here. But I’m sorry none of you re-invented Ben’s Method with a minimum of hints. My #26 was so explicit:

… that a short string of integers will now constitute “solution”. The “brass ring” is off the table by now, but still — five brownie points to whoever figures it out first!

Replies might have been less condescending if my credentials were on display. To detail the variety of work I’ve done with disk systems might seem an exaggeration, and disks were seldom my main focus. I wrote a large immodest paragraph that would impress anyone who believed it, but erased it — we don’t need a “pissing contest.” I did however already mention one resume bullet that is easy to prove:

Among the wide variety of tasks I did as a computer system consultant was reviewing others’ inventions. I think I had a fairly good eye for novelty and “niftiness.” Whatever the limitations of Ben’s Method, it was novel and nifty. (Niftier than the majority of my own 30+ patents.) As for non-obviousness … well, despite all the prattle here, nobody’s come close to writing down the single sentence, or even simple list of integers, that would describe Ben’s Method.

I won’t leave it unsolved for more than another day or two, but want to give people a final chance for the pleasure of their own ‘Aha!’ I’ll PM the solution to anyone unable to wait for two days and promising not to spoil it for others. Please PM back to me with feedback showing how the method was already well-known. :rolleyes:

Uh … the 5-minute Edit window had expired? Yours is the most peculiar post in the thread so far … and there have been some peculiar posts!

At last! An actual “contest” submission! And an idea with possible merit. But No, that isn’t Ben’s Method. However, as you have done here, Ben’s Method can also be presented with a simple ordered list of integers.

Quartz · August 10, 2018, 10:43pm

That’s one basis of clustering. Another being a minimum addressability issue. For instance, you might have an interleave of 1+2+3+4, 9+10+11+12, 5+6+7+8, and 13+14+16+16. An advantage of clustering is that it can be logical, not physical. The drive controller can map the logical clusters to physical sectors in what the manufacturer thinks is the optimum manner. It also allows the drive to invisibly swap out bad sectors. I think this came to the consumer market with IDE drives, but it would not surprise me if there were MFM and RLL precursors.

BeepKillBeep · August 11, 2018, 1:01pm

It sure didn’t come across that way. It really came across as you replying to yourself. I mean go have a look at it and tell me that doesn’t look like you’re replying to yourself. It is no big deal, I thought it was kind of funny. I’m glad I won the most peculiar post award though. Does it come with a t-shirt?

pulykamell · August 11, 2018, 1:31pm

FWIW, you’re not the only one that was confused, especially the part of the post where he does quote and reply to himself and questions the use of the word “quantum.” I couldn’t figure out what was going on.

septimus · August 11, 2018, 2:27pm

I am confused by the confusion of y’all!

septimus used the word “quantum” in an unfamiliar way that might confuse.
Someone (septimus !) noticed the potential confusion and provided a clar·i·fi·ca·tion.

Does this … uh … clarify? Or just confuse further? :eek:

pulykamell · August 11, 2018, 2:44pm

Well, duh, I understand now after the recent comments, but I was confused by it, too, at the time of reading. Sheesh.

BeepKillBeep · August 11, 2018, 2:57pm

Sadly, my fascination with the whole thing (the reply) has kind of passed. I’m looking forward to the big reveal of Ben’s method just for curiosity, this isn’t my area of expertise at all.

RaftPeople · August 11, 2018, 5:47pm

I was thinking the same thing. For most problems in software/hardware, it’s easy to consider a boatload of permutations and combinations of optimizing for various specific variables or use cases, but for any one of those variables/use cases that is highly optimized, there’s another one that just went the opposite direction.

It’s pretty rare to come up with something truly novel/clever that is generally applicable, which is why I don’t think software should be patentable (but that’s a different discussion).
Regarding the OP:
We know it’s some form of interleaving and that it’s helps when the situation is multiple long sequential reads, so the thing that interests me is trying to predict which use cases are hurt by this method. First thought would be many small random reads, but it’s possible the interleaving method wouldn’t be any worse for that.

septimus · August 11, 2018, 7:33pm

Let’s first recapitulate. Recall that the disk is kept busy with a queue of requests; we don’t have the luxury of doing arbitrary speculative reads; there may be many actually-requested reads pending. We seek to a track containing part of a sequential file, having been directed to read, for example, blocks 7-8-9 of that file. Since we have not calibrated the disk, the seek will complete at an effectively random point on that track. If we land in the middle of the requested target – say block 8 – it is a toss-up whether we use Ben’s method or not; so let’s suppose we land in the non-requested area.

The track can then be viewed as having three parts: pre-target, target, post-target. If we arrive at the track when blocks 2, 3, 4, 5, 6, 7, 8, 9, 10, 11 are appearing in order than {2,3,4,5,6} will be the “pre-target” and {10,11} will be post-target. Blocks 2 to 6 (the “pre-target” ) will be worthless to us: the sequential file is being read sequentially; reading of these blocks was already completed in the past. It is the reads of 10-11-… (the “post-target”) that may have value, but it isn’t clear how much of that data should be read before initiating the next seek to read data the host is already waiting for.

In case anyone wants to treat the above preface as a final hint and take a minute now to guess the method, I’ve placed Ben’s Method inside spoiler tags.

[SPOILER]Ben arranges the data blocks in the order 13, 12, 11, 10, 9, 8, 7, 6, 5, … The pre-target — data that can be read for free — is now precisely the data that we want to pre-fetch speculatively. Once we read it and the target data there is no need to dawdle: the post-target data is worthless (as the pre-target is normally); we hurry the seek to next target, perhaps getting useful pre-target reads there as well.

Instead of pre-target reading being useless and post-target speculative reads risking delay to a separate requested read, we “have our cake and eat it too!” The free pre-target reads are useful, and we initiate the system-requested seek ASAP, perhaps even getting more useful pre-target reads.[/SPOILER]
It’s easy to find some cases where Ben’s Method might yield a loss rather win. (I’ve certainly conceded that Ben’s bosses did not adopt his Method and he never patented or published it.) But it still seems novel, and nifty. If thread participants were already familiar with it, they did an excellent job of keeping that knowledge secret.

Or was it too trivial? I’m told in the thread, that this Method was probably well-known decades ago. Please link to such discussions. Was Ben, after all, a plagiarist?

Topic		Replies	Views
DVD-RAM is stupid Great Debates	18	819	April 2, 2000
How computer programs defrag your hard drive Factual Questions	29	2391	December 7, 2010
Does anyone see performance improvements after a defrag? Factual Questions	37	2280	February 21, 2006
What causes a slow directory lookup speed on a hard drive? Factual Questions	22	2908	January 2, 2008
Need help with CPU qualities.... Factual Questions	17	1599	March 2, 2007

Is anyone clever enough to reinvent Ben's Method for improving disk performance?

Related topics