This is based on real life, and I’m curious about the answer.

We have a tale of two programs A, and B, and a programmer who likes to live dangerously and doesn’t care about race conditions.

Program A is a generation program, over time it will generate a set of N files, sequentially. It will not generate file n until it generates file n-1, but you can assume that the moment it starts generating file n, it will take t_n [sampled from] T_1 seconds, where T_1 is some distribution. (We can assume uniformly distributed between x seconds minimum and y seconds maximum to make it simpler).

Program B is an evaluation program. It randomly samples a file from the set N **with replacement**. Overall, it will evaluate M (not necessarily unique) files over the course of its run. Each evaluation will end in t_m [sampled from] T_2 seconds, where T_2 is a distinct distribution from T_1, but is distributed similarly (i.e. distributed uniformly between some min and max time).

The trick, of course, is that Program A is generating while Program B is evaluating. So a new file may appear while B is evaluating a file. No files will ever be deleted, but it IS a problem if B selects file n – the one A is currently generating.

Assuming program A gets a head start of a few files (we’ll call that k), what’s the probability that program B will finish all M evaluations without picking the file A is currently working on?

There are two obvious “win situations” here:

Program A finishes first, in which case the probability from then on of picking the file A is working on is 0.

Program B finishes first, in which case even if A is working, we win.

You can assume that there’s no discontinuity where A and B finish an iteration simultaneously, just assume A has created n+1 by that point.

Obviously at the moment B is selecting a file (which we’ll assume takes no time for simplicity), it has a probability (n-1)/n of selecting a “good” file if A is still running (and n/n if A has terminated), but I really can’t think of how to solve this. It seems like it should have a closed-form a-priori solution, even if it’s kinda gnarly.