How do I write a program to test whether another program has executed and finished writing its output file?
I face this problem in, for example, conducting parametric studies in which a specialized program can solve a numerical problem like building a computational fluid dynamics model or a finite element model. These specialized programs are usually designed to solve a problem once. Maybe there are, say, 6 different parameters that are important that I might want to study, such as 6 different dimensions in a valve I am designing that I know will influence the flow. I want to be able to analyze 20,000 different designs that are spread throughout this 6 dimensional design space. It might seem that I would have to open the specialized program, enter a specific design, run it, record how nicely the design worked, and then do it all over again 19,999 more times.
But I can use some other program to run loops in which it generates a point in that 6 dimensional space, calls out to the operating system to open a session of the numerical analysis program, feed it this design point, instruct it to write its result into a file, wait for it to finish, collect the written result file, add the design point and the result as an entry in an overall table of designs and results, and then repeat.
I do this frequently, and it works. There are a few different kinds of software I use for the outer program, and a few other ones for the inner program. Generally the way they interact is by writing little text files the other will find and read, and by the outer program passing command line type commands to the OS.
But parts of this process are messy and waste time, and that is what I am asking about.
Usually I can have the outer, looping program run a session of the inner, numerical program such that the outer program waits for the inner program to stop executing. But it is a source of messiness that the inner program might hang or crash, so I have to have the outer program decide what is a reasonable processing time, and if the OS doesn’t return control within that time, the outer program should instead kill any processes associated with the name of the inner program. I wish there was a way to detect, for example, that the inner program has not ended its session but the CPU has been mostly idle for a few seconds.
Also, if the outer program tests for the existence of the numerical result file, the test goes positive as soon as the inner program has started writing it. If the outer program waits for the file to appear and then reads it, it will get only the beginning of the file, as it will do the read before the inner program has finished writing it. I think this is especially a problem because Windows will end the inner program session before another program can reliably read the file, as if Windows has not yet synced the filesystem.
I feel uneducated about how to test reliably for these things. Instead I am doing multiple tests, adding waits to be more certain the file write is finished, and so forth.
What do I look for or where do I go to learn more about these kinds of details?