Also in genetics…every year or two, one of the graduate students manages to reinvent the fork bomb.
Many genetics analyses are (technical term) embarrassingly parallel. Some tasks can iterate over whole chromosomes, and others over sets of 10,000 SNPs, or whatever. That often means there are lots of files to which the same thing needs to be done.
Combine that with a supercomputer, which is really just a bunch of regular computers connected together and managed by some fancy software. So, tell the fancy software do-thing block1
and it will dispatch the job to a free computer to do the the thing. If there are 10,000 blocks, the obvious thing is to write a script to tell the fancy software to dispatch jobs.
So, people will write a script to start a bunch of jobs, but sometimes the jobs are slightly different, so it would be nice to be able to use the same script for different kinds of jobs. Write the script so exactly what it should do is on the command line:
launcher-script.sh analyze-script.sh --output run1 --test all --input
or
launcher-script.sh analyze-script.sh --output run2 --test all --input
Intent: The launcher-script.sh
will dispatch a bunch of jobs, each one running analyze-script.sh
, but in a way where the arguments to analyze-script.sh
can be easily changed.
launcher-script.sh
:
#!/bin/bash
# My script to launch a new job for each block to analyze
for i in /data/inputfiles/block*
do
sbatch $0 $i
done
Those who have done a bit of bash scripting will notice that somebody has used $0
as the argument to sbatch
(the program that schedules a job). In bash $0
is an automatic variable, which is replaced by the name of the program that was run.
So, when launcher-script.sh
is run, it will schedule itself to run once for each input file, and when each of those run, they will in-turn schedule more instances of launcher-script.sh
to run.
sbatch launcher-script.sh block1
sbatch launcher-script.sh block2
etc. In a cool sci-fi movie world, this would cause the supercomputer to shoot out sparks, and smoke. In the real world, the second time it happens the user hits the limit for the number of jobs that can be submitted by a single user, eventually notices the problem, and scancels
all of their own jobs. The first time it happens there is no job limit, the scheduler grinds to a halt (or just dies), an admin has to cleanup all of the jobs, and then implement a per-user job limit.
What should have been used was the variable $@
, which is automatically replaced with everything that came after the name of the program on the command line. When working properly the launcher-script.sh
would create lines like
sbatch launcher-script.sh analyze-script.sh --output run2 --test all --input block01
sbatch launcher-script.sh analyze-script.sh --output run2 --test all --input block02
etc. Each of the analyze-script.sh
jobs would do their thing, and then exit.