What Will The New Supercomputer Do For Us?

I heard about the new supercomputer on the news this morning.

Here’s a link to theChicago Tribune article.

The article doesn’t say much. Anyone care to enlighten me about it :slight_smile:

I love the word PETAFLOP. If I get a new pet dog, I think I’ll name him that.

IIRC, HAL was born at the Champaign, IL facility.

There are all sorts of scientific computing problems that require far more power than your average desktop. This sort of supercomputer isn’t really a single computer in any sense, but rather a cluster of lots and lots of fairly ordinary computers that are connected by very high speed interfaces. In this case, there are 24,000 “nodes”, each of which has a couple quad-core server processors, and a total of 32 GB of memory.

The users log on to the cluster and submit jobs. The cluster then queues and divvies up each job – some will run on a single node, others can be run across many many nodes. Come back a few hours, days, or weeks later, and the job will be completed. (At least, that’s how it works on the computing cluster at my university.)

So what sort of tasks require this much computing power? I am starting to work with next-generation DNA sequencing data. The sequencing machines produce billions of short reads (50 nucleotides long) but biologists are usually interested in much longer sequences (RNA transcripts might be 1,000 - 100,000 nucleotides long, genomes are billions of nucleotides). So this “shotgun” strategy requires a lot of brute force computing power to solve billion-piece puzzles. And that’s just the initial assembly, later analysis steps (like comparing many genomes to each other) also take tremendous amounts of computing power.

But, conveniently enough, these sorts of problems are very easy to split into smaller pieces, so many CPUs can be used in parallel. So the sorts of problems that would take weeks to solve on a powerful desktop computer can be finished in hours, using a very small fraction of the supercomputers resources. And now the bioinformaticists are happy to find problems that take weeks to solve with a larger portion of the cluster’s resources…

This is just in one area. There are all sorts of applications in physics, chemistry, economics, mathematics, and many more fields.

As lazybratsche says, this is a very very large cluster of mostly otherwise normal computers. A difference is the presence of Nvidia Tesla based nodes as well. These processors are specifically designed for fast numerical work, and not seen in ordinary computers - although their genesis was in graphics processors.

There are a few nasty secrets about these large computers. Many jobs are what are known as “embarassingly parallel.” These simply involve lots and lots of the same program running on lots and lots of nodes, usually with different parameters or initial data sets. A lot of the time these very large computers are actually never run as a single supercomputer, and are always divided out to a large number of researchers that use a small fraction of the system. But it is easier to get money and attracts more prestige (which is heavily related to the getting of the money) if you build one huge facility rather than giving each research group a cluster of computers each. Beyond the embarrassingly parallel there are computational problems that can be parallelised, mostly you don’t get 10 times speedup for ten times the processors, sometimes the falloff is depressing. It is hard to write parallel code, and very hard to write parallel code that scales well to a lot of processors. But is can and is done for some problems. Some even gain speedups in excess of the number of additional processors. A special class of problem are those that need lots of memory. Or those that access very large datasets.

Computational science is odd in a way. There are corners of lots of science that are amenable to attack by big compute, and these benefit greatly. But there are others that don’t benefit at all.

Traditional areas for big computer are things like analysis of fluid dynamics, complex structures, chemistry, molecular dynamics. As computers got bigger hitherto infeasible tasks became possible. More complex chemistry, simulation of quantum chromodynamics, more complex molecular dynamics - with the ability to step up to biological processes, and thus areas such as drug design. Very large scale searching of data - and DNA and protein sequencing is a big one. Protein folding - as a special case of molecular dynamics remains a big and difficult problem. In fluid dynamics, turbulent processes remains extremely difficult.

Many problems in science grow very quickly in size with their complexity. Computational chemistry at the quantum electrodynamics level grows with the fifth order of the number of atoms. Double the number of atoms, 32 times the computation. If you want to play with big systems you need enormous computational power. These are some areas where the scientists can use pretty much all the computational power you can ever give them.

In some areas, the advances in fidelity of simulations is alone to make paradigm shifts in the level of understanding.

On a sour note, I do worry that some areas of science get funded to a level that does not reflect the value of the science done, simply because they can exploit large computational facilities to help. These facilities are often funded as a prestige thing, and to some extent are funded outside of the normal peer reviewed process for research funding, giving those areas that can benefit a double helping.

How do these supercomputers abilities compare to the various @home distributed schemes? I imagine there are different types of overhead (e.g. splitting things up and reassembling), and many uses don’t warrant the time to divide (particularly as there isn’t a sizeable @home userbase for the problem), but when comparing, say folding problems does the @home model dwarf the dedicated supercomputer? Is there a tipping point/number of participants that makes it worth it? How is that number calculated?

If you just compare number of floating point operations, @home and top supers are in similar territory.

But when you factor in everything else (internal communications, etc.) that go into a super, you get something much more powerful for some classes of problems.

To quibble this point, yes each group could get their own little cluster, but in my field that would be an exceedingly inefficient use of resources. These days just about every bio lab is starting to play with next-gen sequencing data for all sorts of purposes. But a typical research group might only do one or two such experiments in a year, and thus only need a few days’ time on the sort of “cluster” that can fit somewhere under a desk. Even a department cluster would sit unused most of the time (and the department would have to hire someone to set up and administer the thing). And the bioinformatics groups would rather not muck about with the IT side of things.

Plus, it’s pretty difficult to use NIH or NSF funds to buy general-purpose computers, unless the grant is specifically based on computing research. It’s a lot easier to justify the cost of a computing service.

The @home projects are also at the mercy of their public enthusiasm. Listening for signals from ET, or trying to find cures for cancer, are exciting projects, and lots of folks are eager to help out on those, but simulating the merger of two black holes, for instance, is a lot harder to get people on board for. So if you’ve got one of those less-sexy projects, it’s easier to convince a handful of fellow scientists to let you use their supercluster than it is to convince millions of laymen to let you use their idle PCs.

Wasn’t there a project at some point that was going to create a distributed computer across everybody’s PC for use by any project?

Obligatory “Can it run Crysis?” comment

Do you mean BOINC?

AFAIK the @home strategy has a few key compromises that put it at a disadvantage relative to a supercomputing cluster. Basically, each individual chunk of simulation is limited to something that can be completed on an ordinary desktop, i.e. single CPU, single gig of RAM, and amounts of data that can be reasonably transferred on a home broadband connection. But on a supercomputer cluster, a typical node might have eight processors and 32 gigs of RAM, and thus can run bigger simulations. Because of the dedicated storage and high-speed interconnects, each node can easily crunch and create many terabytes worth of data. So the supercomputing cluster can handle bigger and higher resolution simulations than a distributed computing network.

Apparently.

The article got this wrong so I don’t blame you–but saying that the computer reaches a “sustained speed of a petaflop” is incorrect.

Originally, the root word was FLOPS, which stands for “floating point operation per second”–note that the S is not a pluralizer, but stands for “second”.

Later, people sometimes dropped the S to use FLOP as a quantity instead of a rate. One could then say that a certain computation requires 1 petaflop to complete, or turn it back into a rate by saying a computer runs at 1 petaflop per second.

But it makes no sense to say that something runs at a rate of 1 petaflop. It is like saying you have a car that reaches a speed of 150 miles. You need to say 150 mph or miles/hour to turn it into a rate.