Distributed Computer: What is the best option?

Quick background for the uninitiated: Distributed computing is the practice where an organization makes available a program that runs in the background of the client computer, during idle time, and process bits of data and sends that back to the server. It is an increasingly popular alternative to supercomputers, as it can harness a lot of power for very little money. Typically, users are totally unaffected by running these programs. Recently, some scandal arose when a company made “free internet” offers, but their connection client ran a distributed computing client without the user’s knowledge or acceptance. The most famous example of distributed computing, and by far the largest currently in use, is SETI@Home run by Berkeley (in conjunction with other SETI institutes), which uses client processors to analyze radio telescope feedback for possible patterns signaling intelligent life. There are distributed computing programs for everything from economics to gene mapping to SETI, and other tasks usually handled by a supercomputer (or network of supercomputers) and even some businesses selling processing time (or buying it). The amount of computing power “wasted” as business and private computers and even servers sit idle, say, overnight, is staggering, and the cost of running a distributed computing program is negligible.

The question for debate is, what is the most practical, “best” overall option? SETI@Home, I will propose, is an almost complete waste of a lot of processor time that could be put to better use. It tears my heart to put a Stanford program in front of a Berkeley program, but Stanford’s Folding@Home program (research into protien folding) is, I feel, a more worthwhile investment for humankind. I even rank the popular United Devices clients, which are run by a private company and sometimes do non-scientific research (but gives out nice rewards, got $100 from them once ;-). Many businesses also run in-house clients that do batch processes on their databases or other functions while the computers are running overnight, or have a light load.

Distributed computing is one of the most exciting technologies on the Internet. It gives small, low-funded research groups access to supercomputer-class processing power for a tiny fraction of a decimal of the cost. While it is inevitable that some of this will be used by companies (buying and selling processing time for their own large-scale projects), it is vital to programs like SETI and Folding@Home, as well as other scientific applications (mapping genes and such).

I feel that a strong push should be made by the government to “sell” the idea of donating computing power to universities to the public at large. While SETI@Home has a rather large following and accomplished much, it has also (predictably) produced no results. I feel that it has served its purpose as a pioneer technology, and now efforts should be concentrated on using distributed computing on practical science, such as the Folding@Home and United Devices-run projects. These promise a better use of computing power and have results that are applicable in the very near future.

While stargazing for ET may seem good and makes many nerds happy to donate their processing power, it can better be spent on medical and geophysical-type processing. The results are far more tangible and applicable to our society than some “dream” of catching an errant radio signal sent millions of years ago that may or may not signal ET life with advanced technology.

So, the basis is: What is the best use for distributed computing? Publicly, I push for university research programs (other than useless ones such as SETI@Home). Businesses, I feel, have the right to use their processing power as they deem fit, whether they decide to donate it (as many businesses already do), or use it internally to boost their computing power. I certainly feel that all government computers should run a university-run distributed computing client - all those machines at government offices and even at the universities running the programs are major wastes of CPU time and power. I would even support local legislation for this. And I wish the best to companies like UD who offer a mixture; a lot of scientific research in exchange for some business transactions.

So, in conclusion, we should actively push distributed computing by our nation’s universities and government agencies, before the rest of the world does and leaves us far behind. There is nothing to lose and much to gain.

I further conclude that SETI@Home was a necessary tool in promoting distributed computing, but is currently a waste of processing power and should be diverted to more applicable research.

(BTW, we kicked Stanford’s ass yesterday, it was great)

It is the users computer and they can decide to donate the processing power as they wish. SETI has an appeal to many people and other projects can use their marketing to gain as much power as they can much as a charity runs fundraisers. You can’t hijack people’s machines without their consent (we know what that is called).

What Shagnasty said. They’re my cycles, I’ll donate them as I please.

The SETI@Home project has extended itself into a distributed computing framework called Boinc. You run the Boinc client, register with whatever projects you wish, and allocate your CPU cycles however you like. There are several different projects operating in the Boinc framework, and more are steadily being added.

I agree that many businesses and government computers have wasted CPU cycles. State governments would be wise to pool their spare cycles and provide them to their state universities. But this kind of cross-department resource-sharing is tricky to manage in large bureauocracies, and so will probably never happen.

Corporations, especially large ones with centrally-managed computer resources, would also benefit from a Boinc-like system. Boinc is designed to be easy to manage over large networks (I’m not sure how well it works in practice, but they are working explicitly in that direction).

One of the primary problems with distributed computing is that development of the code is difficult. The data and code must be divided into independent chunks, no communication among the chunks. This can be very difficult if not impossible for some computational tasks. The additional software development costs may or may not justify the more efficient use of computer resources.

I also think SETI (and much of the space program) is a lot of wasted effort on behalf of dorks, but it’s none of my business how Paul Allen or dorks choose to spend their own money or time (it is my business when government gets into space programs more or less explicitly inspired by bad SF shows, but that’s another topic).

I run F@H and think it’s time well spent (I like eating beef, so any efforts that may lead to better understanding of proteins/prions/BSE are much appreciated by me). I may check out some other options as I have processor power to spare.