which explains how SETI@home works. Instead of using processing power just to win a dumb contest ( as in Prime 95 and others ) , you use it to process radio frequencies received from space on a 300 meter telescope! Now that’s cool! The link above explains it better.
What I’d really like to somehow help develop is distributed computing that’s available to anyone who needs it. Say you have a huge multimedia file that takes 20 hours to process on a P3 450? Instead process it on 400 PCs all over the world and get it done in 20 mins or so. You could pay for your processing time based on how much time you volunteered previously, like on a ratio system ( 1:100 or something. )
So for those of you who have an interest in distributed computing: does this sound feasible? Has it been done? If you or anyone you know has an interest in developing such a thing I’d like to volunteer my time and computing resources.
I should correct myself: the contest that clients like Distributed.net and Prime95 run aren’t " dumb " , they actually give the winner(s) a sum of money which they donate to the charity of their choice. This isn’t dumb, sorry.
As one of their employees keeps pointing out in the SETI threads, Entropia is in this space as well. Sounds like an interesting product.
The hard part, of course, is partitioning your problem so that these loosely coupled machines can work on portions of it independently. Some things lend themselves to it, some don’t.
i’ve been thinking on the same lines…i’m sure this can be worked for a peer-to-peer network…much like the internet is…something parallel to napster…we can call it sharester…i’m sure this will work…and oh man ppl need it…and there’s like times when my cpu is 98% used and there’s even more times when my cpu is 2% used…so sure i could give away a few million time cycles if anyone gets to developing such a protocol/system…lemme know…i’ll offer my support
There are quite a number of distributed computing projects currently in the works. For instance, check out The Beowulf Project (http://www.beowulf.org/), a distributed computing project for Linux. It centers around combining the computing power of ordinary networked workstations.
One of primary problems (there are, or course, many others)with distributed computing is that, in order to take advantage of the additional distributed computing power, a task must be able to be broken up into many smaller pieces, with each processor tackling a piece of the puzzle. Tasks that are carried out serially, i.e. tasks where performing step 2 in the task is dependent upon the outcome of step 1, aren’t able to take advantage of this parallel computing scheme.
Also, as you increase the number of processing nodes, more and more processing power is lost to the overhead involved in keeping all of these nodes working together efficiently towards a common goal.
The SETI project works well under a distributed computing environment because the SETI servers can simply send out a packet of data to a node computer to analyze. The results obtained on one machine to not affect the results obtained on a different machine that is analyzing a different piece of data.
How secure would a globally distributed network be? With SETI@home, the information is freely available anyway, so you can’t ‘steal’ it. But personal data is not free, and someone could steal it. How much information could you really glean if you set up a port sniffer on your machine and logged everything that went in and out? It sounds that during a heavy period, where you would end up with a thousand packets from as many machines, you couldn’t do much with it (like a thousand fragments of as many pictures, you don’t get enough of one to see anything). But during less active periods, you might get quite a few from only a few machines. Could you piece together what each was working on?
Thanks for the links and input. Actually someone also pointed out to me that you would need a client on these nodes that is designed to perform a specific task, like SETI or distributed. So to use many remote machines to perform a number of tasks, you would need a rather large client running in memory. Aside from the fact that people might not like that, even a large client would be able to perform only a limited number of specific tasks. Distributed and SETI are able to run a small client, and while they each have only one project, many people are interested in their respective projects, hence their success. I guess my idea wasn’t so great after all.
Also, I didn’t know companies were already trying to charge for distributed computing, but that doesn’t surprise me much.
I’m bumping this up, 'cause I just discovered Beowulf clusters and am itching to buil done. It’s probably worth it for the experience, but what could I use it for? I’m not a programer, so I’m not about to write software for it. Anybody got any ideas?
There just isn’t a great need for this kind of service. The kind of applications that require distributed processing aren’t likely to be encountered by even a small segment of the computing community, especially with the ever increasing power of the personal CPU. A business model based on “distributed processing for all” is doomed to failure in IMHO.
Derleth points out that by shipping your data across the network, someone could steal it, but the problem is actually a bit bigger than that: someone doesn’t have to sniff the network to get your data – you’re sending it right to them so they can work on it. You’d better trust everyone who does any processing for you.
In addition to the person stealing your data: how can you be guaranteed the results they send back are actually valid. If I’m sending my large scale calculations off in chunks to be calculated in a distributed manner, I need to be assured that nobody crunching numbers for me has a rogue client which is generating bogus results.
It seems conceivable that, for specific operations, data might be able to be used even if it’s encrypted. For example, if you’re doing frequency analysis on data which is encoded with a simple substitution cipher, the frequency analysis could potentially be done on the ciphered data without decoding it, then the results could be decoded. Of course, that’s not good encryption, but I wonder if the concept can be generalized to work with strong encryption in certain cases.
Distributed computing only makes when a huge amount of computing needs to be done on very small data sets - otherwise the overhead of sending the data and result over the network becomes overwhelming. SETI@home sends about half a megabyte of data to you, which takes less than a minute. Then your computer sends many hours doing every possible analysis on this data, and return the results. But how often do you encounter a computing task like that? Usually the tasks done by computers are limited by the I/O - disk access, network speed, video card speed, waiting for human input, etc. There are some CPU-limited tasks, like rendering CAD images, but even then it takes time to send the data over the network and then download the result.