Differences with centralized frameworks for Distributed Computing

Next: GPU Protocol Up: Programming for the GPU Previous: Programming language Contents

Differences with centralized frameworks

Many centralized frameworks exist today. In essence, a server distributes tasks to clients and collects back results when the clients finish.

Seti@home [10], the first successful distributed computing framework works as follows: an old supercomputer distributes data from a radio-telescope to normal computers run by three million volunteers. A small program installed on these computers analyzes the data in the background using little CPU-power while the user is working but full CPU-power if the screensaver is active.

The analysis of data is done with a Fast Fourier Transform, to search for Gaussian and peaks that might be of extraterrestrial nature. Results are then sent back to the old supercomputer. Possibly interesting results are then post-processed by scientists.

On the same track of Seti@home, many others have followed: for example Folding@home [11], Climateprediction.net [12], distributed.net [13] and Chessbrain [14]. An attempt to unify many projects under the same infrastructure is currently done by BOINC (Berkeley Open Infrastructure for Distributed Computing [15]). As of today, (February 2004) BOINC is in Beta-Test and we can soon expect to see it running on millions of machines.

Three threads computing with graphics output

Figure 3: Architecture of the BOINC[15] project, an extensible centralized model

Being of distributed nature, the GPU project cannot position itself with these big centralized projects. With Gnutella, each user reaches only around 2000 computers² (with a Time-To-Live stamp of 7), about the size of a normal Beowulf Cluster. More computers could be built into a cluster, but each machine will see only about 2000 because Gnutella packets have a count-down counter decreased each time a computer is reached; Once decremental count gets zero, the packet is destroyed and no more nodes could be seen [3].

Additionally, clients have to stay online and operational; GPUs cannot simply disconnect and crunch data offline because they have to keep the network operational by forwarding jobs and answers. This disadvantage might slowly fade away, thanks to the new ADSL connections that provide 24-hour access for a reasonable monthly fee.

The main advantage of the GPU framework is that everyone can use the framework for his/her own purposes. Users running Chessbrain on their home computer follow one match only (in February 2002, against Grand Master Peter Nielsen). Users running GPU can occasionally play against the entire GPU framework (although please notice that the provided Chess plugin, chessbackend.dll, is an example of a frontend and still does not implement parallelism).

We can think of scientists with small budgets developing plugins for the framework or of developers implementing distributed databases in a similar way. Through the "autoupdate" routines, the plugin will eventually spread to all GPUs. Note that the choice to update is left to the user.

Some other weaker differences and considerations are:

A centralized model is normally backed by an institution that constantly produces data for the computing network.
In a centralized model, volunteers can group in teams and compete both individually or in a team. This is an important motivation factor.
Motivation for GPU should arise from the fact that everyone is part of a bigger entity, represented by the virtual supercomputer. Volunteers compute for others, but can ask for computational time as well.

Next: GPU Protocol Up: Programming for the GPU Previous: Programming language Contents

Tiziano Mengotti 2004-03-27