Welcome Guest LOGIN | REGISTER
Tuesday 29th June 2004

Distributed computing

Posted at: Tuesday 29th June 2004 by Ian Betteridge

You might just use your bog-standard 2.4GHz PC for writing inane messages to your friends, but it could become the most powerful computer in the world, or at least part of it. Ian Betteridge explains how

But the first distributed project to truly capture the imagination was nothing to do with calculating prime numbers or breaking encryption. Instead, it managed to combine two classic geek interests: space and computers. SETI@home was launched in 1999 and used distributed computing methods to analyse the radio background of the sky, searching for any hint of a regular intelligent signal. This is exactly the right kind of job for distributed computing, as it requires massive amounts of data processing, but is easily split into small chunks that even the least powerful machines can handle.

The project was conceived by an Australian computer science student, David Gedye, who was inspired by a film about the Apollo moon landings. Gedye noted how the Apollo project involved a lot of ordinary people in science, mainly through TV, and looked for a way to involve the public in the search for extraterrestrial intelligence (SETI).

But the true genius behind SETI@home was to release clients for Windows, Linux and Mac OS, which allowed virtually any machine in the world to take part. Unsurprisingly, it took off, and within a year over two million computers around the world were taking part.

How is it done?
While the original distributed computing projects used dedicated operating systems, allowing tasks to be split across machines without any intervention from a central server, modern projects tend to use simpler client/server architectures. So anyone who wants to contribute to a project just downloads the client software, and this works in the background on your machine.

Typically, this client software will communicate with a central server, which allocates the data, and then works while your machine is idle or when you're not using much of its power. Once your machine has finished its task, it sends the data back to the server, which then sends another task and reassembles the data coming back from the clients into a useful form.

What the client actually does depends on the type of project, but in all cases it's some variety of number crunching. For example, SETI@home examines radio telescope data with a concentrated signal on a very narrow bandwidth, while Folding@home examines potential protein patterns to see if they can 'fold' (recreate themselves in useful ways).

The data itself comes from a variety of sources, depending on the project. However, SETI@home is fairly typical of the way that a distributed computing project works. Every day, the world's largest radio telescope at Arecibo in Puerto Rico collects around 50GB of radio data for a variety of space science projects (ironically, given the role the Internet plays in distributed computing, Arecibo doesn't have a high-speed net connection so this data has to be sent out on tape via snail mail). This data is passed to a dozen Sun servers running Solaris, which range from small rack-mounted units to larger E450s with four CPUs and 6GB of RAM each, and is then stored on two NAS (Network Appliance Storage) servers with several terabytes each. The servers handle requests for data coming in from the 500,000 clients active at any one time, and reassemble completed tasks, flagging any that have revealed potentially interesting data for later analysis.

Submit to:  
Comments

Make a Comment

Mobile Broadband

Compare prices

Fastest, cheapest 3G mobile broadband dongles from 3, Vodafone, T-Mobile and Orange
from just £10/month

Button link to Mobile Broadbandgenie.co.uk
Powered by
Broadband Genie