Your American History Reference Guide!
- Computer cluster

HistoryMania Information Site on Computer cluster American History American History Search        American History Browse welcome to our free resource site for all enthusiasts!

Computer cluster

A computer cluster is a group of locally connected computers that work together as a unit. One of the more popular implementations is a cluster with nodes running Linux as the OS and free software to implement the parallelism. This configuration is often referred to as a Beowulf cluster. Sun Microsystems has also released a clustering product called Grid engine. OpenSSI is another clustering project that provides single-system image capabilities. It leverages HP's NonStop Clusters for Unixware technology and other open source technology to provide a full, highly reliable SSI environment for Linux.

Contents

Cluster types

There are fundamentally four types of clusters:

  • Director-based clusters
  • Two-node clusters
  • Multi-node clusters
  • Massively parallel clusters

All mature (or highly available) cluster implementations attempt to eliminate single points of failure. Director-based clusters and Beowulf clusters are typically implemented for performance reasons, while two-node clusters are typically implemented for fault-tolerance.

A cluster of computers is referred to as a server farm when the computers are used to mimic the operations of a single server machine.

Cluster implementations

An organization publishes the 500 fastest clusters twice a year. TOP500 [1] is a collaboration between the University of Mannheim, the University of Tennessee, and the National Energy Research Scientific Computing Center at Lawrence Berkeley National Laboratory. The current top supercomputer is the Department of Energy's BlueGene/L system with performance of 70.72 TFlops beating out number 2 by over 18 TFlops.

Clustering can provide significant performance benefits versus price. The System X supercomputer at Virginia Tech, the third most powerful supercomputer on Earth as of November 2003, is a computer cluster of 1100 Apple Power Macintosh G5s running Mac OS X. The total cost of the system is $5.2 million, a tenth of the cost of slower mainframe supercomputers. The Power Mac G5s have now been replaced with Apple's Xserve G5 machines, which are smaller, reducing the size of the cluster. The Xserves still run Mac OS X. The Power Mac G5s were sold off.

The central concept of a Beowulf cluster is using COTS machines to produce a cost-effective alternative to a traditional supercomputer. One project that took this to an extreme was the Stone Soupercomputer.

Cluster history

The first commodity clustering product was ARCnet, developed by Datapoint in 1977. ARCnet wasn't a commercial success and clustering didn't really take off until DEC released their VAXcluster product in the 1980s for the VAX/VMS operating system. The ARCnet and VAXcluster products not only supported parallel computing, but also shared file systems and peripheral devices. They were supposed to give you the advantage of parallel processing while maintaining data reliability and uniqueness.

Cluster technologies

In the GNU/Linux world, there is also cluster software, such as the Linux Virtual Server, distcc, Kerrighed, Mosix and its free counterpart openMosix. LVS clusters are a form of director-based clusters that allow incoming requests for services to be distributed across multiple cluster nodes. Mosix and openMosix provide automatic process migration in a homogeneous cluster of GNU/Linux machines, while distcc provides parallel compilation when using GCC. Kerrighed is a GNU/Linux based operating system providing an SMP machine on top of a cluster of PCs.

DragonFly BSD, a recent fork of FreeBSD 4.8 is being redesigned at its core to enable native clustering capabilities.

MPI is a widely-available communications library that enables parallel programs to be written in C and Fortran, for example, in the climate modeling program MM5.

MSCS is Microsoft's high-availability cluster service for Windows. Based on technology developed by Digital Equipment Corporation, it was original released as a separate purchase add-on product for Windows NT4, was shipped as a point and click install with Windows 2000 Enterprise, and is built in to Windows 2003. The original product supported two cluster nodes connected via Ethernet and a shared SCSI channel with separate quorum and resource disk drives. The current version supports eight nodes in a single cluster, typically connected to a SAN. A rich API supports cluster-aware applications, generic templates provide support for non-cluster aware apps.

Grid computing represents the next step in cluster computing. The key differences between grids and traditional clusters are that grids are heterogeneous, supporting different hardware varieties and different operating systems within the same grid, while traditional clusters are homogenous, with cluster members all running the same OS, with fairly strict hardware compatibility requirements. Also, a true grid can spread out and encompass user desktops while clusters are generally confined to data centers. Unfortunately many vendors have attempted to cash in on the interest in grid computing by rebranding their existing traditional cluster products as grids.


See also

References

External links

The contents of this article are licensed from Wikipedia.org under the
GNU Free Documentation License. How to see transparent copy
Search | Browse | Contact | Legal info