Clusters

Every SMP system has a maximum size: a maximum number of nodes, memory, disks, and network interfaces that can be attached to the system. To go beyond these limits, the software architecture must support a cluster.

All the largest computer systems are structured as clusters. The largest known database, the WAL-MART system, runs on an NCR/Teradata cluster of 500 Intel processors accessing approximately 1,500 disks. Other large databases are supported by IBM MVS/Sysplex systems, Tandem Himalaya systems, or Digital VMScluster systems.

Clusters work by partitioning the database and application among several different nodes, or computers. Each node stores a part of the database and each performs a part of the workload. The cluster grows by adding nodes and redistributing the storage and work to the new nodes. This growth and redistribution should be automatic and require no changes to the application.

Partitioned data is used in centralized systems to scale up capacity or bandwidth. When one disk is too small for a database, the database system partitions it over many disks. This same technique is used if traffic on a file or database exceeds the capacity of one disk. This partitioning is transparent to the application. The application accesses all data on all the disks as though it were on a single disk. Similarly, clients are partitioned among networks to scale up network bandwidth. Clusters generalize this partitioning to spread the computation among cluster nodes. The key challenge is to make this partitioning transparent to the customer, so that a cluster is as easy to manage and program as a single system.