This is the first in a three part series of articles that descibes how to create a high availability cluster on Linux.
Below we focus on the basic concepts, discuss configuration prerequisites, and create a basic design.
Actual configuration will take place in the second article in this series. We will conclude in the third article with a presentation of test cases and options for optimization.
Exactly what is a cluster and why do we need them?
In broad terms clustering which is the formation a cluster refers to having two or more computers also known as nodes synchronize timing, processes, and data when working together to complete a task. Clusters can be created at the hardware and software levels. In addition clusters may be formed at any level of the datacenter stack from physical components (controller cards whether storage or network), hypervisors, Operating Systems, and lastly at the userspace level in either the middleware such as the web server or database or in the end user application.
There are three basic motivators for creating a cluster. High performance Computing (HPC), network traffic Load Balancing, and service resilience in the form of High Availability (HA).
HPC or Grid Computing is when a task is broken down into subcomponents and execution of each component is distributed across the nodes in the cluster to improve the speed of computation. The outputs from each node are finally reconstitued to produce the final result.
Load Balancing occurs when production traffic is distributed between nodes in the cluster to reduce the load on any single system. Various algorithms determine the method by which the distribution occurs such as round-robin or least connections etc. Balancing can be done to improve performance and or provide resiliency at the network level. Network load balancing is similar to bonding or teaming at the host level with the exception being that in a cluster case it happens across physical nodes.
High Availability or Failover clustering is done to improve the fault tolerance characteristics of a particular service. In this case services are built up from resource groups that are managed between the nodes. In the event of a fault on any node running a service the affected node is taken out of the cluster that is it is fenced and the service relocated to another healthy node. Common resources found in HA clusters are shared storage devices and virtual IP addresses. Also there will be a fence device with optional heuristics that is used to trigger the reboot or IO restriction for faulted nodes.
Red Hat Cluster Suite (RCS)
RCS is Red Hat's software for creating a HA cluster at the operating system level using Red Hat Enterpise Linux (RHEL).
Read the full tutorial at creating a red hat cluster