Quote

"Between stimulus and response there is a space. In that space is our power to choose our response.
In our response lies our growth and freedom"


“The only way to discover the limits of the possible is to go beyond them into the impossible.”


Monday 13 June 2016

Manage High Availability Clusters Using Pacemaker



Computer Cluster 

A computer cluster consists of a set of loosely or tightly connected computers that work together so that, in many respects, they can be viewed as a single system. Unlike ‘grid computers’,  computer clusters have each node set to perform the same task, controlled and scheduled by software.

The components of a cluster are usually connected to each other through fast local area networks ("LAN"), with each node (computer used as a server) running its own instance of an operating system. In most circumstances, all of the nodes use the same hardware and the same operating system, although in some setups (i.e. using Open Source Cluster Application Resources (OSCAR)), different operating systems can be used on each computer, and/or different hardware.

They are usually deployed to improve performance and availability over that of a single computer, while typically being much more cost-effective than single computers of comparable speed or availability.

Computer clusters emerged as a result of convergence of a number of computing trends including the availability of low-cost microprocessors, high speed networks, and software for high-performance distributed computing. They have a wide range of applicability and deployment, ranging from small business clusters with a handful of nodes to some of the fastest supercomputers in the world such as IBM's Sequoia. The applications that can be done however, are nonetheless limited, since the software needs to be purpose-built per task. It is hence not possible to use computer clusters for casual computing tasks.


Pacemaker

Pacemaker is an open source high availability resource manager software used on computer clusters since 2004. Until about 2007, it was part of the Linux-HA project, then was split out to be its own project.
It implements several APIs for controlling resources, but its preferred API for this purpose is the Open Cluster Framework resource agent API. Pacemaker is generally used with Corosync or Heartbeat.

Pacemaker’s key features include:

  • Detection and recovery of node and service-level failures
  • Storage agnostic, no requirement for shared storage
  • Resource agnostic, anything that can be scripted can be clustered
  • Supports STONITH for ensuring data integrity
  • Supports large and small clusters
  • Supports both quorate and resource driven clusters 
  • Supports practically any redundancy configuration
  • Automatically replicated configuration that can be updated from any node
  • Ability to specify cluster-wide service ordering, colocation and anti-colocation
  • Unified, scriptable, cluster shell
  • Support for advanced service types
        Clones: for services which need to be active on multiple nodes
        Multi-state: for services with multiple modes 
           (eg. master/slave, primary/secondary)

Pieces of Clusters

At a high level, the cluster is made up of three pieces:

Non-cluster aware components. These pieces include the resources themselves, scripts that start, stop and monitor them, and also a local daemon that masks the differences between the different standards these scripts implement.

Resource management Pacemaker provides the brain that processes and reacts
to events regarding the cluster. These events include nodes joining or leaving the cluster; resource events caused by failures, maintenance, scheduled activities; and other administrative actions. Pacemaker will compute the ideal state of the cluster and plot a path to achieve it after any of these events. This may include moving resources, stopping nodes and even forcing them offline with remote power switches.

Low level infrastructure Corosync provides reliable messaging, membership and quorum information about the cluster (illustrated in red).


Node Configurations

While the components may not vary but a cluster can support different type of Node configurations. The most common size for an HA cluster is a two-node cluster, since that is the minimum required to provide redundancy, but many clusters consist of many more, sometimes dozens of nodes. Such configurations can sometimes be categorized into one of the following models:

Active/active — Traffic intended for the failed node is either passed onto an existing node or load balanced across the remaining nodes. This is usually only possible when the nodes use a homogeneous software configuration.

Active/passive — Provides a fully redundant instance of each node, which is only brought online when its associated primary node fails. This configuration typically requires the most extra hardware.

N+1 — Provides a single extra node that is brought online to take over the role of the node that has failed. In the case of heterogeneous software configuration on each primary node, the extra node must be universally capable of assuming any of the roles of the primary nodes it is responsible for. This normally refers to clusters that have multiple services running simultaneously; in the single service case, this degenerates to active/passive.

N+M — In cases where a single cluster is managing many services, having only one dedicated failover node might not offer sufficient redundancy. In such cases, more than one (M) standby servers are included and available. The number of standby servers is a tradeoff between cost and reliability requirements.

N-to-1 — Allows the failover standby node to become the active one temporarily, until the original node can be restored or brought back online, at which point the services or instances must be failed-back to it in order to restore high availability.

N-to-N — A combination of active/active and N+M clusters, N to N clusters redistribute the services, instances or connections from the failed node among the remaining active nodes, thus eliminating (as with active/active) the need for a 'standby' node, but introducing a need for extra capacity on all active nodes.

The terms logical host or cluster logical host is used to describe the network address that is used to access services provided by the cluster. This logical host identity is not tied to a single cluster node. It is actually a network address/hostname that is linked with the service(s) provided by the cluster. If a cluster node with a running database goes down, the database will be restarted on another cluster node.

Following picture from the Pacemaker documentation shows an Active/Active configuration:


Using PCS Command Line to Manage Pacemaker and Corosync

The pcs command line interface provides the ability to control and configure corosync and pacemaker.

The general format of the pcs command is as follows:
pcs [-f file] [-h] [commands]

List of PCS commands are as follows:
  • cluster: Configure cluster options and nodes.
  • resource: Create and manage cluster resources.
  • stonith: Configure fence devices for use with Pacemaker.
  • constraint: Manage resource constraints.
  • property: Set Pacemaker properties.
  • status: View current cluster and resource status.
  • config: Display complete cluster configuration in user-readable form.

Hope this helps clarify how pacemaker fits into high availability clusters setup.