Cluster Service Architecture
Describing the components in the Cluster Service architecture and how they interact.
--------------------------------------------------------------------------------
Cluster Service Components
In order to implement a cluster, you must have Cluster Service installed on each node. Each component of Cluster Service has specific responsibilities in maintaining the operation of the cluster, as described below. Figure 1.2 shows the components.
Figure 1.2 Cluster Service components
Checkpoint Manager
The Checkpoint Manager ensures that Cluster Service can successfully failover cluster-unaware applications by performing Registry checkpointing. The Checkpoint Manager monitors a resource's Registry data and saves any changes, called checkpoint data, to the quorum recovery log. The Checkpoint Manager also writes a checkpoint to the quorum disk when the resource is taken offline. When a node takes ownership of a new resource, the Checkpoint Manager updates the resource's Registry data before bringing the resource online.
Communications Manager
The Communications Manager (also known as the Cluster Network Driver) manages the communication between the nodes in the cluster. It maintains continual communication with the other nodes by using Remote Procedure Calls (RPCs). If a node fails, the Communications Manager notifies the other nodes in the cluster as part of the failover process.
In addition, the Communications Manager delivers cluster heartbeat messages (data packets sent between nodes to verify the health of the cluster), responds to cluster connection requests, and notifies the entire cluster when resources are brought online or taken offline.
Configuration Database Manager
The Configuration Database Manager, or Database Manager, manages and maintains information about the cluster configuration in the configuration database. The configuration database, which is stored in the Registry of each node in the cluster, contains information about all of the entities of the cluster, including the cluster itself, resources, and groups. The Database Managers on each node cooperate to ensure that updates to the database are consistent and accurate.
The quorum resource stores the most current version of the configuration database in the form of recovery logs and Registry checkpoint files. This allows nodes that join a cluster to receive the latest version of the configuration database for their local Registry.
Event Processor
The Event Processor initializes Cluster Service and passes messages to and from the nodes of the cluster. These messages, called event signals, are associated with activities, such as status changes and requests to open or close applications. Event signals include important information that must be disseminated to the other cluster components. In addition, the Event Processor supports the cluster API eventing mechanism, enabling developers to write cluster-aware applications that can send and receive cluster events.
Event Log Manager
The Event Log Manager ensures that each node of the cluster has the same event log entries. To accomplish this, it replicates the event log of one node to all other nodes in the cluster.
Failover Manager
The Failover Manager decides which node in a cluster takes control of a resource in the event of a failover. If the cluster consists of more than two nodes, the Failover Managers of each online node will negotiate control of the resources from the failed node.
If the nodes in a cluster cannot communicate with each other but are otherwise functioning properly, the Failover Manager will still initiate the failover process. Each node will assume that the other has failed and will attempt to communicate with the quorum resource. The quorum resource guarantees that only one node has a resource online at a time. The node that is in communication with the quorum resource will bring the other node's resources online. As a result, the node that cannot communicate with the quorum resource will take its resources offline.
Global Update Manager
The Global Update Manager provides an interface for other components in Cluster Service to initiate and mange updates. It also provides a single method for performing these functions. It allows changes in state to be propagated to other nodes in the cluster. All state changes are sent to active nodes in the cluster. The Global Update Manager provides an atomic (all or none) update service to cluster members. It also provides a global update service to cluster components.
Log Manager
The Log Manager writes changes to recovery logs stored on the quorum resource. The recovery logs—also known as quorum logs—contain the transactions that have been made against the quorum.
Membership Manager
The Membership Manager tracks and manages cluster membership. When a node fails, the Membership Manager triggers a regroup event, causing all remaining nodes to update their membership lists. When a failed node comes back online, the membership lists get updated, reflecting the availability of the node. Also, depending on your configuration, control of the resources originally managed on this node may be returned to the node.
Node Manager
The Node Manager assigns resource group control to nodes based on group preference lists and node availability. All node managers communicate to detect failure in the cluster using heartbeat messages that are delivered by the Communications Manager.
The Node Manager works closely with the Membership Manager. When a node fails, the Node Manager tells the Membership Manager to trigger a regroup event. This causes each node in the cluster to update its view of the current cluster membership. When a failed node comes back online, a regroup event is triggered to refresh the membership list.
Object Manager
The Object Manager maintains an in-memory database of all cluster objects, such as nodes, groups, and resources. It uses this database to manage Cluster Service objects by creating, searching, enumerating, and maintaining reference count objects of different types.
Resource Manager
The Resource Manager is responsible for all dependencies and resources. It oversees and initiates appropriate actions such as starting and stopping resources. It is also responsible for initiating resource group failover, and it receives resources and cluster state information from the resource monitor and node managers.
Resource Monitors
A Resource Monitor provides a communication mechanism between Cluster Service and a resource DLL. By default, a single Resource Monitor is enabled per node. However, you can optionally enable additional Resource Monitors. Resource Monitors each run in their own process and communicate with Cluster Service through RPCs. In addition, Resource Monitors enable resources to run separately from other Cluster Service resources. This design protects Cluster Service from individual failures among the cluster resources.
Resource Monitors verify that each resource of the cluster is operating properly, using callbacks to resource DLLs. They do this by monitoring the state of the resources and then notifying the cluster of any changes. The Resource Monitor uses the LooksAlive and IsAlive cluster APIs to verify the health and availability of resources.
Resource DLLs
A resource DLL provides an interface for Cluster Service to communicate with the various types of applications it supports, including cluster management applications, cluster-aware applications, and cluster-unaware applications. For example, Cluster Service uses resource DLLs to bring resources online and to monitor their health. It does this through Resource Monitors. All resource DLLs provided by Microsoft for cluster-aware applications use a single Resource Monitor. Third-party resource DLLs are required to provide their own Resource Monitor.
Cluster management application These are applications that make calls to Cluster Service using the cluster APIs, such as Cluster Administrator.
Cluster-aware applications These are applications that run on a node in the cluster and take advantage of the features provided by Cluster Service. One important advantage they offer is the ability to fail over to other nodes in the cluster in the event of a failure. Because they can use their own application-specific resource DLLs, they are considered cluster-aware. They use the Cluster API to update and request cluster information.
There are two types of cluster-aware applications: those that interact with the cluster (and not cluster resources) and those that are managed as cluster resources. The Cluster Administrator application is an example of a cluster-aware application that interacts with the cluster. Exchange 2000 is an example of a cluster-aware application that is managed as a cluster resource.
Cluster-unaware applications Like cluster-aware applications, cluster-unaware applications run on a node in the cluster and use a resource DLL to communicate with Cluster Service. However, this DLL is the default resource DLL that comes with Cluster Service. These applications do not use the cluster API and are not aware of the cluster. Therefore, they do not take full advantage of the features of Cluster Service. Because cluster-unaware applications communicate with Cluster Service through a resource DLL, they can potentially be moved to another node in the event of a failover.
How the Nodes Communicate
The nodes in a cluster can communicate in three ways:
Remote Procedure Calls Cluster Service uses RPCs on IP sockets with UDP packets to communicate information between the nodes that are active and online. For example, changes to a resource on one node are communicated to the other nodes by using an RPC. If a message must be sent to a node that is offline, a different communication method must be used.
Cluster heartbeats The nodes of a cluster verify that the others are online and active by periodically transmitting datagrams to each other, as shown in Figure 1.3. These datagrams, called heartbeats, are single UDP packets sent periodically by the nodes' Node Managers. These packets are used to confirm that a node's network interface is still active. If a node fails to respond to a heartbeat, it is considered to have failed and is marked as unavailable.
The first server in a cluster to come online is, by default, responsible for sending heartbeats to the other nodes. However, this process begins only when another node joins the cluster. The other nodes are responsible for replying to each heartbeat transmitted by the original node.
Figure 1.3 Cluster node heartbeats
The first node sends heartbeats approximately every 0.5 seconds. The second node typically responds to each heartbeat within 0.2 seconds. Each heartbeat datagram is 48 bytes in size.
If a node fails to respond to a heartbeat, the original node begins a process of sending 18 heartbeats to the perceived failed node as follows:
Four heartbeats at approximately 0.7-second intervals
Three heartbeats within the next approximately 0.75 seconds
Two heartbeats at approximately 0.3-second intervals
Five heartbeats within the next approximately 0.9 seconds
Two heartbeats at approximately 0.3-second intervals
Two heartbeats approximately 0.3 seconds later
If the second node fails to respond to any of these heartbeats, it is marked as failed. The total time for the above process to complete is approximately 5.3 seconds.
If the first node to come online fails, the second node begins the process described above within 0.7 seconds of the last heartbeat received by the first node.
Quorum resource If a node is offline when a Cluster Service configuration change is made, the changes are stored in the quorum log on the quorum resource. These changes are then made available when the offline node is brought online. As a result, the quorum resource provides a third level of communication for nodes in a cluster.
Lesson Summary
In this lesson, you learned about each Cluster Service component and the function it performs in managing and maintaining a cluster. These components are
Checkpoint Manager
Communications Manager
Configuration Database Manager
Event Processor
Event Log Manager
Failover Manager
Global Update Manager
Log Manager
Membership Manager
Node Manager
Object Manager
Resource Manager
Resource Monitors
Resource DLLs
Summary
This lesson also explained the difference between cluster-aware and cluster-unaware applications and how each interacts with the cluster in the event of a failure. Finally, this lesson described the three methods of communication between the nodes in a cluster: RPCs, cluster heartbeats, and the quorum resource.
No comments:
Post a Comment