Heficed Floating IPs

What are floatings IPs?

Floating IPs allow you to have IP redundancy in case of a system failure. This is achieved by monitoring your servers and automatically routing IP addresses to another server if an issue is detected.

You are provided with a pack of scripts and configuration options that allow utilizing Heficed Terminal API as a way to achieve IP redundancy.

The floating IP solution uses Corosync / Pacemaker as a monitoring system and our API as an IP migration tool.

The system can migrate subnets of any size with the help of dynamic routing. Automatic IP migration between different locations is also possible for subnets that are greater than or equal to /24.

Follow the instructions below to learn how to configure and use the Floating IP solution:

STEP 1. Prerequisites

To start, you need at least two active machines in our infrastructure with CentOS installed. That could be Cloud Hosting or Bare Metal servers.

You also need IP addresses.

Once you have the necessary servers ready, gather information about them from your Terminal:

  • Subnet address – the subnet you want to be floating
  • Subnet CIDR – floating subnet mask in CIDR
  • Hostnames – this is displayed in the Terminal of each machine you will use in the High Availability (HA) cluster
  • Product type – each machine type that you will use in the HA cluster

Lastly, since you will be using the Heficed API, you need to acquire these specific variables:

  • Tenant ID
  • Client ID
  • Client Secret

You can learn how to obtain this information using our API documentation.

STEP 2. Installation

Reassignment script

First, download the scripts that are required for the Floating IP solution.

Next, set up the subnet reassignment script and its configuration.

NOTE: The following steps are performed with our chosen file paths. Feel free to adjust the Python script and its configuration paths as you please, but make sure to change it in all the scripts where necessary.
Python Requests Library must be installed on your servers.

Make a new directory for the scripts on both machines:

mkdir /opt/floatingIP

Place assign-ip.py and api.conf in the following location on both machines:

/opt/floatingIP/

Proceed to edit and fill the api.conf with the required values. All values should be the same on both servers except the hostname and product type.

Install Corosync, Pacemaker and PCS

The next step is to install Corosync, Pacemaker, and PCS on your machines.

Install the software packages on both machines:

yum install pacemaker pcs

The PCS utility creates a new system user during the installation named hacluster with a disabled password. We need to define a password for this user on both servers. This is required for successful PCS synchronization and subnet migration between cluster nodes.

On both machines, run:

passwd hacluster

NOTE: Please use the same password on both machines. This password will also be required in further configuration steps.

Set Up the Cluster

Once you install Corosync, Pacemaker, and PCS on both servers, we can set up the cluster. To enable and start the PCS daemon, run the following on both machines:

systemctl enable pcsd.service
systemctl start pcsd.service

Authenticate the cluster nodes using the username hacluster and the same password you defined in the previous step. You will need to enter the primary IP address for each node. From the primary machine, run:

pcs cluster auth first_machine_primary_IP_address second_machine_primary_IP_address

The output should look like this:

Username: hacluster
Password: 
first_machine_primary_IP_address: Authorized
second_machine_primary_IP_address: Authorized

On the primary machine, generate the Corosync configuration file by running:

pcs cluster setup --name webcluster first_machine_primary_IP_address second_machine_primary_IP_address

The output will look like this:

Shutting down pacemaker/corosync services...
Redirecting to /bin/systemctl stop pacemaker.service
Redirecting to /bin/systemctl stop corosync.service
Killing any remaining services...
Removing all cluster configuration files...
first_machine_primary_IP_address: Succeeded
second_machine_primary_IP_address: Succeeded
Synchronizing pcsd certificates on nodes first_machine_primary_IP_address, second_machine_primary_IP_address...
first_machine_primary_IP_address: Success
second_machine_primary_IP_address: Success

Restaring pcsd on the nodes in order to reload the certificates...
first_machine_primary_IP_address: Success
second_machine_primary_IP_address: Success

The new configuration file will be generated at /etc/corosync/corosync.conf based on the parameters provided to the pcs cluster setup command. In this example, the cluster name was webcluster, but you can choose any name you want.

Next, you need to start your cluster. Run the following command from the primary machine:

pcs cluster start --all

Output:

first_machine_primary_IP_address: Starting Cluster...
second_machine_primary_IP_address: Starting Cluster...

You can check if both nodes have connected to the cluster by running the following command on any of the cluster servers:

pcs status corosync

Output:

Membership information
----------------------
 Nodeid Votes Name
 2 1 secondary_private_IP_address
 1 1 primary_private_IP_address (local)

To get more information about the current status of the cluster, run:

pcs cluster status

The output should be similar to this:

 Last updated: Fri Dec 11 11:59:09 2015 Last change: Fri Dec 11 11:59:00 2015 by hacluster via crmd on secondary
 Stack: corosync
 Current DC: secondary (version 1.1.13-a14efad) - partition with quorum
 2 nodes and 0 resources configured
 Online: [ primary secondary ]

PCSD Status:
 primary (primary_private_IP_address): Online
 secondary (secondary_private_IP_address): Online

Now, enable the Corosync and Pacemaker services so that they would start on system boot. Run the following on both machines:

systemctl enable corosync.service
systemctl enable pacemaker.service

In our configuration, we recommend disabling STONITH (Shoot The Other Node In The Head). Run the following command on one of the machines:

pcs property set stonith-enabled=false

Create a Floating IP Reassignment Resource Agent

The last thing you need to configure is the resource agent that will execute the IP reassignment script when a failure is detected in the primary cluster node.

The resource agent is responsible for creating an interface between the cluster and the resource itself. In this case, the resource is the assign-ip.py script. The cluster requires the resource agent to execute the right procedures when given a start, stop or monitor command.

The resource agent in this example will be OCF (Open Cluster Framework) standard. We will create a new OCF resource agent to manage the assign-ip.py service on both machines.

First, create the directory that will contain the resource agent. The directory name will be used by Pacemaker as an identifier for this custom agent.

Run the following on both machines:

mkdir /usr/lib/ocf/resource.d/heficed

Next, use floatip resource agent script and place it in the newly created directory, on both machines:

/usr/lib/ocf/resource.d/heficed/

Now, make the script executable using the following command on both machines:

chmod +x /usr/lib/ocf/resource.d/heficed/floatip

Next, register the resource agent within the cluster using the PCS utility. The following command should be executed from one of the nodes:

pcs resource create FloatIP ocf:heficed:floatip

The resource should now be registered and active in the cluster. You can check the registered resources from any of the nodes using the following command:

pcs status

Output:

...
2 nodes and 1 resource configured

Online: [ primary secondary ]

Full list of resources:

 FloatIP (ocf::heficed:floatip): Started primary

...

STEP 3. Test the system

To test if the system is working, you can run floatip script in bash with command reporting:

bash -x /usr/lib/ocf/resource.d/heficed/floatip $command

$command is the option that is provided by the HA system. The script must work with these four commands:

  • start – start the resource
  • stop – stop the resource
  • monitor – monitor the health of the resource
  • meta-data – provide information about this resource as an XML snippet

To check the status code after the script completes, enter:

echo $?

More information about the OCF Resource can be found here.

If the script returns the correct codes and does not show any errors, the system should work correctly. Otherwise, please debug as needed.

If you encounter any difficulties or have any further questions, feel free to contact our Customer Support team by submitting a ticket via your Terminal or messaging us directly to support@heficed.com.

Related articles:

Was this article helpful?

Still need help?

Heficed Slack Community

Get involved in Heficed Slack community. Get updates, ask questions, connect with peers.

Heficed Slack

Need support?

If you need any further help, don't hesitate to send a support request to our support team.