The High-Availability Tango: Why Bother?

Picture this: It's 3 AM, and your primary server decides it's had enough and goes down faster than a lead balloon. Without a proper HA setup, you're in for a world of hurt. That's where our dynamic duo comes in:

  • Keepalived: The watchful guardian that manages our virtual IP.
  • Nginx: Our trusty load balancer and reverse proxy.

Together, they ensure that even if one server throws a tantrum, another steps in seamlessly. It's like having a stunt double for your server - the show must go on!

Setting the Stage: What You'll Need

Before we jump into the nitty-gritty, let's make sure we have all our ducks in a row:

  • Two or more servers (let's call them node1 and node2)
  • A virtual IP address (VIP) that will float between our servers
  • Root access (because we're doing some serious business here)
  • A basic understanding of Linux and networking (if you can spell IP, you're halfway there)

Step 1: Installing Our Star Players

First things first, let's get Nginx and Keepalived installed on both of our nodes. We'll assume you're using a Debian-based system because, well, we have to start somewhere.


sudo apt update
sudo apt install nginx keepalived

Easy peasy, lemon squeezy. Now we have the tools, let's put them to work!

Step 2: Configuring Nginx - The Load Balancing Maestro

Nginx will be our frontline warrior, distributing incoming requests and ensuring smooth sailing. Let's set it up as a load balancer:


http {
    upstream backend {
        server 192.168.1.10:8080;
        server 192.168.1.11:8080;
    }

    server {
        listen 80;
        server_name example.com;

        location / {
            proxy_pass http://backend;
            proxy_set_header Host $host;
            proxy_set_header X-Real-IP $remote_addr;
        }
    }
}

This configuration tells Nginx to distribute traffic between our two backend servers. Don't forget to replace the IP addresses with your actual server IPs!

Step 3: Keepalived Configuration - The High-Availability Puppet Master

Now for the real magic - Keepalived. This bad boy will manage our virtual IP and ensure that it always points to a healthy server. Let's configure it on both nodes:

On node1 (the master):


vrrp_script chk_nginx {
    script "pidof nginx"
    interval 2
}

vrrp_instance VI_1 {
    state MASTER
    interface eth0
    virtual_router_id 51
    priority 101
    advert_int 1
    authentication {
        auth_type PASS
        auth_pass supersecretpassword
    }
    virtual_ipaddress {
        192.168.1.100
    }
    track_script {
        chk_nginx
    }
}

On node2 (the backup):


vrrp_script chk_nginx {
    script "pidof nginx"
    interval 2
}

vrrp_instance VI_1 {
    state BACKUP
    interface eth0
    virtual_router_id 51
    priority 100
    advert_int 1
    authentication {
        auth_type PASS
        auth_pass supersecretpassword
    }
    virtual_ipaddress {
        192.168.1.100
    }
    track_script {
        chk_nginx
    }
}

The main differences here are the state and priority settings. Node1 is set as MASTER with a higher priority, while node2 is the BACKUP with a lower priority.

Step 4: Starting the Show

Time to bring our creation to life! On both nodes, run:


sudo systemctl start nginx
sudo systemctl start keepalived

If all goes well, you should now have a functioning high-availability setup. The virtual IP (192.168.1.100 in our example) will be assigned to the master node.

Step 5: Testing - Because Trust, but Verify

Now for the moment of truth. Let's make sure our setup can handle a server going down:

  1. Check the IP assignment again - it should have moved to the other node!

On the active node, stop Nginx:

sudo systemctl stop nginx

Check which node currently holds the virtual IP:

ip addr show eth0

If everything worked as expected, congratulations! You've just set up a basic but effective high-availability cluster.

Diving Deeper: Fine-tuning and Advanced Concepts

Now that we have a working setup, let's explore some ways to make it even better:

1. Customizing Health Checks

Instead of just checking if Nginx is running, we can create more sophisticated health checks:


#!/bin/bash
# /etc/keepalived/check_nginx.sh

if [ $(ps -ef | grep -v grep | grep nginx | wc -l) -eq 0 ]; then
    exit 1
else
    curl -s -o /dev/null http://localhost
    if [ $? -eq 0 ]; then
        exit 0
    else
        exit 1
    fi
fi

Update your Keepalived configuration to use this script:


vrrp_script chk_nginx {
    script "/etc/keepalived/check_nginx.sh"
    interval 2
    weight 2
}

2. Implementing Notification Scripts

Want to know when failover occurs? Let's add a notification script:


#!/bin/bash
# /etc/keepalived/notify.sh

case $1 in
    "MASTER")
        echo "$(date) - Became MASTER" >> /var/log/keepalived.log
        # Add your notification logic here (e.g., send an email or Slack message)
        ;;
    "BACKUP")
        echo "$(date) - Became BACKUP" >> /var/log/keepalived.log
        ;;
    "FAULT")
        echo "$(date) - Entered FAULT state" >> /var/log/keepalived.log
        ;;
esac

Add this to your Keepalived configuration:


vrrp_instance VI_1 {
    ...
    notify /etc/keepalived/notify.sh
    ...
}

3. Multiple Virtual IPs

Need to manage multiple services? You can set up multiple virtual IPs:


vrrp_instance VI_1 {
    ...
    virtual_ipaddress {
        192.168.1.100
        192.168.1.101
        192.168.1.102
    }
    ...
}

Common Pitfalls and How to Avoid Them

Even the best-laid plans can go awry. Here are some common issues and how to tackle them:

1. The Split-Brain Syndrome

If your nodes can't communicate, they might both think they're the master. To prevent this:

  • Use a dedicated network for Keepalived communication
  • Implement fencing mechanisms (like STONITH - Shoot The Other Node In The Head)

2. Inconsistent Configurations

Ensure your Keepalived and Nginx configurations are identical across all nodes. Consider using configuration management tools like Ansible to maintain consistency.

3. Firewall Woes

Make sure your firewall allows VRRP traffic (protocol 112) between your nodes:


sudo iptables -A INPUT -p vrrp -j ACCEPT

Taking It to the Next Level: Container Orchestration

Ready for the big leagues? Consider integrating your HA setup with container orchestration systems like Kubernetes. While Kubernetes has its own HA mechanisms, Keepalived can still play a role in managing external access to your cluster.

Wrapping Up: The Path to Five Nines

Congratulations! You've taken a significant step towards achieving the coveted "five nines" of uptime. Remember, high availability is not a set-it-and-forget-it solution. Regular testing, monitoring, and maintenance are crucial to ensuring your setup remains robust.

Some key takeaways:

  • Always test your failover mechanisms regularly
  • Monitor your Keepalived and Nginx logs for any anomalies
  • Keep your configurations in version control
  • Document your setup and failover procedures for your team

With Keepalived and Nginx in your arsenal, you're well-equipped to tackle the challenges of high availability. Sweet dreams, and may your servers never sleep!

"The best way to avoid failure is to fail constantly." - Netflix

Now go forth and conquer those uptime metrics! And remember, in the world of high availability, paranoia is just good planning.