How to configure MySQL High Availability with DRBD & Heartbeat on CentOS
In this tutorial, we’re going to go through the entire process of installing, configuring and testing DRBD, Heartbeat and MySQL running in a 2 node cluster environment. This will be a general configuration for learning.
Requirements:
In this setup we need two servers. To continue further, you should be very comfortable with installing and configuring MySQL and other software package before starting this step by step process.
I am using CentOs 5.5 distribution for this tutorial. you will need two server, I am using virtualbox for this tutorial as it is the best for this type setup testing.
Server 1:
Host Name: CentOSvm1
IP Address eth0: 192.168.15.10/24
Server 2:
Host Name: CentOSvm2
IP Address eth0: 192.168.15.11/24
1. Installation
yum install kmod-drbd heartbeat mysql mysql-devel mysql-server
This will install drbd, heartbeat, drbd kernel module, mysql, which will work just fine for learning DRBD and Heartbeat. Now that we have all our software installed, Now need to start the configuration.
2: Configuring DRBD
We’re ready to start configuring DRBD on our two node system. Centosvm1 will become our primary node, making Centosvm2 our secondary node. These are our steps for configuring DRBD:
i. Create partitions on both nodes.
ii. Create drbd.conf
iii. Configure global drbd options.
iv. Configure resource which consists of:
o Disk partitions on node0 and node1.
o Network connection between nodes.
o Error handling
o Synchronization
3. On each node, use fdisk to create a type 83 linux partition:
fdisk /dev/sdb
After creating partition, please do not create any filesystem on it using tools such as: mkfs.ext3
4. Editing the configuration file.
Now edit the drbd.conf file. Location of this file is /etc/drbd.conf, make sure you backup first.
vi drbd.conf
#Here is a sample but working configuration file.
global {
minor-count 1;
}
resource mysql {
protocol C; # There are A, B and C protocols. Stick with C.
# incon-degr-cmd “echo ‘DRBD Degraded!’ | wall; sleep 60 ; halt -f”;
# If a cluster starts up in degraded mode, it will echo a message to all
# users. It’ll wait 60 seconds then halt the system.
on centosvm1 {
device /dev/drbd0; # The name of our drbd device.
disk /dev/sda3; # Partition we wish drbd to use.
address 192.168.15.10:7788; # Centos5a IP address and port number.
meta-disk internal; # Stores meta-data in lower portion of sdb1.
}
on centosvm2 {
device /dev/drbd0; # Our drbd device, must match node0.
disk /dev/sda3; # Partition drbd should use.
address 192.168.15.11:7788; # IP address of Centos5b, and port number.
meta-disk internal; #Stores meta-data in lower portion of sdb1.
}
disk {
on-io-error detach; # What to do when the lower level device errors.
}
net {
max-buffers 2048; #datablock buffers used before writing to disk.
ko-count 4; # Peer is dead if this count is exceeded.
#on-disconnect reconnect; # Peer disconnected, try to reconnect.
}
syncer {
rate 10M; # Synchronization rate, in megebytes. Good for 100Mb network.
#group 1; # Used for grouping resources, parallel sync.
al-extents 257; # Must be prime, number of active sets.
}
startup {
wfc-timeout 0; # drbd init script will wait infinitely on resources.
degr-wfc-timeout 120; # 2 minutes.
}
}
# End of resource mysql
5. Bringing up DRBD
All software, drbd.conf, and devices have been created, make sure only Centosvm1 is running. Login as root, then issue the following command:
[root@node0 ~]# drbdadm create-md mysql
After that reboot Centosvm1 server and login to it as root. Issue the following command:
cat /proc/drbd
Output will be something like this.
Note that centosvm1 is in a secondary state, we will fix this by promoting it to the primary.
5. Configuring the second server.
Now Start up second node that means Centosvm2 then you’ll have to issue the following command:
[root@centos5b ~]# drbdadm create-md mysql
Complete the process and issue another command
cat /proc/drbd
Output will be something like this
You can see that both server is in seconday state. now we will promote the centosvm1 to primary.
6. Promoting first node to Primary
Login to first node that means Centos5a as root and issue the following command.
[root@Centosvm1 ~]# drbdadm — –overwrite-data-of-peer primary mysql
Now verify that it really promoted the first node to primary by running the following command again
cat /proc/drbd
Output will be something like this
You’ve now created a two node cluster. It’s very basic, failover is not automatic. We need to take care of that with Heartbeat. First, we need to test DRBD.
7.Testing DRBD
To have a working system, we need to create a filesystem on Centosvm1. We do that just like normal, the difference is we use /dev/drbd0 device instead of /dev/sda3:
[root@Centosvm1 ~]# mkfs.ext3 -L mysql /dev/drbd0
[root@Centosvm2 ~]# mkfs.ext3 /dev/drbd0
mke2fs 1.35 (28-Feb-2004)
mkfs.ext3: Wrong medium type while trying to determine filesystem size
You’re on Centosvm2, which is secondary and /dev/drbd0 is read only! Switch to Centosvm1.
Once that’s done, we’ll do some simple tests. On Centosvm1, mount /dev/drbd0 on /mnt/. Change to that directory, then touch a few test files, create a directory. In order to check to see if our files have been replicated, we need to unmount /mnt/mysql, make Centosvm1 secondary, promote Centosvm2 to primary, remount /mnt/mysql then check to see if your files are on Centos5b. These steps are:
[root@Centosvm1 ~]# umount /mnt/mysql
[root@Centosvm1 ~]# drbdadm secondary mysql
Switch to Centosvm2, then:
[root@Centosvm2 ~]# drbdadm primary mysql
[root@Centosvm2 ~]# mount /dev/drbd0 /mnt/mysql
Check /mnt/ and see what’s in there. You should see your files and directories you created on Centosvm1! You’ll probably notice we didn’t make a filesystem on Centosvm2 for /dev/drbd0. That’s because /dev/drbd0 is replicated, so when we created the filesystem on Centosvm1, it was also created on Centosvm2. Matter of fact, anything we do in Centosvm1:/dev/drbd0 will automatically get replicated to Centosvm2:/dev/drdb0.
Next, we’ll configure MySQL to use our DRBD device. We’ll practice manually failing MySQL over between nodes before automating it with Heartbeat. You want to make sure you understand how the entire system works before automation. That way, if there was a problem with our test files not showing up on Centosvm2, then we know there’s a problem with DRBD. If we tried to test the entire system as one large piece, it would be much more difficult to figure out which piece of the puzzle was giving us our problem. For practice, return Centosvm1 to primary node, and double check your files.
Split Brain Recovery
If for some reason your two nodes went out of sync with each other, there may be a problem of split brain. Please try the following steps to recover from split brain condition:
Manual split brain recovery
DRBD detects split brain at the time connectivity becomes available again and the peer nodes exchange the initial DRBD protocol handshake. If DRBD detects that both nodes are (or were at some point, while disconnected) in the primary role, it immediately tears down the replication connection. The tell-tale sign of this is a message like the following appearing in the system log:
Split-Brain detected, dropping connection!
After split brain has been detected, one node will always have the resource in a StandAlone connection state. The other might either also be in the StandAlone state (if both nodes detected the split brain simultaneously), or in WFConnection (if the peer tore down the connection before the other node had a chance to detect split brain).
At this point, unless you configured DRBD to automatically recover from split brain, you must manually intervene by selecting one node whose modifications will be discarded (this node is referred to as the split brain victim). This intervention is made with the following commands:
drbdadm secondary resource
drbdadm — –discard-my-data connect resource
On the other node (the split brain survivor), if its connection state is also StandAlone, you would enter:
drbdadm connect resource
You may omit this step if the node is already in the WFConnection state; it will then reconnect automatically.
If the resource affected by the split brain is a stacked resource, use drbdadm –stacked instead of just drbdadm.
Upon connection, your split brain victim immediately changes its connection state to SyncTarget, and has its modifications overwritten by the remaining primary node.
Note: The split brain victim is not subjected to a full device synchronization. Instead, it has its local modifications rolled back, and any modifications made on the split brain survivor propagate to the victim.
After re-synchronization has completed, the split brain is considered resolved and the two nodes form a fully consistent, redundant replicated storage system again
Good one.
Hi Bhaskar,
Thanks a lot for your valuable feedback. Stay tuned for more useful tips & tutorials regularly 🙂
Nice artical
but would have been great if you have given rest of the artical about mysql and heatbeat