DRBD over OpenVPN
This outlines how to create a two node DRBD where one node is not active but has an up-to-date data volume.
For added fun the setup uses OpenVPN but that is not necessary.
The detail below involves configuring two hosts, they are named
rosaria
- the server and paimon
- the backup.
OpenVPN Configuration
This is a simple shared secret configuration between the two hosts.
First, create the key:
openvpn --genkey secret /etc/openvpn/static.key
On the server, generally the one with a static IP address
(/etc/openvpn/openvpn-rosaria.conf
):
dev tun0
proto udp
port 1194
# 192.168.56.1 is rosaria
# 192.168.56.2 is paimon
ifconfig 192.168.56.1 192.168.56.2
# Our pre-shared static key
secret static.key
cipher AES-256-CBC
# LZO compression
comp-lzo
# 3 -- medium output, good for normal operation.
verb 3
On the client (/etc/openvpn/openvpn-paimon.conf
):
dev tun0
proto udp
port 1194
remote 192.168.1.8
# 192.168.56.1 is rosaria
# 192.168.56.2 is paimon
ifconfig 192.168.56.2 192.168.56.1
# Our pre-shared static key
secret static.key
cipher AES-256-CBC
# LZO compression
comp-lzo
# 3 -- medium output, good for normal operation.
verb 3
On Alpine or Gentoo Linux (any using openrc) configure init.d and start the VPN at boot time:
On rosaria
:
cd /etc/init.d
ln -s openvpn openvpn-rosaria
rc-update add openvpn-rosaria default
On paimon
:
cd /etc/init.d
ln -s openvpn openvpn-paimon
rc-update add openvpn-paimon default
Reboot both hosts to verify correct operation of the init scripts or start the VPN the normal way:
on rosaria
:
/etc/init.d/openvpn-rosaria start
and on paimon
/etc/init.d/openvpn-paimon start
Ping one host from the other to see that it is connected and working. For troubleshooting have a search of the internets!
Logical Volumes with LVM
The physical device used here is /dev/vdb
it will require
alteration depending on environment.
On both nodes:
pvcreate /dev/vdb
vgcreate local /dev/vdb
lvcreate -L 128M -n drbdmeta local
lvcreate -L 64G -n replicated0 local
rc-update add lvm boot
Above is just a summary, for more detail on LVM please see lvmhowtorev2.
Configure DRBD
Configuration files are the same on both hosts. Make sure
that the correct hostnames are used, in the example below
the two hostnames are rosaria
and paimon
.
/etc/drbd.d/global_common.conf
global {
usage-count yes;
udev-always-use-vnr;
}
common {
handlers {
# No handlers.
}
startup {
}
options {
}
disk {
}
net {
cram-hmac-alg sha1;
shared-secret "joshua";
sndbuf-size 10M;
use-rle;
protocol A;
}
syncer {
rate 1M;
csums-alg md5;
}
}
/etc/drbd.d/0_replicated0.res
resource replicated0 {
device /dev/drbd0;
disk /dev/local/replicated0;
meta-disk /dev/local/drbdmeta[0];
on rosaria {
address 192.168.56.1:7789;
}
on paimon {
address 192.168.56.2:7789;
}
net {
after-sb-0pri discard-zero-changes;
after-sb-1pri discard-secondary;
after-sb-2pri disconnect;
}
}
Create the metadata and bring up the replication pair (both hosts):
drbdadm create-md replicated0
drbdadm up replicated0
Check the status of the drbd daemon:
drbdadm status
The output should not have "Connecting" in it, if it does check that it is possible to perform a TCP connection between the hosts (netcat is your friend). Example output may look like this:
replicated0 role:Secondary
disk:Inconsistent
peer role:Secondary
replication:Established peer-disk:Inconsistent
Notice that on both nodes the output is the same, both nodes are
currently "Secondaries". To make rosaria
the primary node (changes
on rosaria
copied to paimon
):
drbdadm -- --overwrite-data-of-peer primary replicated0
Now roaria
is Primary and paimon
is Secondary, rosaria
is copying
the empty data to paimon
:
rosaria# drbdadm status
replicated0 role:Primary
disk:UpToDate
peer role:Secondary
replication:SyncSource peer-disk:Inconsistent done:0.02
And on paimon
paimon# drbdadm status
replicated0 role:Secondary
disk:Inconsistent
peer role:Primary
replication:SyncTarget peer-disk:UpToDate done:0.19
After a little while the two peers should be consistent... a
cute progress bar and more detail are available by cat
ing a
special file in /proc
:
version: 8.4.11 (api:1/proto:86-101)
srcversion: 2CC17D07553A98E96473D42
0: cs:SyncTarget ro:Secondary/Primary ds:Inconsistent/UpToDate A r-----
ns:0 nr:0 dw:0 dr:8963072 al:8 bm:0 lo:0 pe:0 ua:0 ap:0 ep:1 wo:f oos:58145792
[=>..................] sync'ed: 13.4% (56780/65536)M
finish: 0:24:47 speed: 39,064 (33,568) want: 41,000 K/sec
Create a file system on the Primary node rosaria
.
rosaria# mkfs.ext4 /dev/drbd0
At this point mkfs will discard all blocks in the file system which will speed up the sync operation, it did not take 24 minutes during this test.
Make a mount point and mount the new filesystem:
mkdir /mnt/replicated0
mount /dev/drbd0 /mnt/replicated0
DRBD is now correctly configured using protocol A which is not the most reliable protocol. Read about the protocols and their differences at LINBIT.
Starting the service after a reboot
When a node is restarted the service will not automatically resume the role it had before the reboot. To restore the remote site after a reboot there are some steps to be undertaken.
In this section the operations will be performed manually, there are tools that can perform these operations automatically but they are out of scope. Having a good understanding of this mechanism is important in any case, it provides background that is useful in diagnosing problems.
After a reboot the hosts will not be running DRBD and when started both
rosaria
and paimon
will be offline. To restore normal operation bring up
the DRBD resource on rosaria
first:
drbdadm up replicated0
replicated0
on rosaria
is now in Secondary mode, next switch it into
Primary mode:
drbdadm primary replicated0
At this point /dev/drbd0
can be mounted on rosaria
.
mount /dev/drbd0 /mnt/replicated0
On paimon
bring the resource up:
drbdadm up replicated0
Check on both hosts to see if a connection has been established:
rosaria# drbdadm status
replicated0 role:Primary
disk:UpToDate
peer role:Secondary
replication:Established peer-disk:UpToDate
and on paimon
:
paimon# drbdadm status
replicated0 role:Secondary
disk:UpToDate
peer role:Primary
replication:Established peer-disk:UpToDate
Swapping rosaria
and paimon
roles
During normal operation rosaria
is the primary node and will serve all
requests then replicate changes to paimon
as time allows. What happens
when rosaria
goes offline without warning? How is paimon
promoted
to take responsibility?
The first test is an everything well managed test to swap roles whilst the sun is shining and everything is working properly.
Step 1: unmount the file system on rosaria
.
umount /mnt/replicated0
Step 2: switch rosaria
to Secondary role
drbdadm secondary replicated0
Step 3: switch paimon
to Primary role
drbdadm primary replicated0
Step 4: mount filesystem on paimon
(the new primary)
mount /dev/drbd0 /mnt/replicated0
Done! The reverse can be applied to revert back to the previous setup.
Disaster Testing
Now to check that the setup will work when the interface between the two hosts is interrupted unexpectedly.
Step 1: stop the primary node (rosaria
) by disconnecting the network or
simply terminating the power:
Ctrl+C ;-)
Step 2: Check the status on the secondary (paimon
):
paimon# drbdadm status
...