Installing DRBD on Debian 9
DRBD (Distributed Replicated Block Device) has been a native Linux kernel feature since 2.6.33 for Software Defined Storage (SDS). This article is a documented reference from testing DRBD for VM replication. MD RAID + DRBD + KVM + SMARTMONTOOLS as a virtualization setup, using offsite storage for backups, effectively fulfilling 3-2-1. The goal was quick DR with adaptable offsite range and bandwidth tolerance from commodity networking hardware. This article will only deal with DRBD related setup, management and the results I ended up with.

Asynchronous
Docs made me aware of a proxy feature for async since low bandwidth can cause locking if the sndbuf/socket buffer becomes full while syncing - which I experienced greatly. This behavior obviously comes from DRBD being primarily for full-sync purposes and expects good networking hardware. By simply changing the protocol from A to C in the reference below, you turn on full sync, so that part alone is just a minor configuration change. With async you can have any kind of filesystem, but full sync and HA will require a cluster-aware fs.
Split-brain
A temporary failure on primary node may put it in secondary mode causing secondary/secondary, requiring manual intervention to set one to primary and get it going again. This can be beneficial for robustness and the control we want, but did require some minor hand-holding. There are some ways to automate this but I didn't need it.
INSTALLING DRBD8 on Debian 6+: # apt-get install drbd8-utils PREPARING STORAGE AND NETWORK After you have installed DRBD, you must set aside a roughly identically sized storage area on both cluster nodes. This will become the lower-level device for your DRBD resource. You may use any type of block device found on your system for this purpose. Typical examples include: * A hard drive partition (or a full physical hard drive), * a software RAID device, * an LVM Logical Volume or any other block device configured by the Linux device-mapper infrastructure, * any other block device type found on your system. I simpy used parted to prepare space on existing drives for testing in lab. It is generally not recommended to run DRBD replication via routers, for reasons of fairly obvious performance drawbacks (adversely affecting both throughput and latency). DRBD (by convention) uses TCP ports from 7788 upwards, with every resource listening on a separate port. DRBD uses two TCP connections for every resource configured. For proper DRBD functionality, it is required that these connections are allowed by your firewall configuration. CONFIGURATION All aspects of DRBD are controlled in its configuration file, /etc/drbd.conf. By convention, /etc/drbd.d/global_common.conf contains the global and common sections of the DRBD configuration, whereas the .res files contain one resource section each. It is also possible to use drbd.conf as a flat configuration file without any include statements at all. Such a configuration, however, quickly becomes cluttered and hard to manage, which is why the multiple-file approach is the preferred one. Regardless of which approach you employ, you should always make sure that drbd.conf, and any other files it includes, are exactly identical on all participating cluster nodes. Simple DRBD configuration (/etc/drbd.d/global_common.conf): global { usage-count no; } common { net { protocol A; } } Simple DRBD resource configuration (/etc/drbd.d/r0.res - r0 is arbitrary): resource r0 { on Server1 { device /dev/drbd1; disk /dev/sda7; address 10.1.1.31:7789; meta-disk internal; } on Server2 { device /dev/drbd1; disk /dev/sda7; address 10.1.1.32:7789; meta-disk internal; } } ENABLING THE RESOURCE After you have completed initial resource configuration as outlined in the previous sections, you can bring up your resource. Each of the following steps must be completed on both nodes. # dd if=/dev/zero of=/dev/sda7 bs=1M count=128 (clean a path) # drbdadm create-md r0 # drbdadm up r0 # cat /proc/drbd ^ Observe /proc/drbd. Inconsistent/Inconsistent disk state is expected at this point. By now, DRBD has successfully allocated both disk and network resources and is ready for operation. What it does not know yet is which of your nodes should be used as the source of the initial device synchronization. INITIAL DEVICE SYNCHRONIZATION There are two more steps required for DRBD to become fully operational: Select an initial sync source. If you are dealing with newly-initialized, empty disk, this choice is entirely arbitrary. If one of your nodes already has valuable data that you need to preserve, however, it is of crucial importance that you select that node as your synchronization source. If you do initial device synchronization in the wrong direction, you will lose that data. Exercise caution. Start the initial full synchronization. This step must be performed on only one node, only on initial resource configuration, and only on the node you selected as the synchronization source. To perform this step, issue this command: # drbdadm primary --force r0 After issuing this command, the initial full synchronization will commence. You will be able to monitor its progress via /proc/drbd. It may take some time depending on the size of the device. By now, your DRBD device is fully operational, even before the initial synchronization has completed (albeit with slightly reduced performance). You may now create a filesystem on the device, use it as a raw block device, mount it, and perform any other operation you would with an accessible block device. !!!IMPORTANT!!! Do not get confused about your new block device at this point!. Realize that a new block device as been made for you to use (/dev/drbd1 - /dev/drbd/by-res/r0). So format this (# mkfs.xfs /dev/drbd1) on the primary node, and let DRBD take care of the actual block devices that carries it (/dev/sda7). When formatted and ready for use, use fstab to mount it wherever for consistent use. A prepared script for setting 'drbdadm primary r0' and mounting the device on the secodary node should be made for quick failover, as well as defining and starting the relevant VM and anything else you may have on top of the block. ADMINISTRATION // Monitor status. # watch cat /proc/drbd The resource-specific output from /proc/drbd contains various pieces of information about the resource: cs (connection state). Status of the network connection. ro (roles). Roles of the nodes. The role of the local node is displayed first, followed by the role of the partner node shown after the slash. ds (disk state): Diskless. No local block device has been assigned to the DRBD driver. This may mean that the resource has never attached to its backing device, that it has been manually detached using drbdadm detach, or that it automatically detached after a lower-level I/O error. Attaching. Transient state while reading meta data. Failed. Transient state following an I/O failure report by the local block device. Next state: Diskless. Negotiating. Transient state when an Attach is carried out on an already-Connected DRBD device. Inconsistent. The data is inconsistent. This status occurs immediately upon creation of a new resource, on both nodes (before the initial full sync). Also, this status is found in one node (the synchronization target) during synchronization. Outdated. Resource data is consistent, but outdated. DUnknown. This state is used for the peer disk if no network connection is available. Consistent. Consistent data of a node without connection. When the connection is established, it is decided whether the data is UpToDate or Outdated. UpToDate. Consistent, up-to-date state of the data. This is the normal state. I/O flags (r-----) d: I/O blocked for a reason internal to DRBD, such as a transient disk state. b: Backing device I/O is blocking. n: Congestion on the network socket. a: Simultaneous combination of blocking device I/O and network congestion. Performance indicators The second line of /proc/drbd information for each resource contains the following counters and gauges: ns (network send). Volume of net data sent to the partner via the network connection; in Kibyte. nr (network receive). Volume of net data received by the partner via the network connection; in Kibyte. dw (disk write). Net data written on local hard disk; in Kibyte. dr (disk read). Net data read from local hard disk; in Kibyte. al (activity log). Number of updates of the activity log area of the meta data. bm (bit map). Number of updates of the bitmap area of the meta data. lo (local count). Number of open requests to the local I/O sub-system issued by DRBD. pe (pending). Number of requests sent to the partner, but that have not yet been answered by the latter. ua (unacknowledged). Number of requests received by the partner via the network connection, but that have not yet been answered. ap (application pending). Number of block I/O requests forwarded to DRBD, but not yet answered by DRBD. ep (epochs). Number of epoch objects. Usually 1. Might increase under I/O load when using either the barrier or the none write ordering method. wo (write order). Currently used write ordering method: b(barrier), f(flush), d(drain) or n(none). oos (out of sync). Amount of storage currently out of sync; in Kibibytes. // SECONDARY FAILOVER WHEN PRIMARY IS DEAD. # drbdadm primary r0 # mount /dev/drbd/by-res/r0 <mountpoint> ^ Start services based on resource, e.g. VM. // When primary comes back to life. # umount /dev/drbd/by-res/r0 # drbdadm secondary r0 NOTE: The documentation says there of course will be a certain data loss, but the data that's already on the second node, will be consistent. USEFUL REFERENCES https://docs.linbit.com/doc/users-guide-84/p-build-install-configure/ https://docs.linbit.com/doc/users-guide-84/ch-admin/ https://docs.linbit.com/doc/users-guide-84/ch-configure/
Labtest Summary
Somewhat disappointing write throughput (even if I ran it on horrible hardware for torture testing). For testing I simply used CrystalDiskMark from the VM, so YMMV. Tried Hyper-V Server 2016 replica setup afterwards, and being easier to set up, it was also much better on write throughput in the guest by far, even during replication. 1st gen is very fast outside replication, while 2nd gen guests seems to have a consistent speed, but not on the positive end of the scale. All drivers properly updated.
On a consumer SSD without RAID, I got on continuous DRBD async KVM setup:
926 MB/s read (weird, I know), 91 MB/s write from a qcow2 guest (365 MB/s w/o DRBD). I think this was due to high networking congestion and the commodity networking I forced it on. I think DRBD tends to do things with sync in mind (algorithms used) even if async is defined, and that you should plan hardware as if you would use full sync.
With Hyper-V 1st-Gen VHDX:
During replication: 542 MB/s read, 179 MB/s write.
Between replication: 538 MB/s, 357 MB/s write.
2nd-Gen VHDX:
During AND Between replication:
532 MB/s read, 180 MB/s write.
KVM block-commit (see my KVM reference for a script template) live-backup on KVM did only pause the VM for a second, otherwise did not cause any ill effects at all. Same with Export-VM in Hyper-V (see my PS reference for a script template). Both types of live-backup were very consistent and reliable. I need to look into faster storage and bandwidth for DRBD if I'm going to use it seriously, but I'm very happy with what I have learned and with expected results outside the performance. With Hyper-V Server being as easy to set up and manage - even in a workgroup (see my PS reference) - it's hard not to just go for that instead for this DR scenario. Hyper-V Server 2016 is completely free without any licensing model attached.
GNU/Linux offers much flexibility and freedom, you also don't have to fight an expected ecosystem (domain/AD). I have to say that in this usage scenario Hyper-V is just easier and makes more sense, especially if you have a proper RAID controller. GNU/Linux offers enterprise grade software RAID (MD) that can easily replace a FakeRAID if that's all you have, and smartmontools for monitoring health. Hyper-V Server will allow you to install drivers in a normal way though if you still want to use embedded RAID.