Tuesday, July 5, 2011

Configuring DRBD on Suse Linux - Distributed Replicated Block Device [Single Primary Node]



Here's the step-by-step approach I took to setup DRBD (single-primary node) between two identical servers with Suse 11.4 (x86_64) installed on each. The website www.drbd.org is very organized and has very detailed documentation and so setup was easy to follow and understand. 

My goal is to try the single-primary mode first and move on to the HA (High Availability) cluster setup using OCFS2 (Oracle Cluster File System ) and try to setup two Oracle Instances on this HA cluster and see if I can get two regular instances to be configured like a "DataGuard", without having to do a real Oracle specific DataGuard Configuration.

Primary Host is named as "silicon" and secondary as "graphics".


Download and Install the Software


Go to software.opensuse.org and search for the package "drbd". You can find both 32 and 64 bit packages and download the one that's compatible to your OS. 


silicon: # cd /home/raj/downloads/drbd

silicon:/home/raj/downloads/drbd # wget "http://software.opensuse.org/search/download?base=openSUSE%3A11.4&file=openSUSE%3A%2F11.4%2Fstandard%2Fx86_64%2Fdrbd-8.3.8.1-4.3.x86_64.rpm"

silicon:/home/raj/downloads/drbd # rpm -Uvh drbd-8.3.8.1-4.3.x86_64.rpm
Preparing...                ########################################### [100%]
   1:drbd                   ########################################### [100%]

silicon:/home/raj/downloads/drbd # rpm -qa | grep drbd | xargs rpm -qi
Name        : drbd                         Relocations: (not relocatable)
Version     : 8.3.8.1                           Vendor: openSUSE
Release     : 4.3                           Build Date: Tue Feb 22 18:54:32 2011
Install Date: Sat Jun 25 16:47:05 2011         Build Host: build35
Group       : Productivity/Clustering/HA    Source RPM: drbd-8.3.8.1-4.3.src.rpm
Size        : 461140                           License: GPLv2+
Signature   : RSA/8, Tue Feb 22 18:54:58 2011, Key ID b88b2fd43dbdc284
Packager    : http://bugs.opensuse.org
URL         : http://www.drbd.org/
Summary     : Distributed Replicated Block Device
Description :
Drbd is a distributed replicated block device. It mirrors a block
device over the network to another machine. Think of it as networked
raid 1. It is a building block for setting up clusters.

Authors:
--------
    Philipp Reisner <philipp.reisner@linbit.com>
    Lars Ellenberg <lars.ellenberg@linbit.com>
Distribution: openSUSE 11.4

Upgrade DRBD version to 8.3.9

At some point I ran into an error below, and on searching for the solution I found this article "How To Upgrade DRBD Userland Version To 8.3.9 Under OpenSUSE 11.4"  that clearly explains the reason for the error and steps to fix it. 


"Starting DRBD resources: DRBD module version: 8.3.9
  userland version: 8.3.8
   you should upgrade your drbd tools!"

Below are the steps that I had to do to get the drbd module compiled to 8.3.9 with the existing kernel. It's no different from the steps detailed in the document listed above, I just decided to include these for continuity reasons. 


-- Install dependencies first.
silicon: # zypper install kernel-source gcc flex make ncurses-devel

-- Create .config file of the current running kernel
silicon:/usr/src/linux # cd /usr/src/linux
silicon:/usr/src/linux # cp /boot/config-2.6.37.6-0.5-desktop ./.config

-- Run menuconfig
silicon:/usr/src/linux # make menuconfig
silicon:/usr/src/linux # cd ../

silicon:/usr/src #  wget http://oss.linbit.com/drbd/8.3/drbd-8.3.9.tar.gz
asking libproxy about url 'http://oss.linbit.com/drbd/8.3/drbd-8.3.9.tar.gz'
libproxy suggest to use 'direct://'
--2011-06-25 23:56:12--  http://oss.linbit.com/drbd/8.3/drbd-8.3.9.tar.gz
Resolving oss.linbit.com... 212.69.161.111
Connecting to oss.linbit.com|212.69.161.111|:80... connected.
HTTP request sent, awaiting response... 200 OK
Length: 497886 (486K) [application/x-gzip]
Saving to: `drbd-8.3.9.tar.gz'

100%[=========================================================================>] 497,886      454K/s   in 1.1s

2011-06-25 23:56:14 (454 KB/s) - `drbd-8.3.9.tar.gz' saved [497886/497886]

-- Unzip the downloaded tar file
silicon:/usr/src # cd drbd-8.3.9/
silicon:/usr/src/drbd-8.3.9 # ./configure --prefix=/usr --localstatedir=/var --sysconfdir=/etc --with-utils --with-pacemaker
silicon:/usr/src/drbd-8.3.9 # make KDIR=/usr/src/linux
silicon:/usr/src/drbd-8.3.9 # make install

By now you got the drbd version compiled to version 8.3.9 with the current kernel module.

Configuration of DRBD

You don't necessarily have to use LVM's, instead can use partitions created like /dev/sda1 , /dev/sdb1 etc. I decided to go with LVM's since all my free space is already added to a volume group.


-- Create a Logical Volume
silicon:/etc/drbd.d # lvcreate -L 3G -n lvPQ1_dfs05 vgPQ1
  Logical volume "lvPQ1_dfs05" created

-- Include below in /etc/drbd.conf
include "/etc/drbd.d/global_common.conf";
include "/etc/drbd.d/*.res";

-- Edit resource.res
Device name /dev/drbd00 is an arbitrary name, but is the one that DRBD will use as the device name. On my initial setup I want to use the host - "silicon" as primary and the IP Address of the same is 192.168.5.100 and my secondary node - "graphics" has got the IP Address 192.168.5.101. 

silicon:/etc/drbd.d # cat resource.res
resource data00 {
  on silicon {
    device    /dev/drbd00;
    disk      /dev/vgPQ1/lvPQ1_dfs05;
    address   192.168.5.100:7789;
    meta-disk internal;
  }
  on graphics {
    device    /dev/drbd00;
    disk      /dev/vgPQ1/lvPQ1_dfs05;
    address   192.168.5.101:7789;
    meta-disk internal;
  }
}

-- Now scp both resource.res and drbd.conf files to the secondary node - graphics 

silicon:/etc/drbd.d # scp resource.res 192.168.2.101:/etc/drbd.d/
resource.res                                                                     100%  305     0.3KB/s   00:00

silicon:/etc # scp drbd.conf 192.168.2.101:/etc/
drbd.conf                                                                        100%  177     0.2KB/s   00:00

-- Create the LVM on the secondary node.
graphics:~ # lvcreate -L 3G -n lvPQ1_dfs05 vgPQ1
  Logical volume "lvPQ1_dfs05" created

Create, Load, Sync, Attach and Run DRBD module.

-- Below will create the metadata on the device.
silicon:/etc/drbd.d # drbdadm create-md data00

  --==  Thank you for participating in the global usage survey  ==--
The server's response is:
you are the 3752th user to install this version
Writing meta data...
initializing activity log
NOT initialized bitmap
New drbd meta data block successfully created.
success

-- Load the module to kernel.
silicon:/etc/drbd.d # modprobe drbd


-- Verify if the module is loaded.
silicon:/etc/drbd.d # lsmod | grep drbd

drbd                  272649  0
lru_cache               8007  1 drbd

-- Sync and Connect the device
silicon:/etc/drbd.d # drbdadm syncer data00

silicon:/etc/drbd.d # drbdadm connect data00

-- By running cat /proc/drbd you can monitor the progress of the sync. 

silicon:/etc/drbd.d # cat /proc/drbd
version: 8.3.9 (api:88/proto:86-95)
srcversion: A67EB2D25C5AFBFF3D8B788
 0: cs:Connected ro:Secondary/Secondary ds:Inconsistent/Inconsistent C r-----
    ns:0 nr:0 dw:0 dr:0 al:0 bm:0 lo:0 pe:0 ua:0 ap:0 ep:1 wo:f oos:3145596

-- Repeat all the above steps on the secondary node. (graphics)
graphics:/etc/drbd.d # drbdadm create-md data00

graphics:/etc/drbd.d # drbdadm attach data00
graphics:/etc/drbd.d # drbdadm syncer data00
graphics:/etc/drbd.d # drbdadm connect data00
graphics:/etc/drbd.d # cat /proc/drbd
version: 8.3.9 (api:88/proto:86-95)
srcversion: A67EB2D25C5AFBFF3D8B788
 0: cs:Connected ro:Secondary/Secondary ds:Inconsistent/Inconsistent C r-----
    ns:0 nr:0 dw:0 dr:0 al:0 bm:0 lo:0 pe:0 ua:0 ap:0 ep:1 wo:f oos:3145596

-- Once both devices are connected, now its' time to sync both devices.

Run the below only the node which you want to make as primary. I choose "silicon"

silicon:/etc/drbd.d # drbdadm -- --overwrite-data-of-peer primary data00

silicon:/etc/drbd.d # cat /proc/drbd
version: 8.3.9 (api:88/proto:86-95)
srcversion: A67EB2D25C5AFBFF3D8B788
 0: cs:SyncSource ro:Primary/Secondary ds:UpToDate/Inconsistent C r-----
    ns:6016 nr:0 dw:0 dr:6688 al:0 bm:0 lo:0 pe:0 ua:0 ap:0 ep:1 wo:f oos:3139580
        [>....................] sync'ed:  0.3% (3139580/3145596)K
        finish: 0:32:42 speed: 1,504 (1,504) K/sec

In my case  a 2 GB partition took around ~30mts. Both the nodes are connected through a XO (Cross Over Cable Cat 5e), the max speed I got on this connection was around ~70MBps, although the drbd sync speed was around 250KB/sec (see below). I yet have to figure out how to get a better throughput.

graphics:/etc/drbd.d # cat /proc/drbd
version: 8.3.9 (api:88/proto:86-95)
srcversion: A67EB2D25C5AFBFF3D8B788
 0: cs:SyncTarget ro:Secondary/Primary ds:Inconsistent/UpToDate C r-----
    ns:0 nr:48640 dw:48640 dr:0 al:0 bm:2 lo:0 pe:0 ua:0 ap:0 ep:1 wo:f oos:3096956
        [>....................] sync'ed:  1.6% (3096956/3145596)K
        finish: 0:38:18 speed: 1,332 (1,312) want: 250 K/sec

You can watch the progress in real time by running the cat /proc/drbd with "watch" command.

watch cat /proc/drbd

Once the sync is complete, status would show the ro as "Primary/Secondary". Here's another way to check the status.
silicon:/etc/drbd.d # /etc/init.d/drbd status

drbd driver loaded OK; device status:
version: 8.3.9 (api:88/proto:86-95)
srcversion: A67EB2D25C5AFBFF3D8B788
m:res     cs         ro                 ds                 p  mounted  fstype
0:data00  Connected  Primary/Secondary  UpToDate/UpToDate  C

Now create the File System on the DRBD device, not on the LVM or the disk partition .

silicon:~ # mkfs -t ext4 /dev/drbd00
mke2fs 1.41.14 (22-Dec-2010)
Filesystem label=
OS type: Linux
Block size=4096 (log=2)
Fragment size=4096 (log=2)
Stride=0 blocks, Stripe width=0 blocks
196608 inodes, 786399 blocks
39319 blocks (5.00%) reserved for the super user
First data block=0
Maximum filesystem blocks=805306368
24 block groups
32768 blocks per group, 32768 fragments per group
8192 inodes per group
Superblock backups stored on blocks:
        32768, 98304, 163840, 229376, 294912

Writing inode tables: done
Creating journal (16384 blocks): done
Writing superblocks and filesystem accounting information: done

This filesystem will be automatically checked every 39 mounts or
180 days, whichever comes first.  Use tune2fs -c or -i to override.

After the file system is created mount the device.

silicon:~ # mount /dev/drbd00  /u05
silicon:~ # mount  | grep drbd
/dev/drbd00 on /u05 type ext4 (rw,relatime,barrier=1,data=ordered)

Create a file under the DRBD device, where the device is mounted. This is to verify/test role switch over to see if the files came over after after the primary is switched to secondary and vice versa.
silicon:/u05 # cat /var/log/messages > afile

Switch roles between Primary and Secondary.

Disconnect devices on both primary and secondary hosts.You can run drbdadm down data00 which will disconnect and detach the devices all in once.
silicon:/etc/drbd.d # drbdadm disconnect data00
silicon:/etc/drbd.d # cat /proc/drbd
version: 8.3.9 (api:88/proto:86-95)
srcversion: A67EB2D25C5AFBFF3D8B788
 0: cs:StandAlone ro:Primary/Unknown ds:UpToDate/DUnknown   r-----
    ns:116108 nr:8 dw:116116 dr:1441 al:36 bm:2 lo:0 pe:0 ua:0 ap:0 ep:1 wo:f oos:0

graphics:/etc/drbd.d # drbdadm disconnect data00
graphics:/etc/drbd.d # cat /proc/drbd
version: 8.3.9 (api:88/proto:86-95)
srcversion: A67EB2D25C5AFBFF3D8B788
 0: cs:StandAlone ro:Secondary/Unknown ds:UpToDate/DUnknown   r-----
    ns:0 nr:116108 dw:116108 dr:0 al:0 bm:0 lo:0 pe:0 ua:0 ap:0 ep:1 wo:f oos:0

Now change the role on the graphics to primary and connect the device.
graphics:/etc/drbd.d # drbdadm primary data00

graphics:/etc/drbd.d # drbdadm connect data00
graphics:/etc/drbd.d # cat /proc/drbd
version: 8.3.9 (api:88/proto:86-95)
srcversion: A67EB2D25C5AFBFF3D8B788
 0: cs:WFConnection ro:Primary/Unknown ds:UpToDate/DUnknown C r-----
    ns:0 nr:0 dw:116108 dr:1008 al:0 bm:0 lo:0 pe:0 ua:0 ap:0 ep:1 wo:f oos:0

Run the below on the now secondary node - silicon
silicon # drbdadm -- --discard-my-data connect data00
silicon # drbd-overview.pl
  0:data00  Connected Secondary/Primary UpToDate/UpToDate C r-----

Mount the file system and look for the file created on the silicon. 
graphics:/etc/drbd.d # mount /dev/drbd00 /u05
graphics:/etc/drbd.d # cd /u05
graphics:/u05 # ls -ltr
total 288
drwx------ 2 root root  16384 Jun 29 20:47 lost+found
-rw-r--r-- 1 root root 275360 Jun 29 20:47 messages

Hope this helps. Regards, Raj.

3 comments:

  1. Hi Raj,

    The answer to speed up the sync speed is to set the syncer speed to be higher.

    for example if you have 4 gigabit ports bonded, you can set the syncer rate to at least 300M with the command:

    drbdsetup syncer -r 300M

    best regards,

    Javier

    ReplyDelete
  2. Thanks Javier. I will try that out.

    ReplyDelete