Linux, Devops etc: drbd split brain recovery

I was trying to reboot redhat cluster for backups and gfs performance testing but while restarting I got spilt brain drbd messages in /var/log/messages and the following is the output of /proc/drbd: { we have primary/primary drbd config}

Primary:
version: 8.3.0 (api:88/proto:86-89)
GIT-hash: 9ba8b93e24d842f0dd3fb1f9b90e8348ddb95829 build by
argus@docidtxt03, 2009-03-09 18:04:20
0: cs:StandAlone ro:Primary/Unknown ds:UpToDate/DUnknown r---
ns:0 nr:0 dw:94815205 dr:14861358 al:19160 bm:18136 lo:0 pe:0 ua:0
ap:0 ep:1 wo:b oos:18911216

Secondary:
version: 8.3.0 (api:88/proto:86-89)
GIT-hash: 9ba8b93e24d842f0dd3fb1f9b90e8348ddb95829 build by
argus@docidtxt04, 2009-03-04 16:23:58
0: cs:WFConnection ro:Secondary/Unknown ds:UpToDate/DUnknown A r---
ns:0 nr:0 dw:0 dr:0 al:0 bm:14 lo:0 pe:0 ua:0 ap:0 ep:1 wo:b oos:55544

Following commands are used to recover from the situation:::

drbdadm -- --discard-my-data connect all (on node with "bad" data)
drbdadm connect all (on node with "good" data)

And now the resources are connected.

Then verify the /proc/drbd status and if it isn't connecting check iptables and stop or allow the ports which are being used by drbd and then follow the commands again. or otherwise demote one of the node to secondary as

drbdadm secondary r0 (on bad data node), after the successfull sync do

drbdadm primary r0 (on both nodes)

Linux, Devops etc

Tuesday, 4 October 2011

drbd split brain recovery

No comments:

Post a Comment