Doerzbach Engineering GmbH - Changing Coordinator Disks online in Veritas Cluster Server (VCS) without vxfenswap

Changing Coordinator Disks online in Veritas Cluster Server (VCS) without vxfenswap

Sunday, February 5, 2012, 10:26 - Technical Informations, Veritas Cluster Server
Posted by Administrator

Starting with Veritas Cluster Server 5.0 MP3 there is an official tool "vxfenswap" to change the coordinator disks online. Before there was no such tool and no official statement how to change to coordinator disks while applications could stay online, although there is a simple solution which works normally without any problems.

The steps are:

1) Check the cluster state (LLT, GAB)


root@node1 # lltstat -n
LLT node information:
    Node                 State    Links
     0 node1              OPEN        3
     1 node2              OPEN        3
     2 node3              OPEN        3
root@node1 # gabconfig -a
GAB Port Memberships
===============================================================
Port a gen   f3b50f membership 012                       
Port b gen   f3b51a membership 012
Port h gen   f3b51a membership 012                       ^

2) Freeze all service groups and systems persistent.


root@node1 # haconf -makerw
root@node1 # hagrp -list |awk '[{print $1}' | sort -u | while read g ; do hagrp -freeze $g -persistent ; done
VCS WARNING V-16-1-50894 Command (hagrp -freeze -persistent ClusterService ) failed. The Group ClusterService cannot be frozen

root@node1 # hasys -list | while read s ; do hasys -freeze -persistent $s ; done
root@node1 # haconf -dump -makero

3) Stop the cluster monitoring


root@node1 # hastop -all -force

4) Stop the Fencing driver on all cluster nodes


root@nodeXXX # /etc/init.d/vxfen stop
Stopping VxFEN:

5) Stop the GAB driver on all cluster nodes


root@nodeXXX # /etc/init.d/gab stop
Stopping GAB:

6) Stop the LLT driver on all cluster nodes


root@nodeXXX # /etc/init.d/llt stop
Stopping LLT:

7) import the coordinator diskgroup on one node


root@node1 # vxdg -ftC import `cat /etc/vxfendg`

8) Set the coordinator flag off


root@node1 # vxdg -g `cat /etc/vxfendg` set coordinator=off

9) Remove unwanted coordinator disks


root@node1 # vxdg -g `cat /etc/vxfendg` rmdisk <unwanteddiskname>

where unwantetdiskname is the diskname out of the output of the command "vxprint -g `cat /etc/vxfendg`"
10) Add new coordinator disks


root@node1 # vxdctl enable
root@node1 # vxdisksetup -i <newdevicename>
root@node1 # vxdg -g `cat /etc/vxfendg` adddisk <newdiskname>=<newdevicename>

where <newdevicename> is to "DEVICE" out of the output of the command "vxdisk list" and <newdiskname> is a name you choose to name the disk in the diskgroup.
11) Rescan the partitions of the new coordinator disk on all systems
As the partition table is changed when a new disk is initialized by Volume Manager and the other nodes do not know about it, we have to Rescan the partition table on the ohter cluster nodes:

First get the diskid for the new coordinator disk on node1:


root@node1 # vxdisk list <newdevicename> |grep '^disk:'
disk:      name= id=1327668260.266.node1

Then create the script rescan_partitions.sh on all other nodes with the following content:


#!/bin/bash

vxdctl enable
vxdisk -o alldgs list|grep `cat /etc/vxfendg`|while read disk rest; do
  if vxdisk list $disk|grep $1 >/dev/null ; then                 
    if [ "`uname`" = "Linux" ] ; then
      vxdisk list $disk | grep state=enabled |while read dev re ; do
        grep $dev /proc/partitions
        blockdev --rereadpt /dev/$dev
      done
     fi
     vxdisk rm $disk
     vxdctl enable
  fi
done

And run it with the diskid as paramter on the other nodes


root@nodexxx # ./rescan_partitions.sh 1327668260.266.node1

12) Set the coordinator flag on


root@node1 # vxdg -g `cat /etc/vxfendg` set coordinator=on

13) deport the coordinator diskgroup


root@node1 # vxdg -g `cat /etc/vxfendg` deport

14) Start LLT on all nodes:


root@nodex # /etc/init.d/llt start
Starting LLT: 
LLT: loading module...
WARNING:  No modules found for 2.6.9-55.ELsmp, using compatible modules for 2.6.9-34.ELsmp.
LLT: configuring module...

15) Start GAB on all nodes:


root@nodex # /etc/init.d/gab start
Starting GAB: 
WARNING:  No modules found for 2.6.9-55.ELsmp, using compatible modules for 2.6.9-34.ELsmp.

16) Start Fencing Driver on all nodes


root@nodex # /etc/init.d/vxfen start
Starting VxFEN: 
WARNING:  No modules found for 2.6.9-55.ELsmp, using compatible modules for 2.6.9-34.ELsmp.
Starting vxfen.. Done

17) Start the cluster monitoring on all nodes


root@nodex # hastart

18) Unfreeze all service groups and systems persistent.
Wait for the cluster startup to complete. Check it with the command below. There shoud be no output:


root@node1 # hastatus -summ|grep '^D'
D  BWP          Proxy                LAN-PBWP           node1                
D  MIP          Proxy                LAN-PMIP           node2

Here some resources where still not probed. Just wait another minute and recheck...
As soon as all resources are probed you can safely unfreeze the servicegroups and systems:


root@node1 # haconf -makerw
root@node1 # hagrp -list |awk '[{print $1}' | sort -u | while read g ; do hagrp -unfreeze $g -persistent ; done
VCS WARNING V-16-1-40202 Group is not persistently frozen
root@node1 # hasys -list | while read s ; do hasys -unfreeze -persistent $s ; done
root@node1 # haconf -dump -makero

[ view entry ] ( 30629 views ) | permalink | print article |

( 3 / 471 )

<<First <Back | 1 | 2 | 3 | Next> Last>>