This article has been archived and is no longer updated by Apple.

Xsan 2.1.1: Script to reboot Intel-based Xserve MDCs after failover

View the script to reboot Intel-based Xserve MDCs after failover.

This script in this article immediately reboots an Xsan Metadata Controller (MDC) that has lost control of an Xsan volume after an Xsan volume failover. If a failover occurs for any reason, the script will be executed by the MDC that takes control of the volume. The script uses Lights Out Management (LOM) to send a reboot signal to the previously active MDC. This action ensures that the Xsan volume’s metadata cannot be modified by the previously active MDC after a failover.

Xsan includes multiple safeguards to avoid a situation where an Xsan volume is active on more than one MDC. This script is not required. It is provided as an example for sites where an additional hardware-based method is desired to prevent this unlikely scenario.

Use of this script is optional and at your own risk. Verify system requirements before installing. Apple does not provide support for modifications of this script.

Notes

  • This script is designed for Xsans with two MDCs running Xsan 2.1.1. Do not deploy this script if your Xsan volume has more than two MDCs.

  • Both MDCs must be Intel-based Xserves because of the LOM commands this script uses to send the reboot signal.

  • This script can be used in Xsans with more than one volume. Each volume should be configured to run on both MDCs with the same failover priority.

  • It is recommended that only Xsan and Open Directory services be hosted on Xsan MDCs. If Open Directory services are running on the MDCs, it is important that an Open Directory Replica be available at all times. This will ensure continual availability of Open Directory services should an MDC be rebooted by this script.

  • Important: The MDC’s internal hard drives should be formatted as Mac OS Extended with journaling enabled.

Installing the script

Perform these steps on both MDCs:

  1. Use the steps described in this article to configure the LOM Addresses.

  2. Create a LOM password file by executing this command:

    sudo sh -c "echo PASSWORD > /private/var/root/Other_MDCs_LOM_Password"

    Replace ‘PASSWORD’ with the LOM administrator’s password on the other MDC.

  3. Restrict access to the LOM password file by executing the following commands in Terminal:

    sudo chmod 400 /private/var/root/Other_MDCs_LOM_Password

    sudo chown root:wheel /private/var/root/Other_MDCs_LOM_Password

  4. Back up the original script by executing this Terminal command:

    sudo mv /Library/Filesystems/Xsan/bin/cvfail /Library/Filesystems/Xsan/bin/cvfail.bak

  5. To create the new script, copy the text below beginning with the line “#!/bin/sh” to the “# end script” line. Paste the text into a new plain text document in TextEdit, using these guidelines.

    #!/bin/sh# cvfail# This script is intended for use in Xsan 2.1.1.# This script may be replaced by a future software update.# -------------- edit the variables below this line ----------------# For more information on these settings: http://support.apple.com/kb/HT3620# Set reset_enabled to 'yes' to reset the other MDC upon failover.# Set to 'no' to disable reset on both MDCs before performing maintenance.reset_enabled='no'# IP address of other MDC's Lights Out Management (LOM) interfacelom_ip='Other_MDCs_LOM_IP_address'# LOM admin user on other MDClom_username='Other_MDCs_LOM_admin_username'# LOM password for other MDC is stored in this filelom_password_file='/private/var/root/Other_MDCs_LOM_Password'# name of SANsan_name='My Xsan'# ------------------ do not edit below this line ------------------hostname="$1" fsm_port="$2" fs_name="$3" last_reset='/private/tmp/.cvfail'reset_interval=15sendNotification() { # Note: notification method is subject to change in future versions if [ ! "$subject" ]; then command="xsan:command = sendFailoverxsan:hostname = $hostnamexsan:volume = $fs_name" else command="xsan:command = sendNotificationxsan:messageSubject = $subjectxsan:messageBody = $bodyxsan:messageType = failover" fi echo "$command" | /usr/sbin/serveradmin command &}messageBody() { input=`echo $1 | /usr/bin/tr -d '\n\r' | /usr/bin/sed "s/\"//g; s/\'//g"` if [ "$body" ]; then body="$body $input" else body="$input" fi}# do not reset the other MDC when in maintenance modeif [ "`echo $reset_enabled | /usr/bin/awk '{print tolower}'`" != 'yes' ]; then echo "cvfail $fs_name: Maintenance mode. MDC will not be reset." sendNotification exit 0 fi# do not reset the other MDC if it has already been reset within reset intervalif [ -f "$last_reset" ]; then eval $(/usr/bin/stat -s "$last_reset") time_since_last_reset=$(($(/bin/date +%s) - $st_ctime)) if [ $time_since_last_reset -le $reset_interval ]; then echo "cvfail $fs_name: MDC already reset. Will not reset again." sendNotification exit 0 fifi# check the password fileif [ ! -r "$lom_password_file" ]; then echo "cvfail $fs_name: $lom_password_file: Cannot read file or file does not exist." subject="$san_name: Volume $fs_name did not fail over" messageBody "The failover script for the volume $fs_name in $san_name" messageBody "did not complete successfully on $hostname because" messageBody "the password file ($lom_password_file) could not be read or does not exist." sendNotification exit 1fi# reset the other MDCecho "cvfail $fs_name: Sending reset command as '$lom_username' to LOM IP: $lom_ip" ipmitool_output=`/usr/bin/ipmitool -l lan -U "$lom_username" \ -f "$lom_password_file" -H "$lom_ip" chassis power reset 2>&1`# send the appropriate notificationif [ $? -eq 0 ]; then echo "cvfail $fs_name: MDC reset succeeded." /usr/bin/touch "$last_reset" sendNotification exit 0 else echo "cvfail $fs_name: MDC reset failed. ipmitool: $ipmitool_output" subject="$san_name: Volume $fs_name did not fail over" messageBody "The failover script for the volume $fs_name in $san_name" messageBody "did not complete successfully on $hostname because" messageBody "an ipmitool error occurred: $ipmitool_output" sendNotification exit 1fi# end script

  6. Edit the variables in the script between the lines containing “edit the variables below this line” and “do not edit below this line”. These values are highlighted yellow in the text above, but should appear unformatted in Plain Text script. Remember that the values entered here pertain to the other MDC. Maintain the single quote characters around each value.

    • reset_enabled='yes'

      Leave this value set to yes on both MDCs during normal operation.

      Important: To prevent the script from rebooting the other MDC during planned maintenance, change this value to no on both MDCs before performing maintenance. Planned maintenance tasks include running Software Update on either MDC, starting a volume, changing volume settings, or forcing failover.

    • san_name='My Xsan'

      Replace My Xsan with the name of the SAN found by selecting Overview from the left-hand column in Xsan Admin and observing the Name value.

  7. Save the new script in the following location:

    /Library/Filesystems/Xsan/bin/cvfail

  8. Execute the following commands in Terminal:

    sudo chmod 544 /Library/Filesystems/Xsan/bin/cvfail

    sudo chown root:wheel /Library/Filesystems/Xsan/bin/cvfail

Learn more

Testing the scripts

Once the scripts have been deployed on both MDCs, you can test their functionality by taking the following steps.

  1. Open Xsan Admin and select Volumes under SAN Assets. Observe the Hosted By value for the Xsan Volume you wish to test.

  2. Click the action button and select Force Failover.

    Note: If you are running Xsan Admin on the MDC that was running the volume, the system will reboot immediately. If you are running Xsan Admin on another Mac, Xsan Admin may become unresponsive after the Force Failover command. If Xsan Admin’s “Failing-over volume” progress dialog does not close after one minute, force Xsan Admin to quit.

  3. Reconnect to your SAN in Xsan Admin, if needed. Observe the Hosted By value for the Xsan Volume which was failed-over. If the failover event was successful, the value should become the name of the other MDC.

Note: Testing failover may result in a momentary disruption of Xsan volume availability. Avoid possible disruptions to production by testing failover during non-production hours.

Published Date: