Monday, January 10, 2011

Use Nagios? Got an Android phone?

If you're like me, you use Nagios to monitor just about anything and everything on your network.  It's a great tool to keep your finger on the pulse of your systems, so to speak, but if you rely on SMS or email notifications, sometimes they can get lost in the shuffle with everything else.  I get a lot of texts and tons of email from various processes and scripts, so the signal to noise ratio is pretty high.

I started looking for a solution that wouldn't involve creating a bunch of filters and such, which of course brought me to look for a client.  After doing a cursory search to see what was out there, I ran across two candidates, Nagroid and NagMonDroid.  I didn't care much for NagMonDroid, not to its discredit, as it looked like a fine client.  I just didn't like the interface, and Nagroid worked better for my needs.

The setup process for Nagroid was pretty simple, but I did tweak my config on the Nagios side just a bit to fine tune things.  Out of the slew of servers and devices that I need to monitor, the majority of those I would classify as "tier 2" and don't want to get a notification in the middle of the night if something goes awry.  For those systems, email is perfectly fine as it can wait until the morning.  For my mission critical or "tier 1" systems, I want something that will wake the dead at 2am if need be.  In order to make this work, I created a secondary login for Nagios (in my case user_mobile) and then created a group that contained both my primary and mobile only logins.  I then disabled all notifications for the mobile login.  This provided me with the ability to set tier 1 systems to notify my group resulting in both an email and a Nagroid notification, while tier 2 systems I configured to notify only my primary login resulting in email only.

While Nagroid has met my needs, I encourage you to check both clients out.  I've included the QR codes below that will take you to the appropriate market pages for each.


Nagroid

NagMonDroid

Tuesday, January 4, 2011

ZRM for MySQL and NFS locking

I ran into this issue a couple of years ago when we decided to implement a VTL solution for our backups.  We use ZRM for MySQL for our database backups, and it has been extremely dependable.  We were initially backing up to local storage on the backup server, but then moved our target over to a Data Domain DDR530 data restorer.  We accessed the DDR530 via NFS and took the necessary precautions regarding file locking only to still have problems backing up to the DDR530.  After a bit of digging and a minor patch to the ZRM code to disable flocks, backups worked perfectly but purges remained a problem.  All of this was due to forces at work under the hood of the DDR530 which threw a wrench in attempting to thwart locking.

Unable to get the supplied purge script from ZRM to play well with the DDR530, I set out to create a suitable stand-in.  It's nothing fancy, but if for some reason you cannot get the supplied purge process to work, feel free to give this a spin.  It will get the job done without a lot of fuss.

#!/bin/bash

##################################################
# purge-zrm-backups
#
# purges backups without file locking
#
##################################################

BACKUPDIR="/path/to/backups"
PURGELOG="/var/log/mysql-zrm/purgelog"
CURDATE=`date +%s`
TIMESTAMP="date -Iseconds"

echo "$($TIMESTAMP) -- Starting purge session" >> $PURGELOG

for buset in $BACKUPDIR/*
do
   for budate in $buset/*
   do
      KEEP=`grep retention-policy $budate/index | awk -F= '{print$2}'`
      WEEKS=${KEEP:0:1}
      TICKS=$((WEEKS * 7 * 86400))
      CUTOFF=$((CURDATE - TICKS))
      TSTAMP=`grep backup-date-epoch $budate/index | awk -F= '{print$2}'`

      if [ "$TSTAMP" -lt "$CUTOFF" ]; then
         echo "$($TIMESTAMP) -- | Purging $budate" >> $PURGELOG
         rm -rf $budate 2>&1 &
      fi
   done
done

echo "$($TIMESTAMP) -- Finished purge session" >> $PURGELOG

exit 0

Testing.... 1, 2, 3...

Well, if you stumbled in here, thanks for dropping by.  I've been working with Linux and Unix since 1995 and have picked up a few interesting things along the way (as well as found lots of things not to do).  For the past few years I've been working a lot on cross platform integration between Linux and Windows systems so don't think I've fallen and bumped my head if you see a stray post about the evil empire in here.

Hopefully I can share some pointers that will save you some time.  Feel free to comment on anything you see here as there's always room for improvement.