If you’re playing around with Heartbeat, DRBD & NFS you might find the following annoying bug in your log files.
Filesystem[5133]:INFO: Running stop for /dev/drbd0 on /nfs
Filesystem[5133]:INFO: Trying to unmount /nfs
lrmd[4616]:info: RA output: (FS_repdata:stop:stderr) umount: /nfs: device is busy
lrmd[4616]:info: RA output: (FS_repdata:stop:stderr) umount: /nfs: device is busy
Filesystem[5133]:ERROR: Couldn’t unmount /nfs; trying cleanup with SIGTERM
Filesystem[5133]:INFO: No processes on /nfs were signalled
lrmd[4616]: info: RA output: (FS_repdata:stop:stderr) umount: /nfs: device is busy
umount: /nfs: device is busy
Filesystem[5133]: ERROR: Couldn’t unmount /nfs; trying cleanup with SIGTERM
Filesystem[5133]: INFO: No processes on /nfs were signalled
lrmd[4616]:info: RA output: (FS_repdata:stop:stderr) umount: /nfs: device is busy
lrmd[4616]:info: RA output: (FS_repdata:stop:stderr) umount: /nfs: device is busy
Filesystem[5133]: ERROR: Couldn’t unmount /nfs; trying cleanup with SIGTERM
Filesystem[5133]: INFO: No processes on /nfs were signalled
lrmd[4616]: info: RA output: (FS_repdata:stop:stderr) umount: /nfs: device is busy
lrmd[4616]: info: RA output: (FS_repdata:stop:stderr) umount: /nfs: device is busy
Filesystem[5133]: ERROR: Couldn’t unmount /nfs; trying cleanup with SIGKILL
Filesystem[5133]: INFO: No processes on /nfs were signalled
lrmd[4616]: info: RA output: (FS_repdata:stop:stderr) umount: /nfs: device is busy
umount: /nfs: device is busy
In my case, it was because the nfs service (/etc/init.d/nfs) was told to stop, but didn’t clean all processes cleanly. This caused the stop for DRBD to fail, since it still had file handles pointing to that particular mount.
I had to edit the init.d script to change this. You can test this, by just running /etc/init.d/nfs stop, and searching for any remaining nfs children that are still alive and holding a lock on your DRBD device.
This was the original /etc/init.d/nfs script:
[snip]
echo -n $"Shutting down NFS mountd: "
killproc rpc.mountd
echo
echo -n $"Shutting down NFS daemon: "
killproc nfsd 2
echo
Which didn’t kill every nfs process, and made every dependent action fail. Here’s the current change.
echo -n $"Shutting down NFS mountd: "
killproc rpc.mountd
echo
echo -n $"Shutting down NFS daemon: "
**killproc nfsd 2
# Force kill if the above didn’t go as planned
sleep 5
killproc nfsd -KILL**
echo
This will properly kill all your nfs-processes, so the rest of the heartbeat resources can be configured.