<?xml version="1.0" encoding="UTF-8"?>
<rss version="2.0"
	xmlns:content="http://purl.org/rss/1.0/modules/content/"
	xmlns:wfw="http://wellformedweb.org/CommentAPI/"
	xmlns:dc="http://purl.org/dc/elements/1.1/"
	xmlns:atom="http://www.w3.org/2005/Atom"
	xmlns:sy="http://purl.org/rss/1.0/modules/syndication/"
	xmlns:slash="http://purl.org/rss/1.0/modules/slash/"
	>

<channel>
	<title>Hans Rakers &#187; drbd</title>
	<atom:link href="http://hans.rakers.org/tag/drbd/feed/" rel="self" type="application/rss+xml" />
	<link>http://hans.rakers.org</link>
	<description>Personal blog</description>
	<lastBuildDate>Wed, 07 Sep 2011 12:28:48 +0000</lastBuildDate>
	<language>en</language>
	<sy:updatePeriod>hourly</sy:updatePeriod>
	<sy:updateFrequency>1</sy:updateFrequency>
	<generator>http://wordpress.org/?v=3.2.1</generator>
		<item>
		<title>Issues regarding HA-NFS with Heartbeat and DRBD on Slackware 12.0</title>
		<link>http://hans.rakers.org/2007/10/issues-regarding-ha-nfs-with-heartbeat-and-drbd-on-slackware-120/</link>
		<comments>http://hans.rakers.org/2007/10/issues-regarding-ha-nfs-with-heartbeat-and-drbd-on-slackware-120/#comments</comments>
		<pubDate>Thu, 04 Oct 2007 10:45:25 +0000</pubDate>
		<dc:creator>Hans</dc:creator>
				<category><![CDATA[SysAdmin]]></category>
		<category><![CDATA[drbd]]></category>
		<category><![CDATA[HA-NFS]]></category>
		<category><![CDATA[haresources]]></category>
		<category><![CDATA[heartbeat]]></category>
		<category><![CDATA[NFS]]></category>
		<category><![CDATA[slackware]]></category>

		<guid isPermaLink="false">http://hans.rakers.org/2007/10/issues-regarding-ha-nfs-with-heartbeat-and-drbd-on-slackware-120/</guid>
		<description><![CDATA[<a href="http://hans.rakers.org/2007/10/issues-regarding-ha-nfs-with-heartbeat-and-drbd-on-slackware-120/" title="Issues regarding HA-NFS with Heartbeat and DRBD on Slackware 12.0"></a>Our NFS server setup at our datacenter consists of two SuperMicro SC933 chassis, each with dual Intel Xeon 3 Ghz, 2GB memory, and 15 200GB SATA disks connected to a Areca ARC-1160 16-ports SATA RAID controller. High Availability by redundancy &#8230;<p class="read-more"><a href="http://hans.rakers.org/2007/10/issues-regarding-ha-nfs-with-heartbeat-and-drbd-on-slackware-120/">Read more &#187;</a></p>]]></description>
			<content:encoded><![CDATA[<a href="http://hans.rakers.org/2007/10/issues-regarding-ha-nfs-with-heartbeat-and-drbd-on-slackware-120/" title="Issues regarding HA-NFS with Heartbeat and DRBD on Slackware 12.0"></a><p>Our NFS server setup at our datacenter consists of two <a href="http://www.supermicro.com/products/chassis/3U/933/SC933T-R760.cfm" target="_blank">SuperMicro SC933</a> chassis, each with dual Intel Xeon 3 Ghz, 2GB memory, and 15 200GB SATA disks connected to a Areca ARC-1160 16-ports SATA RAID controller. High Availability by redundancy and fail-over is taken care of by <a href="http://www.linux-ha.org/" target="_blank">Heartbeat</a> and <a href="http://www.drbd.org/" target="_blank">DRBD</a>. This setup is  responsible for serving up document roots for our web cluster through NFS, and it obviously is very important that it always works <img src='http://hans.rakers.org/wp-includes/images/smilies/icon_smile.gif' alt=':)' class='wp-smiley' /> </p>
<p>These systems  run Slackware Linux, which has historically been my distro of choice for critical systems. When deploying Heartbeat on Slackware i ran into some issues which i&#8217;d like to share here. I won&#8217;t go into basic stuff like actually compiling and installing DRBD and Heartbeat, since that is pretty well documented in various other places, for starters <a href="http://www.linux-ha.org/" target="_blank">the Linux-HA site</a> (home of Heartbeat).<br />
<span id="more-20"></span><br />
Heartbeat starts HA services as defined in the file &#8216;haresources&#8217;, but in doing so, Heartbeat seems a bit SysV-init biased. SysV init based systems only start certain services when they are told to do so, by linking them to a certain runlevel. So on a Sysv-style system, if you don&#8217;t want to start NFS services at boot time, you just remove them from their runlevels, but leaving the init.d script intact.</p>
<p>Because Slackware has more of a BSD-style init, it starts most of its service daemons from /etc/rc.d/rc.M, including RPC services and NFS.  We don&#8217;t want these services started by rc.M at boot time, because we want Hearbeat to manage these services. Normally on Slackware, if you don&#8217;t want a certain service started at boot-time, you would simply &#8216;chmod a-x&#8217; the specific rc script in /etc/rc.d, but that is not an option now, since heartbeat still needs to be able to start/stop the service from its &#8216;haresources&#8217;. My solution was to leave all scripts executable, and rename the rc scripts of the services i wanted to be managed by Heartbeat:</p>
<blockquote><p><code>cd /etc/rc.d<br />
mv rc.rpc rc.rcp-hb<br />
mv rc.nfsd rc.nfsd-hb<br />
mv rc.samba rc.samba-hb</code></p></blockquote>
<p>Now the RPC services, NFS and Samba will no longer be started at boot time, since rc.M only looks for the existence of the rc scripts without the added &#8216;-hb&#8217; part.</p>
<p>Next we tell Heartbeat the names of the rc scripts to start/stop by putting them in &#8216;haresources&#8217;. My &#8216;haresources&#8217; file looks like this:</p>
<blockquote><p><code>fs1     drbddisk::shared drbddisk::backups \<br />
Filesystem::/dev/drbd0::/var/nfsroot/shared::reiserfs \<br />
Filesystem::/dev/drbd1::/var/nfsroot/backups::xfs \<br />
Delay::2::0 \<br />
rc.samba-hb \<br />
rc.rpc-hb \<br />
rc.nfsd-hb \<br />
Delay::3::0 \<br />
IPaddr::10.0.0.150/16/eth0</code></p></blockquote>
<p>As you can see i have Hearbeat managing two DRBD volumes (&#8216;shared&#8217; and &#8216;backups&#8217;), NFS and Samba, and one shared IP address.</p>
<p>To have DRBD and Heartbeat started at boot time, i added the following to &#8216;rc.local&#8217;:</p>
<blockquote><p><code>/etc/rc.d/drbd start<br />
/etc/rc.d/heartbeat start</code></p></blockquote>
<p>And to stop them at reboot, i added this to &#8216;rc.local_shutdown&#8217;:</p>
<blockquote><p><code>/etc/rc.d/heartbeat stop<br />
/etc/rc.d/drbd stop</code></p></blockquote>
<p>Also, don&#8217;t forget to move /var/lib/nfs to your drbd volume for proper locking, and alter your rc.rpc-hb to add a cluster name to the startup of statd. More background info on this at <a href="http://www.linux-ha.org/DRBD/NFS">Linux-HA</a> (step 4e and 4f)</p>
<p>Now for all this to work properly, there is one more thing, which I found out the hard way <img src='http://hans.rakers.org/wp-includes/images/smilies/icon_razz.gif' alt=':P' class='wp-smiley' />  When Heartbeat releases its resources, it stops all services mentioned in &#8216;haresources&#8217; by calling their related rc scripts with &#8216;stop&#8217;. I ran into some strange failover behaviour, and found the following in my logs:</p>
<blockquote><p><code>heartbeat: 2007/10/01_13:16:42 info: Running /etc/rc.d/rc.samba  stop<br />
heartbeat: 2007/10/01_13:16:42 ERROR: Return code 1 from /etc/rc.d/rc.samba<br />
heartbeat: 2007/10/01_13:16:42 ERROR: Resource script for rc.samba probably not LSB-compliant.<br />
heartbeat: 2007/10/01_13:16:42 WARN: it (rc.samba) MUST succeed on a stop when already stopped<br />
heartbeat: 2007/10/01_13:16:42 WARN: Machine reboot narrowly avoided!<br />
</code></p></blockquote>
<p>Apparently, Heartbeat will sometimes call an rc script with &#8216;stop&#8217; while the services is already in &#8220;stopped state&#8221;. Now, looking at our rc.samba-hb, we see the following stop function:</p>
<blockquote><p><code>samba_stop() {<br />
killall smbd nmbd<br />
}<br />
</code></p></blockquote>
<p>This call to killall will return with a non-zero exit code when there are no processes to kill, which results in the rc script exiting with a non-zero exit code. This makes Heartbeat think something failed, resulting in the above error messages. The fix for this is rather simple, though maybe a bit hackish. Change the samba_stop function by adding a &#8216;exit 0&#8242;:</p>
<blockquote><p><code>samba_stop() {<br />
killall smbd nmbd<br />
exit 0<br />
}<br />
</code></p></blockquote>
<p>This should make Heartbeat happy. There are probably other rc scripts around that do not comply to this, so check your startup scripts.</p>
<p>Finally, watch out with DRBD-0.7.24 on a system with a kernel &gt;= 2.6.22. I still use the DRBD 0.7 branch, and when i deployed 0.7.24 on kernel 2.6.22.9, i ran into a whole bunch of trouble. The load average would suddenly spike enormously, and my systems went unresponsive to shutdown or reboot commands. I found a <a href="http://www.gossamer-threads.com/lists/drbd/users/12598">drbd-user mailinglist posting from someone with similar issues</a>. Apparently it&#8217;s a known issue with drbd-0.7.24 on kernel &gt;= 2.6.22 and XFS, which is fixed only in subversion!<br />
After fetching and installing the latest subversion revision of the drbd-0.7 branch as described <a href="http://www.drbd.org/latest.html">here</a>, the problem is solved.</p>
]]></content:encoded>
			<wfw:commentRss>http://hans.rakers.org/2007/10/issues-regarding-ha-nfs-with-heartbeat-and-drbd-on-slackware-120/feed/</wfw:commentRss>
		<slash:comments>0</slash:comments>
		</item>
	</channel>
</rss>

