IDE RAID - OpenBSD 3.3 current


Table of Contents:


Introduction

[ref: raid(4), raidctl(8)]
[ref: Andreas F. Bobak's "Installing OpenBSD 3.3 on a Soft-RAID Array"]

These are notes expanding on the configuration of the RAIDFrame disk driver [raid(8)] from Andreas F. Bobak's mini-howto. There is also another good howto by Sven Kirmess, discovered after the initial build of this howto.

The primary instigator of these notes is to clarify the process as I perform it on a machine for which I am putting the RAIDframe on an IDE RAID configuration. That is to say, I want to be able to rebuild the system if it dies horribly one day or I get paid to put one together.

These notes will take you through the major phases of preparing and creating a raid enabled IDE RAID system using standard IDE drives.

Testing Environment: OpenBSD3.3 current (2003-07-09), Pentium4 motherboard with onboard RAID IDE controller (ASUS P4B533)

Retrieving Current

The source is required for building RAID support into the kernel. We will be building the RAID configuration from source, and therefore the initial part of these instructions are on acquiring the source and building a release. Maybe one day this can be expanded and put into its own walk-through. The 'release' build is primarily to get it into a state that is similar to getting distribution binaries.

The primary reason for generating a binary distribution is to allow the process to use available binary distributions (aka buy the cd or download snapshots)

For those who will be using the release source code available on CD to create their custom kernel, then you can skip the source retrieval section.

For my configuration, it is faster for me to suck the source down on a non-local site, compress the files for transfer to the local site. For those with fast enough pipes or who can get the source pre-compressed, please skip this stage.

Using anoncvs get the current release of OpenBSD. This is not necessary, but a note for how I did it.

cd /some/location

export CVSROOT=:pserver:(anoncvs site of your choice)

cvs login

cvs get src

cvs get XF4

cvs get ports

Compressing current so I can get it faster down the pipe to my work machines.

gtar -cjf src.tbz2 src/

gtar -cjf XF4.tbz2 XF4/

gtar -cjf ports.tbz2 ports/

Bring it across the wire on ftp or sftp (which is apparently better for these things than scp)

Decompressing on local system

cd /usr

gtar -xjf /path-to-file/src.tbz2

gtar -xjf /path-to-file/ports.tbz2

gtar -xjf /path-to-file/XF4.tbz2

Create Kernel with RAIDframe support

[ref: FAQ et. al.]

There are a good amount of documentation already available for customising the kernel, and we are just quick-stepping through the bits you have to do. For explanation of what we are doing here please refer to the FAQ and other documentation.

# cd /usr/src/sys/arch/$ARCH/conf
# cp GENERIC RAIDKERN

modify the RAIDKERN file to include the following options

option RAID_AUTOCONFIG
pseudo-device raid 4 # RAIDframe disk driver

The primary modifications are inclusion of the pseudo-device and the RAID_AUTOCONFIG. For documentation completion, I have also included the below for my kernel.

option BUFCACHEPERCENT=30
option NMBCLUSTERS=8192

continue customisation by doing the following:

# config RAIDKERN
# cd ../compile/RAIDKERN
# make depend && make

On my system, the build machine is the same arch as the proposed RAID machine, so I can test the kernel, and use it for building binaries by doing the following.

# cp /bsd /bsd.old
# cp bsd /bsd
# cp bsd /bsd.raid

Restart the machine to ensure the kernel build worked.

Building the release/distribution Binaries

[ref: (jross_at_openvistas.net post ), release(8), http://www.geocities.com/easybakeoven88/release.html]

To ensure that the binaries are in synch with the RAID kernel, we now need to build binaries 'current' with the 'current' kernel created above.

Create the Script mybuild.sh inspired by posting noted above.

#!/bin/sh
#
# OpenBSD - Release Building Shell Script v2.4
# Created by FenderQ - November 9 2002
# 
# SEE ALSO release(8)
# 

DEST="/usr/new-root"
RELEASE="/usr/new-release"

build_kernel() {
    echo "********** Build and install a new kernel **********"
    cd /usr/src/sys/arch/i386/conf
    config GENERIC
    cd ../compile/GENERIC
    make clean depend bsd
    cp /bsd /bsd.old && cp bsd / && chown root.wheel /bsd
}

build_system() {
    echo "********** Build a new system **********"
    rm -rf /usr/obj/*
    cd /usr/src && nice make obj
    nice make build
    cd /dev && ./MAKEDEV all
}

make_release() {
    echo "********** Make and validate the system release **********"
    cd /usr/src/distrib/crunch && make obj depend all install
    export DESTDIR=$DEST RELEASEDIR=$RELEASE
    rm -rf $DESTDIR
    mkdir -p $DESTDIR $RELEASEDIR
    cd /usr/src/etc && nice make release
    cd /usr/src/distrib/sets && sh checkflist
    unset DESTDIR RELEASEDIR
}

build_XF4() {
    echo "********** Build and install XF4 **********"
    rm -rf /usr/Xbuild
    mkdir -p /usr/Xbuild
    cd /usr/ports/lang/tcl/8.3 && make install
    cd /usr/ports/x11/tk/8.3 && make install
    cd /usr/Xbuild && lndir /usr/XF4 && nice make build
}

make_release_XF4() {
    echo "********** Make and validate the XF4 release **********"
    export DESTDIR=$DEST RELEASEDIR=$RELEASE
    rm -rf $DESTDIR
    mkdir -p $DESTDIR $RELEASEDIR
    nice make release
    unset DESTDIR RELEASEDIR
}

clean_everything() {
    echo "********** Clean everything **********"
    rm -rf /usr/obj/* $DEST $RELEASE
}

usage() {
    echo "Usage: $0 options" 
    echo 
    echo "Options:"
    echo 
    echo "  kernel             - Build and install GENERIC kernel"
    echo "  system             - Build a new system"
    echo "  release            - Make and validate the system release"
    echo "  xwindow            - Build and install XF4"
    echo "  xwindow-release    - Make and validate the XF4 release"
    echo "  clean              - Clean everything"
    echo
}

if [ `whoami` != "root" ]; then
    echo "You probably should be root instead of `whoami` to run this safely." 
    exit 1
fi

START=`date`
echo
echo "***** OpenBSD - Release Building *****"
echo
echo "Dest: $DEST"
echo "Release: $RELEASE"
echo "CVS Revision Tag: $CVSTAG"
echo "CVS Server: $CVSROOT"
echo

if [ $# = 0 ]; then usage; exit 1; fi

for i in $*
    do
    case $i in
        kernel)
            build_kernel
            ;;
        system)
            build_system
            ;;
        release)
            make_release
            ;;
        xwindow)
            build_XF4
            ;;
        xwindow-release)
            make_release_XF4
            ;;
        clean)
            clean_everything
            ;;
        *)
            echo "********** Abort! Abort! **********"
            echo "Invalid option encountered: $i"
            echo "Exiting......."
            echo
            exit 1
            ;; 
    esac
    done
    
echo
echo "Start Time  : $START"
echo "Finish Time : `date`"
echo
	

Notes:

/usr is an 80GB Hard-drive with 60GB free space on this test configuration (i.e. make sure you have enough space on the partition you intend to place the files onto)

Depending on how sluggish your system is, this can be speedy or its time to weed the garden. You could of course just manually key in the commands above without need to resort to the script.

After everything is completed, we should have a RELEASE SNAPSHOT in $MYRELEASEDIR with the GENERIC KERNEL.

Create minimal RAIDframe install set

We will create a new tarball "site33.tgz" to be our minimal install set as the completion of this section of the guide.

To save on some keyboard retyping, I'm setting up some environment variables that will be used later in the process (i.e. if you use different directories it'll be easier to make the change here.)

# export MYRELEASEDIR=/usr/new-release
# export MYBUILDDIR=/usr/new-minimal-package
# export MYRAIDRELEASE=/usr/home/ftp/pub/OpenBSD/3.3/i386
# export MYRAIDKERNEL=/usr/src/sys/arch/i386/compile/RAIDKERN/bsd

MYRELEASEDIR is where the snapshot, cd-release, or src-build release is located.

MYBUILDDIR is the empty location where we are going to build our minimal release.

MYRAIDRELEASE is where we are going to put the tgz/compressed release build for ftp or cd-writing.

MYRAIDKERNEL is the location and RAID enabled kernel

Now, on to the extraction process. We will:

Create Build Directory

# mkdir -p $MYBUILDDIR
# cd $MYBUILDDIR

Extract Files for the Build

# tar xvzfp $MYRELEASEDIR/etc33.tgz './etc/*'
# tar xvzfp $MYRELEASEDIR/base33.tgz \
'./bin/*' '*/ex' '*/MAKEDEV' '*zoneinfo*' '*/vi' '*/raidc*' \
'*/find' ./usr/bin/reset ./usr/bin/tset */ld.so */libcurses.so* */libc.so* \
*/libterm* */termcap* */terminfo* \
*/chroot *libexec/getty '*/mtree' */slog* *bin/ssh* \
*/mdec/* './var/*' '*/encrypt' '*/pwd_mkdb*' ./sbin/ancontrol ./sbin/chown ./sbin/dhclient \
./sbin/dhclient-script ./sbin/disklabel ./sbin/dmesg ./sbin/fdisk ./sbin/fsck* \
./sbin/halt ./sbin/ifconfig ./sbin/init ./sbin/kbd ./sbin/mkfifo ./sbin/mknod \
./sbin/mount* ./sbin/newfs ./sbin/ping */libcrypto.so* */libkrb5.so* */crontab \
*/egrep */fgrep */compress *bin/ftp */grep */gunzip */gzcat */gzip */less */more \
*/rsh */sed *bin/sysc* './sbin/route' './sbin/ttyflags' './sbin/swapctl'

We're splitting up part of the extraction process, so we can make some notes on what is being sucked out here. (and my ssh sessions weren't handling unlimited command-line pastes duhhhh)

# tar xvzfp $MYRELEASEDIR/base33.tgz \
'./usr/sbin/kvm_mkdb' './usr/sbin/dev_mkdb' './usr/bin/mktemp' './usr/bin/install' \
'./usr/sbin/syslogd' './usr/bin/wc' './sbin/savecore' \
'./sbin/quotacheck' './usr/sbin/quotaoff' './usr/sbin/quotaon' \
'./usr/bin/cmp' './sbin/ccdconfig' './usr/sbin/inetd' '*/libz.so*' './usr/sbin/cron' \
'*/libwrap.so*' './usr/lib/libutil.so*' './usr/lib/libkrb5.so*' \
'./usr/lib/libas*' './usr/lib/libcom_err.so*' './usr/lib/libdes.so*' \
'./usr/libexec/auth/login_*' './usr/bin/login' '*/libedit.*' './sbin/umount'

We are adding the following so we can have more complete ssh support in the minimal environment (i.e. support for scp and sftp) Very useful when you forget to put something, or you can just leave it out.

# tar xvzfp $MYRELEASEDIR/base33.tgz \
'./usr/bin/scp' './usr/libexec/sftp-server' './usr/bin/ftp'

Tools I soon found out that were necessary, at least in my testing period.

# tar xvzfp $MYRELEASEDIR/base33.tgz \
'./usr/bin/which' './usr/bin/touch' './usr/bin/logname' './usr/bin/whoami' \
'./usr/bin/du'

Copy the standard 'GENERIC' kernel as bsd.old into the working directory and copy our raid enabled kernel into the current/build directory.

# cp $MYRELEASEDIR/bsd ./bsd.old
# cp $MYRAIDKERNEL ./bsd

Make some directories reuired during boot process

For the minimal install we can get rid of some large directories in ./var such as www. We will also need to create some directories that are 'accessed' during full startup by the ./etc/rc script.

# mkdir ./usr/local
# rm -rf ./var/www
# mkdir ./tmp
# mkdir ./root

Create customisation files

Create a dummy ./etc/rc.conf.local

#!/bin/sh -
#
sendmail_flags=NO
check_quotas=NO   # NO may be desirable in some YP environments
ntpd=NO           # run ntpd if it exists

Just ensuring a few things are turned off in our environment.

Now we'll make a few basic login environments for root. I'm not too good at this stuff so have copied others work.

Create ./root/.profile with the following preliminary contents

# $OpenBSD: dot.profile,v 1.3 2003/03/20 01:43:31 david Exp $
#
# sh/ksh initialization

PATH=/sbin:/usr/sbin:/bin:/usr/bin
export PATH
HOME=/root
export HOME
umask 022

alias ll='ls -al'
export TERM=vt220

if [ -x /usr/bin/tset ]; then
     eval `/usr/bin/tset -sQ \?$TERM`
fi

Create ./root/.login with the following preliminary contents

# $OpenBSD: dot.login,v 1.9 2003/03/20 01:43:31 david Exp $
#
# csh login file

set tterm='?'$TERM
set noglob
onintr finish
eval `tset -s -Q $tterm`
finish:
unset noglob
unset tterm
onintr

if ( `logname` == `whoami` ) then
      echo "Don't login as root, use su"
endif

Create ./root/raid0.conf.new with the following preliminary contents

START array
# numRow numCol numSpare
1 2 0

START disks
/dev/wd0d
/dev/wd1d

START layout
# sectPerSU SUsPerParityUnit SUsPerReconUnit RAID_level_1
128 1 1 1

START queue
fifo 100

[ref: raid(8)]

The raid configuration file is well documented in the raid(8) man page. Take special note of the device wd0d and wd1d. These, in my configuration, are the device/partition on which we will install our raid system. You can change this now, or later as fits your configuration.

The file will be used in two sections of the installation. 1st it will be required in creating the raid configuration, and 2nd will be used by the system to 'auto-detect' the raid enabled system.

If you hate superfluous error messages then you can also perform the following 'optional' instructions, to 'create' dummy files:

touch ./var/log/messages
touch ./var/log/authlog
touch ./var/log/secure
touch ./var/cron/log
touch ./var/log/daemon
touch ./var/log/xferlog
touch ./var/log/lpd-errors
touch ./var/log/maillog

Create the minimal install tarball

# tar -czf $MYRAIDRELEASE/site33.tgz *

We are now ready to ftp-install or create a cdr with our minimal RAID enabled kernel & short-list binaries.

To complete preparations for an ftp install, we'd probably need to copy the release files onto the MYRAIDRELEASE directory:

# cp $MYRELEASEDIR/* $MYRAIDRELEASE/

For a boot media you can either create a boot floppy, or burn the contents of $MYRAIDRELEASE onto a CDR. To create a boot floppy, use something like the below.

# dd if=$MYRAIDRELEASE/floppy33.fs of=/dev/fd0c

To burn a CDR, make sure you use the $MYRAIDRELEASE/cdrom33.fs as the boot-image and include your standard install binaries from $MYRELEASEDIR and the raid enabled tar ball fom $MYRAIDRELEASE.

Machine Install Phase 1 - Installing the minimal pre-raided configuration

There is a two phase process in installing RAID which first requires building a minimal system that can start the RAIDFrame disk drivers, and then building the RAID enabled system on top of that.

On the new machine, we are assuming you will have two or more identical drives for RAID connected on an iDE RAID controller. In our example we will be installing and configuring on the 1st (primary) drive of this configuration, the others in the set will be created from this install drive.

My two drives are wd0 and wd1.

Partitioning the Hard-Drive

When you reach the the disk creation process create a small (50M) 'a' and similarly smal (32M) 'swap' partition. Specify 'd' as type "RAID."

Partition
FS-TYPE
Intended Mount Point
Size
wd0a
4.2BSD
/
50M
wd0b
swap
swap
32M
wd0c
unused
[reserved]
wd0d
RAID
*

For example:

> a a
offset: [63]
size: [ ... ] 50m
Rounding to nearest cylinder: xxxxx
FS type: [4.2BSD]
mount point: [none] /
> a b
offset: [ ... ]
size: [ ... ] 32m
Rounding to nearest cylinder: xxxxx
FS type: [swap]
> a d
offset: [ ... ]
size: [ ... ]
FS type: [4.2BSD] RAID
mount point: [none]
> w
> q

Remember that in our example we are not touching the other drive(s).

Files to install

When you get the list of files to install, specify

Do not select any other set. Remember that we are only using a 50M '/' partition in our example.

After the installation is complete, restart to ensure installation is a workable installed system.

If the installation is functional, you should be able to SSH into the new machine which will also let you upload any other (read: missing) binaries required for your installation.

Image partition information onto 2nd Drive

With a working, minimal system we now need to configure the 2nd drive.

We are going to make a 'copy' of the disklabel information from our primary drive and modify it for use on our 2nd drive.

# disklabel wd0 > /root/disklabel.wd0
# cp /root/disklabel.wd0 /root/disklabel.wd1
# vi /root/disklabel.wd1

Edit the file /root/disklabel.wd1 and change the entry that showed:

# /dev/rwd0c:

To now show as:

# /dev/rwd1c:

To copy the disklabel for the 2nd drive follow the below commands to initialise a new Master Boot Record (MBR) on the drive and then to write the new partitioning information.

# fdisk -i wd1
# disklabel -R -r wd1 /root/disklabel.wd1

Our 2nd drive is now 'partitioned' and ready for use.

Create the filesystem on the '/' partition by doing the following

# newfs /dev/wd1a

Copy the root partition onto the 2nd drive:

Before we make the 2nd drive bootable, we'll need to ensure all files from the 1st drive are copied across. Let's make a mount point for the 2nd drive and mount it.

# mkdir /mnt
# mount /dev/wd1a /mnt

Now, we need to copy all our files across from wd0a to wd1a

# cd /
# pax -r -w -p e -v bin boot bsd dev etc root sbin tmp usr var /mnt/

Make the 2nd drive bootable

Now, we'll copy the boot files across and make the 2nd drive bootable

# cp /usr/mdec/boot /mnt/boot
# /usr/mdec/installboot -v /mnt/boot /usr/mdec/biosboot wd1

Depending on your system, you may or may not need to perform the below configuration. Fix the fstab on the 2nd drive by editing /mnt/etc/fstab to modify the line referring to

/dev/wd0a / ffs rw 1 1

to now refer to:

/dev/wd1a / ffs rw 1 1

On my test configuration, the above change works fine so long as there is a detected and active wd0. If wd0 is totally failed, even though wd1 is on a 2nd controller it will start as wd0 (i.e. fstab needs to remain as original). Oh well, easier to edit in single user mode.

Power Recycle Testing

Test #1:

We now test to see whether we have configured the 2nd drive correctly. Use "halt" to stop the machine and when you restart, specify the 2nd drive as the boot drive:

boot > boot hd1a:/bsd

Watch the process, as the system should boot from the 2nd drive, and if not it may automatically boot from the 'primary' drive.

Once this is working flawlessly, we are now assured that the boot configuration is working and can continue with the 2nd test.

Test #2:

The 2nd test will be a physical powerdown/ drive disconnection.

Use "halt -p" to powerdown the system and after the system has halted, physically disconnect the primary hard-drive (wd0) and restart the machine to test whether the 2nd drive will start correctly.

Creating the RAID Array

[ref: raidctl(8)]

During our minimal configuration we had created a ./root/raid0.conf.new file that we are going to now use for configuring the raid array.

raidctl -C /root/raid0.conf.new raid0

Initialize the component labels with an ID. We are arbitrarily picking an easy number '100'

raidctl -I 100 raid0

Initialize the parity set

raidctl -iv raid0

If you make a mistake above, you can restart the process by "undoing" raid with the command "raidctl -u raid0"

From the manpage:

-i dev Initialize the RAID device. In particular, (rewrite) the parity on the selected device. This MUST be done for all RAID sets before the RAID device is labeled and before file systems are created on the RAID device.

Now, at this point you will be probably better off going and making dinner, possibly eating it as well. On this test system it took about 2 hours to get past this stage. Strangely enough, the 2nd time around (i.e. I screwed up a few things and restarted often) this process only took about 30 minutes (?)

Partitioning your RAID configuration

Once completed with the initial configuration of the raid0 environment, we create the partitions using disklabel.

disklabel -E raid0

Create your partition information as you would a normal drive (except mount points are not yet associated,) so keep a list of your partition 'letters'. As per standard configuration 'b' is the swap partition, 'c' is reserved.

For my configuration, and for this documentation, I have created the following partitions all FS-TYPE BSD4.3 (except swap of course.)

Partition
Intended Mount Point
Size
raid0a
/
raid0b
swap
raid0c
[reserved]
raid0d
/tmp
raid0e
/var
raid0f
/usr
raid0h
/home

There's an interesting note about 'swap' that we'll discover later. But if you can, avoid 'swap' on the 'raid' system. [ooops, have to mention that I don't have much history on this so make your call]

Format the filesystem(s) you have created:

# newfs raid0a; newfs raid0d; newfs raid0e; newfs raid0f; newfs raid0h

To continue configuration, we turn on auto-configuration on raid0, making it bootable/rootable will be done later.

raidctl -A yes raid0

Installing OpenBSD current on the raided array

We are now ready to install our OpenBSD binaries onto the raided environment by

Mount raided partitions

To install the binaries we'll create mount points, and mount the raid partitions for installation of software. We'll put '/' on /mnt and subsequent subpartitions under that.

cd /mnt
mount /dev/raid0a /mnt

Within the new 'root' environment (/mnt) create the mount points for the other partitions in your configuration.

mkdir ./usr
mkdir ./tmp
mkdir ./var
mkdir ./home

Now we can mount the partitions as per our desired configuration, noted above.

mount /dev/raid0d /mnt/tmp
mount /dev/raid0e /mnt/var
mount /dev/raid0f /mnt/usr
mount /dev/raid0h /mnt/home

Extract Release Files

Now on this configuration, I've made ./tmp a 2GB partition so we can ftp the binaries onto tmp before installation (or you can of course just grab the files from a mounted CDR.) Install the binaries by going into the new 'root' and untar'ing the files

cd /mnt
tar -zxvpf ./tmp/etc33.tgz
tar -zxvpf ./tmp/base33.tgz
tar -zxvpf ./tmp/comp33.tgz
tar -zxvpf ./tmp/man33.tgz

Of course you can install all your binaries

cd /mnt
tar -zxvpf ./tmp/game33.tgz
tar -zxvpf ./tmp/misc33.tgz

I haven't compiled the X environment yet, which is fine since my new raid box is going to be faster than my build machine.

Configure System

Now we have a binary complete RAID configuration, except it doesn't have any of the system configuration we got from doing the proper install onto the hard disk, so we need to copy those files across onto the raid0 configuration. In this case we are concerned with at least the /dev /boot directories and bsd*/files as well as a number of configuration files.

cd /
pax -r -w -p e -v dev boot bsd* /mnt/
cd /etc
cp -p fstab my* hostname.* hosts resolv.conf dhc* /mnt/etc/

Edit the /mnt/etc/fstab file to refer to the 'correct' configuration, by changing the line that refered to:

/dev/wd0a / ffs rw 1 1

To now refer to our new slicing:

/dev/raid0a /     ffs rw 1 1
/dev/raid0d /tmp  ffs rw 1 1
/dev/raid0e /var  ffs rw 1 1
/dev/raid0f /usr  ffs rw 1 1
/dev/raid0h /home ffs rw 1 1

Configure a new root password

After this, we now need to configure a password for the raid0 environment by first changing rooting into the new environment.

/mnt/usr/sbin/chroot /mnt
passwd
exit

Finalise RAIDframe Configuration

Now we are ready to change raid0 so that it is recognised as a root partition.

raidctl -A root raid0

This can be reversed by using

raidctl -A yes raid0

Tell the system that raid0 is available for auto-configuration by copying the config file into the /etc directory.

# cp /root/raid0.conf.new /etc/raid0.conf

This tells the minimal configuration (via /etc/rc) to boot raid0 if the kernel (in raid0:/) is available.

Reboot to test whether our hard-work has paid of.

Test: are we really raided ?

So, how can you tell whether the machine has gone into raid0 ?

The most obvious is which password gets you into root. (That is, of course, if you've selected a different root password for the raid0 configuration from the wd0a configuration)

You can tell by seeing if your booted environment is the full install (i.e. just cat /etc/fstab would be a good start)

The fallback is to watch the startup process (check with dmesg) to see if something like the below has shown up.

Kernelized RAIDframe activated
  [.. stuff left out ..]
raid0 (root): (RAID Level 1) total number of sectors is 77987712 (38079 MB) as root
dkcsum: wd0 matched BIOS disk 80
dkcsum: wd1 matched BIOS disk 81
rootdev=0x1300 rrootdev=0x3600 rawdev=0x3602
dev = 0x1304, block = 68, fs = /var
panic: ffs_blkfree: freeing free frag
OpenBSD 3.3-current (RAIDKERN) #1: Fri Jul 11 12:03:08 TOT 2003
root@[build-machine-name]:/usr/src/sys/arch/i386/compile/RAIDKERN

When it works, remember that the system is not wd0a, but raid0a which means if you are using ssh to connect, the signature has changed. (this is good)

Testing RAID configuration

To verify the raid0 configurations are actually working, i.e. backing up files (mirror in this example) we need to do some power down, remove drive, testing.

Power Recycle Testing

1st. Power down the system and physically disconnect drive 1 (wd0)

Restart the machine to ensure the raid0 array is still the boot system, although performing from the 2nd drive.

At the root prompt verify the system is running by typing

# raidctl -s raid0

you should get a message similar to the below.

raid0 Components:
/dev/wd0d: optimal
component1: failed
No spares.
Parity status: clean
Reconstruction is 100% complete.
Parity Re-write is 100% complete.
Copyback is 100% complete.

Note that 'component1: failed' line above indicates a failure.

What I did next is probably not 'proper', I powered down the system again to reconnect the primary drive and disconnected the 2nd drive.

The system showed the same message above (note message made previously about how this system designates wd0 for the 1st 'active' drive it finds, irregardless of controller)

Anyhow, reconnect both drives and restart the system.

Because of my smarty activities above, the system automatically complains about 'parity' and starts checking the system. At this point I go for lunch and come back afterwards to find an AOK, working system.

Remember how we put 'swap' into the raided environment. Well, there you go, the swap space was different between the two drives and forced a dirty-parity check.

Kernelized RAIDframe activated
   [ ... stuff left out ... ]
raid0 (root): (RAID Level 1) total number of sectors is 77987712 (38079 MB) as root
dkcsum: wd0 matched BIOS disk 80
dkcsum: wd1 matched BIOS disk 81
rootdev=0x1300 rrootdev=0x3600 rawdev=0x3602
WARNING: / was not properly unmounted

hmmm, walk away and because I went away for a while, we really should check the status of the system with: raidctl -s raid0

raid0 Components:
/dev/wd0d: optimal
/dev/wd1d: optimal
No spares.
Parity status: clean
Reconstruction is 100% complete.
Parity Re-write is 100% complete.
Copyback is 100% complete.

OK, seems the thing figured itself out (although probably not the optimal way to do this.) I guess, in a manual sense I should have chosen to:

Verify the parity is ok using : raidctl -P raid0

-P dev
Check the status of the parity on the RAID set, and initialize (re-write) the parity if the parity is not known to be up-to-date. This is normally used after a system crash (and before a fsck(8)) to ensure the integrity of the parity.

If a new drive has been put in, to rebuild that drive from the working drive.

-R component dev
Fails the specified component, if necessary, and immediately be gins a reconstruction back to component. This is useful for re constructing back onto a component after it has been replaced following a failure.

For example, if drive 2 has failed we would recreate the partition information on drive 2 (as above) and then use the following command(?):

# raidctl -R /dev/wd1d raid0

Forced Fail

A test mentioned in the man page, is to perform a forced 'Fail' (which just fails the raid0 component and does not reconstruct the drive as partitioned from the beginning.) The command is supposed to begin reconstruction immediately.

-F component dev
Fails the specified component of the device, and immediately be gin a reconstruction of the failed disk onto an available hot spare. This is one of the mechanisms used to start the recon struction process if a component does have a hardware failure.

so, let's go for it. Better find out now, then later when it doesn't work on a critical system.

# raidctl -F /dev/wd1d raid0

OK, if you try it on a two drive system, then you'll probably get a message like this:

raidctl -F /dev/wd1d raid0
/bsd: raid0: Failing disk r0 c1.
/bsd: raid0: Failing disk r0 c1.
/bsd: Unable to reconstruct disk at row 0 col 1 because no spares are available.
/bsd: Unable to reconstruct disk at row 0 col 1 because no spares are available.

Ooops, ok it seems we really needed a spare for that command to do both the fail and reconstruct.

Forced Reconstruction

So, I guess it's a good time to test whether the "-R" really does works:

# raidctl -R /dev/wd1d raid0
/bsd: Closing the opened device: /dev/wd1d
/bsd: Closing the opened device: /dev/wd1d
/bsd: About to (re-)open the device for rebuilding: /dev/wd1d
/bsd: About to (re-)open the device for rebuilding: /dev/wd1d
/bsd: RECON: Initiating in-place reconstruction on
/bsd: RECON: Initiating in-place reconstruction on
/bsd: row 0 col 1 -> spare at row 0 col 1.
/bsd: row 0 col 1 -> spare at row 0 col 1.
/bsd: Quiescence reached...
/bsd: Quiescence reached...

OK, something seems to be happening. To make sure we know what stage our 'reconstruction' is at, keep a watch on 'raidctl -s raid0' and you will get a progress report such as the following:

raid0 Components:
/dev/wd0d: optimal
/dev/wd1d: reconstructing
No spares.
Parity status: clean
Reconstruction is 9% complete.
Parity Re-write is 100% complete.
Copyback is 100% complete.

I've highlighted as bold the areas that should be changing. raidctl will perform the work in the background and when it finally finishes it should give you a message similar to the below.

/bsd: Quiescence reached...
/bsd: Quiescence reached...
/bsd: Number of I/Os: 31
/bsd: Number of I/Os: 31
/bsd: Elapsed time (us): 1224582350
/bsd: Elapsed time (us): 1224582350
/bsd: User I/Os per second: 0
/bsd: User I/Os per second: 0
/bsd: Average user response time: 0 us
/bsd: Average user response time: 0 us
/bsd: Total sectors moved: 404
/bsd: Total sectors moved: 404
/bsd: Average access size (sect): 13
/bsd: Average access size (sect): 13
/bsd: Achieved data rate: 0.0 MB/sec
/bsd: Achieved data rate: 0.0 MB/sec
/bsd: Reconstruction of disk at row 0 col 1 completed.
/bsd: Reconstruction of disk at row 0 col 1 completed.
/bsd: Recon time was 1224.573513 seconds, accumulated XOR time was 0 us (0.000000).
/bsd: Recon time was 1224.573513 seconds, accumulated XOR time was 0 us (0.000000).
/bsd: (start time 1057885838 sec 269003 usec, end time 1057887062 sec 842516 usec)
/bsd: (start time 1057885838 sec 269003 usec, end time 1057887062 sec 842516 usec)
/bsd: Total head-sep stall count was 0.
/bsd: Total head-sep stall count was 0.
/bsd: RAIDframe: 1138971 recon event waits, 0 recon delays.
/bsd: RAIDframe: 1138971 recon event waits, 0 recon delays.
/bsd: RAIDframe: 10000 max exec ticks.
/bsd: RAIDframe: 10000 max exec ticks.
/bsd: RAIDframe: 10000 max exec ticks.
/bsd: RAIDframe: 10000 max exec ticks.

Coffee Please

And there we should have it, we've rebuilt our failed drive.

# raidctl -s raid0
raid0 Components:
/dev/wd0d: optimal
/dev/wd1d: optimal
No spares.
Parity status: clean
Reconstruction is 100% complete.
Parity Re-write is 100% complete.
Copyback is 100% complete.

Now for that well deserved cup of coffee. (snore, snore, snore)

Afterthoughts - What have we learned

Word: quiescence

n 1: a state of quiet (but possibly temporary) inaction [syn: dormancy, quiescency] 2: quiet and inactive restfulness [syn: quiescency, dormancy, sleeping]

Source: WordNet ® 1.6, © 1997 Princeton University (via dictionary.com)

ldd - there's a lot of dependencies involved with binaries, so per file copying is more difficult and is a trial and error environment if you don't use ldd to find dependencies for some of the files you are bound to use and need.

raidctl(8) - is very well documented and with pleasant examples. After seing the potential from Andreas F. Bobak's mini-howto the man pages were very helpful in completing things I couldn't get from the howto. It was the source for a number of misunderstandings on my part.

Author and Copyright

Copyright (c) 2003 Samiuela LV Taufa. All Rights Reserved.

I reserve the right to be totally incorrect even at the best advice of betters. In other words, I'm probably wrong in enough places for you to call me an idiot, but don't 'cause you'll hurt my sensibilities, just tell me where I went wrong and I'll try again.

You are permitted and encouraged to use this guide for fun or for profit as you see fit. If you republish this work in what-ever form, it would be nice (though not enforceable) to be credited.

RAIDFrame - cheap RAID and OpenBSD

Copyright  © 2000/1/2 NoMoa Publishers All rights reserved. Caveat Emptor