Today i will share some thought on software RAID under linux with you.
Sometimes you need to store data relative save and high available. That is where RAID solutions come in to play. Especially RAID 1 (mirroring) based solutions provide better availability of your data. But don't forget to use backups anyway :)
Some days ago i installed RAID 1 on Debian Lenny (amd64 arch) without any troubles for now. <UPD Dez 2015: Here missing reference to initial tutorial from Jerry>
The Partitions used in the RAID /dev/md0/ should both enable boot flag if you want to boot from RAID.
This is worked for me.
My configuration is the same as Jerry's except partition sizes and chosed LVM. So boot is done form RAID partition* /dev/md0*. And SWAP is on RAID too, which is controversy solution but it the best one for me, of course availability of a system is primary goal.
So how to check the state of raid after installation. The simplest is to look on /proc file-system with $cat /proc/mdstat. Here is my configuration.
$cat /proc/mdstat Personalities : [raid1] md1 : active raid1 sda2 sdb2 476560128 blocks [2/2] [UU] md0 : active raid1 sda1 sdb1 497856 blocks [2/2] [UU] unused devices: <none>
Here you see two RAID arrays md0 and md1. Also information about used devices and state are peresent. [UU] stand for "used" for every disc is used in a raid. On diskfailure you wold see something like [U_] and sdb2(F).
Next example utilizes mdadm which is used by kernel in latest kernel versions. My is 2.6.26-2-amd64 (lenny default)
Use of --detail or -D option with a device name gives more information. Here my example.
# mdadm --detail /dev/md1 /dev/md1: Version : 00.90 Creation Time : Wed Jan 6 00:51:37 2010 Raid Level : raid1 Array Size : 476560128 (454.48 GiB 488.00 GB) Used Dev Size : 476560128 (454.48 GiB 488.00 GB) Raid Devices : 2 Total Devices : 2 Preferred Minor : 1 Persistence : Superblock is persistent Update Time : Mon Jan 11 02:18:01 2010 State : clean Active Devices : 2 Working Devices : 2 Failed Devices : 0 Spare Devices : 0 UUID : fe3bbbfd:2c6963e7:9785a408:be715448 Events : 0.8 Number Major Minor RaidDevice State 0 8 2 0 active sync /dev/sda2 1 8 18 1 active sync /dev/sdb2
If you wish that your system is able to boot even if one of disks fails completely. You need to write GRUB loader on every hard disk separately.
Here the example for hd0 ( fist hard disk)
$grub grub> root (hd0,0) Filesystem type is ext2fs, partition type 0xfd grub> setup (hd0) setup (hd0) Checking if "/boot/grub/stage1" exists... no Checking if "/grub/stage1" exists... yes Checking if "/grub/stage2" exists... yes Checking if "/grub/e2fs_stage1_5" exists... yes Running "embed /grub/e2fs_stage1_5 (hd0)"... 17 sectors are embedded. succeeded Running "install /grub/stage1 (hd0) (hd0)1+17 p (hd0,0)/grub/stage2 /grub/menu.lst"... succeeded Done.
Run the same also for second disk.
grub> root (hd1,0) ... grub> setup (hd1)
Also your GRUB configuration should enabe loading form the first and alternatively the second disk.
default 0 fallback 1 # is there after installation title Debian GNU/Linux, kernel 2.6.18-6-686 Raid (hd0) root (hd0,0) kernel /boot/vmlinuz-2.6.18-6-686 root=/dev/md0 ro initrd /boot/initrd.img-2.6.18-6-686 # manually created title Debian GNU/Linux, kernel 2.6.18-6-686 Raid (hd1) root (hd1,0) kernel /boot/vmlinuz-2.6.18-6-686 root=/dev/md0 ro initrd /boot/initrd.img-2.6.18-6-686
The default option declares the first configuration to run on default. fallback option forces grub to load fallback configuration (here hd1)
when the firt one could not start.
Once you run your "high"-available RAID you're probably very interested to be informed about problems with it.
E.g. on a failure of one (hope only one) HD device.
Yet i know only two most common solutions for monitoring Software RAID. First of it bases on Nagios tool, but is not described here, because not tired yet. The second is to involve mdadm again.
The option --monitor causes mdadm periodically poll a your md arrays and to inform you on every events that occurs. mdadm should never exit process, so it should normally be run in the background.
Here is example of sending a mail on event. Of course sendmail must be configured.
mdadm --monitor --firstname.lastname@example.org --scan --delay=1800 -ft
It would run a monitor deamon which scanns all of RAID arrays with delay. The option f starts it as deamon process and t option generates test messages on a startup.
When you want to run mdadm through crone job so use option -1
mdadm --monitor--scan -1
Also this would send your mail or run program which are specified in /etc/mdadm.conf file.
Please see manual of mdadm for more details.
You can test our physical drives with hdparm utility e.g.
hdparm -tT /dev/hda
But hdparm would not correct work on your mdX RAID arrays.
So one of the simplest methods is to try time utility.
time dd if=/dev/md0 of=/dev/null bs=1024k count=1000
This test reads 1 GB data file from you RAID Array but reports to copy only something like "509804544 Bytes (510 MB) in 5,73989 s, 88,8 MB/s" by me. So why only half of the Gigabyte? Just be cause it utilizes two devices, as i think, and indeed it copies 500MB two times parallel = 1 Gig. And this only in the half of time as it needed to red 1 Gigabyte from physical /dev/sda directly.
time dd if=/dev/zero bs=1024k count=1000 of=/home/1Gb.file
This test writes 1 Gb file on RAID partition /home . Therefore you can compare write performance with RAID and non raid partitions but don't expect any considerable advantages on Mirroring systems ;)
The last one more professional but also complex tools is iozone. It can perform a various of tests,which can't be explained here detailed.
Please read manual if you really need to test your raid that way. Alternatively you can start iozone in automatic mode, which performs various of tests (relative long-running) and prints on console.
See on Strided Read column which should consider software RAID bonus.
Failure and Recovery
RAID 1 should preserve you from data loose and improve your scalability. So it is very good idea to has relative good knowledge of what to do on Failure. Best way to get this knowledge is to simulate failure. Take try to remove one disk and put it again into array.