SnapRAID Notes
I was initially thinking of using SnapRaid for parity-protecting my DrivePool NAS drive array, so I did some research and testing on it. Since StableBit already delivers pooling and SMART monitoring, I would be using the parity functionality of SnapRAID to simply complement it if any data drive should die without warning. SnapRAID consists of just a few zipped files, nothing is installed. I place mine in
C:\SnapRAID
Beware of limitations before you implement. Especially "Only file, time-stamps, symlinks and hardlinks are saved. Permissions, ownership and extended attributes are not saved.".
snapraid.conf (adapted for 2-parity and Stablebit DrivePool):
parity C:\Disker\pdisk1\snapraid.parity
2-parity C:\Disker\pdisk2\snapraid.2-parity
content C:\Disker\pdisk1\snapraid.content
content C:\Disker\pdisk2\snapraid.content
content C:\SnapRAID\snapraid.content
data disk0 C:\Disker\disk0\PoolPart.N
data disk1 C:\Disker\disk1\PoolPart.N
data disk2 C:\Disker\disk2\PoolPart.N
data disk3 C:\Disker\disk3\PoolPart.N
data disk4 C:\Disker\disk4\PoolPart.N
data disk5 C:\Disker\disk5\PoolPart.N
data disk6 C:\Disker\disk6\PoolPart.N
data disk7 C:\Disker\disk7\PoolPart.N
exclude *.covefs
exclude *.unrecoverable
exclude Thumbs.db
exclude $RECYCLE.BIN
exclude \System Volume Information
Example scripts below, schedule with taskschd.msc. Uses sendmail for Windows for e-mail.
SnapRAID_SYNC.ps1 (schedule this to run e.g. every 6-24 hours):
This script can safely run even if a disk dies: -E, --force-empty exists to force the insecure operation of syncing a disk with all the original files missing. If SnapRAID detects that all the files originally present in the disk are missing or rewritten, it stops proceeding unless you specify this option. This allows to easily detect when a data file-system is not mounted. This option can be used only with "sync".
$SmartFailFlag = snapraid smart | findstr FAIL
if ($SmartFailFlag -ne $null) {
# A disk is failing, do not sync. E-mails from Scanner and RAID
# software should already be on its way to storage engineer to
# handle drive swap and a rebuild.
# Send an e-mail to storage engineer.
$tmpfile = New-TemporaryFile
Add-Content -Value "From: [email protected]" -Path $tmpfile.FullName
Add-Content -Value "To: [email protected]" -Path $tmpfile.FullName
Add-Content -Value "Subject: SnapRAID SMART warning from W2`n" -Path $tmpfile.FullName
Add-Content -Value "SnapRAID script held back sync because of a SMART FAIL flag.`n" -Path $tmpfile.FullName
Add-Content -Value $SmartFailFlag -Path $tmpfile.FullName
Get-Content -Path $tmpfile.FullName | c:\sendmail\sendmail.exe -t
Remove-Item $tmpfile.FullName -Force
} else {
# Proceed with sync.
Write-Output "SnapRAID sync running ..."
snapraid.exe sync -l "C:\SnapRAID\SYNCLOG.txt" > "C:\SnapRAID\SYNCREPORT.txt"
}
What happen if a disk breaks during a "sync"?You are still able to recover data. In the worst case, you will be able to recover as much data as if the disk would have broken before the "sync". But if the "sync" process already run for some time, SnapRAID is able to use the partially synced data to recover more.
Healthy RAM:
Before running the first "sync", check your RAM memory with a program like memtest86.
Bad RAM is the most frequent cause of data loss when using SnapRAID!
SnapRAID_SCRUB.ps1 (I run this manually every few months or so since I use Stablebit Scanner as well):
NOTE: Every run of scrub checks by default about the 8% of the array, but not data already scrubbed in the previous 10 days. To run a complete scrub you can use
-p 100 -o 0
which means 100 percent and all files older than 0 days (effectively all).$SmartFailFlag = snapraid smart | findstr FAIL
if ($SmartFailFlag -ne $null) {
# A disk is failing, do not scrub. E-mails from Scanner and RAID
# software should already be on its way to storage engineer to
# handle drive swap and a rebuild.
# Send an e-mail to storage engineer.
$tmpfile = New-TemporaryFile
Add-Content -Value "From: [email protected]" -Path $tmpfile.FullName
Add-Content -Value "To: [email protected]" -Path $tmpfile.FullName
Add-Content -Value "Subject: SnapRAID SMART warning from W2`n" -Path $tmpfile.FullName
Add-Content -Value "SnapRAID script held back scrub because of a SMART FAIL flag.`n" -Path $tmpfile.FullName
Add-Content -Value $SmartFailFlag -Path $tmpfile.FullName
Get-Content -Path $tmpfile.FullName | c:\sendmail\sendmail.exe -t
Remove-Item $tmpfile.FullName -Force
} else {
# Proceed with sync.
Write-Output "SnapRAID scrub running ..."
snapraid.exe scrub -p 100 -o 0 -l "C:\SnapRAID\SCRUBLOG.txt" > "C:\SnapRAID\SCRUBREPORT.txt"
}
NOTE: Scrubbing has two purposes:1) Discover silent data corruption, which is its most unique and most important trait, but should be very rare.
2) Discover if a disk is beginning to fail and trigger SMART errors as early as possible (like Stablebit Scanner).
Quick reference for using SnapRAID
SYSTEM REQUIREMENTS SnapRAID works best in a 64 bit operating system. It can access more than 4 GiB of memory, and use faster hashing and parity algorithms. Anyway, if you have less than 30 TB of data, it performs very well also using a 32 bit operating system. The 64 bits version requires 1 GiB of memory for 7 TB of data. For best performance it's recommended to have all the disks connected with SATA and not with USB. As a rule of thumb you can stay with one parity disk (RAID5) with up to four data disks, and then using one parity disk for each group of seven data disks, like in the table: Parities Data disks 1/Single Parity/RAID5 2 - 4 2/Double Parity/RAID6 5 - 14 3/Triple Parity 15 - 21 4/Quad Parity 22 - 28 5/Penta Parity 29 - 35 6/Hexa Parity 36 - 42 GENERAL COMMANDS If during the scrub process, silent or input/output errors are found, the corresponding blocks are marked as bad in the "content" file, and listed in the "status" command. > snapraid status To fix them, you can use the "fix" command filtering for bad blocks with the -e option: > snapraid -e fix Lists all the files modified from the last "sync" that need to have their parity data recomputed. > snapraid diff At the next "scrub" the errors will disappear from the "status" report if really fixed. To make it fast, you can use -p 0 to scrub only blocks marked as bad. > snapraid -p 0 scrub You can read and write while running sync, and if this interferes the process will continue anyway, just skipping that written part. This obviously could affect a potential recovery, but it's just like if you wrote just after the 'sync' completed. ADDING A NEW DRIVE TO PROTECT ############################# To add a new data disk at the array, add the new "disk" option in the configuration file, and then run a "sync" command. > snapraid sync REMOVING A PROTECTED DRIVE ########################## Change in the configuration file the related "disk" option to point to an empty directory Remove from the configuration file any "content" option pointing to such disk. Run a "sync" command with the "-E" option: > snapraid sync -E The "-E" option tells at SnapRAID to proceed even when detecting an empty disk. When the "sync" command terminates, remove the "disk" option from the configuration file. Your array is now without any reference to the removed disk. ADDING ANOTHER PARITY DRIVE ########################### To add a new parity level, add the proper "N-parity" option in the configuration file, and then run the "sync" command, using the "-F" option: > snapraid -F sync The "-F" option tells at SnapRAID to recompute the full parity. Note that in the process, you will be always protected, because the existing parity is not modified. If you wish to remove a parity, you can simply remove the highest "N-parity" option from the configuration and then delete the parity file. Take care that after removing a parity file, you cannot reuse it anymore, because it gets outdated after the next "sync" command. RECOVERY ######## Simply run the "fix" command using a filter for the specified file. Like: > snapraid fix -f my_just_deleted_file To undelete a directory use: > snapraid fix -f my_just_deleted_dir/ To undelete all the missing files use: > snapraid fix -m
With SnapRAID you can use up to 6 parity drives, meaning you can loose 6 drives at once. It also helps keeping data integrity by checking checksums of protected data it has synced (scrub). This complements how DrivePool already runs internal checksum control on data you choose to duplicate in the pool.