A year or so ago I discovered that one drive had failed. I drove direct to a supplier and bought a 3TB drive of a comparable nature. Installed, and prayed to the gods of storage that the array rebuild completed before the second one failed. It did. Sweat wiped from brow, I enabled email warnings and went back to ignoring it.
Every time there's a power cut I get an email. That's all, so far. But hard disks being hard disks... one of them is bound to fail some time, likely the old original one now that they're mismatched.
Given how nervous a mirror setup makes me feel when one fails with no other backup... my plan is to add another 3TB drive to the array via USB.
My other plan is to replace the not-yet-failed 2TB drive with another 3TB drive and scale up the storage into the unused 1 gig.
My final plan is to extract all the info from my still-working USB 2TB drive and get that into the mirrored NAS storage.
That'll leave me with one happy external 2TB general purpose drive for moving big bits of data around, and one 3-way RAID-1 mirror NAS which I'm unlikely to be made nervous by.
Not a bad place to be. Unlike now.
A few things:
SSH is root@nas with password = soho+admin's password eg if admin has password 'p4ssw0rd' then it's 'sohop4ssw0rd'
Link to enable/disable that in the firmware I have is: http://nas/diagnostics.html
File systems are XFS inside LVM inside Linux SW RAID in mirrored mode. Root has ext2 mirrored FS for other stuff. Firmware lives on some flash on the main board.
Instructions: http://blog.millard.org/2011/04/non-des ... on-in.html
The 3TB drive in it is this: ST3000VX000 Seagate SV35 Drive 3TB 7200RPM SATA3
The 2TB drive in it is some low-end drive without the required reliability ratings! :-(
New drive to match: https://www.pbtech.co.nz/product/HDDSE6 ... D--durable
Although I don't care about speed much, I'm not sure it's a great idea to have a third drive over USB as it'll be quite a bit slower than the SATA inside, but perhaps not much slower than the LAN connection?
RAID info: http://www.tldp.org/HOWTO/Software-RAID-HOWTO.html and https://raid.wiki.kernel.org/index.php/Linux_Raid and https://help.ubuntu.com/community/Insta ... ftwareRAID
2+3 raid with X data on it
2 usb with Y data on it
3 usb to buy
3 internal to buy mark drive as newer
T+ Backup all content to new 3 external
EZ Swap 2 to 3 internal - mark OK and keep old 2 as point-in-time backup with low reliability
T+ Rebuild Mirror to new 3 internal in 2 size - safe: 2 backups
EZ Resize md to 3 from 2 - care taken not to exceed byte count of any of the three drives - safe: 2 backups
T+ Add 3 ext to raid mirror set if possible - low risk, only one unreliable backup during this process, though
T^ Consolidate data from unduplicated 2 ext into new larger 3-way mirror raid array
EDIT: Ordered! Didn't really want to burn money on drives, but also don't want to lose data, and it's inevitable without proper precautions, so...
Order Number: 2012053116103714
Date Ordered: Thursday 31 May, 2012
1 x Iomega 35431 4TB ix2-200 StorCentre Cloud Edition NAS
Sub-Total : US$154.10
EMS Rates (Shipping to ES): US$36.50
Grand Total : US$190.60
So it's only 4.5 years old, actually! And the other drive died 2 years ago? Let's see if I can find a record of that... yep!
SV35 7200 RPM 3 year warranty
159+GST = 182.85nzd
6th of April 2015
So this is 1.5 years old, now! It was used the same day, however worth noting that the bad drive had failed possibly up to 6 months earlier.
And in a week or two or less the drives will arrive and I can begin the above process of ensuring I don't lose anything! About time, really.
Seagate SV35 Drive 3TB 7200RPM SATA3 HDD , durable reliability and performance tuned to the high-write workloads of today 24Hours&7days video surveillance systems ( 3 years warranty )
Part #: HDDSE6030 $163.00
Seagate 3TB Expansion Desktop
Part #: HDDSEA0433 $129.57
Total Delivery Cost: $5.00
Sub Total: $305.01
This may represent a bit of a stuff up, though, as I may not be able to use it as the third part of a mirror. Hmmm. Still, I can get onto a fresher drive, and upgrade the size to 3TB. Then maybe just rsync the shares every once in a while. And plan to replace the older drive in a year or two, if it doesn't fail sooner.
I'm 3/4 through backing up the NAS to the new USB drive over wifi and rsync/cp with a few permissions fixups and other learnings/chores along the way. That should finish sometime today or tomorrow.
When I get home I can shut it down and install the new drive, let it sync, then try to resize it out to the full 3T size. Then I can have a crack at getting the USB 3T into the software RAID array as a third mirror. Wish me luck at this point :-)
Desktop USB drive failed clickity click a week or two ago.
I bought a WD Red Plus 4TB and put it in the seagate case and it initially worked, then failed outright and I couldn't read anything off it at all. Buggered, returned after sitting a powerful magnet on it overnight twice on two angles. Online says not effective, but I had to try something.
Got a Seagate Ironwolf Pro 4TB to replace it (much better than the drives in the NAS and the USB one) and put that in the same case, tried to do ext4 fsck and got read errors after a few minutes, then tonnes etc. Hmmmm. Drive survived, but could not get long term reliability.
Swapped out 1.5A plug pack for 2A plug pack, seemed better, but again, USB related read errors. Figured I had a bad box not a bad drive. Got an Orico USBC 3.1 case with 2A supply and fitted the old USB drive to it, still fucked. Fitted the new Ironwolf to it, all good, did 60% of a fsck with zero errors then USB connection failed and thousands of sequential errors, maybe heat related. Reformatted new partition table new ext4 fs and called it good and got a fan on it and the NAS and started copying to it share at a time (1TB done from 1.4TB total as we speak), no errors, nothing in dmesg, all good. Seems quieter too.
When I initially started reading from the NAS I got permission issues. So I got on it as root and corrected ownership of various files. One unimportant file was in a bad state journal wise and one with wrong perms that I missed, so corrected that. But while doing the corrections I noted the drives getting busy. Turned out to be rebuilding the mirror!!! No idea why, but it lost sync with one drive and had to resync. It's possible that it never finished when i installed the second 3TB drive a while back (after the above posts) but I thought I did finish. In any case, it finished and succeeded and has been good since and if it survives another 400G of reads has done well enough.
Resize LVM and XFS in the NAS to take advantage of the double 3TB guts vs the 4TB USB drive that's now mirroring it.
Following step, get a Sabrent Rocket 4TB NVMe stick in from the US and install that in the third NUC internal slot and create a single disk mirror vdev ZFS pool. Then mirror the content from the USB drive over to it. Then erase the USB drive and join it to the ZFS mirror vdev to have two way ssd-hdd mirror of the data inside the and outside the machine. Then get a 4 bay TR004 QNAP box and put the disk from the USB enclosure into that, and add another fresh 4TB ironwolf to that and to the vdev. Then I have 3 way mirror ZFS 4TB internal to my machine on SSD and external over USB to two HDDs. Then if I run out I can add two more HDDs and one more SSD (sata inside the box with a HBA card of some sort) and a further mirror vdev across them in the same zpool and just keep writing to it as needed.
If I outgrow that and can't fit any further SSDs in the NUC then I will build a more serious ZFS based NAS of some sort. Likely just raw Llinux + ZFS + NFS4 + syncthing + B2/rclone - maybe with cockpit and the zfs manager module or maybe just as is.
With a ZFS 4TB volume in the machine I can push all the stuff that's on the very full 1TB M2 stick out and maybe add the second 4TB in that slot, instead, or an 8TB if they make a TLC or MLC one by then. Latest 980 PRO samsungs are now TLC not MLC and Rocket TLC drives have massive TBW endurance.
Alternative I could use that slot for a ZIL and L2ARC but I think it can operate without those just fine so we'll see. I don't need epic performance just epic reliability and good enough performance.
3-way mirror disks = enough. Then you're not in a panic if you lose one drive and you still have a chance if you lose two (as I did last week! gasp, most of it survived, at least, phew!).
I had forgotten that I'd upgraded it to SV35 surveillance drives. Good move, Fred. Ironwolf Pro is a better move, though. :-D
Will keep this updated, but looking back at the start, that Iomega IX2-200 is nearly 9 years old! Not a bad run, still pulling 60 Megabytes per second off it over the gigabit LAN connection, likely maxed out disks or mainboard inside it. It's only a tiny embedded processor with bugger all RAM after all. :-D
I think I can get another year out of it before I build a monster, maybe more with the above scheme.
Onward and upward! :-D
Code: Select all
root@NAS:/# cat /proc/sys/dev/raid/speed_limit_min 1000 root@NAS:/# echo 140000 > /proc/sys/dev/raid/speed_limit_min
Code: Select all
root@NAS:/# cat /proc/mdstat Personalities : [linear] [raid0] [raid1] [raid6] [raid5] [raid4] md1 : active raid1 sdb2 sda2 2909285488 blocks super 1.0 [2/2] [UU] [>....................] resync = 0.6% (19343680/2909285488) finish=343.1min speed=140372K/sec md0 : active raid1 sdb1 sda1 20980800 blocks [2/2] [UU] unused devices: <none>
Code: Select all
root@NAS:/# cat /proc/mdstat Personalities : [linear] [raid0] [raid1] [raid6] [raid5] [raid4] md1 : active raid1 sdb2 sda2 2909285488 blocks super 1.0 [2/2] [UU] [>....................] resync = 0.0% (856960/2909285488) finish=46970.7min speed=1031K/sec md0 : active raid1 sdb1 sda1 20980800 blocks [2/2] [UU] unused devices: <none>
Question remains, why is it resyncing at all. Not sure. Last night I did the resize operations following this guide just for ease/lack of having to think at 3am and it was all good after the reboot:
http://blog.millard.org/2011/04/non-des ... on-in.html
Funny that the guide is 10 years old and still useful to someone :-D
But then I tried to solve an issue:
Code: Select all
fred@mako:~$ l /mnt/nas/Archive/cheetah.preusa/full.server.backup.22.23.august.36hour.period/var/www/freeems.org/www/.git/hooks ls: reading directory '/mnt/nas/Archive/cheetah.preusa/full.server.backup.22.23.august.36hour.period/var/www/freeems.org/www/.git/hooks': Input/output error total 0
Code: Select all
root@NAS:/# ls -al /mnt/pools/A/A0/Archive/cheetah.preusa/full.server.backup.22.23.august.36hour.period/var/www/freeems.org/www/.git/hooks ls: reading directory /mnt/pools/A/A0/Archive/cheetah.preusa/full.server.backup.22.23.august.36hour.period/var/www/freeems.org/www/.git/hooks: Structure needs cleaning total 0
so I tried a telinit 1 and it kicked me out and sat there doing something/nothing until this morning. This morning no drive activity so I held power down and maybe that's the answer, maybe some clean shut down flag wasn't set and it's just checking everything out.... let me search...
and look at the logs and here's the answer:
Code: Select all
Jul 5 06:28:14 NAS kernel: md: md1 stopped. Jul 5 06:28:14 NAS kernel: md: bind<sda2> Jul 5 06:28:14 NAS kernel: md: bind<sdb2> Jul 5 06:28:14 NAS kernel: raid1: md1 is not clean -- starting background reconstruction Jul 5 06:28:14 NAS kernel: raid1: raid set md1 active with 2 out of 2 mirrors Jul 5 06:28:14 NAS kernel: md1: detected capacity change from 0 to 2979108339712 Jul 5 06:28:14 NAS kernel: md: resync of RAID array md1 Jul 5 06:28:14 NAS kernel: md: minimum _guaranteed_ speed: 1000 KB/sec/disk. Jul 5 06:28:14 NAS kernel: md: using maximum available idle IO bandwidth (but not more than 200000 KB/sec) for resync. Jul 5 06:28:14 NAS kernel: md: using 128k window, over a total of 2909285488 blocks. Jul 5 06:28:16 NAS kernel: md1: unknown partition table Jul 5 06:28:19 NAS kernel: XFS mounting filesystem dm-1 Jul 5 06:28:19 NAS kernel: Starting XFS recovery on filesystem: dm-1 (logdev: internal) Jul 5 06:28:19 NAS kernel: Ending XFS recovery on filesystem: dm-1 (logdev: internal) Jul 5 06:28:19 NAS kernel: XFS mounting filesystem dm-2 Jul 5 06:28:20 NAS kernel: Starting XFS recovery on filesystem: dm-2 (logdev: internal) Jul 5 06:28:20 NAS kernel: Ending XFS recovery on filesystem: dm-2 (logdev: internal)
I wonder why zero. Oh well, I'm on track to do something better anyway and the third mirror completed last night so I don't have much fear now.
Seems like the forced shutdown doesn't leave the "raid superblock" in the correct state, if a clean reboot results in no repeat performance, then I'll call it good.
Looking forward to putting my data into a reasonable structure and with good reliability/safety settings.