Hello, I currently have a home server mainly for media, in which I have an SSD for the system and 2 6TB hard drives set up in raid 1 using mdadm, its the most I can fit in the case. I have been getting interested in ZFS and wanting to expand my storage since it’s getting pretty full. I have 2 12TB external hard drives. My question is can I create a pool (I think that’s what they are called), using all 4 of these drives in a raidz configuration, or is this a bad idea?
(6TB+6TB) + 12TB + 12TB, should give me 24TB, and should work even if one of the 6TB or 12TB fails if I understand this correctly.
How would one go about doing this? Would you mdadm the 2 6TB ones into a raid 0 and then create a pool over that?
I am also just dipping my toes now into Nixos so having a resource that would cover that might be useful since the home server is currently running Debian. This server will be left at my parents house and would like it to have minimal onsite support needed. Parents just need to be able to turn screen on and use the browser.
Thank you
Acronyms, initialisms, abbreviations, contractions, and other phrases which expand to something larger, that I’ve seen in this thread:
Fewer Letters More Letters LVM (Linux) Logical Volume Manager for filesystem mapping NAS Network-Attached Storage RAID Redundant Array of Independent Disks for mass storage SATA Serial AT Attachment interface for mass storage SSD Solid State Drive mass storage ZFS Solaris/Linux filesystem focusing on data integrity
6 acronyms in this thread; the most compressed thread commented on today has 30 acronyms.
[Thread #542 for this sub, first seen 25th Feb 2024, 08:45] [FAQ] [Full list] [Contact] [Source code]
I believe ZFS works best when having direct access to the disks, so having a md underlying it is not best practice. Not sure how well ZFS handles external disks, but that is something to consider. As for the drive sizes and redundancy, each type should have its own vdev. So you should be looking at a vdev of the 2x6TB in mirror and a vdev of the 2x12TB in mirror for maximum redundancy against drive failure, totaling 18TB usable in your pool. Later on if you need to add more space you can create new vdevs and add them to the pool.
If you’re not worried about redundancy, then you could bypass ZFS and just setup a RAID-0 through mdadm or add the disks to a LVM VG to use all the capacity, but remember that you might lose the whole volume if a disk dies. Keep in mind that this would include accidentally unplugging an external disk.
ZFS is fine with external disks and indirect access to hardware in my limited experience. Performance would not be as optimal as it could be but data integrity shouldn’t be a problem. If I didn’t need the space and this was my primary storage array, I’d probably opt for the increased reliability of 2 mirror vdevs. I’ve done something similar to what OP is suggesting with LVM combining multiple disks on my off-site backup though. I combined 1T+3T+4T disks into a single 8TB volume. Deliciously bastardized, not a single integrity issue, no hardware failures either over the several years it ran.
Cool. Yeah, as a professional I am constantly aware of data integrity and have most of my shit stored on redundant drives. I had a WoW Guild Officer who shared his home setup with like 8x12TB drives in Windows Storage Spaces with no redundancy that was like 80% full. I had to ask how he slept at night knowing he could lose 80TB of data at any time.
Personally my TrueNAS has 5x1.92TB SSDs setup in two mirror vdevs and a hot spare for my ISCSI LUNs and 8x1.2TB 10K drives in a raidz2 (2 disk parity) for my NAS storage.
Yeah definitely wanting the redundancy, most of what I will be storing will not be life changing if lost just a big inconvenience, and for the life changing stuff I plan on having it backed up to cloud storage.
You’ve got some decent answers already, but since you’re getting interested in ZFS, I wanted to make sure you know about discourse.practicalzfs.com. It’s the successor to the ZFS subreddit and it’s a great place to get expert advice.
Thank you for that will have to have a look into it since I am quite new and I am not completely sure how to go about things in a way to not regret it later down the line in half a year or so.
Is there any particular reason you’re interested in using ZFS?
How do you intend to move over the data on the 2x6 array if you create a new pool with all the drives?
mdadm RAID1 is easy to handle, fairly safe from write holes and easy to upgrade.
If it were me I’d upgrade to a 2x12 array (mdadm RAID1 or ZFS mirror, whichever you want), stored internally. And use the old 6 TB drives as cold storage external backups with Borg Backup. Not necessarily for the media files but you must have some important data that your don’t want to lose (passwords, 2FA codes, emails, phone photos etc.)
I wouldn’t trust USB-connected arrays much. Most USB enclosures and adapters aren’t designed for 24/7 connectivity, and arrays (especially ZFS) are sensitive to the slightest error. Mixing USB drives with any ZFS pool is a recipe for headache IMO.
I could accept using the 2x6 as a RAID1 or mirror by themselves but that’s it. Don’t mix them with the internal drives.
Not that there’s much you could do with that drive setup, since the sizes are mismatched. You could try unraid or snapraid+mergerfs which can do parity with mismatched drives but it’s meh.
Oh and never use RAID0 as the bottom layer of anything, when a drive breaks you lose everything.
If there’s an offline backup, they could create a degraded RAIDz1 with the 2 12T disks, copy the data from the 6Ts over, create the 12T linear volume out of the 6Ts, add it to the degraded RAIDz1 and wait for it to resilver. If no hardware fails and they don’t punch in a wrong keystroke, it should work.
Most USB enclosures and adapters aren’t designed for 24/7 connectivity
This is true. I’m using 8 external USB drives in two RAIDz1s and I had to ensure their controllers don’t overheat. For example I’ve had 4 WD Elements standing vertically, stacked next to each other. The inner two’s controllers would overheat during initial data transfer and disconnect. Spacing then apart resolved this for my ambient environment. In the other pool, I had a new WD Elements overheat on its own, without taking ambient heat. I resolved that by adhering a small heatsink to the SATA-USB controller in the enclosure. I also drilled a hole in the enclosure immediately above the heatsink for better ventilation. I later applied this mod to the of the drives of the same model.
Mixing USB drives with any ZFS pool is a recipe for headache IMO.
Crucially however between the issues above and accidental cable unplugging, ZFS hasn’t lost any data or caused any undue headache. If anything, getting back to a working state has been easier in some occasions as it would automatically detect a missing drive is back, resilver if needed and go its merry way. The headache I’ve observed most of the time has been of the sort - a message that zpool is not healthy, a drive has shown errors and/or missing, resolve drive issues if any, reconnect drive, no affected applications, no downtime. The much less often observed issue, probably twice over the last 5 years has been of the sort - applications are down, zpool isn’t reading/writing or is missing, more than one drives is disconnected due to a cable snafu, shutdown, reconnect drives, boot, ZFS detects the drives and it proceeds as if nothing happened. All in, the occasions on which I had to manipulate ZFS over that last 5 years is around 5, most during the initial data transfer load. The previous LVM + mdraid setup I had required more work to get back in shape after a drive was kicked out for one reason or another. So yes USB can definitely present issues that you wouldn’t see in an internal application, especially if some of your USB enclosure controllers are shit, but in my anecdotal experience, ZFS is very capable in handling those gracefully and with less manual intervention than the standard Linux solutions. If anything, ZFS has been less sensitive to hardware problems.
I feel like what you’re saying here, in effect, is “USB connected drives in a RAID are a bad idea, but if you’re going to do it, ZFS is the way to go.”
Hahaha. Good one!
Well not quite. More like “USB connected drives in RAID could be less reliable than internal and software can deal with it. ZFS makes that easier than LVM+mdraid.” The downside of LVM+mdraid in my experience is that it needs more commands typed in to repair an array if something’s gone wrong. It probably doesn’t break much more than ZFS would under the same hardware conditions and it probably can recover from the same conditions ZFS could. USB drives can present more failure modes than internal but one of the points of RAID is to mitigate hardware failures. So I’m considering USB drives as just shittier drivers whose shittiness the software should be able to hide. So far that has been borne out in practice in my anecdata. I’ve used both LVMRAID (LVM+built-in mdraid) and ZFS with questionable USB drives and both have handled them without data loss and rare downtime, less than once a year. ZFS requires less attention. With all of that said ZFS does of course provide data integrity checking and correction which is a significant plus over LVM+mdraid. It’s already saved me from data corruption due to RAM I had no idea had a problem. RAM that passed Memtest86+'s first pass. Little did I know that it fails on subsequent passes… Yes the first and subsequent passes are different. So I’d use ZFS with USB or internal disks whenever I have the choice to. 😂
Make sure you understand volume block size before you start using it. It has a big impact on compression, performance and even disk utilization. In certain configurations you may be surprised to find out as much as 25% of your disk space (in addition to parity) is effectively gone and it is untrivial to change the block size after the fact.
That’s definitely I have to look into, the nixos page on ZFS had a link to a ZFS cheat sheet of sorts that I have been trying to wrap my head around, thanks for pointing it out though.
The simplest way to do exactly what you want is to use LVM to create a linear volume (equivalent to JBOD) from the two 6TB disks. Then create a zpool with a single RAIDz1 vdev with that along with the other 2 12TB disks. You could use mdraid to do a RAID0 as you suggested too. The result would be similar. In fact that could be easier for you if you already know how to use mdraid and you don’t LVM.
You could also do it all with ZFS albeit with more lost space. You could create a zpool with 2 vdevs. One with a 6TB mirror comprised of the 2 6TB drives. The other a 12TB mirror. The redundancy in ZFS is at the vdev level. A zpool contains one or more vdevs. It combines their space like a JBOD. You can mix and match the size and type of the vdevs. You can have mirrors with RAIDz, just mirrors, just RAIDz, etc. My suggestion for having two mirrors, 6TB and 12TB results in 18TB usable space. This is straightforward, easy to manage and easy to expand. You just add another vdev to the pool with whatever topology you like. If you want to maximize the space with what you got, you can do your idea instead. It’s got a bit more setup and a bit less redundancy but it’ll work fine.
EDIT: Ensure your external drives work reliably under load for extended periods! Either load test them with something and/or watch for errors while transferring data to the new zpool. If you see an error, check dmesg for anything related to a USB drive. If you encounter such a problem it might be a controller bug, controller overheating or a bad cable. Controller refers to the enclosure SATA-to-USB bridge. I have 5x WD Elements, one WD MyBook and 2x NexStat 3.1. They’re all using ASMedia. The WD Elements are prone to overheating if there’s no appropriate ventilation. I’ve had to adhere tiny heatsinks to two of them in order to resolve overheating as they operate at higher ambient temperature. Crucially all of this overheating has occurred under load. Without loading them, it’s all looks fine and dandy. ZFS did not lose me any data when any of this happened and as I addressed it.
That is a good point about stress testing them, if memory serves me well I believe one of the 12TB would disconnect a while back maybe 2 or so years ago when I was using windows and doing large backups. I think the consensus around seems to just mirror the 6TB and mirror the 12TB drives separate, it’s probably what I will end up doing since in the end I am tripling the amount of storage and really allowing me to lose two drives albeit two different drives before data loss. Feel I may be getting a bit greedy with what I have and should just be happy with what I am getting with that. Looking at getting an upgrade in about a year or two either way.
Apparently I’ve heatsinked the MyBook as well. This is what that looks like without the cover. The heatsink is from a Raspberry Pi kit. If I were you, knowing what I know, I’d just slap heatsinks on the 12T disks preventatively instead of testing them.
deleted by creator