this post was submitted on 28 Dec 2023
35 points (94.9% liked)

Selfhosted

40734 readers
339 users here now

A place to share alternatives to popular online services that can be self-hosted without giving up privacy or locking you into a service you don't control.

Rules:

  1. Be civil: we're here to support and learn from one another. Insults won't be tolerated. Flame wars are frowned upon.

  2. No spam posting.

  3. Posts have to be centered around self-hosting. There are other communities for discussing hardware or home computing. If it's not obvious why your post topic revolves around selfhosting, please include details to make it clear.

  4. Don't duplicate the full text of your blog or github here. Just post the link for folks to click.

  5. Submission headline should match the article title (don’t cherry-pick information from the title to fit your agenda).

  6. No trolling.

Resources:

Any issues on the community? Report it using the report flag.

Questions? DM the mods!

founded 2 years ago
MODERATORS
 

Hey fellow Selfhosters! I need some help, I think, and searching isn't yielding what I'm hoping for.

I recently built a new NAS for my network with 4x 18TB drives in a ZFS raidz1 pool. I previously have been using an external USB 12TB harddrive attached to a different machine.

I've been attempting to use rsync to get the 12TB drive copied over to the new pool and things go great for the first 30-45 minutes. At that point, the current copy speed diminishes and 4 current files in progress sit at 100% done. Eventually, I've had to reboot the machine, because the zpool doesn't appear accessible any longer. After reboot, the pool appears fine, no faults, and I can resume rsync for a while.

EDIT: Of note, the rsync process seems to stall and I can't get it to respect SIGINT or Ctrl+C. I can SSH in separately and running zpool status hangs with no output.

While the workaround seems to be partially successful, the point of using rsync is to make it fairly hands-free and it's been a week long process to copy the 3TB that I have now. I don't think my zpool should be disappearing like that! Makes me nervous about the long-term viability. I don't think I'm ready to drop down on Unraid.

rsync is being initiated from the NAS to copy from the old server, am I better off "pushing" than "pulling"? I can't imagine it'd make much difference.

Could my drives be bad? How could I tell? They're attached to a 10 port SATA card, could that be defective? How would I tell?

Thanks for any help! I've dabbled in linux for a long time, but I'm far from proficient, so I don't really know the intricacies of dmesg et al.

you are viewing a single comment's thread
view the rest of the comments
[–] [email protected] 1 points 1 year ago (1 children)

I don't have practical experience with ZFS, but my understanding is that it uses RAM a lot... if that's new, it might be worth checking the RAM by booting up memtest (for example) and just ruling that out.

Maybe also worth watching the system with nmon or htop (running in another tmux / screen pane) at the beginning of the next session, then when you think it's jammed up, see what looks different...

[–] [email protected] 1 points 1 year ago* (last edited 1 year ago) (1 children)

Awesome, thanks for giving some clues. It's a new build, but I didn't focus hugely on RAM, I think it's only 32GB. I'll try this out.

Edit: I did some reading about L2ARC, so pending some of these tests, I'm planning to get up to 64gb ram and then extend with an l2arc SSD, assuming no other hardware errors.

[–] sonstwas 4 points 1 year ago* (last edited 1 year ago) (2 children)

Based on this thread it's the deduplication that requires a lot of RAM.

See also: https://wiki.freebsd.org/ZFSTuningGuide

Edit: from my understand the pool shouldn't become inaccessible tho and only get slow. So there might be another issue.

Edit2: here's a guide to check whether your system is limited by zfs' memory consumption: https://github.com/openzfs/zfs/issues/10251

[–] [email protected] 4 points 1 year ago

Just another thought... Maybe just format the drives as a massive EXT4 JBOD (just for a temp test) and copy the data again - just to see if ZFS is the problem... maybe it's something else altogether? Maybe - and I hope not - the USB source drive is failing after long reads?

[–] [email protected] 3 points 11 months ago

I believe there's another issue. ZFS has been using nearly all RAM (which is fine, I only need RAM for system and ZFS anyway, there's nothing else running on this box), but I was pretty convinced while I was looking that I don't have dedup turned on. Thanks for your suggestions and links!