this post was submitted on 12 Nov 2023
1 points (100.0% liked)

Data Hoarder

170 readers
1 users here now

We are digital librarians. Among us are represented the various reasons to keep data -- legal requirements, competitive requirements, uncertainty of permanence of cloud services, distaste for transmitting your data externally (e.g. government or corporate espionage), cultural and familial archivists, internet collapse preppers, and people who do it themselves so they're sure it's done right. Everyone has their reasons for curating the data they have decided to keep (either forever or For A Damn Long Time (tm) ). Along the way we have sought out like-minded individuals to exchange strategies, war stories, and cautionary tales of failures.

founded 1 year ago
MODERATORS
 

I'm looking to add some large Enterprise HDD's to my array and was wondering if a Long Smart Test would be sufficient before putting the drive into service?

I use Windows/Snapraid and have/use HDD Sentinel and also could run Read and/or write tests.

I'm curious what others testing methods are after a drive is shipped to them before putting it into service?

you are viewing a single comment's thread
view the rest of the comments
[–] [email protected] 1 points 1 year ago

Here is my over the top method.

++++++++++++++++++++++++++++++++++++++++++++++++++++

My Testing methodology

This is something I developed to stress both new and used drives so that if there are any issues they will appear.
Testing can take anywhere from 4-7 days depending on hardware. I have a dedicated testing server setup.

I use a server with ECC RAM installed, but if your RAM has been tested with MemTest86+ then your are probably fine.

  1. SMART Test, check stats

smartctl -i /dev/sdxx

smartctl -A /dev/sdxx

smartctl -t long /dev/sdxx

  1. BadBlocks -This is a complete write and read test, will destroy all data on the drive

badblocks -b 4096 -c 65535 -wsv /dev/sdxx > $disk.log

  1. Real world surface testing, Format to ZFS -Yes you want compression on, I have found checksum errors, that having compression off would have missed. (I noticed it completely by accident. I had a drive that would produce checksum errors when it was in a pool. So I pulled and ran my test without compression on. It passed just fine. I would put it back into the pool and errors would appear again. The pool had compression on. So I pulled the drive re ran my test with compression on. And checksum errors. I have asked about. No one knows why this happens but it does. This may have been a bug in early versions of ZOL that is no longer present.)

zpool create -f -o ashift=12 -O logbias=throughput -O compress=lz4 -O dedup=off -O atime=off -O xattr=sa TESTR001 /dev/sdxx

zpool export TESTR001

sudo zpool import -d /dev/disk/by-id TESTR001

sudo chmod -R ugo+rw /TESTR001

  1. Fill Test using F3 + 5) ZFS Scrub to check any Read, Write, Checksum errors.

sudo f3write /TESTR001 && f3read /TESTR001 && zpool scrub TESTR001

If everything passes, drive goes into my good pile, if something fails, I contact the seller, to get a partial refund for the drive or a return label to send it back. I record the wwn numbers and serial of each drive, and a copy of any test notes

8TB wwn-0x5000cca03bac1768 -Failed, 26 -Read errors, non recoverable, drive is unsafe to use.

8TB wwn-0x5000cca03bd38ca8 -Failed, CheckSum Errors, possible recoverable, drive use is not recommend.

++++++++++++++++++++++++++++++++++++++++++++++++++++