It's A Digital Disease!

20 readers
1 users here now

This is a sub that aims at bringing data hoarders together to share their passion with like minded people.

founded 2 years ago
MODERATORS
1
1
LTO Tape speed (zerobytes.monster)
submitted 6 hours ago by [email protected] to c/[email protected]
 
 
The original post: /r/datahoarder by /u/DiskBytes on 2025-04-03 13:01:14.

Hi, I'm writing to LTO using tar and mbuffer, but even with mbuffer I'm noticing the tape slows and speeds up, though it doesn't come to a stop and wait, stop/start is shoe shining right? Will slowing down and speeding up again be ok?

This is probably to do with the file sizes and buffer sizes. I've allocated 6gb for mbuffer, copying from a SATA drive, going to an LTO drive on an SAS card.

I'm wondering if it would help with speed if I try ditching mbuffer and/or putting the SATA drive onto the SAS card?

Thanks.

2
 
 
The original post: /r/datahoarder by /u/Neil_Hester on 2025-04-03 12:45:12.

Hi, I’m a long time user of Stablebit Drivepool (and Drivebender before that) which I chose simply because I could add disks of varying sizes I had laying around or could buy in high capacity cheaply occasionally to top up the system or replace failing drives. I really like this idea so built myself an HBA attached enclosure to house 12x 3.5” spinning drives and squeezed a few more onto the motherboard sata connectors of the PC I dedicated to being the storage server.

I decided against using MS storage spaces because I read so many bad experiences from users it kinda put me off.

I would like to know if there is a better solution out there these days that can still accept random sized drives as I like to use them until they literally die (my drive pool is entirely duplicated for this reason) . Drivebender and Drivepool always feel a little bit clunky and slow connecting and using for my video edit pc over my direct network connection (10Gbe Mellonox cards) compared to local drives and I would also really like to increase the speed by adding some SSD’s as cache drives for read and write if that’s even possible and/or a benefit. I’ve read that caches drives aren’t very well implemented in Drivepool and only work for writing.

So is there anything else out there I should consider taking into account my requirements or should I just continue to plod along with Drivepool

Thanks 👍🏻

3
 
 
The original post: /r/datahoarder by /u/Neil_Hester on 2025-04-03 12:42:51.

Hi, I’m a long time user of Stablebit Drivepool (and Drivebender before that) which I chose simply because I could add disks of varying sizes I had laying around or could buy in high capacity cheaply occasionally to top up the system or replace failing drives. I really like this idea so built myself an HBA attached enclosure to house 12x 3.5” spinning drives and squeezed a few more onto the motherboard sata connectors of the PC I dedicated to being the storage server.

I decided against using MS storage spaces because I read so many bad experiences from users it kinda put me off.

I would like to know if there is a better solution out there these days that can still accept random sized drives as I like to use them until they literally die (my drive pool is entirely duplicated for this reason) . Drivebender and Drivepool always feel a little bit clunky and slow connecting and using for my video edit pc over my direct network connection (10Gbe Mellonox cards) compared to local drives and I would also really like to increase the speed by adding some SSD’s as cache drives for read and write if that’s even possible and/or a benefit. I’ve read that caches drives aren’t very well implemented in Drivepool and only work for writing.

So is there anything else out there I should consider taking into account my requirements or should I just continue to plod along with Drivepool

Thanks 👍🏻

4
 
 
The original post: /r/datahoarder by /u/CheneyQWER on 2025-04-03 12:11:50.

We know that 20tb and 24tb are already barracuda, but what about 22tb and 28tb?

5
 
 
The original post: /r/datahoarder by /u/OtakuGuru_official on 2025-04-03 10:05:46.
6
 
 
The original post: /r/datahoarder by /u/silverhand31 on 2025-04-03 09:49:57.

Hi guys, im looking for a way to download the whole website (just homepage is fine) given url programmatically.

I know I can open website right click save page as, and everything gonna be store locally. But i want to do that with programming.

I dont need fancy speed, so if there is existing tool use with CLI, it would fine to me.

I was thinking about download it via web.archive.org too (i dont need that up-to-date content). I hope that there are tools for that?

Do you have any hunch how im going with this?

Thank.

(i have proxy/vpn to avoid blocking)

7
 
 
The original post: /r/datahoarder by /u/tbok1992 on 2025-04-03 05:27:32.

So, a site called ShareCG is going down very soon. Which, if you're not familiar the site's notable for having a lot of free 3d models and assets, especially for DAZ Studio and Poser, and it disappearing means that a lot of stuff could become permanently lost. This is, of course, inadvisable.

So, I'm wondering, anyone here making any efforts to archive them? Or, any interest in starting any?

I'd presume that putting a lot of the stuff up on the Internet Archive to keep it circulating might violate some of the legal terms, but like, I think that's probably preferable to it being lost forever, IDK.

I myself am currently manually downloading stuff from notable creators (Because I don't know much about how to use scripts to do it and I only have one 2TB SSD) ideally for potential future distribution, but it's slow going because, well, I'm doing it manually, so...

8
 
 
The original post: /r/datahoarder by /u/Cyber_Akuma on 2025-04-03 03:22:25.

I know there are a lot of listings on places like Amazon, but a lot of them are either no-name brands and/or random 3rd party sellers from China selling what they claim are name brands... also a lot of those are PNY which I have had many many issues with in the past.

Any places or listings one can recommend of decent ones? I don't need them to be fast, or even big, I just need a bunch of reliable ones to give to others.

9
 
 
The original post: /r/datahoarder by /u/miwashi on 2025-04-03 02:55:52.

I was wondering what the problem could be. for small files it works fine, for large files around 1 gb it hangs when copying files from the box. Tried on 4 computers but same problem.

10
 
 
The original post: /r/datahoarder by /u/newoodworker on 2025-04-03 02:04:35.

Hi,

TL:DR; Looking for relatively cheap 6-8 bay altrnatives to UPRO-Nas that would give better storage flexibility and 10gbe ethernet. Rack mount is a plus.

Looking for something similar to a UNAS-Pro - a fairly simple, relatively cheap NAS with 10gbe ethernet. Bigest drawback of the UNAS-Pro is lack of configurability of storage/storage pools. I have 4x 10tb drives and 4x 8tb drives, which is enough for what I want to store. I would rather run 2 pools with the flexibility to have more redundancy for important files on one pool, and more space in the other pool for Linux ISOs and other files which I could afford to lose.

I do have a "NAS motherboard", which would be well suited for TrueNAS/Unraid, but it is also a fairly powerful board which I would rather dedicate to compute workloads, rather than running compute + nas on the same hardware. I would like to run proxmox to host different compute options, and running TrueNAS/Unraid as a VM within the proxmox host has some undesirable limitations.

11
1
How To Track Data (zerobytes.monster)
submitted 14 hours ago by [email protected] to c/[email protected]
 
 
The original post: /r/datahoarder by /u/inthetrees101 on 2025-04-03 02:11:17.

I started a new job as a nurse in an emergency department and I want to keep track of some metrics. Right now I think chief complaint, iv started, age, gender. (Open to others)

What would be the best way to keep track of all of this? Simply use an excel spreadsheet?

12
 
 
The original post: /r/datahoarder by /u/DrKersh on 2025-04-03 02:03:38.

Original Title: looking for an hdd enclosure of 4 / 5 drives or 2/3 x2 that makes the least noise possible from fans and even hdd vibrations, and if possible, doesn't auto-spin down the drives or let me configure it.


Hi

I am looking for an external hdd enclosure, I would like to use it with 3 drives at least, but possibly 4/5 in the future, so I think it would be better a 4 / 5 hdd enclosure? Or maybe a 2/3 drives and buy 2/3 of them?

I would really love if the enclosure makes the least noise possible, that includes the sounds of the hdd's vibrations and spins. I tried to just putting them inside the computer case, but the noise they make it's making me go insane, so I just want to put them in an external case 3 meters away from me and connect them through a long usb.

and also, if possible, that the enclosure don't spin down my drives if I don't want it or have in windows selected to not do it. I think this will be impossible, but well, it is what it is.

I don't look for a nas, just the cheapest external enclosure with those features.

I've seen a lot of brands like orico, fantech, terramaster, sabrent, but after losing hours trying to find specs or opinions, I ended even more confused than before.

It will be used for torrenting / watching media when I'm on the computer, don't need to run it while I'm not, just that.

13
 
 
The original post: /r/datahoarder by /u/VviFMCgY on 2025-04-03 01:41:04.

I have 2 NAS's at home, both running Core. Really have no issues to speak of, everything just works

I plan to stay on Core until I can't. What are you doing?

14
 
 
The original post: /r/datahoarder by /u/Tron_Livesx on 2025-04-03 00:41:29.
15
 
 
The original post: /r/datahoarder by /u/dlm2137 on 2025-04-03 00:09:55.

Anyone gamed out how these will affect hard drive prices yet? I see there's several countries from SE Asia on the list.

16
 
 
The original post: /r/datahoarder by /u/AxlJones on 2025-04-02 19:32:16.

Does anyone have nice tool to scrap everything off their youtube account? Favorite videos, Subscriptions, uploaded videos etc?

17
 
 
The original post: /r/datahoarder by /u/ArtLongjumping487 on 2025-04-02 16:02:40.

Is there anyone one here with a good lead

18
 
 
The original post: /r/datahoarder by /u/chillinewman on 2025-04-02 15:04:09.

Does anyone know the story behind this? I'm surprised I don't see anyone talking about it.

The URL was: https://www.youtube.com/hubblespacetelescope

19
 
 
The original post: /r/datahoarder by /u/daxliniere on 2025-04-02 18:10:37.

Hey everyone,

I am trying to purchase a pair of Toshiba MG drives for a small NAS. I had an 'experience' with a seller on Amazon (Toshiba drive sealed in WD anti-static bag, recovery software R-Studio showed extensive usage, and S.M.A.R.T. data had been reset! 😯 (thankfully refunded)), so I'm now wary of third-party sellers on Amazon/eBay, even if they claim them to be new and with warranty.

I'm looking at sizes between 12 and 16Tb and having trouble finding reasonable prices. There seems to be a monopoly in the UK, as Scan seems to be one of the only known companies able to get stock of these sizes.

Can anyone recommend a good deal from a reliable outlet, please?

Thank you for reading this far. :)

20
 
 
The original post: /r/datahoarder by /u/SuperCiao on 2025-04-02 17:34:09.

https://preview.redd.it/qv6j6cp6jgse1.png?width=866&format=png&auto=webp&s=70fde72ef385583dfb49acc9ad6a2a7ba839372a

Xreveal PRO stuck at 99% but operation success. It's normal?

21
 
 
The original post: /r/datahoarder by /u/Dnasrphotography on 2025-04-02 17:06:56.

I'm photographer and currently have about 6.5TB of data, nothing crazy yet. I have a 4TB SSD which I edit off of that's formatted exFAT, a Seagate 8TB for archiving data that's exFAT (which is a backup of my SSD and everything) and another newer Seagate AFPS 8TB that right now is just a second backup of the SSD. I have the exFAT HDD and 4TB SSD backed up on Backblaze as well. I'm wondering if I should backup the first Seagate exFAT HDD to the Seagate AFPS HDD and have Best Buy (I have a plus membership so this would be free) offload the data and reformat the ExFAT Drive to AFPS and reload the data. I do have a Windows Laptop, but seldom use it and figured the SSD could be kept as exFAT for that reason. I wanted to get some suggestions, feedback. Wasn't sure if having both HDD drives as AFPS would make space on the drives more optimized and the read/writing speed a lot faster or worth the trade off in my situation.

22
 
 
The original post: /r/datahoarder by /u/kschaffner on 2025-04-02 15:47:00.
23
 
 
The original post: /r/datahoarder by /u/kcombinator on 2025-04-02 14:52:06.

I’m trying Archivebox, and it has a lot of nice ideas, and it is completely inadequate. I need something that can fetch in parallel. I have ~25k unique bookmarks dating back almost 15 years and just want to preserve what I still can. Does anyone have any recommendations?

24
 
 
The original post: /r/datahoarder by /u/TheFeshy on 2025-04-02 14:32:28.

I've seen the "Do I need ECC RAM" question come up from time to time, so I thought I'd share my experience with it.

The common wisdom is this: cosmic ray bit flips are rare. And the chances that they happen in a bit of memory you actually care about are rarer still. And from a data hoarder perspective, the chances that they occur in a bit of memory you're just about to write to disk are vanishingly small. So it's not really worth the jump in price to enterprise equipment, which is often the only way to get ECC RAM (Even when the RAM itself isn't much more expensive.)

Well, I've been data hoarding since the late 90's, and all but the last 5 on consumer-grade, non-ECC equipment. And I've finally gotten around to using a program that will go through my hoard, and compare it with existing Linux ISO torrent files, to see if I've got the same version. Then I can re-share stuff that's been sitting around for a decade or more. It's been a fun project.

This program allows you to identify less-than-perfect matches, in case you've got a torrent with many Linux ISOs and only one doesn't match, or there are some junk files you've lost track of, or whatever.

I was finding that, sometimes, I'd get a folder of Linux ISOs where they all match except one. And stranger still, I'd get some ISOs that were showing 99% match, but only had one file! So I started looking into this, and did a binary comparison of a freshly downloaded copy and my original. I found they didn't match by a single byte! But all these files were on ZFS initially, and now Ceph - both check for bitrot on every read, and both got regular scrubs to check as well. So how could I be seeing bitrot?

What I found is this (four random examples from my byte by byte comparisons.) See the pattern?

Offset    F1 F2
--------- -- --
5BE77DA0  29 69
1FF937DA0 A8 E8
234777DA0 24 64
29DE37DA0 0B 4B
2B7537DA0 3A 7A
2F88D7DA0 9F DF

If you do, consider your geek card renewed. The difference between the byte from the first copy and the byte from the second copy is always 0100 0000.

I notice another thing: All the files have write dates in 2011 or 2012.

That's when it hit me: I RMA'd a stick of ram about that time. Late 2012, according to my email records.

I had been doing a ZFS scrub, and found an error. Bitrot! I thought. ZFS worked! During the next scrub, it found two such errors, and I started to worry about my disks. Then it found more in a scrub later, and I got suspicious. So I ran memtest on the RAM for 12 hours, and it showed no errors. Just like when I tested it when it was new. Maybe it really is my disks then?

Then I did another zfs scrub, which found more errors, so out of paranoia I ran memtest for 48 hours. That was many loops through all its tests, and it found 2 errors in all those loops. So most times it did the whole loop fine, but sometimes it failed a single test with a single error.

That was enough to replace the RAM under warranty, and I got no more scrub errors on the next scrub. Problem solved.

Except... except. Any file written during that time was cached in that RAM first. And if the parity checks that ZFS does are done on the RAM copy of the data with a bad bit - say, a single bit in a single byte that sometimes comes up 1 when it should be 0 - the checksum data is done on bad data. So ZFS preserves that bad data with checksum integrity.

A cosmic ray flip at just the wrong time would be a single file in your hoard - maybe you'd never notice. The statistical analysis at the start of this post is true.

But a subtly bad stick of RAM? It might sit in your system for years - two in my case - and any file written in those two years might now be suspect.

And any file with a date later than that is also suspect, since it might have been written to, modified, copied, or touched from a file in your suspect date range.

I've found dozens of files with a single bad byte, based on the small percentage I've been able to compare against internet versions.

And the problem is not easy to sort out! I have backups of important stuff, sure - but I'm now looking at thirteen years of edits to possible bad files, to compare to backups. And I don't keep backup version history that old. And for Linux ISOs, while many files are easy to replace, replacing every file is a much bigger task.

So, TL;DR: Yes, folks, in my opinion you want ECC RAM on your storage machine(s.) Lest you wind up looking at every file written since the first Obama administration with suspicion, like I now do.

25
 
 
The original post: /r/datahoarder by /u/DryDealer3816 on 2025-04-02 14:14:20.

What would you do?

I thought of this because I'm currently downloading Professor Leonard's Calculus playlist because I don't want it to go anywhere before I have a chance to watch it 🥺. So if they announced YouTube is getting wiped in a year (and they didn't do anything to try and stop the obviously incoming download frenzy) what would you do?

I'm not sure if I'm allowed to make a post like this here, if I'm not, my apologies. I didn't see anything in the rules that would suggest this kind of post is forbidden.

view more: next ›