this post was submitted on 19 Jan 2024
1 points (100.0% liked)

It's A Digital Disease!

22 readers
1 users here now

This is a sub that aims at bringing data hoarders together to share their passion with like minded people.

founded 1 year ago
MODERATORS
 
This is an automated archive.

The original was posted on /r/datahoarder by /u/__markb on 2024-01-19 11:01:53+00:00.


Hi dear data hoarders!

This is not a typical "hoarding" scenario but I think with the amount of data this community has you'd have some very sound advice.

I have video media projects - about 2000 new projects a year. Monthly I get notice of ones that are no longer needed and can be archived off. This on average ranges about to 1000 projects.

The size of the project can range from 4GB all the way up to 60TB for a single project. Now a lot of the projects are under 1TB, but even at 1000 projects a year I'm archiving 1PB yearly.

Currently, only because it was based off an old process, was we were to burn off the project to spanned discs. The physical media was cheap and we were able to store them offsite in perfect conditions - dark, temperature monitored, etc.

Previously the recall for this media to be onlined again was rare - maybe once a year. We are now seeing a trend of it increasing.

Protocol says we cant keep it online or nearline and it needs to be removed from the active system. If we could change this we would but its not a simple task.

However we do have access to a growing petabytes storage system designed for being the digital offsite archive. It was approved for paper digitisation archival, and now extended to us for digital media.

The interface for the storage doesn't allow folders, so the archive needs to be self contained - zip, sfx, tar, iso, etc.

Questions:

What I'd like advice on is what would be the easiest way to archive these large amounts of data, assuming the project was one self contained end file. As in ProjectA and ProjectB cannot be in the same zip, they would be ProjectA.zip and ProjectB.zip

How or what tools can I use to achieve this with the least amount of interaction? What would maintain the data integrity the best?

Are there any recommendations of the best compression to time to compress options?

What I've researched:

So I'm not coming in without trying anything first. I have tried 7zipping the projects, 14TB takes about 100 hours on store setting. I can lower it to 40 hours with the LZMA2 option in compression.

There is no need to save space - though if makes it faster is a bonus.

I also tried PeaZip but didn't let it run long enough to see the results as the UI was frozen and no indication of progress.

I did try ImgBurn for directory to ISO but that was longer than the 7zip 40 hours so that didn't seem as viable option.

We do have an option in the archival software (which gathers all the media for the project) to create a TAR file - however, I've had very little interaction with those types and not 100% sure of the integrity long term.


Again, mods, if this is too far from data hoarding happy to remove it - but thought maybe it might be helpful if there are answers for those who are hoarding for some tips on larger media sizes or automating their processes.

no comments (yet)
sorted by: hot top controversial new old
there doesn't seem to be anything here