This is an automated archive.
The original was posted on /r/datahoarder by /u/Grimy81 on 2024-01-24 01:14:23+00:00.
Morning folks,
I have the joy of coming up with a solution to verify the 100's of millions of files/folders that were copied to an Azure File Share indeed match the source file. From what I've found MD5 would not be my friend / scalable in these types of numbers, perhaps xxHash or something like that for the Windows world? Both source and destination can be mapped from a Windows host if this plays a factor.
There is a check built in to azcopy that should have been used at the time, but this is now somewhat irrelevant as the 120Tb worth of files is already in the AFS space plus the post-copy check was going to take something like 3 years base on test results.
Any tips would be much appreciated.