this post was submitted on 25 Oct 2024
352 points (97.1% liked)
Curated Tumblr
3964 readers
199 users here now
For preserving the least toxic and most culturally relevant Tumblr heritage posts.
Image descriptions and plain text captions of written content are expected of all screenshots. Here are some image text extractors (I looked these up quick and will gladly take FOSS recommendations):
-web
-iOS
Please begin copied raw text posts (lacking a screenshot that makes it apparent it is from Tumblr) with:
# This has been reposted here to Lemmy as part of the "Curated Tumblr Project."
I made the icon using multiple creative commons svg resources, the banner is this.
founded 1 year ago
MODERATORS
you are viewing a single comment's thread
view the rest of the comments
view the rest of the comments
As per my other post, this person isn't doing any of that.
But, since you asked for papers on generic matching algorithms, I found this during the silent conniption fit you sent me into after suggesting that some random tumblr user plugged a tumblr bot directly into a state of the art genomics db.
https://link.springer.com/article/10.1007/s11227-022-04673-3
Please note that while, yes, they ran this test on a standard office computer, they were only searching against 12 million characters.
A single tebibyte of characters would be more like 1 trillion characters. A pebibyte would be more like 1 ~~quintillion~~ quadrillion.
... much, much, much longer processing times.
Edit: Used the wrong word for stupendously large numbers that start with q.