Curated Tumblr

4932 readers

1399 users here now

For preserving the least toxic and most culturally relevant Tumblr heritage posts.

The best transcribed post each week will be pinned and receive a random bitmap of a trophy superimposed with the author's username and a personalized message. Here are some OCR tools to assist you in your endeavors:

web
iOS
FOSS Android Recs per u/[email protected]: 1 , 2

Don't be mean. I promise to do my best to judge that fairly.

founded 2 years ago

MODERATORS

Apytele

[email protected]

358

Take that, atheists (files.catbox.moe)

submitted 6 months ago by [email protected] to c/curatedtumblr

71 comments fedilink hide all child comments

you are viewing a single comment's thread
view the rest of the comments

[–] fartsparkles 99 points 6 months ago (21 children)

What is up with the bird at the end?

[–] [email protected] 110 points 6 months ago* (last edited 6 months ago) (19 children)

A bot strips away all spaces and letters that aren't A, T, C or G, then treats the rest like a genetic sequence and checks it against some database.

Presumably, it runs through many terabytes of data for each comment, as the Gallinula chloropus alone has about 51 billion base pairs, or some 15 GiB. The Genome Ark DB, which has sequences of two common moorhens, contains over 1 PiB. I wonder if a bored sequencing lab employee just wrote it to give their database and computing servers something to do when there is no task running.

No, I won't download the genome and check how close the "closest match" is but statistically, 93 base pairs are expected to recur every 2^186^ bits or once per 10^40^ PiB. By evaluating the function (4-1)^m^ × mℂ93 ≥ 4^93^ ÷ (pebi × 8), one can expect the 93-base sequence to appear at least once in a 1 PiB database if m ≥ 32 mismatches or over ⅓ are allowed. Not great.

This assumes true randomness, which is not true of naturally occuring DNA nor letters in English text, but should be in the right ballpark. Maybe fewer if you account for insertions/deletions.

[–] breakfastburrito 3 points 6 months ago

It’s probably just ncbi

load more comments (18 replies)

load more comments (19 replies)