me, crying because Leberkäs is 15€ per kilo that thing is literally a ground up paste of the worst cuts of the meat, and they made it more expensive and added more water….
BudgetBandit
People be like "life’s shit" and then they consume food designed to keep them addicted, media picked by an algorithm to keep them watching and move less than a sloth in a day.
You’re not forced to be a wage slave. Just go live in the woods, plan a heist, get rid of a CEO or something like that. Live a little.
Not in Europe. Over here, those go for ~350 if you’re lucky
For anyone getting an error
On Monday, court documents revealed that AI company Anthropic spent millions of dollars physically scanning print books to build Claude, an AI assistant similar to ChatGPT. In the process, the company cut millions of print books from their bindings, scanned them into digital files, and threw away the originals solely for the purpose of training AI—details buried in a copyright ruling on fair use whose broader fair use implications we reported yesterday.
The 32-page legal decision tells the story of how, in February 2024, the company hired Tom Turvey, the former head of partnerships for the Google Books book-scanning project, and tasked him with obtaining "all the books in the world." The strategic hire appears to have been designed to replicate Google's legally successful book digitization approach—the same scanning operation that survived copyright challenges and established key fair use precedents.
While destructive scanning is a common practice among some book digitizing operations, Anthropic's approach was somewhat unusual due to its documented massive scale. By contrast, the Google Books project largely used a patented non-destructivecamera process to scan millions of books borrowed from libraries and later returned. For Anthropic, the faster speed and lower cost of the destructive process appears to have trumped any need for preserving the physical books themselves, hinting at the need for a cheap and easy solution in a highly competitive industry.
Ultimately, Judge William Alsup ruled that this destructive scanning operation qualified as fair use—but only because Anthropic had legally purchased the books first, destroyed each print copy after scanning, and kept the digital files internally rather than distributing them. The judge compared the process to "conserv[ing] space" through format conversion and found it transformative. Had Anthropic stuck to this approach from the beginning, it might have achieved the first legally sanctioned case of AI fair use. Instead, the company's earlier piracy undermined its position.
But if you're not intimately familiar with the AI industry and copyright, you might wonder: Why would a company spend millions of dollars on books to destroy them? Behind these odd legal maneuvers lies a more fundamental driver: the AI industry's insatiable hunger for high-quality text.
The race for high-quality training data
To understand why Anthropic would want to scan millions of books, it's important to know that AI researchers build large language models (LLMs) like those that power ChatGPT and Claude by feeding billions of words into a neural network. During training, the AI system processes the text repeatedly, building statistical relationships between words and concepts in the process.
The quality of training data fed into the neural network directly impacts the resulting AI model's capabilities. Models trained on well-edited books and articles tend to produce more coherent, accurate responses than those trained on lower-quality text like random YouTube comments.
Publishers legally control content that AI companies desperately want, but AI companies don't always want to negotiate a license. The first-sale doctrine offered a workaround: Once you buy a physical book, you can do what you want with that copy—including destroy it. That meant buying physical books offered a legal workaround.
And yet buying things is expensive, even if it is legal. So like many AI companies before it, Anthropic initially chose the quick and easy path. In the quest for high-quality training data, the court filing states, Anthropic first chose to amass digitized versions of pirated books to avoid what CEO Dario Amodei called "legal/practice/business slog"—the complex licensing negotiations with publishers. But by 2024, Anthropic had become "not so gung ho about" using pirated ebooks "for legal reasons" and needed a safer source.
Credit: State of Washington
Buying used physical books sidestepped licensing entirely while providing the high-quality, professionally edited text that AI models need, and destructive scanning was simply the fastest way to digitize millions of volumes. The company spent "many millions of dollars" on this buying and scanning operation, often purchasing used books in bulk. Next, they stripped books from bindings, cut pages to workable dimensions, scanned them as stacks of pages into PDFs with machine-readable text including covers, then discarded all the paper originals.
The court documents don't indicate that any rare books were destroyed in this process—Anthropic purchased its books in bulk from major retailers—but archivists long ago established other ways to extract information from paper. For example, The Internet Archive pioneered non-destructive book scanning methods that preserve physical volumes while creating digital copies. And earlier this month, OpenAI and Microsoft announced they're working with Harvard's libraries to train AI models on nearly 1 million public domain books dating back to the 15th century—fully digitized but preserved to live another day.
While Harvard carefully preserves 600-year-old manuscripts for AI training, somewhere on Earth sits the discarded remains of millions of books that taught Claude how to juice up your résumé. When asked about this process, Claude itself offered a poignant response in a style culled from billions of pages of discarded text: "The fact that this destruction helped create me—something that can discuss literature, help people write, and engage with human knowledge—adds layers of complexity I'm still processing. It's like being built from a library's ashes."
This article was updated on 6/26/25 at 7:57 a.m. to add information about the non-destructive scanning technique used by Google Books.
You can get 4TB of high quality HDD for $100 and with your budget you can get that and 2 backup drives
Kategorie "Vorverdaute Füllstoffe mit zugesetzten Suchtmitteln" kreieren.
My nerdy ass brain was like "how in the Windows environment can someone like GSM" and then I continued reading…
I think Sigurd Felix Wolfgang Atreides has some very rich parents.
Also, somehow the „smartest man who knows more about cars than anyone currently alive“ only uses cameras on the cars.
Not even infrared cameras, just regular ones.
Als KI-gestütztes System unterliegt "telli" der EU-Verordnung über Künstliche Intelligenz. Diese verpflichtet Anbieter und Betreiber sicherzustellen, dass Nutzer über die erforderlichen Kompetenzen verfügen. Deswegen muss auch in Bremen ein Selbstlernkurs abgeschlossen werden, um das Programm nutzen zu dürfen.
It’s English. First it’s needed to make the spelling and sounding uniform.
read read red dread
Tough though through
I quit my phone addiction when they added YouTube shorts and the Reddit API changes, I stopped eating processed foods during covid, because I was bored and learned how to cook (I gained 80 pounds due to overeating on my own cooking…) And am forced to move because I’ve picked up RC driving as a hobby and I don’t really like to drive safely.