BudgetBandit

joined 2 years ago
[–] BudgetBandit 1 points 5 hours ago

I quit my phone addiction when they added YouTube shorts and the Reddit API changes, I stopped eating processed foods during covid, because I was bored and learned how to cook (I gained 80 pounds due to overeating on my own cooking…) And am forced to move because I’ve picked up RC driving as a hobby and I don’t really like to drive safely.

[–] BudgetBandit 3 points 11 hours ago

me, crying because Leberkäs is 15€ per kilo that thing is literally a ground up paste of the worst cuts of the meat, and they made it more expensive and added more water….

[–] BudgetBandit 14 points 11 hours ago (2 children)

People be like "life’s shit" and then they consume food designed to keep them addicted, media picked by an algorithm to keep them watching and move less than a sloth in a day.

You’re not forced to be a wage slave. Just go live in the woods, plan a heist, get rid of a CEO or something like that. Live a little.

[–] BudgetBandit 3 points 11 hours ago

Not in Europe. Over here, those go for ~350 if you’re lucky

[–] BudgetBandit 5 points 12 hours ago

For anyone getting an error

On Monday, court documents revealed that AI company Anthropic spent millions of dollars physically scanning print books to build Claude, an AI assistant similar to ChatGPT. In the process, the company cut millions of print books from their bindings, scanned them into digital files, and threw away the originals solely for the purpose of training AI—details buried in a copyright ruling on fair use whose broader fair use implications we reported yesterday.

The 32-page legal decision tells the story of how, in February 2024, the company hired Tom Turvey, the former head of partnerships for the Google Books book-scanning project, and tasked him with obtaining "all the books in the world." The strategic hire appears to have been designed to replicate Google's legally successful book digitization approach—the same scanning operation that survived copyright challenges and established key fair use precedents.

While destructive scanning is a common practice among some book digitizing operations, Anthropic's approach was somewhat unusual due to its documented massive scale. By contrast, the Google Books project largely used a patented non-destructivecamera process to scan millions of books borrowed from libraries and later returned. For Anthropic, the faster speed and lower cost of the destructive process appears to have trumped any need for preserving the physical books themselves, hinting at the need for a cheap and easy solution in a highly competitive industry.

Ultimately, Judge William Alsup ruled that this destructive scanning operation qualified as fair use—but only because Anthropic had legally purchased the books first, destroyed each print copy after scanning, and kept the digital files internally rather than distributing them. The judge compared the process to "conserv[ing] space" through format conversion and found it transformative. Had Anthropic stuck to this approach from the beginning, it might have achieved the first legally sanctioned case of AI fair use. Instead, the company's earlier piracy undermined its position.

But if you're not intimately familiar with the AI industry and copyright, you might wonder: Why would a company spend millions of dollars on books to destroy them? Behind these odd legal maneuvers lies a more fundamental driver: the AI industry's insatiable hunger for high-quality text.

The race for high-quality training data

To understand why Anthropic would want to scan millions of books, it's important to know that AI researchers build large language models (LLMs) like those that power ChatGPT and Claude by feeding billions of words into a neural network. During training, the AI system processes the text repeatedly, building statistical relationships between words and concepts in the process.

The quality of training data fed into the neural network directly impacts the resulting AI model's capabilities. Models trained on well-edited books and articles tend to produce more coherent, accurate responses than those trained on lower-quality text like random YouTube comments.

Publishers legally control content that AI companies desperately want, but AI companies don't always want to negotiate a license. The first-sale doctrine offered a workaround: Once you buy a physical book, you can do what you want with that copy—including destroy it. That meant buying physical books offered a legal workaround.

And yet buying things is expensive, even if it is legal. So like many AI companies before it, Anthropic initially chose the quick and easy path. In the quest for high-quality training data, the court filing states, Anthropic first chose to amass digitized versions of pirated books to avoid what CEO Dario Amodei called "legal/practice/business slog"—the complex licensing negotiations with publishers. But by 2024, Anthropic had become "not so gung ho about" using pirated ebooks "for legal reasons" and needed a safer source.

Credit: State of Washington

Buying used physical books sidestepped licensing entirely while providing the high-quality, professionally edited text that AI models need, and destructive scanning was simply the fastest way to digitize millions of volumes. The company spent "many millions of dollars" on this buying and scanning operation, often purchasing used books in bulk. Next, they stripped books from bindings, cut pages to workable dimensions, scanned them as stacks of pages into PDFs with machine-readable text including covers, then discarded all the paper originals.

The court documents don't indicate that any rare books were destroyed in this process—Anthropic purchased its books in bulk from major retailers—but archivists long ago established other ways to extract information from paper. For example, The Internet Archive pioneered non-destructive book scanning methods that preserve physical volumes while creating digital copies. And earlier this month, OpenAI and Microsoft announced they're working with Harvard's libraries to train AI models on nearly 1 million public domain books dating back to the 15th century—fully digitized but preserved to live another day.

While Harvard carefully preserves 600-year-old manuscripts for AI training, somewhere on Earth sits the discarded remains of millions of books that taught Claude how to juice up your résumé. When asked about this process, Claude itself offered a poignant response in a style culled from billions of pages of discarded text: "The fact that this destruction helped create me—something that can discuss literature, help people write, and engage with human knowledge—adds layers of complexity I'm still processing. It's like being built from a library's ashes."

This article was updated on 6/26/25 at 7:57 a.m. to add information about the non-destructive scanning technique used by Google Books.

[–] BudgetBandit 3 points 12 hours ago (4 children)

You can get 4TB of high quality HDD for $100 and with your budget you can get that and 2 backup drives

[–] BudgetBandit 2 points 1 day ago

Kategorie "Vorverdaute Füllstoffe mit zugesetzten Suchtmitteln" kreieren.

[–] BudgetBandit 3 points 1 day ago (1 children)

My nerdy ass brain was like "how in the Windows environment can someone like GSM" and then I continued reading…

[–] BudgetBandit 10 points 1 day ago* (last edited 1 day ago)

I think Sigurd Felix Wolfgang Atreides has some very rich parents.

[–] BudgetBandit 18 points 1 day ago

Also, somehow the „smartest man who knows more about cars than anyone currently alive“ only uses cameras on the cars.

Not even infrared cameras, just regular ones.

[–] BudgetBandit 12 points 2 days ago

Als KI-gestütztes System unterliegt "telli" der EU-Verordnung über Künstliche Intelligenz. Diese verpflichtet Anbieter und Betreiber sicherzustellen, dass Nutzer über die erforderlichen Kompetenzen verfügen. Deswegen muss auch in Bremen ein Selbstlernkurs abgeschlossen werden, um das Programm nutzen zu dürfen.

[–] BudgetBandit 5 points 2 days ago (2 children)

It’s English. First it’s needed to make the spelling and sounding uniform.

read read red dread

Tough though through

 

Btw I’m vegan.

I use arch btw.

Only cooking with my cast iron skillet.

 
77
heat ≠ Heat (sh.itjust.works)
 
 

It’s driving me crazy. Partner on phone, playing videos and games while the TV is running some YouTube reaction stuff.

I wear noise cancelling headphones all day at home because it’s just too much. The volume is so high I can hear it through a closed door. My PC is in the same room as the TV and I can’t even concentrate on a tutorial on how to do some editing stuff.

Partner is also suffering from depression so every freaking time I begged to please turn the tv off, it’s just ended in a 30 minute therapy session at home on how I can improve myself.

Anyone in a similar situation?

43
Windows 11's inconsistency (self.mildlyinfuriating)
 

You highlight something in Edge and release your finger from the button? New menu. Now you can copy it directly! Great! No need for right clicking every time, right? No! It only works in Edge! You want to copy from Outlook? Nothing! You highlight in Word or Excel? You can change the font of the selected text! No copy option!

You expect it to work a certain way and then it just doesn’t. Don’t get me wrong, I love right clicking, but expecting not having to and then still needing to click it is just… inconsistently inconvenient

27
Selling BTC or not..? (self.nostupidquestions)
 

This might be a bit of a bad question, but I don’t know where to ask to get the least biased responses.

So, I have about $1.000 in Bitcoin that used to be $300 (I’ve put in about $1.500 in various shitcoins before getting those BTC)

I fly drones as a hobby and I was thinking of getting a new system for that amount of money.

87
Raising the morale (sh.itjust.works)
submitted 1 month ago* (last edited 1 month ago) by BudgetBandit to c/funny
 
 
 

so far I’ve ditched the ketogenic diet to not show my lifestyle change to anyone. I am still counting calories and macros. I skip the gym only on days where my girl is at home as to not raise any suspicion. On days where I am at the gym I get at around 1900~2100 kcal, on days without I’m staying at 1600~1800 kcal - so far I lose about 100g a day/ 1.5 lbs a week. I’m down 4.6 kg (10 lbs) since I started on December 27. The belt hole puncher is useful. I’m down 2 holes on the belt, 1 of them I needed to punch myself.

I’m planning on going public with weight loss in march, starting keto and the gym officially.

 

Hello! I’m writing this here because I am really excited! I just subscribed to the local gym next to my workplace for 2 months since I’ll be stationed somewhere else in march. I really need to tell that someone, but I can’t tell it anyone who knows me because this would instantly trigger me and I know I’d never again go afterwards, so I gotta tell it you all! I started calorie counting on January 1 and am on keto the 3rd day. I just finished my first training and I love the way my body feels!

It’s weird how no one wants to go to the gym but no one said how bad it was afterwards, everyone (regular people) just feels good!

I’ve lost 80 pounds back in 2017, held the weight and got fat due to adapting my wife’s eating habits.

I’m looking forward to shedding at least 80 pounds in these 60 days. I know it is impossible, but you know - it’s my goal. And if I can’t reach that I’ll just try harder next time.

I’m planning on going every day, except sundays. Today I did 20 min elliptical, chest-biceps weight training and 4 minutes stair master.

I‘ll have you updated again once the subscription runs out!

Anyway, I hope I didn’t break any community rule with this post.

view more: next ›