this post was submitted on 24 Jun 2025
634 points (98.9% liked)

Technology

71922 readers
3227 users here now

This is a most excellent place for technology news and articles.


Our Rules


  1. Follow the lemmy.world rules.
  2. Only tech related news or articles.
  3. Be excellent to each other!
  4. Mod approved content bots can post up to 10 articles per day.
  5. Threads asking for personal tech support may be deleted.
  6. Politics threads may be removed.
  7. No memes allowed as posts, OK to post as comments.
  8. Only approved bots from the list below, this includes using AI responses and summaries. To ask if your bot can be added please contact a mod.
  9. Check for duplicates before posting, duplicates may be removed
  10. Accounts 7 days and younger will have their posts automatically removed.

Approved Bots


founded 2 years ago
MODERATORS
you are viewing a single comment's thread
view the rest of the comments
[–] [email protected] 21 points 2 days ago (9 children)

...no?

That's exactly what the ruling prohibits - it's fair use to train AI models on any copies of books that you legally acquired, but never when those books were illegally acquired, as was the case with the books that Anthropic used in their training here.

This satirical torrent client would be violating the laws just as much as one without any slow training built in.

[–] RvTV95XBeo -1 points 2 days ago (8 children)

But if one person buys a book, trains an "AI model" to recite it, then distributes that model we good?

[–] [email protected] 7 points 2 days ago (6 children)

I don't think anyone would consider complete verbatim recitement of the material to be anything but a copyright violation, being the exact same thing that you produce.

Fair use requires the derivative work to be transformative, and no transformation occurs when you verbatim recite something.

[–] RvTV95XBeo -3 points 2 days ago (4 children)

"Recite the complete works of Shakespeare but replace every thirteenth thou with this"

[–] [email protected] 7 points 2 days ago (1 children)

existing copyright law covers exactly this. if you were to do the same, it would also not be fair use or transformative

[–] [email protected] 2 points 2 days ago

Well, except Shakespeare is already public domain.

[–] [email protected] 1 points 1 day ago

A court will decide such cases. Most AI models aren't trained for this purpose of whitewashing content even if some people would imply that's all they do, but if you decided to actually train a model for this explicit purpose you would most likely not get away with it if someone dragged you in front of a court for it.

It's a similar defense that some file hosting websites had against hosting and distributing copyrighted content (Eg. MEGA), but in such cases it was very clear to what their real goals were (especially in court), and at the same time it did not kill all file sharing websites, because not all of them were built with the intention to distribute illegal material with under the guise of legitimate operation.

[–] [email protected] 1 points 1 day ago

I'm picking up what you're throwing down but using as an example something that's been in the public domain for centuries was kind of silly in a teehee way.

[–] [email protected] 3 points 2 days ago

I'd be impressed with any model that succeeds with that, but assuming one does, the complete works of Shakespeare are not copyright protected - they have fallen into the public domain since a very long time ago.

For any works still under copyright protection, it would probably be a case of a trial to determine whether a certain work is transformative enough to be considered fair use. I'd imagine that this would not clear that bar.

load more comments (1 replies)
load more comments (2 replies)
load more comments (2 replies)