this post was submitted on 12 Jul 2024
564 points (98.3% liked)
Technology
60116 readers
3140 users here now
This is a most excellent place for technology news and articles.
Our Rules
- Follow the lemmy.world rules.
- Only tech related content.
- Be excellent to each another!
- Mod approved content bots can post up to 10 articles per day.
- Threads asking for personal tech support may be deleted.
- Politics threads may be removed.
- No memes allowed as posts, OK to post as comments.
- Only approved bots from the list below, to ask if your bot can be added please contact us.
- Check for duplicates before posting, duplicates may be removed
Approved Bots
founded 2 years ago
MODERATORS
you are viewing a single comment's thread
view the rest of the comments
view the rest of the comments
This is a very weird assumption you are making man. The quoted text you sent above pretty much says the opposite. It says everyone who wants to train their models wirh copyrigthed data needs to get permission from the copyright holders. That is great for me period. No one, not a big company nor the open source community, gets to steal the work of people producing art, code, etc. I honestly don't get why you assume all the data scrapped before would be exempt. Again, very weird assumption.
As for ML algorithms having use, of course they have. Hell, pretty much every company I have worked with has used them for decades. But take a look at the examples you provided. None of them requires you or your company scrapping a bunch of information from randoms on the internet. Specially not copyrighted art, literature, or code. And that's the point here, you are acting like all of that stops with these laws but that's ridiculous.
The article is pro corpo, I'm looking at the bill and it's quite clear where it's headed.
None of what I mentioned is possible without the LLM that's at its heart. Just training an LLM is a million or two in compute power. We don't get the next generation for free if laws like this tack on an extra 80 million. 6 million for Reddit and that was when you could scrap it for free, and that's just a drop in the bucket.