Technology

58011 readers

3147 users here now

This is a most excellent place for technology news and articles.

Our Rules

Follow the lemmy.world rules.
Only tech related content.
Be excellent to each another!
Mod approved content bots can post up to 10 articles per day.
Threads asking for personal tech support may be deleted.
Politics threads may be removed.
No memes allowed as posts, OK to post as comments.
Only approved bots from the list below, to ask if your bot can be added please contact us.
Check for duplicates before posting, duplicates may be removed

Approved Bots

founded 1 year ago

MODERATORS

[email protected]

141

OpenAI being Sued for "Stealing" Peoples Content Online (www.firstpost.com)

submitted 1 year ago by [email protected] to c/[email protected]

49 comments fedilink hide all child comments

OpenAI's ChatGPT and Sam Altman are in massive trouble. OpenAI is getting sued in the US for illegally using content from the internet to train their LLM or large language models

you are viewing a single comment's thread
view the rest of the comments

[–] [email protected] 5 points 1 year ago (2 children)

There's a line here that is a little ambiguous.

If I create a program that's designed to learn to play video games, do I need to specifically get the consent of the developers of all games that I have legal access to? Do I need to be able to redistribute a piece of IP before I can make use of it to train an AI?

That doesn't seem right.

Do I need to own a copyright before I can use Dark Reader on a webpage? To use accessibility software? Ad blockers?

Do I need to own a piece of music in order to learn to play it? To learn about composing from it and take it as a source of inspiration?

It seems to me that if you're putting your content out there for all the world to see, the world seeing that through the lens of a program they wrote and making use of that experience to teach their program to understand language and visual representations ought to be within the realm of the reasonable and expected.

We live in a world where our data is gathered sneakily on a regular basis in order to build massively invasive personality profiles on us that do us no good and make a massive profit for others. Everybody's data is already being stolen. But this uses information that's out there for anyone to take and hands us something of incredible value in return that gives tremendous power to individuals. It learns from us and we learn from it. Seems like a fair trade.

LLMs are a tremendous resource that we really need to protect public access to.

[–] [email protected] 3 points 1 year ago

It's not a fair trade if there is no consent.

[–] [email protected] 2 points 1 year ago

The main difference is that if you as a person learn music, or play video games, or anything else where you take someone else's work and make it your own, you are making it your own.

ChatGPT and other AI like it only regurgitate their training data in a mishmash of almost, but not quite, nonsense. There's no "reimagining" there's no creativity, it's just a literal rehash of the training data.

It's even in the name, the P in ChatGPT stands for "pre-trained". It isn't learning anything new, it's just spitting out bits and pieces of what you originally fed it, and that is copyright infringement with extra steps.