Technology

61394 readers

3998 users here now

This is a most excellent place for technology news and articles.

Our Rules

Follow the lemmy.world rules.
Only tech related content.
Be excellent to each other!
Mod approved content bots can post up to 10 articles per day.
Threads asking for personal tech support may be deleted.
Politics threads may be removed.
No memes allowed as posts, OK to post as comments.
Only approved bots from the list below, to ask if your bot can be added please contact us.
Check for duplicates before posting, duplicates may be removed
Accounts 7 days and younger will have their posts automatically removed.

Approved Bots

founded 2 years ago

MODERATORS

[email protected]

489

In Cringe Video, OpenAI CTO Says She Doesn’t Know Where Sora’s Training Data Came From (futurism.com)

submitted 10 months ago by [email protected] to c/[email protected]

161 comments fedilink hide all child comments

(page 2) 50 comments

sorted by: hot top controversial new old

[–] [email protected] 7 points 10 months ago* (last edited 10 months ago)

this is why code AND cloud services shouldn't be copyrightable or licensable without some kind of transparency legislation to ensure people are honest. Either forced open source or some kind of code review submission to a government authority that can be unsealed in legal disputes.

[–] [email protected] 4 points 10 months ago

Obviously nobody fully knows where so much training data come from. They used Web scraping tool like there's no tomorrow before, with that amount if informations you can't tell where all the training material come from. Which doesn't mean that the tool is unreliable, but that we don't truly why it's that good, unless you can somehow access all the layers of the digital brains operating these machines; that isn't doable in closed source model so we can only speculate. This is what is called a black box and we use this because we trust the output enough to do it. Knowing in details the process behind each query would thus be taxing. Anyway...I'm starting to see more and more ai generated content, YouTube is slowly but surely losing significance and importance as I don't search informations there any longer, ai being one of the reasons for this.

[–] mindbleach 3 points 10 months ago* (last edited 10 months ago)

"We let the English robot read every book in the library." <assorted gasps, clutched pearls>

"We showed the drawing robot a bunch of stuff from the internet." <assorted gasps, clutched pearls>

"We... totally didn't show the video robot our DVD collection, probably." <assorted gasps, clutched pearls>

If you can't win, why try?

Of fucking course generative models were trained on copyrighted data. How else would they exist? We only broke through forty years of dead-end tweaking and guesswork by shoveling as much information as possible into the biggest networks we could run.

Now this tech is halfway to magic, and you expect me to dislike it because of copyright? I don't even respect copyright normally! I'm not about to fall for Hollywood screaming about some new advancement, again, if there's essence of Disney in the pile of math that turns "Shrek fighting Vader" into a real video file.

It's been barely two years since this stopped being tiny blurry pictures in research PDFs, and now Sora can spit out a finished high-def video shot with no actors or animators whatsoever. We've made a thousand pictures worth about twenty words. This is fucking awesome and I refuse to pretend otherwise.

These companies don't even give a shit. They used AI to threaten all the people who make their movies the hard way. Studios think this tech will let them get rid of creative people, instead of letting creative people get rid of studios. These empty suits can only imagine you'll give them even more money once you can wish movies into existence.

load more comments