this post was submitted on 21 May 2024
512 points (95.4% liked)
Technology
60828 readers
4313 users here now
This is a most excellent place for technology news and articles.
Our Rules
- Follow the lemmy.world rules.
- Only tech related content.
- Be excellent to each other!
- Mod approved content bots can post up to 10 articles per day.
- Threads asking for personal tech support may be deleted.
- Politics threads may be removed.
- No memes allowed as posts, OK to post as comments.
- Only approved bots from the list below, to ask if your bot can be added please contact us.
- Check for duplicates before posting, duplicates may be removed
Approved Bots
founded 2 years ago
MODERATORS
you are viewing a single comment's thread
view the rest of the comments
view the rest of the comments
The training doesn't use csam, 0% chance big tech would use that in their dataset. The models are somewhat able to link concept like red and car, even if it had never seen a red car before.
Well, with models like SD at least, the datasets are large enough and the employees are few enough that it is impossible to have a human filter every image. They scrape them from the web and try to filter with AI, but there is still a chance of bad images getting through. This is why most companies install filters after the model as well as in the training process.