this post was submitted on 21 Jun 2023
8 points (100.0% liked)
Excellent Reads
1505 readers
1 users here now
Are you tired of clickbait and the current state of journalism? This community is meant to remind you that excellent journalism still happens. While not sticking to a specific topic, the focus will be on high-quality articles and discussion around their topics.
Politics is allowed, but should not be the main focus of the community.
Submissions should be articles of medium length or longer. As in, it should take you 5 minutes or more to read it. Article series’ would also qualify.
Please either submit an archive link, or include it in your summary.
Rules:
- Common Sense. Civility, etc.
- Server rules.
founded 1 year ago
MODERATORS
you are viewing a single comment's thread
view the rest of the comments
view the rest of the comments
Great article. Other discussions on AI training consistently discuss how data collected now from social media might be poisoned and can't inherently be trusted with all the new chatbots and that RLHF will need to be used making it that much more expensive and difficult. The final line of this article puts the problem of data poisoning into full perspective.
I never thought about it like that, but you're right on, the data quality matters. I saw discussion on another board how all of the Reddit data that we use in our searches might become extremely valuable since was majority genuine human.
Of course, obligatory fuck u/spez for his handling of what we all created, but there's no reason we can't do it again here.