this post was submitted on 26 Jul 2024
230 points (96.7% liked)

science

14885 readers
305 users here now

A community to post scientific articles, news, and civil discussion.

rule #1: be kind

<--- rules currently under construction, see current pinned post.

2024-11-11

founded 2 years ago
MODERATORS
you are viewing a single comment's thread
view the rest of the comments
[โ€“] [email protected] 8 points 4 months ago* (last edited 4 months ago) (1 children)

In this case, the models are given part of the text from the training data and asked to predict the next word. This appears to work decently well on the pre-2023 internet as it brought us ChatGPT and friends.

This paper is claiming that when you train LLMs on output from other LLMs, it produces garbage. The problem is that the evaluation of the quality of the guess is based on the training data, not some external, intelligent judge.

[โ€“] [email protected] 2 points 4 months ago

ah I get what you're saying., thanks! "Good" means that what the machine outputs should be statistically similar (based on comparing billions of parameters) to the provided training data, so if the training data gradually gains more examples of e.g. noses being attached to the wrong side of the head, the model also grows more likely to generate similar output.