this post was submitted on 12 Apr 2024
1001 points (98.6% liked)

Technology

57472 readers
3622 users here now

This is a most excellent place for technology news and articles.


Our Rules


  1. Follow the lemmy.world rules.
  2. Only tech related content.
  3. Be excellent to each another!
  4. Mod approved content bots can post up to 10 articles per day.
  5. Threads asking for personal tech support may be deleted.
  6. Politics threads may be removed.
  7. No memes allowed as posts, OK to post as comments.
  8. Only approved bots from the list below, to ask if your bot can be added please contact us.
  9. Check for duplicates before posting, duplicates may be removed

Approved Bots


founded 1 year ago
MODERATORS
you are viewing a single comment's thread
view the rest of the comments
[–] [email protected] 2 points 4 months ago

Yeah. So with the pretrained models they aren't instruct tuned so instead of "write an ad for a Coca Cola Twitter post emphasizing the brand focus of 'enjoy life'" you need to do things that will work for autocompletion like:

As an example of our top shelf social media copywriting services, consider the following Cleo winning tweet for the client Coca-Cola which emphasized their brand focus of "enjoy life":

In terms of the pre- and post-processing, you can use cheaper and faster models to just convert a query or response from formatting for the pretrained model into one that is more chat/instruct formatted. You can also check for and filter out jailbreaking or inappropriate content at those layers too.

Basically the pretrained models are just much better at being more 'human' and unless what you are getting them to do is to complete word problems or the exact things models are optimized around currently (which I think poorly map to real world use cases), for a like to like model I prefer the pretrained.

Though ultimately the biggest advantage is the overall model sophistication - a pretrained simpler and older model isn't better than a chat/instruct tuned more modern larger model.