this post was submitted on 26 Jul 2023
3 points (71.4% liked)

LocalLLaMA

2178 readers
1 users here now

Community to discuss about LLaMA, the large language model created by Meta AI.

This is intended to be a replacement for r/LocalLLaMA on Reddit.

founded 1 year ago
MODERATORS
 

I'm trying to learn more about LLMs, but I haven't found any explanation for what determines which prompt template format a model requires.

For example meta-llama's llama-2 requires this format:

...INST and <> tags, BOS and EOS tokens...

But if I instead download's TheBloke's version of llama-2 the prompt template should instead be:

SYSTEM: ...

USER: {prompt}

ASSISTANT:

I thought this would have been determined how the original training data was formatted, but afaik TheBloke only converted the llama-2 models from one format to another. Looking at the documentation for the GGML format I don't see anything related to the prompt being embedded in the model file.

Anyone who understands this stuff who could point me in the right direction?

you are viewing a single comment's thread
view the rest of the comments
[–] [email protected] 1 points 1 year ago (1 children)

Thanks! I'm going to do some experiments and see if I get different results. I've been using TheBloke's format and it worked mostly well, but perhaps switching to meta-llama's format will eliminate the occasional bugs I've had.

[–] [email protected] 2 points 1 year ago (1 children)

That's probably the most reasonable thing you can do.

I'm not sure how much of a difference we expect from 100% the correct prompt compared to something roughly in that direction. I've been tinkering around with instruction style tuned models (from the previous/first llama) and sometimes it doesn't seem to matter. I also sometimes used a 'wrong' prompt for days and couldn't tell. Maybe the models are 'intelligent' enough to compensate for that. I'm not sure. I usually try to get it right to get all the performance out of it.