this post was submitted on 03 Feb 2024
9 points (100.0% liked)

Free Open-Source Artificial Intelligence

2928 readers
1 users here now

Welcome to Free Open-Source Artificial Intelligence!

We are a community dedicated to forwarding the availability and access to:

Free Open Source Artificial Intelligence (F.O.S.A.I.)

More AI Communities

LLM Leaderboards

Developer Resources

GitHub Projects

FOSAI Time Capsule

founded 2 years ago
MODERATORS
 

I was wondering if anyone here got Code Llama 70b running or knows of any guides/tutorials on how to do so. I tried setting it up myself with a quantized version, and it was able to load but I think I must have misconfigured it since I only got nonsensical results. One thing I definitely don't understand is the templates, did they change those? Also, if this type of post isn't allowed or is off topic please let me know, I have never posted in this sublemmy before.

top 2 comments
sorted by: hot top controversial new old
[–] noneabove1182 2 points 10 months ago

If you're using text generation webui there's a bug where if your max new tokens is equal to your prompt truncation length it will remove all input and therefore just generate nonsense since there's no prompt

Reduce your max new tokens and your prompt should actually get passed to the backend. This is more noticable in models with only 4k context (since a lot of people default max new tokens to 4k)

[–] [email protected] 2 points 10 months ago* (last edited 10 months ago)

I’m just using Ollama with Ollama WebUI. You’ll have to use the right tag when installing Llama to make sure you get 70b.