LLM for GTX1080 (8GB) for local use. (lemmy.dbzer0.com)

submitted 8 months ago by [email protected] to c/[email protected]

7 comments fedilink hide all child comments

Could someone recommend a LLM for the Nvidia GTX1080? I've used the gptq_model-4bit-128g of Luna AI from the Bloke and i get a response every 30s-60s and only 4-5 prompts before it starts to repeat or hallucinate.

top 7 comments

sorted by: hot top controversial new old

[-] SkySyrup 4 points 8 months ago

try openorca-mistral-7b, it should fit in your GPU. Try using exllama2 to speed up interference.

[-] [email protected] 2 points 8 months ago

thx, this one? https://huggingface.co/TheBloke/Mistral-7B-OpenOrca-GPTQ

[-] SkySyrup 3 points 8 months ago

yeah that should work!

[-] [email protected] 2 points 8 months ago

Yes it does and fits the GPU just fine. Didn't hallucinate but it was slow like 60s+ in the first run but did it's job. Thanks.

[-] SkySyrup 2 points 8 months ago

good to hear it worked, it’s weird it’s so slow. I’m lucky to have access to a 3060, which isn’t that far out from a 1080, and get at least 40t/s on it. Are you running on CPU or are you using exllama?