LocalLLaMA

2634 readers

43 users here now

Community to discuss about LLaMA, the large language model created by Meta AI.

This is intended to be a replacement for r/LocalLLaMA on Reddit.

founded 2 years ago

MODERATORS

SkySyrup

pax

noneabove1182

Looking for a low-end setup (self.localllama)

submitted 1 year ago by Rez to c/localllama

4 comments fedilink hide all child comments

I have a laptop with a Ryzen 7 5700U, 16 GB ram, running Fedora 38 linux.
I'm looking to run a local uncensored LLM, I'd like to know what would be the best model and software to run it.
I'm currently running KoboldAI and Erebus 2.7b. It's okay in terms of speed, but I'm wondering if there's anything better out there. I guess, I would prefer something that is not web-ui based to lower the overhead, if possible.
I'm not very well versed in all the lingo yet, so please keep it simple.
Thanks!

top 4 comments

sorted by: hot top controversial new old

[–] [email protected] 5 points 1 year ago* (last edited 1 year ago)

Take a look at GPT4All, very user friendly

[–] [email protected] 4 points 1 year ago* (last edited 1 year ago)

I like KoboldCpp. It is easy to set up and runs well with little resources.

With something like that, you should be able to fit a much larger and better model into your RAM. If you use the quantized versions. Look for models in GGUF format on Huggingface. Q4_K_M is a good compromise between size and quality.

Which model depends on your exact use-case. I like Mythomax-L2-13b or Llama2-13B-Tiefighter for roleplay, Mistral 7B (Dolphin 2.1 Mistral 7B) or Toppy-M for more factual things. All of those are uncensored.

[–] [email protected] 3 points 1 year ago

Hope you had some success. Don't hesitate to ask if you have further questions.

[–] [email protected] 1 points 1 year ago

As an alternative you could look at distributed/shared inferencing. There's https://horde.koboldai.net/ (which you probably know), and petals.dev

I haven't tested tho..