this post was submitted on 29 Jan 2024
64 points (91.0% liked)

LocalLLaMA

2402 readers
1 users here now

Community to discuss about LLaMA, the large language model created by Meta AI.

This is intended to be a replacement for r/LocalLLaMA on Reddit.

founded 2 years ago
MODERATORS
all 16 comments
sorted by: hot top controversial new old
[–] [email protected] 21 points 11 months ago (2 children)

I don't like to sound like a broken clock, but all the llama models have restrictions on their use that mean they aren't open source.

[–] [email protected] 18 points 11 months ago (1 children)

And they don't provide the source... So it's neither open nor source. I get why and how Meta tries to make themselves look better. And I'm grateful for having access to such models. But I think words have meanings and journalists should do better than repeat that phrasing and help watering down the meaning of 'open source'. (Which technically doesn't mean free or without restrictions, but is often used synonymously.)

[–] planish 7 points 11 months ago (1 children)

Don't they provide the source for the code to actually run the model? Otherwise how are people loading it up and running it? Are they shipping executables along with model weights?

[–] [email protected] 5 points 11 months ago* (last edited 11 months ago)

What they mean by that is probably the fact that you can download the model, run it on your own hardware and adapt it. Contrary to what OpenAI does, who just offer a service and don't give access to the model itself, you can just use ChatGPT through their servers.

Most of the models come with a Github repo with code to run it and benchmarks. But it's more or less just boilerplate code to get it running in one of the well-established machine learning frameworks. Maybe a few customizations and the exact setup to get a new model architecture running. It would usually be something like Huggingface's Transformers library. There are a few other big projects which are used by people. If researchers come up with new maths, concepts and new architectures, it eventually gets implemented there.

But the code that gets released alongside new models it usually meant for scientific repeatability and not necessarily for actual use. It might contain customizations that make it difficult to incorporate it into other things, usually isn't maintained after the release and most of the times it is based on old versions of libraries, that were state of the art when they started with their research. So that's usually not what gets used by people in the end.

Interestingly enough companies all use different phrasing. Mistral AI claims to be commited to be "open & transparent" yet they like to drop torrent files to new models that come with zero explanation and code. And OpenAI still carries the word "open" in their company name, but at this point openness is more a hint of an idea from their very early days.

Anyways, inference code and the model aren't the same thing. It would be more like if we were talking about cake recipes and you provide me with the schematics of a kitchen aid.

[–] [email protected] 1 points 11 months ago (1 children)

How well do the OpenLlama models perform against Llama2? AIUI the training data uses for OpenLlama is the same?

[–] [email protected] 7 points 11 months ago* (last edited 11 months ago)

The training data for OpenLlama is called RedPajama if I'm not mistaken. And a reproduction of what Meta used to train the first LLaMA. Back then they listed the datasets in the scientific paper. Nowadays they and their competitors don't do that anymore.

OpenLlama performs about as good (slightly worse) as the first official LLaMA. And both perform worse than Llama2. It's not day and night, but i think a noticeable improvement. And Llama2 has twice the context length which is a huge improvement for some use-cases.

If you're looking for models with a different license, there are some more. Mistral is Apache 2.0 and there are several more with permissive licenses.

If you're looking for info on what datasets the big players use, forget it (my opinion). The companies are all involved in legal battles over copyright and have stopped publishing what they use. Many (except for Meta) have kept it a (trade) secret from the beginning and never shared such information. It's unscientific because it doesn't allow for repeatability. But AI is expensive and everyone is currently trying to get obscenely rich with it or strives for world domination.

But datasets are available, like the RedPajama one, several other collections for various purposes... Lots of datasets for fine-tuning and a whole community around that. Just for the base/foundation models, we don't have access to a current state of the art dataset for that.

[–] [email protected] 7 points 11 months ago (3 children)

Anyone using the code specific models, -- how are you prompting them? Are you using any integration into vim emacs or other truly open source and offline text editor/IDE; not electron or proton based? I've compiled VS code before, but it is basically useless in that form, and the binary version sends network traffic like crazy.

[–] [email protected] 7 points 11 months ago* (last edited 11 months ago) (2 children)

I've downloaded the 13B codellama from huggingface, passed it my NVIDIA 2070 via cuda, and have interfaced either through the terminal or lmstudio.

Usually my prompts include the specific code block and a wordy explanation about what I'm trying to do.

It's okay, but it's not as accurate as chatgpt, and tends to repeat itself a lot more.

For editor integration, i just opted for codeium in neovim. It's a pretty good alternative to copilot imho.

[–] [email protected] 2 points 11 months ago (1 children)

Hugging face have an llm plug-in for code completion in neovim btw!

[–] [email protected] 1 points 11 months ago* (last edited 11 months ago) (1 children)

Oh nice! Got a link for anyone that comes across this? Save me and others a search plz?

EDIT: NM. Got it. Gonna give it a try later.

LLM powered development for Neovim

[–] [email protected] 2 points 11 months ago (1 children)

If you use ollama you can try to use the fork that I am using. This is my config to make it work: https://github.com/Amzd/nvim.config/blob/main/lua/plugins/llm.lua

[–] [email protected] 0 points 11 months ago

Nice. Thanks. I'll save this post in case I use ollama in the future. Right now I use a codellama model and a mythomax model, but am not running them via a localhost server, just outputted in the terminal or LMStudio.

This looks interesting though. Thanks!

[–] [email protected] 2 points 11 months ago (1 children)

Why use it though if it's not as good, and repeats itself.

[–] [email protected] 9 points 11 months ago* (last edited 11 months ago)

Because it doesn't call out to the internet. I even put lmstudio behind firejail to prevent it from doing so. Thusly any code I feed it (albeit pretty trivial code) doesn't add to chatgpt's overarching data set.

It still can produce usable results. It's just not as consistent. Whenever it gets into a repetitive loop, I just restart it, resetting the initial context, which generally prevents it from repeating itself, at least initially. To be fair, I've also experienced this with chatgpt, just not as often.

TLDR; It's more private and still useful.

[–] [email protected] 2 points 11 months ago

You might like VSCodium with continue or privy