LocalLLaMA

3228 readers

1 users here now

Welcome to LocalLLaMA! Here we discuss running and developing machine learning models at home. Lets explore cutting edge open source neural network technology together.

Get support from the community! Ask questions, share prompts, discuss benchmarks, get hyped at the latest and greatest model releases! Enjoy talking about our awesome hobby.

As ambassadors of the self-hosting machine learning community, we strive to support each other and share our enthusiasm in a positive constructive way.

Rules:

Rule 1 - No harassment or personal character attacks of community members. I.E no namecalling, no generalizing entire groups of people that make up our community, no baseless personal insults.

Rule 2 - No comparing artificial intelligence/machine learning models to cryptocurrency. I.E no comparing the usefulness of models to that of NFTs, no comparing the resource usage required to train a model is anything close to maintaining a blockchain/ mining for crypto, no implying its just a fad/bubble that will leave people with nothing of value when it burst.

Rule 3 - No comparing artificial intelligence/machine learning to simple text prediction algorithms. I.E statements such as "llms are basically just simple text predictions like what your phone keyboard autocorrect uses, and they're still using the same algorithms since <over 10 years ago>.

Rule 4 - No implying that models are devoid of purpose or potential for enriching peoples lives.

founded 2 years ago

MODERATORS

Meta releases ‘Code Llama 70B’, an open-source behemoth to rival private AI development (venturebeat.com)

submitted 1 year ago by [email protected] to c/localllama

15 comments fedilink hide all child comments

you are viewing a single comment's thread
view the rest of the comments

[–] [email protected] 18 points 1 year ago (1 children)

And they don't provide the source... So it's neither open nor source. I get why and how Meta tries to make themselves look better. And I'm grateful for having access to such models. But I think words have meanings and journalists should do better than repeat that phrasing and help watering down the meaning of 'open source'. (Which technically doesn't mean free or without restrictions, but is often used synonymously.)

[–] planish 7 points 1 year ago (1 children)

Don't they provide the source for the code to actually run the model? Otherwise how are people loading it up and running it? Are they shipping executables along with model weights?

[–] [email protected] 5 points 1 year ago* (last edited 1 year ago)

What they mean by that is probably the fact that you can download the model, run it on your own hardware and adapt it. Contrary to what OpenAI does, who just offer a service and don't give access to the model itself, you can just use ChatGPT through their servers.

Most of the models come with a Github repo with code to run it and benchmarks. But it's more or less just boilerplate code to get it running in one of the well-established machine learning frameworks. Maybe a few customizations and the exact setup to get a new model architecture running. It would usually be something like Huggingface's Transformers library. There are a few other big projects which are used by people. If researchers come up with new maths, concepts and new architectures, it eventually gets implemented there.

But the code that gets released alongside new models it usually meant for scientific repeatability and not necessarily for actual use. It might contain customizations that make it difficult to incorporate it into other things, usually isn't maintained after the release and most of the times it is based on old versions of libraries, that were state of the art when they started with their research. So that's usually not what gets used by people in the end.

Interestingly enough companies all use different phrasing. Mistral AI claims to be commited to be "open & transparent" yet they like to drop torrent files to new models that come with zero explanation and code. And OpenAI still carries the word "open" in their company name, but at this point openness is more a hint of an idea from their very early days.

Anyways, inference code and the model aren't the same thing. It would be more like if we were talking about cake recipes and you provide me with the schematics of a kitchen aid.