LocalLLaMA

2582 readers

32 users here now

Community to discuss about LLaMA, the large language model created by Meta AI.

This is intended to be a replacement for r/LocalLLaMA on Reddit.

founded 2 years ago

MODERATORS

SkySyrup

pax

noneabove1182

Meta's Llama 3 will force OpenAI and other AI giants to up their game (www.itpro.com)

submitted 10 months ago by [email protected] to c/localllama

10 comments fedilink hide all child comments

top 10 comments

sorted by: hot top controversial new old

[–] [email protected] 7 points 10 months ago (4 children)

I think we need to reevaluate what it means for a model to be FOSS. There isn't a good answer and it would be nice if some free organization would release guidelines on AI

[–] [email protected] 5 points 10 months ago

Reading the license, there's 3 things.

There must be attribution. Finetunes, merges, etc need to have “Llama 3” at the beginning of the model name. This is probably consistent with FOSS.

Your use of Llama has to "adhere to the Acceptable Use Policy for the Llama Materials". AFAIK, it's an open question whether ethical licenses can be considered FOSS.

Finally, you must not use it, if you had more than 700 million active users in March 2024 (the calendar month before the release). I'm not sure about the legal definition of "active user". I doubt it's very many companies, though. In practice, it's probably less of a restriction than copyleft, but still, strictly speaking, that's not FOSS.

[–] [email protected] 3 points 10 months ago* (last edited 10 months ago) (1 children)

I think the OSI started about last June to work on that:

https://opensource.org/deepdive

[–] [email protected] 0 points 10 months ago (1 children)

I won't have much faith in "open source" and the open source initiative is just a money and labor extraction machine

[–] [email protected] 3 points 10 months ago

Just linked them as the OSI was the entity who initially coined the term.

[–] [email protected] 2 points 10 months ago (1 children)

I would think access to the training data, or at least no restrictions on what you can do with the model, would be a good definition.

[–] [email protected] 3 points 10 months ago (1 children)

access to the training data

That's just not realistic. There are too many legal problems with that.

Besides, Llama 3 was trained on 15 trillion tokens. Whatcha gonna do with something like that?

[–] [email protected] 1 points 10 months ago* (last edited 10 months ago)

Hmm. Sure the legal issues is why it is the way it is. It doesn't necessarily mean it should be that way... But it's more complicated than that.

With the dataset, I'm sure people could figure out something to do with it. There are community curated datasets, previous attempts to recreate models like RedPajama... Sure this is a lot more, but other people are making progress, too. And if not that we could at least have a look at it, do some research, statistics... Maybe use parts of it for something else. That's the spirit of the free software movement.

I'm a bit split on the topic. FOSS doesn't translate directly to ML models. Not being able to recreate something isn't how it's supposed to be. But it's not software either and works differently. Releasing datasets would give us some progress and give the tools to other people than just the big tech companies who are free to violate copyright law. But we're still missing the millions to afford the compute to train a model anyways.

[–] [email protected] 2 points 10 months ago* (last edited 10 months ago) (1 children)

No we don't. Someone misusing the term doesn't change what it is. It stands for FREE AND OPEN SOURCE SOFTWARE. If it isn't all three of those things then it isn't FOSS.

[–] [email protected] 3 points 10 months ago

For a model to be free we need the license to permit what exactly? We need to define the problem.