this post was submitted on 22 Nov 2024
705 points (98.2% liked)

Comic Strips

12722 readers
2123 users here now

Comic Strips is a community for those who love comic stories.

The rules are simple:

Web of links

founded 1 year ago
MODERATORS
705
submitted 2 days ago* (last edited 2 days ago) by Joker to c/[email protected]
you are viewing a single comment's thread
view the rest of the comments
[–] [email protected] 5 points 1 day ago (2 children)

I don't know much about AI models, but that's still more than other vendors are giving away, right? Especially "Open"AI. A lot of people just care if they can use the model for free.

How useful would the training data be? Training of the largest Llama model was done on a cluster of over 100,000 Nvidia H100s so I'm not sure how many people would want to repeat that.

[–] [email protected] 7 points 1 day ago

scientific institutions and governments could rent enough GPUs to train their own models, with potentially public funding and public accountability, and also it’d be nice to know if the data llama was trained with was literally just facebook user data. i’m not really in the camp of "if user content is on my site then the content belongs to me".

[–] [email protected] 2 points 1 day ago

Without the same training data you wouldn't be able to recreate the results even when having the computing power. Thus it's not fully open source. Training data is a part of the source to create the result, "LLM". It's like having to add your own lines of code to open source program to make it work because the company doesn't provide it.