this post was submitted on 29 Jan 2025
18 points (95.0% liked)

Investing

834 readers
1 users here now

A community for discussing investing news.

Rules:

  1. No bigotry: Including racism, sexism, homophobia, transphobia, or xenophobia. Code of Conduct.
  2. Be respectful. Everyone should feel welcome here.
  3. No NSFW content.
  4. No Ads / Spamming.
  5. Be thoughtful and helpful: even with ‘stupid’ questions. The world won’t be made better or worse by snarky comments schooling naive newcomers on Lemmy.

founded 2 years ago
MODERATORS
you are viewing a single comment's thread
view the rest of the comments
[–] lurch 3 points 1 week ago (2 children)

it's not open source tho.

and while nvidia stock price tanked a bit, it also had smaller bumps back up again. could even out over the next weeks.

[–] [email protected] 4 points 1 week ago (2 children)

Idk why people keep saying this - they published their methodology and the code that runs the model with the weights. The only things they didn't publish with it are likely copyrighted works that cant be freely shared. It's 'open-sourced' in all the ways that matter

And nvidia bounced after the US signaled intent to block or investigate deepseek, not necessarily because the model isn't a threat

[–] [email protected] 2 points 1 week ago (1 children)

unless i can compile it myself, its not foss

[–] [email protected] 2 points 1 week ago (1 children)

They aren't going to break the law for you. If you want to train your own LLM you'll have to source your own copyrighted dataset for the task.

Jellyfin doesn't come with a media library, you have to 'rip' your own dvd's and home videos. Same deal.

[–] [email protected] 0 points 1 week ago* (last edited 1 week ago) (1 children)

thats precisely my point. if you have to break the law to be able to compile it yourself, its not foss.

even if regular joes like you or me had the means to mass collect the data they did.

[–] [email protected] 1 points 1 week ago

This might be controversial, but you and I both have the means to mass collect data, or find illicit datasets already collected. The kind of data collection that we don't have access to (the kind that's taken from your phone without your consent) isn't really helpful for training LLM's. But, again, if you have the means to replicate their methodology to begin with then you likely already have all of the material. You're not going to recreate their model on consumer hardware anyway.

They're just not advertising where that data is (and neither should anyone here)

if you have to break the law to be able to compile it yourself, its not foss.

Not if you consider apps like jellyfin or plex to be FOSS, but even that comparison is apples and oranges because training a model that big isn't something you can do on your own hardware. Just because they haven't given you the data to alter the model doesn't mean they haven't given you everything you need to use it with your own data and your own hardware. I get that people inherently distrust AI companies (and Chinese companies especially, but I won't get into that here), but I think it's misplaced here.

[–] lurch 1 points 1 week ago (1 children)

a list with references to the training data plus what they added would be the bare minimum to call it open source, in my opinion, but a lot of people see this more strict than i do.

[–] [email protected] 2 points 1 week ago (1 children)

None of the flagship models publish their training data because they're all trained on less-than-legal datasets.

It's a little like complaining that jellyfin doesn't publish any media with their code - not only is that not legal but it's implied that you're responsible for attaining your own.

If you're someone who can and does compile and re-train your own 64B parameter LLM models, you almost certainly have your own dataset for that purpose (in fact huggingface has many).

[–] lurch 1 points 1 week ago

still doesn't make it magically open source.

debian would probably split the package in a non-free and open source part, for this reason.

[–] [email protected] 2 points 1 week ago (1 children)

So I guess you're investing heavily in Nvidia.

[–] lurch 1 points 1 week ago

i hold my two shares of course. i have a very diversified portfolio. i consider stocks gambling and i won't bet everything on one horse. where's the fun in that?