this post was submitted on 28 Jan 2025
10 points (85.7% liked)

Privacy

544 readers
326 users here now

Protect your privacy in the digital world

Welcome! This is a community for all those who are interested in protecting their privacy.

Rules

~PS: Don't be a smartass and try to game the system, we'll know if you're breaking the rules when we see it!~

  1. Be nice, civil and no bigotry/prejudice
  2. No tankies/alt-right fascists. The former can be tolerated but the latter are banned
  3. Stay on topic
  4. Don't promote proprietary software
  5. No crypto
  6. No Xitter links (only allowed when can't fact check any other way, use xcancel)
  7. If you post news exclusive to a country please name it. ~(This isn't a bannable rule, just a recommendation!)~

Related communities

founded 2 months ago
MODERATORS
 

So, I was reading the privacy notice and the terms of use and I did read some sketchy stuff about it (data used in advertising, getting keystroke). How bad is it? Is it like chatgpt or worse? Anything I can do about it?

you are viewing a single comment's thread
view the rest of the comments
[–] [email protected] 1 points 4 days ago (1 children)
[–] [email protected] 0 points 4 days ago (1 children)

It looks like that has 4GB of RAM, so depending on the rest of your system you might actually be able to run some quantised models! You'll need to use software that supports "offloading" the operations to system RAM which don't fit in your GPU's VRAM, like LMStudio (https://lmstudio.ai/).

I recommend checking out Unsloth's models, they specifically try to fine-tune for use on older/slower hardware like yours. Their HuggingFace is here: https://huggingface.co/unsloth

This is the version of Deepseek you'll wanna try: https://huggingface.co/unsloth/DeepSeek-R1-Distill-Llama-8B-GGUF

Click one of the options on the right hand side to download the model file:

Very basically speaking, the lower the "bits", the smaller the file (and the dumber the model), and therefore the less VRAM and system RAM you'll need to run it. If you get one of the 2-bit versions, you might be able to fit the whole thing inside your GPU - the 2-bit models are only ~3.2GB! You can probably run 4-bit though, even on your hardware.

[–] [email protected] 2 points 4 days ago (1 children)

Wow, that's a thorough explanation. Thanks! I also have 16 gigs of ram and an i7 6th gen

[–] [email protected] 1 points 4 days ago* (last edited 4 days ago) (1 children)

No problem - and, that's not thorough, that's the cut down version haha!

Yeah, that hardware's a little old so the token generation might be slow-ish (your RAM speed will make a big difference, so make sure you have the fastest RAM the system will support), but you should be able to run smaller models without issue 😊 Glad to help, I hope you manage to get something up and running!

[–] [email protected] 1 points 2 days ago (1 children)

Unfortunately I'm getting this error :(

[–] [email protected] 1 points 2 days ago (1 children)

From that thread, switching runtimes in LMStudio might help. On Windows the shortcut is apparently Ctrl+shift+R. There are three main kinds: Vulkan, CUDA, and CPU. Vulkan is an AMD thing; CUDA is an nVidia thing; and CPU is a backup to use when the other two aren't working for it is sssslllllooooooowwwwwww.

In the thread one of the posters said they got it running on CUDA, and I imagine that would work well for you since it's an nVidia chip; or, if it's already using CUDA try llama.cpp or Vulkan.

[–] [email protected] 1 points 2 days ago (1 children)

Yeah but for some reason it raises an error :(

[–] [email protected] 1 points 2 days ago (1 children)

Which runtimes did you try, specifically?

[–] [email protected] 1 points 2 days ago (1 children)

Cuda gives the error I told you before, vulkan works once and then it also stops working. I didn't try the CPU cause I thought it would be so slow and there is no point to it

[–] [email protected] 1 points 2 days ago

Okay no worries, I'd at least try llama cpp just to see how fast it is and to verify it works. If it doesn't work or only works once and then quits, maybe the problem is LMStudio. In that case you might want to try GPT4All (https://www.nomic.ai/gpt4all); this is the one I started with way back in the day.

If you care enough to post the logs from LMStudio after it crashes I'm happy to take a look for you and see if I can see the issue, as well 🙌