this post was submitted on 13 Dec 2024
11 points (92.3% liked)

LocalLLaMA

2292 readers
1 users here now

Community to discuss about LLaMA, the large language model created by Meta AI.

This is intended to be a replacement for r/LocalLLaMA on Reddit.

founded 2 years ago
MODERATORS
 

Howdy!

(moved this comment from the noob question thread because no replies)

I'm not a total noob when it comes to general compute and AI. I've been using online models for some time, but I've never tried to run one locally.

I'm thinking about buying a new computer for gaming and for running/testing/developing LLMs (not training, only inference and in context learning) . My understanding is that ROCm is becoming decent (and I also hate Nvidia) , so I'm thinking that a Radeon Rx 7900 XTX might be a good start. If I buy the right motherboard I should be able to put another XTX in there as well, later. If I use watercooling.

So first, what do you think about this? Are the 24 gigs of VRAM worth the extra bucks? Or should I just go for a mid-range GPU like the Arc B580?

I'm also curious experimenting with a no-GPU setup. I.e. CPU + lots of RAM. What kind of models do you think I'll be able to run, with decent performance, if I have something like a Ryzen 7 9800X3D and 128/256 GB of DDR5? How does it compare to the Radeon RX 7900 XTX? Is it possible to utilize both CPU and GPU when running inference with a single model, or is it either or?

Also.. Is it not better if noobs post questions in the main thread? Then questions will probably reach more people. It's not like there is super much activity..

you are viewing a single comment's thread
view the rest of the comments
[–] [email protected] 2 points 4 days ago (1 children)

i have an XTX. it has a TDP of 400 watts. if you install two of them you've basically built a medium-effect space heater. you'll need shitloads of cooling and a pretty beefy power supply.

performance-wise it's pretty good. over 100 tokens a second with llama3 and it runs SDXL-Turbo about as fast as i can type.

word of warning, if you run Linux you need to manually set the fan curves. i had to RMA my first XTX because it didn't spin the fans up and cooked itself. the VRAM reached 115°C and started failing.

[–] atzanteol 2 points 4 days ago (1 children)

That's crazy - does it not have any thermal protection? I've had CPUs overheat and they tend to throttle/shutdown before I've had anything damaged.

[–] [email protected] 2 points 4 days ago

no, it goes full speed until it dies if the fans don't work. it is limited to 300W from factory but that's about it. it pulls insane amounts of power.

from what I understand the 4090 is worse but that's also much larger so it probably handles the heat better.