this post was submitted on 23 Aug 2024
29 points (85.4% liked)

Linux

48034 readers
1583 users here now

From Wikipedia, the free encyclopedia

Linux is a family of open source Unix-like operating systems based on the Linux kernel, an operating system kernel first released on September 17, 1991 by Linus Torvalds. Linux is typically packaged in a Linux distribution (or distro for short).

Distributions include the Linux kernel and supporting system software and libraries, many of which are provided by the GNU Project. Many Linux distributions use the word "Linux" in their name, but the Free Software Foundation uses the name GNU/Linux to emphasize the importance of GNU software, causing some controversy.

Rules

Related Communities

Community icon by Alpár-Etele Méder, licensed under CC BY 3.0

founded 5 years ago
MODERATORS
 

Should I struggle through constant crashes to get my 7900gre with 16gb of vram working, possibly through the headache of ONNX? Can anyone report their own success or offer advice? AMD on linux is generally lovely, SD with AMD on linux, not so much. It was much better with my RTX2080 on linux but gaming was horrible with NVIDIA drivers. I feel I could do more with the 16GB AMD card if stability wasn't so bad. I currently have both cards running to the horror of my PSU. A1111 does NOT want to see the NVIDIA card, only the AMD. Something about the version of pytorch? More work to be done there.

  • Having a much better time back on Cinnamon default instead of Wayland. Oops!

** It heard me. Crashed again on an x/y plot but due to being away from Wayland I was able to see the terminal dump: amdgpu thermal overload! shutdown initiated! That'll do it! Finally something easy to fix. Wonder why thermal throttling isn't kicking in to control runaway? Will stress it once more and clock the temps this time.

Temps were exceeding 115C, phew! No idea why the default amdgpu driver has no fan control but they're ripping like they should now. Monitoring temps has restored system stability. Using multiple amd/nvidia dedicated venv folders and careful driver choice/installation were the keys to multigpu success.

you are viewing a single comment's thread
view the rest of the comments
[–] [email protected] 1 points 2 months ago (1 children)

If you don't understand how models use host caching, start there.

Outside of that, you're asking people to simplify everything into a quick answer, and there is none.

ONNX is the "universal" standard, ensure you didn't accidentally convert the input model into something else by accident, but more importantly, ensure when you run it and automatically convert, that the works are actually done on the GPU. ONNX defaults to CPU.

[–] [email protected] -1 points 2 months ago (1 children)

I started reading into the ONNX business here https://rocm.blogs.amd.com/artificial-intelligence/stable-diffusion-onnx-runtime/README.html Didn't take long to see that was beyond me. Has anyone distilled an easy to use model converter/conversion process? One I saw required a HF token for the process, yeesh

[–] [email protected] 2 points 2 months ago (1 children)

No.

As I said, you're trying to distill people's profession into an easy to digest guide about "make it work". Nothin like that exists.

Same way you can't just get a job doing "doctor stuff", or "build junk".

[–] [email protected] 2 points 2 months ago

Since only one of us is feeling helpful, here is a 6 minute video for the rest of us to enjoy https://www.youtube.com/watch?v=lRBsmnBE9ZA