this post was submitted on 11 Jul 2023
16 points (100.0% liked)
LocalLLaMA
2328 readers
28 users here now
Community to discuss about LLaMA, the large language model created by Meta AI.
This is intended to be a replacement for r/LocalLLaMA on Reddit.
founded 2 years ago
MODERATORS
you are viewing a single comment's thread
view the rest of the comments
view the rest of the comments
Note this is koboldcpp.exe and not KoboldAI.
The Github describes arguments to use GPU acceleration, but it is fuzzy on what the arguments do and completely neglects to mention what the values for those arguments do. I understand the --gpulayers arg, but the two ints after --useclblast are lost on me. I defaulted to "[path]\koboldcpp.exe --useclblast 0 0 --gpulayers 40", but it seems to be completely ignoring GPU acceleration, and I'm clueless where the problem lies. I figured it would be easier to ask for a guide and just start my GGML setup from scratch.
Those are OpenCL platform and device identifiers, you can use clinfo to find out which numbers are what on your system.
Also note that if you're building kobold.cpp yourself, you need to build with LLAMA_CLBLAST=1 for OpenCL support to exist in the first place. Or LLAMA_CUBLAS for CUDA.