LocalLLaMA

2592 readers

13 users here now

Community to discuss about LLaMA, the large language model created by Meta AI.

This is intended to be a replacement for r/LocalLLaMA on Reddit.

founded 2 years ago

MODERATORS

SkySyrup

pax

noneabove1182

Difference between GGML & GPTQ (lemmy.fmhy.ml)

submitted 2 years ago by [email protected] to c/localllama

2 comments fedilink hide all child comments

Apologies for the basic question, but what's the difference between GGML and GPTQ? Do these just refer to different compression methods? Which would you choose if you're using a 3090ti GPU?

you are viewing a single comment's thread
view the rest of the comments

[–] [email protected] 1 points 2 years ago

Also llama.cpp offers very fast performance with the ggmls compared to using transformers, and sometimes faster than ExLlama.

permalink
fedilink
source
parent