I am just learning in this space and I could be wrong about this one, but... The GGML and GPTQ models are nice for getting started with AI in Oobabooga. The range of models available is kinda odd to navigate and understand in context as far as how they compare and all the different quantization types, settings, and features. I still don't understand a lot of it. One of the main aspects I didn't (still don't fully) understand are how some models do not have a quantization stated like GGML/GPTQ, but still work using Transformers. I tried some of these by chance at first, and avoided them because they take longer to initially load.
Yesterday I created my first LoRAs and learned through trial and error, the only models I can use to train a LoRA on are the ones that use Transformers, and can be set to 8bit mode. Even using GGML/GPTQ models with 8 bit quantization, I could not use them to make a LoRA. It could be my software setup, but I think there is either a fundamental aspect of these models I haven't learned yet, or it is a limitation in Oobabooga's implementation. Either way, the key takeaway is to try making a LoRA with a Transformers based model loaded in Oobabooga, and be sure the "load in 8 bit" box is checked.
I didn't know what to expect with this, and haven't come across many examples, so I put off trying this until now. I have an 12th gen i7 with 20 logical cores and a 16GBV 3080Ti in a laptop. I can convert an entire novel into a text file and load this as raw text (tab) for training in Oobabooga using the default settings. If my machine has some assistance with cooling, I can create the LoRA in 40 minutes using the default settings and a 7B model. This has a mild effect. IIRC the default weight of the LoRA network is 32. If this is turned up to 96-128, it will have a more noticeable effect on personality. It still won't substantially improve the Q&A accuracy, but it may improve the quality to some extent.
I first tested with a relatively small Wikipedia article on Leto II (Dune character) formatted for this purpose manually. This didn't change anything substantially. Then I tried with the entire God Emperor of Dune e-book as raw text. This had garbage results, probably due to all the nonsense before the book even starts, and the terrible text formatting extracted from an eBook. The last dataset I tried was the book text only, with everything reflowed using a Linux bash script I wrote to alter newline characters, spacing, and remove page gaps. Then I manually edited with find and replace to remove special characters and any formatting oddballs I could find. This was the first LoRA I made where the 7B model's tendency to hallucinate seemed more evident than issues with my LoRA. For instance, picking a random name of an irrelevant character that occurs 3 times in 2 sentences of the LoRA text and prompting about it results in random unrelated output. The overall character identity is also weak despite a strong character profile and a 1.8MB text file for the LoRA.
This is just the perspective from a beginner's first attempt. Actually tuning this with a bit of experience will produce far better results. I'm just trying to say, if you're new to this and just poking around, try making a LoRA. It is quite easy to do.