I run my Nvidia stuff in containers to not have to deal with all the stupid shenanigans
The 3060 is a nice cheap one for running okay sized models, but if you can find a way to stretch for a 3090 or a 7900 XTX you'll be able to run these 33B models with decent quant levels
First few quants are up: https://huggingface.co/bartowski/WizardCoder-33B-V1.1-exl2
4.25 should fit nicely into 24gb (3090, 4090)
Smaller sizes still being created, 3.5, 3.0, and 2.4
I live in Ontario where we go down to -30C in the harshest conditions.
We have a heat pump and a furnace and they alternate based on efficiency
Somewhere around -5 to +5 C it switches from the heat pump to the furnace
I think you could get by a bit colder but it really loses out on efficiency vs burning gas unless you invest in a geothermal heat pump
Seems relatively uncensored, willing to answer most questions
It's definitely a little odd.. I'm glad they did any kind of official release for 0.2, but yeah information is sorely lacking and would be nice to have more, especially with how revolutionary the previous one was.. is this incremental? Is it a huge change? Is it just more fine tuning? Did they start from scratch? We'll never know 🤷♂️
The only concern I had was my god is it a lot of faith to put in this random twitter, hope they never get hacked lol, but otherwise yes it's a wonderful idea, would be a good feature for huggingface to speed up downloads/uploads
Yeah this seems less focused on creativity, there's a lot of really good models out there tuned for story telling that will far exceed generalized SoTA models
Better finetuning is such an important factor, i feel like the future is all of us having our own personal tunes for models that work well with our lives, and iterating for learning more basically every day is also really helpful, so the more barriers we can take down the better!
Hmm had interesting results from both of those base models, haven't tried the combo yet, will start some exllamav2 quants to test
What's it doing well at?
quant link for anyone who may want: https://huggingface.co/bartowski/OpenHermes-2.5-neural-chat-7b-v3-1-7B-exl2
I don't have a lot of experience with either at this time, I've used them here and there for programming questions but usually I stick to 7b models because I use them for code completion and I only find that useful if it completes the code before I do lol
That said, I've had overall good answers from either whenever I've decided to pull them out, it feels like wizard coder should be better since it's so much newer but overall it hasn't been that different. Wish phind would release an update :(