lollms-webui is the jankiest of the images, but that one's newish to the scene and I'm working with the dev a bit to get it nicer (main current problem is the requirement for CLI prompts which he'll be removing) Koboldcpp and text-gen are in a good place though, happy with how those are running
Selfhosted
A place to share alternatives to popular online services that can be self-hosted without giving up privacy or locking you into a service you don't control.
Rules:
-
Be civil: we're here to support and learn from one another. Insults won't be tolerated. Flame wars are frowned upon.
-
No spam posting.
-
Posts have to be centered around self-hosting. There are other communities for discussing hardware or home computing. If it's not obvious why your post topic revolves around selfhosting, please include details to make it clear.
-
Don't duplicate the full text of your blog or github here. Just post the link for folks to click.
-
Submission headline should match the article title (don’t cherry-pick information from the title to fit your agenda).
-
No trolling.
Resources:
- selfh.st Newsletter and index of selfhosted software and apps
- awesome-selfhosted software
- awesome-sysadmin resources
- Self-Hosted Podcast from Jupiter Broadcasting
Any issues on the community? Report it using the report flag.
Questions? DM the mods!
Thanks! I'll check these out when I get to my server. I host a small LLM that help bots sound more human while going trivial tasks in Twitch.
Awesome work! Going to try out koboldcpp right away. Currently running llama.cpp in docker on my workstation because it would be such a mess to get cuda toolkit installed natively..
Out of curiosity, isn't conda a bit redundant in docker since it already is an isolated environment?
Yes that's a good comment for an FAQ cause I get it a lot and it's a very good question haha. The reason I use it is for image size, the base nvidia devel image is needed for a lot of compilation during python package installation and is huge, so instead I use conda, transfer it to the nvidia-runtime image which is.. also pretty big, but it saves several GB of space so it's a worthwhile hack :)
but yes avoiding CUDA messes on my bare machine is definitely my biggest motivation
I would love to have some GUI with optional vector database support that I could feed my docs into.
You want H2OGPT or just use Langchain with CLI