For people self hosting LLMs.. I have a couple docker images I maintain

noneabove1182@sh.itjust.works · 1 year ago

For people self hosting LLMs.. I have a couple docker images I maintain

noneabove1182@sh.itjust.works · edit-2 1 year ago

lollms-webui is the jankiest of the images, but that one’s newish to the scene and I’m working with the dev a bit to get it nicer (main current problem is the requirement for CLI prompts which he’ll be removing) Koboldcpp and text-gen are in a good place though, happy with how those are running

1 year ago

Thanks! I’ll check these out when I get to my server. I host a small LLM that help bots sound more human while going trivial tasks in Twitch.

ffhein@lemmy.world · 1 year ago

Awesome work! Going to try out koboldcpp right away. Currently running llama.cpp in docker on my workstation because it would be such a mess to get cuda toolkit installed natively…

Out of curiosity, isn’t conda a bit redundant in docker since it already is an isolated environment?

noneabove1182@sh.itjust.works · edit-2 1 year ago

Yes that’s a good comment for an FAQ cause I get it a lot and it’s a very good question haha. The reason I use it is for image size, the base nvidia devel image is needed for a lot of compilation during python package installation and is huge, so instead I use conda, transfer it to the nvidia-runtime image which is… also pretty big, but it saves several GB of space so it’s a worthwhile hack :)

but yes avoiding CUDA messes on my bare machine is definitely my biggest motivation

ffhein@lemmy.world · 1 year ago

Ah, nice.

Btw. perhaps you’d like to add:

build: .

to docker-compose.yml so you can just write “docker-compose build” instead of having to do it with a separate docker command. I would submit a PR for it but I have made a bunch of other changes to that file so it’s probably faster if you do it.

Madiator2011@lm.madiator.cloud · 1 year ago

I would love to have some GUI with optional vector database support that I could feed my docs into.

Falcon@lemmy.world · 9 months ago

You want H2OGPT or just use Langchain with CLI