I’ve been looking into self-hosting LLMs or stable diffusion models using something like LocalAI and / or Ollama and LibreChat.
Some questions to get a nice discussion going:
- Any of you have experience with this?
- What are your motivations?
- What are you using in terms of hardware?
- Considerations regarding energy efficiency and associated costs?
- What about renting a GPU? Privacy implications?
I’ve installed Ollama on my Gaming Rig (RTX4090 with 128GB ram), M3 MacBook Pro, and M2 MacBook Air. I’m running Open WebUI on my server which can connect to multiple Ollama instances. Open WebUI has it’s own Ollama compatible API which I use for projects. I’ll only boot up my gaming rig if I need to use larger models, otherwise the M3 MacBook Pro can handle most tasks.