Another reason to self host your own AI

SuspiciousCarrot78@aussie.zone · edit-2 2 months ago

Another reason to self host your own AI

superglue@lemmy.dbzer0.com · 2 months ago

Does anyone have a recommendation for a local model that can run well on a 5070 12GB? It pretty much would only get used for help with homelabbing and simple scripts.

monoboy@lemmy.zip · 2 months ago

Qwen 3.6-35B-A3B (which OP mentioned) would work great as long as you have some system RAM to offload it.

SuspiciousCarrot78@aussie.zone · 2 months ago

There’s an argument to be had regarding a MoE versus a small dense model. I guess it depends on what exactly you need doing with it. I would be tempted to run a smaller dense model (like a Qwen 3-14B or a Qwen 3.5 9B) as at a reasonable quant, it might fit mostly or entirely on the GPU, thereby giving you excellent speeds.

PS: I’m actually in the process of designing an expert system (not a LLM) for pretty much the task you described. The intention is that you would still interact with it like a large language model, but the actual brains underneath it would be something more traditional.