PewDiePie releases Codex/ClaudeCode/Cursor killer, Odysseous (FOSS)

appauled@sh.itjust.works · 2 months ago

PewDiePie releases Codex/ClaudeCode/Cursor killer, Odysseous (FOSS)

onlinepersona@programming.dev · 2 months ago

How many GPUs do you even need to have a usable, self-hosted AI? It looks like he has 6 on his rig. Probably each costs 2k or something. That’s not peanuts. I have a 12GB VRAM card. It probably can’t generate anything in any meaningful amount of time. Which brings me to the question: who is this for?

Regardless, impressive what he vibe-coded there.

realitaetsverlust@piefed.zip · 2 months ago

I use an 6700 XTX and it’s working perfectly fine, depending on the model. Gemma4 takes a long time to generate answers, but the Qwen-Series is quick and starts generating answers in ~10 seconds.

onlinepersona@programming.dev · 2 months ago

What’s the quality of the answers though? And how much context can it hold? I imagine it’s only good for small, short questions, but have no concept of what is needed for that.

I’m assuming you’re using a 12b or 24b qwen model. The ones from deepseek go up to hundreds of billions of params and I can’t tell if bigger number is better or just meaningless posturing.

realitaetsverlust@piefed.zip · 2 months ago

I’m using the 35b models.

Quality for qwen is mostly fine - sometimes it does hallucinate some shit while thinking, but it does correct itself almost every time. But the answers itself are, for the most part, precise and useful. Not what you know from the cloud models, obviously, but it’s absolutely fine for everyday use. What is actually annoying is the web search - not sure if that’s a qwen problem or a problem with open webui, but it actually takes a long time to finish the search.

I once had a situation where a model was running into an “infinite loop” while thinking, thinking the same line over and over again. And once, qwen just started outputting chinese halfway through the answer lol.

When it comes to context, I’m gonna be very honest - I don’t know. I have never hit any kind of problems or limits because of that since I’m not using AI over a long term project. I use it for small, concise cases and that’s it.

irmadlad@lemmy.world · 2 months ago

Didn’t downvote. I use AI, and not ashamed of it. I don’t write huge programs and I damn sure don’t release anything to the public mainly because, in the back of my mind, I can just see some poor chap using my code and now smoke is coming out of his server. It works for me. Usually it’s ‘write a script that does _________’ or Docker compose files. It seems pretty accurate for those uses and if I need a bash command sequence explained, it’s good for that too.

I also use AI when I master my audio tracks before I upload them. I am clinically deaf and there are some frequencies that I just can’t hear well enough to make a judgement call. It’s pretty good at that too.

onlinepersona@programming.dev · 2 months ago

Thanks for the response. It’s interesting to read about the experience of others.

Encrypt-Keeper@lemmy.world · 2 months ago

My MacBook Air with 24GB of unified RAM is enough to run something simple and useful.

KyuubiNoKitsune@lemmy.blahaj.zone · 2 months ago

That’s like what, 5 or 6k?

Encrypt-Keeper@lemmy.world · 2 months ago

Like 1k

KyuubiNoKitsune@lemmy.blahaj.zone · 2 months ago

Reasonable price!

new_world_odor@lemmy.world · 2 months ago

I have a rx5600xt (6gb), 32gb ram, ryzen 3600. System hasn’t been updated since i built it during covid. QwenV3-vl35B is the heftiest thing I can run, it gets around 2 tokens/sec, in LM studio. It’s easier than most people seem to think.

onlinepersona@programming.dev · 2 months ago

How do you now run out of RAM? Does it offload to system RAM?

new_world_odor@lemmy.world · 2 months ago

Yes, offloads into system. Oh and i forgot to mention that’s with the context set around 25k. That can vary greatly per model though, it’s taken some experimentation to figure that out.

onlinepersona@programming.dev · 2 months ago

Thank you. That’s good to know.

Dultas@lemmy.world · 2 months ago

I think in one video it looked like 16 cards. I think he did multiple bifurcations of the pcie lanes. I think he is / was using it for protein folding as well.

onlinepersona@programming.dev · 2 months ago

That’s definitely not my level of disposable wealth/income. I can barely afford one card.

PewDiePie releases Codex/ClaudeCode/Cursor killer, Odysseous (FOSS)

PewDiePie releases Codex/ClaudeCode/Cursor killer, Odysseous (FOSS)

MY trillion $Dollar Project is finally OUT!