Recent post re: AI as utility

https://www.tomsguide.com/ai/people-will-buy-intelligence-from-us-on-a-meter-chatgpts-ceo-sam-altman-has-critics-worried-with-his-ai-vision

Myself, I’m a fan of local LLM / self hosted ML… but if you ever needed a clarion call that a hard pivot is coming (soon) for online/ cloud based AI…Altman et al are making some concerning mouth noises (to say nothing of broader concerns with OAI, Anthropic etc).

Right now, I’m sketching out a plan where my Raspberry Pi (always on, 2-3w) uses a magic packet to wake up my modest AI server (Lenovo P330 with Tesla P4) if/when needed (Qwen 3.6-35B-A3B); no point in chugging down 80-100w, 24/7 for no good reason.

If the trend continues the direction it appears to be (increasing costs, environmental impacts etc) then I’d feel a lot better hosting my own as port of first call and replacing simpler tasks with more traditional programs. YMMV.

  • irmadlad@lemmy.world
    link
    fedilink
    English
    arrow-up
    16
    arrow-down
    1
    ·
    7 days ago

    People will buy intelligence from us on a meter’

    We have governmental surveillance and we have surveillance capitalism. Surveillance capitalism works so well that governments are now very interested in the data they collect, which is alarming. Unfounded conspiracy theory: It’s probably one of the reasons that governments don’t seem interested in AI’s regulation. If I had the proper equipment to run AI entirely local and efficiently so that the expenditure would justify it, I would.

    • SuspiciousCarrot78@aussie.zoneOP
      link
      fedilink
      English
      arrow-up
      4
      arrow-down
      1
      ·
      7 days ago

      You probably could. A Tesla P4 or P40 (old data centre cards) are more than up to the job. My Lenovo tiny hosts a P4 (card cost $100 on eBay; the lenovo itself was $200ish) and runs Qwen3.5-35B-A3B at about 20 tok/s. Smaller models are even faster.

      https://www.youtube.com/watch?v=8F_5pdcD3HY

      If you’re not bound by the one liter shoebox design, then the P40 is still a great and inexpensive card.

      I think I mentioned elsewhere but right now I’m trying to figure out if I can use a magic packet from the Raspberry Pi to wake up the Lenovo as needed rather than leaving it on all the time.

      • irmadlad@lemmy.world
        link
        fedilink
        English
        arrow-up
        2
        ·
        6 days ago

        Thing is, if I were going to do in house AI, I’d want to do it up right and from what I can gather, a system like that is going to cost me some jack.

  • pogmommy@lemmy.ml
    link
    fedilink
    English
    arrow-up
    14
    arrow-down
    6
    ·
    7 days ago

    My issue with the orphan-crushing machine isn’t only that it’s not in my children’s bedroom

  • sobchak@programming.dev
    link
    fedilink
    English
    arrow-up
    6
    ·
    6 days ago

    I think they know it’s a somewhat viable option and is part of the reason they’re doing the hardware cartel/circlejerk thing.

  • Decronym@lemmy.decronym.xyzB
    link
    fedilink
    English
    arrow-up
    2
    ·
    edit-2
    4 days ago

    Acronyms, initialisms, abbreviations, contractions, and other phrases which expand to something larger, that I’ve seen in this thread:

    Fewer Letters More Letters
    ARP Address Resolution Protocol, translates IPs to MAC addresses
    IP Internet Protocol
    RPi Raspberry Pi brand of SBC
    SBC Single-Board Computer

    [Thread #321 for this comm, first seen 30th May 2026, 09:50] [FAQ] [Full list] [Contact] [Source code]

  • Auli@lemmy.ca
    link
    fedilink
    English
    arrow-up
    6
    arrow-down
    2
    ·
    6 days ago

    Sure but all these self hosted ais are still done by companies who used massive amounts of power and water to train it.

    • KatherinaReichelt@feddit.org
      link
      fedilink
      English
      arrow-up
      9
      arrow-down
      1
      ·
      6 days ago

      Which is an interesting dilemma: Those AIs are already trained. That power and water was used. If you use them, you will not pollute anything. But you may encourage those companies to train another AI

  • Noxy@pawb.social
    link
    fedilink
    English
    arrow-up
    17
    arrow-down
    13
    ·
    7 days ago

    not gonna self host bullshit that wastes resources and makes me dumber.

  • somegeek@programming.dev
    link
    fedilink
    English
    arrow-up
    2
    arrow-down
    1
    ·
    edit-2
    5 days ago

    I started working toward self hosting LLM for my small company using ollama and opencode as agent But I realized a good model like GLM 5 requures 250GB of RAM and 24GB vram with a 4080?? I dont know, this is what the LLM told me itself.

    I ended up using qwen-code2.7-7b-16k.

    Currently the best thing I have is my laptop, 16GB ram, i7 9750H gtx1650

    How do you guys selfhost? What models do you use that are actually good?

    • SuspiciousCarrot78@aussie.zoneOP
      link
      fedilink
      English
      arrow-up
      2
      arrow-down
      1
      ·
      edit-2
      4 days ago

      I mean…that entirely depends on your use case - and I hate saying that. For me and what I do, Qwen SLM (esp Qwen3-4B 2507 instruct and Qwen3.5-2B) are exceptional. But I’m not trying to do Claude at home.

      Best bet? Spend $10 on OpenRouter and try different models. In a head to head with ChatGPT 5.4 mini (excellent for coding BTW), I’ve found Qwen 3.5 27B more than able to hold its own for coding tasks…IF you narrowly gate it/confine it. The last batch of Qwen’s really are something. Dunno about the 3.7 series.

      Having said ALL that, I’m really tempted to go back in time and code myself a deterministic expert system, with user updatable knowledge cascade, tool calling and a minimal amount of Markov chain word garnish for flavour. I think we use to just call that “a program” lol.

      Really tempted actually, because if 50% of llm use case is basically Super Google but not shit…well, I can make that myself. I just need to point my autism at it.

      PS: this might help

      https://www.youtube.com/watch?v=0AqpaFm11oI

  • superglue@lemmy.dbzer0.com
    link
    fedilink
    English
    arrow-up
    5
    arrow-down
    1
    ·
    7 days ago

    Does anyone have a recommendation for a local model that can run well on a 5070 12GB? It pretty much would only get used for help with homelabbing and simple scripts.

    • monoboy@lemmy.zip
      link
      fedilink
      English
      arrow-up
      7
      arrow-down
      1
      ·
      7 days ago

      Qwen 3.6-35B-A3B (which OP mentioned) would work great as long as you have some system RAM to offload it.

    • SuspiciousCarrot78@aussie.zoneOP
      link
      fedilink
      English
      arrow-up
      5
      arrow-down
      1
      ·
      7 days ago

      There’s an argument to be had regarding a MoE versus a small dense model. I guess it depends on what exactly you need doing with it. I would be tempted to run a smaller dense model (like a Qwen 3-14B or a Qwen 3.5 9B) as at a reasonable quant, it might fit mostly or entirely on the GPU, thereby giving you excellent speeds.

      PS: I’m actually in the process of designing an expert system (not a LLM) for pretty much the task you described. The intention is that you would still interact with it like a large language model, but the actual brains underneath it would be something more traditional.

  • heartSagan5@lemmy.zip
    link
    fedilink
    English
    arrow-up
    1
    arrow-down
    1
    ·
    5 days ago

    And are you sure you’r self-hosting or is it a plugin (that you’re self-hosting)? Also, I don’t invite SkyNet into my perimeter.

  • litchralee@sh.itjust.works
    link
    fedilink
    English
    arrow-up
    4
    arrow-down
    7
    ·
    7 days ago

    I’d like to draw a comparison: a cozy wood fire versus central heating. In the right time and place (eg camping in the woods), a wood fire is both very practical and very useful. Meanwhile, most homes built in the past 70+ years in the USA have central heating (or are somewhere that doesn’t need heating at all) and the benefits are quite obvious: automatic temperature regulation, supplied by a utility, and low or no local emissions. And yet, there will still be rural homes that are heated exclusively by a wood stove, located in the middle of the living room, whose iron construction stores and radiates heat well after the fire has gone out.

    Do I bemoan individual homes that use a wood fire? No, not really. The reality is that a grand, overwhelming majority of people don’t have wood fires anymore. Even when air quality is poor, prohibiting wood fires in a few rural homes isn’t exactly what would clear up the air.

    Now, it would be a vastly different story if city-dwellers all had wood fires. When every home in a neighborhood is building and burning a wood fire, the results are disastrous: horrific PM2.5 in the air, soot coating everything, substantially reduced energy efficiency, and mass logging just to keep the wood supply. A mole-hill quickly becomes a mountain of problems when it’s at scale.

    So to that end, I would very much like to see commercial-scale AI reigned in, as the external costs have already gotten out of hand. What they have built is more correctly called a wildfire, not a wood fire. But where does that leave small-scale AI/LLM users? They can weigh the cost/benefits for themselves, provided that they don’t harm other people or resources in the process.

    But that brings us back to a cozy wood fire versus central heating: at small scale, a wood fire struggles to heat an entire modern American home (ie 2500 sq ft; or 232 sq m). Yet central heating does it with ease. Who then will be interested in this endeavor? Probably only those with a love for the camping aesthetic, and other enthusiasts.

    At this point, it has become more clear what the utility of small LLM models is, and they do pale in comparison to larger LLM models. If small LLMs are what sensibly survives into the future, then that’s essentially a cap on their capabilities, given a want to avoid burning the planet to run anything larger. The only way out would be for substantial developments in the energy efficiency of small LLM models, but that’s not where the interest is.

    No one is seeking to build a more efficient wood fire.

    • pound_heap@lemmy.dbzer0.com
      link
      fedilink
      English
      arrow-up
      4
      arrow-down
      1
      ·
      7 days ago

      People are downvoting you, but I like your idea to draw analogy with heating, because it is something most of us rely on, and if LLMs and related technology will keep evolving as they do, probably most of us will rely on it more or less, sooner or later. Regardless of what AI haters would say.

      But your wood fire/central heating analogy is bad. I would compare large LLM vendors to hot water heating utility common in Eastern Europe, and small LLMs to various heating devices. Utility companies can set prices, and decide who gets connected to hot water pipe, and set water temperature. There are regulations that limit the power of such utility companies, allow customers to choose the supplier, etc. Same should happen with LLM providers - competition and anti-monopoly laws should protect customers who choose to use them.

      Alternatively, customers may choose not to use utility-supplied heating. They can purchase space heaters, hand warmers, install split systems, burn wood - they are free to pick technology, power source, size, appearance of such devices. They can take responsibility of heating their homes, willing to invest their time and money in order to be independent of central heating utility. Small LLMs are like that - people can run their own, with capabilities dependent on investment, or they can pay smaller providers or resellers to get more flexibility and/or privacy and avoid capital investments. They could spend time tuning small models and harnesses to do some simple tasks, and they wouldn’t need to “buy intelligence” from OpenAI and others.