A new paper suggests diminishing returns from larger and larger generative AI models. Dr Mike Pound discusses.

The Paper (No “Zero-Shot” Without Exponential Data): https://arxiv.org/abs/2404.04125

  • chrash0@lemmy.world
    link
    fedilink
    English
    arrow-up
    7
    arrow-down
    2
    ·
    6 months ago

    gotem!

    seriously tho, you don’t think OpenAI is tracking this? architecural improvements and training strategies are developing all the time

    • barsoap@lemm.eeOP
      link
      fedilink
      English
      arrow-up
      6
      arrow-down
      4
      ·
      6 months ago

      …and aren’t making progress on that front: A linear increase in generalisation still requires a more than linear increase in amount of data.

      Also it’s not btw that we wouldn’t know that our current architectures won’t lead to proper intelligence, tl:dr: While current architectures can learn, and represent information, they cannot develop learning strategies or decide smartly on how to represent a particular bit of information. All the improvement that are happening are on that “how to learn better” area, we have no idea whatsoever how to make the jump on how to teach an AI to learn how to learn. AlphaZero is able to learn rules of a game, yes, but it can’t learn arbitrary information – once you throw something other than a game at it it has no idea how to make sense of anything.

      • chrash0@lemmy.world
        link
        fedilink
        English
        arrow-up
        5
        arrow-down
        4
        ·
        6 months ago

        “we don’t know how” != “it’s not possible”

        i think OpenAI more than anyone knows the challenges with scaling data and training. anyone working on AI knows the line: “a baby can learn to recognize elephants from a single instance”. reducing training data and time is fundamental to advancement. don’t get me wrong, it’s great to put numbers to these things. i just don’t think this paper is super groundbreaking or profound. a bit clickbaity and sensational for Computerphile

        • barsoap@lemm.eeOP
          link
          fedilink
          English
          arrow-up
          3
          arrow-down
          1
          ·
          6 months ago

          …and a baby doesn’t use the same architecture, not even close, as generative AIs. Babies are T3 systems, they aren’t simply systems which have rules on how to learn, they are systems which have rules on how to develop learning strategies that they then use to learn.

          I’m not doubting, in the slightest, that AI can’t get there: It’s definitely possible. It’s just not possible with the current approaches, and the iterative refinements that “oh OpenAI is constantly coming up with new topologies” refers to is just more of the same. Show me a topology that can come up with topologies, then we’ll have a chance to break through the need for exponential amounts of data.