‘Impossible’ to create AI tools like ChatGPT without copyrighted material, OpenAI says::Pressure grows on artificial intelligence firms over the content used to train their products

      • kiagam@lemmy.world
        link
        fedilink
        English
        arrow-up
        14
        ·
        7 months ago

        we should use those who break it as a beacon to rally around and change the stupid rule

      • Grabbels@lemmy.world
        link
        fedilink
        English
        arrow-up
        2
        arrow-down
        2
        ·
        7 months ago

        Except they pocket millions of dollars by breaking that rule and the original creators of their “essential data” don’t get a single cent while their creations indirectly show up in content generated by AI. If it really was about changing the rules they wouldn’t be so obvious in making it profitable, but rather use that money to make it available for the greater good AND pay the people that made their training data. Right now they’re hell-bent in commercialising their products as fast as possible.

        If their statement is that stealing literally all the content on the internet is the only way to make AI work (instead of for example using their profits to pay for a selection of all that data and only using that) then the business model is wrong and illegal. It’s as a simple as that.

        I don’t get why people are so hell-bent on defending OpenAI in this case; if I were to launch a food-delivery service that’s affordable for everyone, but I shoplifted all my ingredients “because it’s the only way”, most would agree that’s wrong and my business is illegal. Why is this OpenAI case any different? Because AI is an essential development? Oh, and affordable food isn’t?

        • afraid_of_zombies@lemmy.world
          link
          fedilink
          English
          arrow-up
          1
          arrow-down
          3
          ·
          7 months ago

          I am not defending OpenAi I am attacking copyright. Do you have freedom of speech if you have nothing to say? Do you have it if you are a total asshole? Do you have it if you are the nicest human who ever lived? Do you have it and have no desire to use it?

  • unreasonabro@lemmy.world
    link
    fedilink
    English
    arrow-up
    36
    ·
    7 months ago

    finally capitalism will notice how many times it has shot up its own foot with their ridiculous, greedy infinite copyright scheme

    As a musician, people not involved in the making of my music make all my money nowadays instead of me anyway. burn it all down

  • Blackmist@feddit.uk
    link
    fedilink
    English
    arrow-up
    29
    arrow-down
    3
    ·
    7 months ago

    Maybe you shouldn’t have done it then.

    I can’t make a Jellyfin server full of content without copyrighted material either, but the key difference here is I’m not then trying to sell that to investors.

      • Shazbot@lemmy.world
        link
        fedilink
        English
        arrow-up
        14
        ·
        7 months ago

        Reading these comments has shown me that most users don’t realize that not all working artists are using 1099s and filing as an individual. Once you have stable income and assets (e.g. equipment) there are tax and legal benefits to incorporating your business. Removing copyright protections for large corporations will impact successful small artists who just wanted a few tax breaks.

      • BURN@lemmy.world
        link
        fedilink
        English
        arrow-up
        7
        arrow-down
        1
        ·
        7 months ago

        They protect artists AND protect corporations, and you can’t have one without the other. It’s much better the way it is compared to no copyright at all.

          • BURN@lemmy.world
            link
            fedilink
            English
            arrow-up
            7
            arrow-down
            1
            ·
            7 months ago

            They’re screwed less than they would be if copyright was abolished. It’s not a perfect system by far, but over restrictive is 100x better than an open system of stealing from others.

            • agitatedpotato@lemmy.world
              link
              fedilink
              English
              arrow-up
              1
              ·
              edit-2
              7 months ago

              So without copyright, if an artist makes a cool picture and coca cola uses it to sell soda and decided not to give the artist any money, now they have no legal recourse, and that’s better? I don’t think the issue is as much copyright inherently, as much as it is who holds and enforces those rights. If all copyrights were necessarily held by the people who actually made what is copy-written, much of the problems would be gone.

      • HelloThere@sh.itjust.works
        link
        fedilink
        English
        arrow-up
        4
        ·
        edit-2
        7 months ago

        I’m no fan of the current copyright law - the Statute of Anne was much better - but let’s not kid ourselves that some of the richest companies in the world have any desire what so ever to change it.

        • Gutless2615@ttrpg.network
          link
          fedilink
          English
          arrow-up
          1
          arrow-down
          3
          ·
          7 months ago

          My brother in Christ I’m begging you to look just a little bit into the history of copyright expansion.

      • Fisk400@feddit.nu
        link
        fedilink
        English
        arrow-up
        1
        ·
        7 months ago

        As long as capitalism exist in society, just being able go yoink and taking everyone’s art will never be a practical rule set.

    • S410@lemmy.ml
      link
      fedilink
      English
      arrow-up
      1
      arrow-down
      1
      ·
      7 months ago

      Every work is protected by copyright, unless stated otherwise by the author.
      If you want to create a capable system, you want real data and you want a wide range of it, including data that is rarely considered to be a protected work, despite being one.
      I can guarantee you that you’re going to have a pretty hard time finding a dataset with diverse data containing things like napkin doodles or bathroom stall writing that’s compiled with permission of every copyright holder involved.

      • HelloThere@sh.itjust.works
        link
        fedilink
        English
        arrow-up
        2
        ·
        edit-2
        7 months ago

        I never said it was going to be easy - and clearly that is why OpenAI didn’t bother.

        If they want to advocate for changes to copyright law then I’m all ears, but let’s not pretend they actually have any interest in that.

    • NeatNit@discuss.tchncs.de
      link
      fedilink
      English
      arrow-up
      27
      arrow-down
      1
      ·
      7 months ago

      hijacking this comment

      OpenAI was IMHO well within its rights to use copyrighted materials when it was just doing research. They were* doing research on how far large language models can be pushed, where’s the ceiling for that. It’s genuinely good research, and if copyrighted works are used just to research and what gets published is the findings of the experiments, that’s perfectly okay in my book - and, I think, in the law as well. In this case, the LLM is an intermediate step, and the published research papers are the “product”.

      The unacceptable turning point is when they took all the intermediate results of that research and flipped them into a product. That’s not the same, and most or all of us here can agree - this isn’t okay, and it’s probably illegal.

      * disclaimer: I’m half-remembering things I’ve heard a long time ago, so even if I phrase things definitively I might be wrong

      • dasgoat@lemmy.world
        link
        fedilink
        English
        arrow-up
        1
        arrow-down
        1
        ·
        edit-2
        7 months ago

        True, with the acknowledgement that this was their plan all along and the research part was always intended to be used as a basis for a product. They just used the term ‘research’ as a workaround that allowed them to do basically whatever to copyrighted materials, fully knowing that they were building a marketable product at every step of their research

        That is how these people essentially function, they’re the tax loophole guys that make sure you and I pay less taxes than Amazon. They are scammers who have no regard for ethics and they can and will use whatever they can to reach their goal. If that involves lying about how you’re doing research when in actuality you’re doing product development, they will do that without hesitation. The fact that this product now exists makes it so lawmakers are now faced with a reality where the crimes are kind of past and all they can do is try and legislate around this thing that now exists. And they will do that poorly because they don’t understand AI.

        And this just goes into fraud in regards to research and copyright. Recently it came out that LAION-5B, an image generator that is part of Stable Diffusion, was trained on at least 1000 images of child pornography. We don’t know what OpenAI did to mitigate the risk of their seemingly indiscriminate web scrapers from picking up harmful content.

        AI is not a future, it’s a product that essentially functions to repeat garbled junk out of things we have already created, all the while creating a massive burden on society with its many, many drawbacks. There are little to no arguments FOR AI, and many, many, MANY to stop and think about what these fascist billionaire ghouls are burdening society with now. Looking at you, Peter Thiel. You absolute ghoul.

        • NeatNit@discuss.tchncs.de
          link
          fedilink
          English
          arrow-up
          2
          arrow-down
          1
          ·
          7 months ago

          True, with the acknowledgement that this was their plan all along and the research part was always intended to be used as a basis for a product. They just used the term ‘research’ as a workaround that allowed them to do basically whatever to copyrighted materials, fully knowing that they were building a marketable product at every step of their research

          I really don’t think so. I do believe OpenAI was founded with genuine good intentions. But around the time it transitioned from a non-profit to a for-profit, those good intentions were getting corrupted, culminating in the OpenAI of today.

          The company’s unique structure, with a non-profit’s board of directors controlling the company, was supposed to subdue or prevent short-term gain interests from taking precedence over long-term AI safety and other such things. I don’t know any of the details beyond that. We all know it failed, but I still believe the whole thing was set up in good faith, way back when. Their corruption was a gradual process.

          There are little to no arguments FOR AI

          Outright not true. There’s so freaking many! Here’s some examples off the top of my head:

          • Just today, my sister told me how ChatGPT (her first time using it) identified a song for her based on her vague description of it. She has been looking for this song for months with no success, even though she had pretty good key details: it was a duet, released around 2008-2012, and she even remembered a certain line from it. Other tools simply failed, and ChatGPT found it instantly. AI is just a great tool for these kinds of tasks.
          • If you have a huge amount of data to sift through, looking for something specific but that isn’t presented in a specific format - e.g. find all arguments for and against assisted dying in this database of 200,000 articles with no useful tags - then AI is the perfect springboard. It can filter huge datasets down to just a tiny fragment, which is small enough to then be processed by humans.
          • Using AI to identify potential problems and pitfalls in your work, which can’t realistically be caught by directly programmed QA tools. I have no particular example in mind right now, unfortunately, but this is a legitimate use case for AI.
          • Also today, I stumbled upon Rapid, a map editing tool for OpenStreetMap which uses AI to predict and suggest things to add - with the expectation that the user would make sure the suggestions are good before accepting them. I haven’t formed a full opinion about it in particular (and especially wary because it was made by Facebook), but these kinds of productivity boosters are another legitimate use case for AI. Also in this category is GitHub’s Copilot, which is its own can of worms, but if Copilot’s training data wasn’t stolen the way it was, I don’t think I’d have many problems with it. It looks like a fantastic tool (I’ve never used it myself) with very few downsides for society as a whole. Again, other than the way it was trained.

          As for generative AI and pictures especially, I can’t as easily offer non-creepy uses for it, but I recommend you see this video which takes a very frank take on the matter: https://nebula.tv/videos/austinmcconnell-i-used-ai-in-a-video-there-was-backlash if you have access to Nebula, https://www.youtube.com/watch?v=iRSg6gjOOWA otherwise.
          Personally I’m still undecided on this sub-topic.

          Deepfakes etc. are just plain horrifying, you won’t hear me give them any wiggle room.

          Don’t get me wrong - I am not saying OpenAI isn’t today rotten at the core - it is! But that doesn’t mean ALL instances of AI that could ever be are evil.

          • dasgoat@lemmy.world
            link
            fedilink
            English
            arrow-up
            1
            arrow-down
            3
            ·
            7 months ago

            ‘It’s just this one that is rotten to the core’

            ‘Oh and this one’

            ‘Oh this one too huh’

            ‘Oh shit the other one as well’

            Yeah you’re not convincing me of shit. I haven’t even mentioned the goddamn digital slavery these operations are running, or how this shit is polluting our planet so someone somewhere can get some AI Childporn? Fuck that shit.

            You’re afraid to look behind the curtains because you want to ride the hypetrain. Have fun while it lasts, I hope it burns every motherfucker who thought this shit was a good idea to the motherfucking ground.

  • McArthur@lemmy.world
    link
    fedilink
    English
    arrow-up
    22
    arrow-down
    2
    ·
    7 months ago

    It feels to be like every other post on lemmy is taking about how copyright is bad and should be changed, or piracy is caused by fragmentation and difficulty accessing information (streaming sites). Then whenever this topic comes up everyone completely flips. But in my mind all this would do is fragment the ai market much like streaming services (suddenly you have 10 different models with different licenses), and make it harder for non mega corps without infinite money to fund their own llms (of good quality).

    Like seriously, can’t we just stay consistent and keep saying copyright bad even in this case? It’s not really an ai problem that jobs are effected, just a capitalism problem. Throw in some good social safety nets and tax these big ai companies and we wouldn’t even have to worry about the artist’s well-being.

    • HiddenLayer5@lemmy.ml
      link
      fedilink
      English
      arrow-up
      18
      ·
      edit-2
      7 months ago

      I think looking at copyright in a vacuum is unhelpful because it’s only one part of the problem. IMO, the reason people are okay with piracy of name brand media but are not okay with OpenAI using human-created artwork is from the same logic of not liking companies and capitalism in general. People don’t like the fact that AI is extracting value from individual artists to make the rich even richer while not giving anything in return to the individual artists, in the same way we object to massive and extremely profitable media companies paying their artists peanuts. It’s also extremely hypocritical that the government and by extention “copyright” seems to care much more that OpenAI is using name brand media than it cares about OpenAI scraping the internet for independent artists’ work.

      Something else to consider is that AI is also undermining copyleft licenses. We saw this in the GitHub Autopilot AI, a 100% proprietary product, but was trained on all of GitHub’s user-generated code, including GPL and other copyleft licensed code. The art equivalent would be CC-BY-SA licenses where derivatives have to also be creative commons.

      • McArthur@lemmy.world
        link
        fedilink
        English
        arrow-up
        1
        ·
        7 months ago

        Maybe I’m optimistic but I think your comparison to big media companies paying their artist’s peanuts highlights to me that the best outcome is to let ai go wild and just… Provide some form of government support (I don’t care what form, that’s another discussion). Because in the end the more stuff we can train ai on freely the faster we automate away labour.

        I think another good comparison is reparations. If you could come to me with some plan that perfectly pays out the correct amount of money to every person on earth that was impacted by slavery and other racist policies to make up what they missed out on, ids probably be fine with it. But that is such a complex (impossible, id say) task that it can’t be done, and so I end up being against reparations and instead just say “give everyone money, it might overcompensate some, but better that than under compensating others”. Why bother figuring out such a complex, costly and bureaucratic way to repay artists when we could just give everyone robust social services paid for by taxing ai products an amount equal to however much money they have removed from the work force with automation.

    • MrSqueezles@lemm.ee
      link
      fedilink
      English
      arrow-up
      5
      arrow-down
      1
      ·
      7 months ago

      Journalist: Read a press release. Write it in my own words. See some Tweets. Put them together in a page padded with my commentary. Learn from, reference, and quote copyrighted material everywhere.

      AI

      I do that too.

      Journalists

      How dare AI learn! Especially from copyrighted material!

      • Boiglenoight@lemmy.world
        link
        fedilink
        English
        arrow-up
        2
        arrow-down
        1
        ·
        7 months ago

        Journalists need to survive. AI is a tool for profit, with no need to eat, sleep, pay for kids clothes or textbooks.

    • rottingleaf@lemmy.zip
      link
      fedilink
      English
      arrow-up
      1
      ·
      7 months ago

      Which jobs are going to be affected really?

      One thing is for certain, the “open” web is going to become a junkyard even more than it is now.

  • afraid_of_zombies@lemmy.world
    link
    fedilink
    English
    arrow-up
    28
    arrow-down
    9
    ·
    7 months ago

    If the copyright people had their way we wouldn’t be able to write a single word without paying them. This whole thing is clearly a fucking money grab. It is not struggling artists being wiped out, it is big corporations suing a well funded startup.

  • Treczoks@lemmy.world
    link
    fedilink
    English
    arrow-up
    20
    arrow-down
    3
    ·
    7 months ago

    If a business relies on breaking the law as a fundament of their business model, it is not a business but an organized crime syndicate. A Mafia.

  • phillaholic@lemm.ee
    link
    fedilink
    English
    arrow-up
    19
    arrow-down
    4
    ·
    7 months ago

    A ton of people need to read some basic background on how copyright, trademark, and patents protect people. Having none of those things would be horrible for modern society. Wiping out millions of jobs, medical advancements, and putting control into the hands of companies who can steal and strongarm the best. If you want to live in a world run by Mafia style big business then sure.

    • 31337@sh.itjust.works
      link
      fedilink
      English
      arrow-up
      2
      ·
      7 months ago

      Meh, patents are monopolies over ideas, do much more harm than good, and help big business much more than they help the little guy. Being able to own an idea seems crazy to me.

      I marginally support copyright laws, just because they provide a legal framework to enforce copyleft licenses. Though, I think copyright is abused too much on places like YouTube. In regards to training generative AI, the goal is not to copy works, and that would make the model’s less useful. It’s very much fair use.

      Trademarks are generally good, but sometimes abused as well.

      • phillaholic@lemm.ee
        link
        fedilink
        English
        arrow-up
        1
        ·
        7 months ago

        Patents don’t let you own an idea. They give you an exclusive right to use the idea for a limited time in exchange for detailed documentation on how your idea works. Once the patent expires everyone can use it. But while it’s under patent anyone can look up the full documentation and learn from it. Without this, big business could reverse engineer the little guys invention and just steal it.

        • 31337@sh.itjust.works
          link
          fedilink
          English
          arrow-up
          1
          ·
          edit-2
          7 months ago

          Goes both ways. As someone who has tried bringing new products to market, it’s extremely annoying that nearly everything you can think of already has similar patent. I’ve also reverse engineered a few things (circuits and disassembled code), as a little guy, working for a small business . I don’t think people usually scan patents to learn things, and reverse engineering usually isn’t too hard.

          If I were a capitalist, I’d argue that if a big business “steals” an idea, and implements it more effectively and efficiently than the small business, then the small business should probably fail.

          • phillaholic@lemm.ee
            link
            fedilink
            English
            arrow-up
            1
            ·
            7 months ago

            Amazon is practically a case study on your last point. They routinely copy competitors products that use their platform to sell, taking most of the profits for themselves and sometimes putting those others out of business. I don’t see that as a good thing, it’s anticompetitive and eventually the big business just squeezes for more profit.

    • xenoclast@lemmy.world
      link
      fedilink
      English
      arrow-up
      2
      ·
      edit-2
      7 months ago

      I agree with you on part …It’s moot anyway. It’s the current law of the land. The glue of society and all that. It’s illegal now so they shouldn’t do it.

      If you have enough money (required) and make a solid legal argument to change the laws (optional: depends on how much money you start with) then they can do it… But for now they should STFU and shut the fuck down.

    • BlueMagma@sh.itjust.works
      link
      fedilink
      English
      arrow-up
      2
      arrow-down
      1
      ·
      7 months ago

      I see and understand your point regarding trademark, but I don’t understand how removing copyright or patents would have this effect, could you elaborate ?

      • mihnt@lemmy.world
        link
        fedilink
        English
        arrow-up
        13
        ·
        7 months ago

        Small business comes up with something, big business takes idea and puts it in all their stores/factories. Small business loses out because they can’t compete. Small business goes poof trying to compete.

        • BlueMagma@sh.itjust.works
          link
          fedilink
          English
          arrow-up
          4
          arrow-down
          2
          ·
          7 months ago

          Is it not what is already happening with our current system ? The little guy never have the ressources to fight legal battle against the big guy and enforce it’s “intellectual property”.

          And the opposite would be true in a world without patent, small businesses could win because they would be free to reuse and adapt big businesses’ ideas.

          It feels very simplistic to reduce patents to “protection of the little business”, in our current world they mostly protect the big ones.

          Also this small example doesn’t elaborate about how removing copyrights would so negatively affects our society

          • aesthelete@lemmy.world
            link
            fedilink
            English
            arrow-up
            6
            ·
            edit-2
            7 months ago

            There’s a reason why the sharks on shark tank ask if ideas are patented. Without a patent, your idea can be ripped off without any recompense.

            Sure there are problems with some patents, such as software patents, but the system should be reformed rather than completely tossed.

          • BURN@lemmy.world
            link
            fedilink
            English
            arrow-up
            6
            ·
            7 months ago

            I mean we’ve seen it work multiple times against Apple where a smaller company has been able to enforce their patent against them.

          • mihnt@lemmy.world
            link
            fedilink
            English
            arrow-up
            2
            ·
            7 months ago

            Well, I was just giving an example of something that is bad about not having a patent system. Personally, I think the patent system is good thing, but it needs a lot of reworking and we don’t and probably won’t ever have the proper government to fix it what with all the big businesses living in the politician’s pockets.

  • Venia Silente@lemm.ee
    link
    fedilink
    English
    arrow-up
    18
    arrow-down
    4
    ·
    7 months ago

    “Impossible”? They just need to ask for permission from each source. It’s not like they don’t already know who the sources are, since the AIs are issuing HTTP(S) requests to fetch them.

  • kingthrillgore@lemmy.ml
    link
    fedilink
    English
    arrow-up
    11
    ·
    7 months ago

    Its almost like we had a thing where copyrighted things used to end up but they extended the dates because money

  • randon31415@lemmy.world
    link
    fedilink
    English
    arrow-up
    6
    arrow-down
    1
    ·
    7 months ago

    I wonder if the act of picking cotton was copyrighted, would we had got the cotton gin? We have automated most non-creative pursues and displaced their workers. Is it because people can take joy out of creative pursues that we balk at the automation? If you have a particular style in picking items to fulfill Amazon orders, should that be copyrighted and protected from being used elsewhere?