Google’s AI Chatbot Is Trained by Humans Who Say They’re Overworked, Underpaid and Frustrated

LollerCorleone@kbin.social · 1 year ago

Google’s AI Chatbot Is Trained by Humans Who Say They’re Overworked, Underpaid and Frustrated

MoogleMaestro@kbin.social · 1 year ago

Surprising absolutely nobody.

This is going to be a big issue until some major government steps in to do something about the inequities of GAN-based AI training models and the human exploitation it is currently revolving around. Humans should own rights to the inputs being fed into these generative models and companies should be paying royalties to use them!

IncognitoErgoSum@kbin.social · 1 year ago

Wow, that’s a great way to immediately drain all of the potential out of what could be a really amazing technology, and absolutely prevent any open source competitor from ever coming into existince, so in the best case we’ll all be paying google and openAI monthly forever for access to knowledge that ought to be free. What we need are unions and laws that enforce better labor conditions across the board.

MoogleMaestro@kbin.social · edit-2 1 year ago

Wow, that’s a great way to immediately drain all of the potential out of what could be a really amazing technology, and absolutely prevent any open source competitor from ever coming into existince, so in the best case we’ll all be paying google and openAI monthly forever for access to knowledge that ought to be free.

I mean, if people cannot afford to pay for the rights to certain works, they shouldn’t use them as data. It’s actually very simple to say that you need to own the rights to the inputs in order to own the rights over the outputs and I don’t think it “stifles” anything. For example, if you don’t own the right of the original copy of Star Wars, you obviously wouldn’t own any rights over the output of an upscaled Star Wars. Same goes for writing or other “transformative” media and it has been this way for a long time (see: audio sampling)

This would keep AI companies honest. I have no problems with them recreating the voice of darth vader via AI since it was an ethically condoned business and the assets were properly licensed and sourced. Other AI projects haven’t been doing this and voice over artists have been (rightfully) calling them out.

Edit: Also, working in open source means having a proper understanding of licensing and ownership. Open source doesn’t mean “free this and free that” – in fact, many AI based code assistance tools are actually hurting the open source initiative by not properly respecting the license of the code base it’s studying from.

IncognitoErgoSum@kbin.social · 1 year ago

Also, working in open source means having a proper understanding of licensing and ownership. Open source doesn’t mean “free this and free that” – in fact, many AI based code assistance tools are actually hurting the open source initiative by not properly respecting the license of the code base it’s studying from.

Don’t be patronizing. I’ve been involved in open source for 20+ years, and I know plenty about licensing.

What you’re talking about is changing copyright law so that you’ll have to license content in order for an AI to learn concepts from that content (in other words, to be able to summarize it, learn facts from it, learn an art style, and so on). This isn’t how copyright law currently works, and I hope to god it stays that way.

For example, if you don’t own the right of the original copy of Star Wars, you obviously wouldn’t own any rights over the output of an upscaled Star Wars. Same goes for writing or other “transformative” media and it has been this way for a long time (see: audio sampling)

That’s not the same thing as training and AI on Star Wars. If you feed Star Wars into an upscaling AI, the AI is processing each frame and creating an output that’s a derivative work on that frame, and result of that isn’t something you would be allowed to release without a license. If you train it on Star Wars, the AI would learn general concepts from Star Wars, and not be able to produce an upscaled version of the movie verbatim (although depending on the AI, it may be able to produce images in the general style of Star Wars or summarize the movie).

An appropriate analogy for what’s going on here would be reading a book and then talking about the facts I learned from that book, which is in no way a violation of copyright law. If I started quoting long sections of that book verbatim, I would need a license from the author, but that’s not how AI works. It’s not learning the sentences those people type verbatim, it’s picking up concepts and facts from them. Even if I were to memorize the book from cover to cover, I would be in the clear as long as I didn’t actually start reproducing the book in some way. Neural networks are learning machines, not databases. Their purpose isn’t to reproduce information verbatim.

If you’re still not clear on the difference between training on data and processing it, let me know and I’ll try to clarify further.

Google’s AI Chatbot Is Trained by Humans Who Say They’re Overworked, Underpaid and Frustrated

Google’s AI Chatbot Is Trained by Humans Who Say They’re Overworked, Underpaid and Frustrated

Bloomberg - Are you a robot?