The problem with that approach is that the resulting AI doesn’t contain any identifiable “copies” of the material that was used to train it. No copying, no copyright. The AI model is not a legally recognizable derivative work.
If the future output of the model that happens to sound very similar to the original voice actor counts as a copyright violation, then human sound-alikes and impersonators would also be in violation and things become a huge mess.
The problem with that approach is that the resulting AI doesn’t contain any identifiable “copies” of the material that was used to train it. No copying, no copyright. The AI model is not a legally recognizable derivative work.
That’s a HUGE assumption you’ve made, and certainly not something that has been tested in court, let alone found to be true.
In the context of existing legal precedent, there’s an argument to be made that the resulting model is itself a derivative work of the copyright-protected works, even if it does not literally contain an identifiable copy, as it is a derivative of the work in the common meaning of the term.
If the future output of the model that happens to sound very similar to the original voice actor counts as a copyright violation, then human sound-alikes and impersonators would also be in violation and things become a huge mess.
A key distinction here is that a human brain is not a work, and in that sense, a human brain learning things is not a derivative work.
No, I know how these neural nets are trained and how they’re structured. They really don’t contain any identifiable copies of the material used to train it.
and certainly not something that has been tested in court
Sure, this is brand new tech. It takes time for the court cases to churn their way through the system. If that’s going to be the ultimate arbiter, though, then what’s to discuss in the meantime?
No, I know how these neural nets are trained and how they’re structured. They really don’t contain any identifiable copies of the material used to train it.
Go back and read my comment in full, please. I addressed that directly.
Also, neural network weights are just a bunch of numbers, and I’m pretty sure data can’t be copyrighted. And yes, images and sounds and video stored on a computer are numbers too, but those can be played back or viewed by a human in a meaningful way, and as such represent a work.
Also, neural network weights are just a bunch of numbers, and I’m pretty sure data can’t be copyrighted.
Just being “a bunch of numbers” doesn’t stop it from being a work, it doesn’t stop it from being a derivative work, and you absolutely can copyright data – all digitally encoded works are “just data”.
A trained AI is not a measurement of the natural world. It is a thing that has been created from the processing of other things – in the common sense of it the word, it is derivative of those works. What remains, IMO, is the question of if it would be a work, or something else, and if that something else would be distinct enough from being a work to matter.
Just being “a bunch of numbers” doesn’t stop it from being a work, it doesn’t stop it from being a derivative work
I suggest reading my entire comment.
A trained AI is not a measurement of the natural world. It is a thing that has been created from the processing of other things – in the common sense of it the word, it is derivative of those works. What remains, IMO, is the question of if it would be a work, or something else, and if that something else would be distinct enough from being a work to matter.
It’s only a work if your brain is a work. We agree that in a digitized picture, those numbers represent the picture itself and thus constitute a work (which you would have known if you read beyond the first sentence of my comment). The weights that make up a neural network represent encodings into neurons, and as such should be treated the same way as neural encodings in a brain.
I did, buddy. You’re just wrong. You can copyright data. A work can be “just data”. Again, we’re not talking about a set of measurements of the natural world.
It’s only a work if your brain is a work. (…) The weights that make up a neural network represent encodings into neurons, and as such should be treated the same way as neural encodings in a brain.
Okay, I see how you have the hot take that a generative model is brain-like to you, but that’s a hot take – it’s not a legally accepted fact that a trained model is not a work.
You understand that, right? You do get that this hasn’t been debated in court, and what you think is correct is not necessarily how the legal system will rule on the matter, yeah?
Because the argument that a trained generative model is a work is also pretty coherent. It’s a thing that you can distribute, even monetise. It isn’t a person, it isn’t an intelligence, it’s essentially part of a program, and it’s the output of labour performed by someone.
The fact that something models neurons does not mean it can’t be a work. That’s not… coherent. You’ve jumped from A to Z and your argument to get there is “human brain has neurons”. Like, okay? Does that somehow mean anything that is vaguely neuron-like is not a work? So if I make a mechanical neuron, I can’t copyright it? I can’t patent it?
I’m finally reading a comment from someone who actually knows how machine learning works. Too many people craft their argument before learning about the technology. Well, they think reading a few blog articles counts as research maybe.
That’s a decent theoretical legal basis, but the voice lines are property of the game company rather than the voice actors.
If this interpretation of copyright law on AI models will be the outcome of the two (three?) big AI lawsuits related to stable diffusion, most AI companies will be completely fucked. Everything from Stable Diffusion to ChatGPT 4 will instantly be in trouble.
Making derivatives of existing game assets is a core part of modding. I don’t see how this is any different from splicing existing voice lines to make them say whatever you want them to say.
Maybe it’s morally wrong to use the work of voice actors for NSFW purposes without their consent, but I’m not sure if it’s illegal from a copyright standpoint.
The legal grounds are that the AI is trained using voice lines that can indeed be copyrighted material. Not the voice itself, but the delivered lines.
The problem with that approach is that the resulting AI doesn’t contain any identifiable “copies” of the material that was used to train it. No copying, no copyright. The AI model is not a legally recognizable derivative work.
If the future output of the model that happens to sound very similar to the original voice actor counts as a copyright violation, then human sound-alikes and impersonators would also be in violation and things become a huge mess.
That’s a HUGE assumption you’ve made, and certainly not something that has been tested in court, let alone found to be true.
In the context of existing legal precedent, there’s an argument to be made that the resulting model is itself a derivative work of the copyright-protected works, even if it does not literally contain an identifiable copy, as it is a derivative of the work in the common meaning of the term.
A key distinction here is that a human brain is not a work, and in that sense, a human brain learning things is not a derivative work.
No, I know how these neural nets are trained and how they’re structured. They really don’t contain any identifiable copies of the material used to train it.
Sure, this is brand new tech. It takes time for the court cases to churn their way through the system. If that’s going to be the ultimate arbiter, though, then what’s to discuss in the meantime?
Go back and read my comment in full, please. I addressed that directly.
Also, neural network weights are just a bunch of numbers, and I’m pretty sure data can’t be copyrighted. And yes, images and sounds and video stored on a computer are numbers too, but those can be played back or viewed by a human in a meaningful way, and as such represent a work.
Just being “a bunch of numbers” doesn’t stop it from being a work, it doesn’t stop it from being a derivative work, and you absolutely can copyright data – all digitally encoded works are “just data”.
A trained AI is not a measurement of the natural world. It is a thing that has been created from the processing of other things – in the common sense of it the word, it is derivative of those works. What remains, IMO, is the question of if it would be a work, or something else, and if that something else would be distinct enough from being a work to matter.
I suggest reading my entire comment.
It’s only a work if your brain is a work. We agree that in a digitized picture, those numbers represent the picture itself and thus constitute a work (which you would have known if you read beyond the first sentence of my comment). The weights that make up a neural network represent encodings into neurons, and as such should be treated the same way as neural encodings in a brain.
I did, buddy. You’re just wrong. You can copyright data. A work can be “just data”. Again, we’re not talking about a set of measurements of the natural world.
Okay, I see how you have the hot take that a generative model is brain-like to you, but that’s a hot take – it’s not a legally accepted fact that a trained model is not a work.
You understand that, right? You do get that this hasn’t been debated in court, and what you think is correct is not necessarily how the legal system will rule on the matter, yeah?
Because the argument that a trained generative model is a work is also pretty coherent. It’s a thing that you can distribute, even monetise. It isn’t a person, it isn’t an intelligence, it’s essentially part of a program, and it’s the output of labour performed by someone.
The fact that something models neurons does not mean it can’t be a work. That’s not… coherent. You’ve jumped from A to Z and your argument to get there is “human brain has neurons”. Like, okay? Does that somehow mean anything that is vaguely neuron-like is not a work? So if I make a mechanical neuron, I can’t copyright it? I can’t patent it?
No, that’s absurd.
In that case all work would be derivative.
No? No. Not all work is analogous to training a generative model. That’s a really bizarre thing to say, and I’m shocked to hear it from you.
I’m finally reading a comment from someone who actually knows how machine learning works. Too many people craft their argument before learning about the technology. Well, they think reading a few blog articles counts as research maybe.
Unfortunately, the courts and legislatures may craft their opinions and laws, respectively, without knowing how machine learning actually works.
That’s a decent theoretical legal basis, but the voice lines are property of the game company rather than the voice actors.
If this interpretation of copyright law on AI models will be the outcome of the two (three?) big AI lawsuits related to stable diffusion, most AI companies will be completely fucked. Everything from Stable Diffusion to ChatGPT 4 will instantly be in trouble.
Making derivatives of existing game assets is a core part of modding. I don’t see how this is any different from splicing existing voice lines to make them say whatever you want them to say.
Maybe it’s morally wrong to use the work of voice actors for NSFW purposes without their consent, but I’m not sure if it’s illegal from a copyright standpoint.