• 0 Posts
  • 87 Comments
Joined 1 year ago
cake
Cake day: June 1st, 2023

help-circle


  • In the movie industry, everyone usually signs a work for hire contract that specifies who will have the rights to the completed film.

    However, in a recent case the director (Alex Merkin) did not sign a contract and then tried to claim copyright afterwards. The court said that directors have no inherent copyright over film:

    We answer that question in the negative on the facts of the present case, finding that the Copyright Actʹs terms, structure, and history support the conclusion that Merkinʹs contributions to the film do not themselves constitute a ʺwork of authorshipʺ amenable to copyright protection. … As a general rule, the author is the party who actually creates the work, that is, the person who translates an idea into a fixed, tangible expression entitled to copyright protection. … But a directorʹs contribution to an integrated ʺwork of authorshipʺ such as a film is not itself a ʺwork of authorshipʺ subject to its own copyright protection.



  • Simple question:

    If you are college student, learning to write professionally, is it fair use to download copyrighted books from Z-Library in order to become a better writer? If you are a musician, is it fair use to download mp3s from The Pirate Bay in order to learn about musical styles? How about film students, can they torrent Disney movies as part of their education?

    I’m certain that every court in the US would rule that this is not fair use. It’s not fair use even if pirated content ultimately teaches a student how to create original, groundbreaking works of writing, music, and film.

    Simply being a student does not give someone free pass to pirate content. The same is true of training an AI, and there are already reports that pirated material is in the openAI training set.

    If openAI could claim fair use, then almost by definition The Pirate Bay could claim fair use too.


  • Again, it’s not a question of reproducing books in an LLM. The allegation is that the openAI developers downloaded books illegally to train their AI.

    You need to pay for your copy of a book. That’s true if you are a student teaching yourself to write, and it’s also true if you are an AI developer training an AI to write. In the latter case, you might also need to pay for a special license.

    Is it possible that the openAI developers can bring the receipts showing they paid for each and every book and/or license they needed to train their AI? Sure, it’s possible. If so, the lawyers who brought the suit would look pretty silly for not even bother to check.

    But openAI used a whole lot of books, which cost a whole lot of money. So I wouldn’t hold my breath.


  • the purpose and character of the use, the nature of the copyrighted work, the amount and substantiality of the portion used, and the effect of the use upon the potential market.

    Yes, and I named three of those factors:

    the key questions are often whether the use of the work (a) is commercial, or (b) may substitute for the original work. Furthermore, the amount of the work copied is also considered.

    And while you don’t need to meet all the criteria, the odds are pretty long when you fail three of the four (commercial nature, copying complete work rather than a portion, and negative effect on the market for the original).

    Think of it this way: if it were legal to download books in order to train an AI, then it would also be legal to download books in order to train a human student. After all, why would a human have fewer rights than an AI?

    Do you really think courts are going to decide that it’s ok to download books from The Pirate Bay or Z-Library, provided they are being read by the next generation of writers?


  • If a musician doesn’t have the right to their own work, it’s because someone offered to pay them for the rights and they accepted.

    Is that in their favor? I think so, considering the alternative is to not get paid and not have rights to their work.

    And not to go too far off topic, but publicly funded research is generally not aimed at drug development, it is aimed at discovering the basic science behind how the body works (human body or otherwise).

    If you want a clinical trial that proves a particular drug can actually help patients, you will need to find a company to pay for it. The government almost never pays for clinical trials (I think the COVID vaccine might have been an exception). Clinical trials are far more expensive than basic science, and patents are the carrot to get the private sector to pay for them.




  • I know the model doesn’t contain a copy of the training data, but it doesn’t matter.

    If the copyrighted data is downloaded at any point during training, that’s an IP violation. Even if it is immediately deleted after being processed by the model.

    As an analogy, if you illegally download a Disney movie, watch it, write a movie review, and then delete the file … then you still violated copyright. The movie review doesn’t contain the Disney movie and your computer no longer has a copy of the Disney movie. But at one point it did, and that’s all that matters.


  • If they bought physical books then the lawsuit might happen, but it would be much harder to win.

    If they bought e-books, then it might not have helped the AI developers. When you buy an e-book you are just buying a license, and the license might restrict what you can do with the text. If an e-book license prohibits AI training (and they will in the future, if they don’t already) then buying the e-book makes no difference.

    Anyway, I expect that in the future publishers will make sets of curated data available for AI developers who are willing to pay. Authors who want to participate will get royalties, and developers will have a clear license to use the data they paid for.


  • When determining whether something is fair use, the key questions are often whether the use of the work (a) is commercial, or (b) may substitute for the original work. Furthermore, the amount of the work copied is also considered.

    Search engine scrapers are fair use, because they only copy a snippet of a work and a search result cannot substitute for the work itself. Likewise if you copy an excerpt of a movie in order to critique it, because consumers don’t watch reviews as a substitute for watching movies.

    On the other hand, openAI is accused of copying entire works, and openAI is explicitly intended as a replacement for hiring actual writers. I think it is unlikely to be considered fair use.

    And in practice, fair use is not easy to establish.


  • The question “what is sufficient” basically amounts to convincing an official that the final work reflects some form of your creative expression.

    So for instance, if you are hired to take AI-generated output and crop it to a 29:10 image, that probably won’t be eligible for copyright. You aren’t expressing your creativity, you are doing something anyone else could do.

    On the other hand, if you take AI-generated output and edit it in photoshop to the point that everyone says “Hey, that looks like a ThunderingJerboa image”, then you would almost certainly be eligible for copyright.

    Everyone else falls in between, trying to convince someone that they are more like the latter case. Which is good, because it means actual artists will be rewarded.




  • You put as much effort into it as you would anything else.

    Copyright is not meant to reward effort. This is a common misconception. Thirty years ago there was a landmark SCOTUS case about copyrighting a phone book. Back then, collecting and verifying phone numbers and addresses took a tremendous amount of effort. Somebody immediately copied the phone book, and the creators of the phone book argued that their effort should be rewarded with copyright protection.

    The courts shot that down. Copyright is not about effort, it’s about creative expression. Creative expression can require major effort (Sistine Chapel) or take very little effort (duck lips photo). Either way, it’s rewarded with a copyright.

    Assembling a database is not creative expression. Neither is judging whether an AI generated work is suitable. Nor pointing out what you’d like to see in a new AI generated work. So no matter how much effort one puts into those activities, they are not eligible for copyright.

    To the extent that an artist takes an AI generated work and adds their own creative expression to it, they can claim copyright over the final result. But in all the cases in which AI generated works were ruled ineligible, the operator was simply providing prompts and/or approving the final result.


  • It’s not actually called “theft” or “stealing”, it’s called “infringement” or “violation”. Infringement is to intellectual property as trespassing is to real estate. The owners are still able to use their property, but their rights to it have nevertheless been violated.

    Also, corporations cannot create intellectual property. They can only offer to buy it from the natural persons who created it. Without IP protection, creators would lose the only protections they have against corporations and other entrenched interests.

    Imagine seeing all your family photos plastered on a McDonald’s billboard, or in political ad for a candidate you despise. Imagine being told, “Sorry, you can’t stop them from using your photos however they want”. That’s a world without IP protection.



  • No, under copyright law it would be your work and your work alone.

    Someone who is providing suggestions or prompts to you is not eligible to share the copyright, no matter how detailed they are. They must actually create part of the work themselves.

    So for instance if you are in a recording studio then you will have the full copyright over music that you record. No matter how much advice or suggestions you get from other people in the studio with you. Your instruments/voice/lyrics, your copyright.

    Otherwise copyright law would be a constant legal quagmire with those who gave you suggestions/prompts/feedback! Remember, an idea cannot be copyrighted, and prompts are ideas.

    In the case of Stable Diffusion, the copyright would go to Stable Diffusion alone if it were a human. But Stable Diffusion is not a human, so there is no copyright at all.