AI and Law

Kyle Wiggers, over at TechCrunch, informs us “The current legal cases against generative AI are just the beginning.” Microsoft, GitHub and OpenAI are currently being sued in a class action over Copilot, a code-generating AI system trained on public code. The suit claims that the companies are violating copyright law by creating a system that generates licensed code snippets yet provides no credit. Midjourney and Stability AI are being sued over claims that their generative AI art tools were trained on images from the web. Stability AI has also been taken to court by Getty Images for using their data for training without permission.

A trend seems to be emerging.

This lengthy and detailed article tells us:

At issue, mainly, is generative AI’s tendency to replicate images, text and more—including copyrighted content—from the data that was used to train it. In a recent example, an AI tool used by CNET to write explanatory articles was found to have plagiarized articles written by humans, articles presumably swept up in its training dataset. Meanwhile, an academic study published in December found that image-generating AI models like DALL-E 2 and Stable Diffusion can and do replicate aspects of images from their training data.

Key Points

  • Legal experts warn that generative AI tools could put companies at risk of engaging in copyright violation. Some websites have banned AI-generated content for fear of legal reprisals. On the other hand, others say that without a “smoking gun”—a system that exactly reproduces the material it was trained upon–legal action will be difficult. 
  • Generative AI systems can produce works “in the style of” a particular artist, but copyrighting style has proved notoriously difficult.
  • Everyone seems to agree that copyright law must be updated to reflect the new technology. Legal precedent, so far, seems to tilt in favor of generative AI. For example, the U.S. Court of Appeals deemed that Google scanning millions of copyrighted books without a license constituted fair use.
  • Wiggers also introduces us to a term from the Federal Trade Comission: algorithmic disgorgement. While this may bring to mind a digital ipecac, it is a doctrine that companies cannot profit from illegaly collected data, either directly or through the algorithms that are trained on it, akin to the “fruit of the poisonous tree” construct.