The News Media Alliance (NMA) claims that AI developers heavily rely on illegally scraping copyrighted material from news publications and journalists to train their models, resulting in copyright infringement and competition with news outlets. In an effort to combat these issues, the NMA recommends measures to restrict the ingestion of copyrighted materials and protect publishers. OpenAI's ChatGPT, Google's Bard, and Anthropic's Claude, among other AI models, have faced copyright infringement claims related to their training methods.


The News Media Alliance (NMA) has accused AI developers of illegally scraping copyrighted material from news publications and journalists to train their models, leading to copyright infringement and competition with news outlets.


In a 77-page white paper and accompanying submission to the United States Copyright Office, the NMA claims that the data sets used to train AI models include significantly more content from news publishers compared to other sources. As a result, AI models generate content that uses publisher materials in their outputs without permission, infringing on copyright.


The NMA argues that AI developers benefit significantly from news publisher content in terms of users, data, brand creation, and advertising revenue, while news publishers face reduced revenues, employment opportunities, and tarnished relationships with their audience.


To address these issues, the NMA has recommended several measures, including:


  • Declaring that using a publication's content to monetize AI systems harms publishers

  • Implementing various licensing models and transparency measures to restrict the ingestion of copyrighted materials

  • adopting measures to exclude protected content from third-party websites


While the NMA acknowledges the benefits of generative AI, it emphasizes that publications and journalists can use AI for proofreading, idea generation, and search engine optimization.


Several AI models, including OpenAI's ChatGPT, Google's Bard, and Anthropic's Claude, have faced copyright infringement claims in court related to their training methods. Comedian Sarah Silverman sued OpenAI and Meta in July, alleging that the two firms used her copyrighted work to train their AI systems without permission.


Google has stated that it will assume legal responsibility if its customers are alleged to have infringed copyright by using its generative AI products on Google Cloud and Workspace. However, Google's Bard search tool is not covered by its legal protection promise. OpenAI and Google have not immediately responded to a request for comment.


This dispute highlights the ongoing challenges and controversies surrounding the use of AI models trained on copyrighted materials and their impact on copyright, ownership, and competition in various industries.


(BRAYDEN LINDREA, COINTELEGRAPH, 2023)