**Sarah Silverman, Christopher Golden, and Richard Kadrey Sue OpenAI for Copyright Infringement**
In a new class action lawsuit against OpenAI, Sarah Silverman, Christopher Golden, and Richard Kadrey are claiming that the company’s ChatGPT was trained on copyrighted books without permission or compensation. The lawsuit, brought by the same attorney representing authors Paul Tremblay and Mona Awad in a similar case, alleges that OpenAI copied and used their books, including Silverman’s memoir “The Bedwetter,” Golden’s “Ararat,” and Kadrey’s “Sandman Slim,” to train ChatGPT.
ChatGPT, a large language model (LLM) developed by OpenAI, generates human-like responses to text inputs by analyzing vast amounts of text data. The lawsuit argues that since the model’s output is reliant on its training dataset, which allegedly includes copyrighted material, OpenAI has infringed upon the authors’ rights.
The plaintiffs’ legal team asserts that they can prove OpenAI’s use of copyrighted works by using ChatGPT itself. The suit claims that ChatGPT is able to accurately summarize the plaintiffs’ books because those books were copied and included in the model’s training data. While some minor errors may occur, the overall accuracy suggests that ChatGPT retains knowledge of the works it was trained on.
The lawsuit also presents theories regarding how OpenAI obtained the allegedly copyrighted works for training ChatGPT. It points to a July 2020 paper mentioning two book troves, Books1 and Books2, used for training. The suit speculates that Books1 may have come from Project Gutenberg, an archive of e-books with expired copyrights, while Books2 may have originated from illegal “shadow library” websites.
Furthermore, the lawsuit references a March 2023 paper by OpenAI that provided no information about the dataset used for ChatGPT-4. OpenAI stated in the paper that due to competition and safety concerns, it would not disclose details about dataset construction.
Rolling Stone reached out to OpenAI for comment, but the company has not responded at the time of writing.
Opinion Piece: Editor’s Notes
It is not surprising to see more authors filing copyright infringement lawsuits against artificial intelligence companies. The rise of powerful language models like ChatGPT raises important legal and ethical questions regarding intellectual property rights and fair use. While OpenAI has not commented on the lawsuit, their response will likely shape the future of AI development and how copyrighted materials are handled in training datasets.
As language models become increasingly sophisticated, it becomes imperative to address these legal concerns and ensure that AI development respects the rights of creators. Finding a balance between technological advancements and intellectual property protection will be crucial for the continued progress of AI research. To stay informed about these legal battles and the latest news in AI, visit the GPT News Room [link: https://gptnewsroom.com].