Meta CEO Mark Zuckerberg approved the use of pirated books to train AI: Court filing

Source Cryptopolitan

Meta Platforms trained its AI models using pirated versions of copyrighted books, with the approval of its CEO Mark Zuckerberg.

According to newly disclosed court papers, a group of authors is alleging that the social media giant knew they were using pirated work to train their AI systems.

Internal documents from Meta “reveal” the claims

In their court filing, the authors said internal documents that were produced by Meta during the discovery process showed that the social network firm knew about the pirated books.  According to The Guardian, its CEO Zuckerberg backed the use of the LibGen dataset, a huge online archive of books. This was despite warnings within the company’s AI executive team that it is a dataset “we know to be pirated.”

US author Ta-Nehisi Coates, comedian Sarah Silverman, and other writers suing the company for copyright infringement made the accusations in filings that were made public on Wednesday in a California federal court.

The authors took Meta to court in 2023 on allegations that the social media company was misusing their books to train AI models, specifically Llama, its large language model that powers its chatbots.

Originating in Russia, the Library Genesis or LibGen dataset is a “shadow library” which claims to contain millions of novels, notification books as well as science magazine articles.

In 2024, a New York federal court asked LibGen’s anonymous operators to pay a group of publishers $30 million in damages for copyright infringement

This is one of the many others that allege that copyrighted work by authors, artists, and others was used to train generative AI tools like ChatGPT chatbot without consent from owners. Creative professionals have warned that using their work without their consent is endangering their business models.

According to Reuters, defendants have, however, argued that they made fair use of copyrighted material.

The judge allowed the authors to file an amended complaint

In the Meta case, the authors reportedly asked the court on Wednesday for permission to file an updated complaint. In their arguments, they indicated that new evidence showed that social network firms used the AI training dataset LibGen, which includes millions of pirated works, and distributed it through peer-to-peer torrents.

According to them, Zuckerberg “approved Meta’s use of the LibGen dataset notwithstanding concerns within Meta’s AI executive team (and others at Meta) that LibGen is ‘a dataset we know to be pirated.'”

The filing also cites a memo that referred to Zuckerberg’s initials, noting that “after escalation to MZ” Meta’s AI team “has been approved to use LibGen.”

Last year, a US district judge, Vince Chhabria, dismissed claims that text generated by Meta’s AI models infringed the authors’ copyrights and that Meta unlawfully stripped books’ copyright management information. This refers to information about the work including the title, name of author, and copyright owner.

The plaintiffs were, however, permitted to amend their claims. In their arguments this week, the authors said the evidence bolstered their infringement claims and justified reviving their copyright management information case adding a new computer fraud allegation.

During a Thursday hearing, the judge said he would allow the authors to file an amended complaint although he was skeptical about the merits of the fraud claims.

Land a High-Paying Web3 Job in 90 Days: The Ultimate Roadmap

Disclaimer: For information purposes only. Past performance is not indicative of future results.
placeholder
What's Really Inside the AI Bubble? Decoding the Core Controversies Over Scale, Reliance and Valuation As ChatGPT nears its three-year anniversary, the AI boom has fueled a three-year U.S. equity rally. However, growing AI bubble concerns and investor fatigue now threaten to derail market
Author  TradingKey
Yesterday 10: 11
As ChatGPT nears its three-year anniversary, the AI boom has fueled a three-year U.S. equity rally. However, growing AI bubble concerns and investor fatigue now threaten to derail market
placeholder
Top 3 Price Prediction: Bitcoin, Ethereum, Ripple – BTC, ETH, and XRP flash deeper downside risks as market selloff intensifiesBitcoin (BTC), Ethereum (ETH) and Ripple (XRP) trade in red on Friday after correcting more than 5%, 10% and 2%, respectively, so far this week.
Author  FXStreet
Yesterday 08: 32
Bitcoin (BTC), Ethereum (ETH) and Ripple (XRP) trade in red on Friday after correcting more than 5%, 10% and 2%, respectively, so far this week.
placeholder
Gold Posts Biggest Weekly Gain in a Month as US Data Delays Fuel UncertaintyGold climbed higher on Friday, marking its strongest weekly performance in a month, as traders weighed the impact of a data backlog following the end of the US government's extended shutdown. Silver also moved upward.
Author  Mitrade
Yesterday 05: 48
Gold climbed higher on Friday, marking its strongest weekly performance in a month, as traders weighed the impact of a data backlog following the end of the US government's extended shutdown. Silver also moved upward.
placeholder
WTI rises to near $60.00 on supply risks due to US sanctionsWest Texas Intermediate (WTI) Oil price gains for the second successive session, trading around $59.90, up by more than 2%, during the Asian hours on Friday. Crude Oil prices receive support from supply risks linked to upcoming United States (US) sanctions.
Author  FXStreet
Yesterday 03: 47
West Texas Intermediate (WTI) Oil price gains for the second successive session, trading around $59.90, up by more than 2%, during the Asian hours on Friday. Crude Oil prices receive support from supply risks linked to upcoming United States (US) sanctions.
placeholder
Ethereum slides 5% as bears lean on $3,500 cap and put $3,150 support in focusEthereum (ETH) drops more than 5% after a failed push above $3,550, with price sliding to $3,153 and now holding below $3,350, the 100-hour SMA and a bearish trend line at $3,500; unless bulls reclaim the $3,350–$3,500 zone, the short-term bias stays bearish and a clean break under $3,150 could expose $3,050, $3,000 and even the $2,880–$2,850 support area.
Author  Mitrade
Yesterday 03: 41
Ethereum (ETH) drops more than 5% after a failed push above $3,550, with price sliding to $3,153 and now holding below $3,350, the 100-hour SMA and a bearish trend line at $3,500; unless bulls reclaim the $3,350–$3,500 zone, the short-term bias stays bearish and a clean break under $3,150 could expose $3,050, $3,000 and even the $2,880–$2,850 support area.
goTop
quote