On May 9, 2024, in X Corp. v. Bright Data Ltd., the U.S. District Court for the Northern District of California dismissed X’s claims alleging that Bright Data’s access to X’s systems, and scraping and selling of publicly available data from X’s platform, violated its Terms of Service and other policies. Bright Data’s business involves various data scraping products and services, such as scraping data on behalf of its customers, selling tools that enable customer to scrape data, and providing IP proxies for customers to use when scraping data.
As an initial matter, the court divided its analysis in two, first looking at X’s claims based on Bright Data’s improper access to X’s systems and then at X’s claims based on Bright Data’s scraping and selling of X’s data. The court easily dismissed the claims based on improper access to X’s systems, including claims for trespass to chattels, violation of California Section 17200 (prohibiting unlawful, unfair and fraudulent business acts), tortious interference with contract, and breach of contract. There was nothing particularly surprising in the court’s analysis of these claims, since the dismissals primarily resulted from X’s failure to plead adequate damages arising from Bright Data’s allegedly improper access.
The more interesting aspect of Bright Data came next, where the court considered X’s claim that Bright Data’s scraping and selling of data violated X’s Terms of Service and held that X’s claims were preempted by federal copyright law.
Section 301(a) of the Copyright Act states that federal copyright law preempts state-law claims when a plaintiff’s work “come[s] within the subject matter of copyright” and state law grants “legal or equitable rights that are equivalent to any of the exclusive rights within the general scope of copyright.” The court in Bright Data noted that there are two types of copyright preemption: (i) express preemption with respect to state law rights that are “equivalent” to federal copyright rights, and (ii) conflict preemption, where enforcement of the state law would more generally undermine the purpose of federal copyright law. The court in Bright Data focused on this second type of preemption.
Under X’s Terms of Service, users retain ownership of the images, text, and other content that they upload to the X platform, and grant X a broad, non-exclusive, royalty-free license to display users’ content on the X platform and to use that content in various other ways, including by disseminating the user content through X’s paid tools. The Terms of Service also prohibit scraping and selling any content from the X platform.
As a first step in its preemption analysis, the court notes that, under the Terms of Service, X does not own the user content that Bright Data scraped and sold, and X merely has a non-exclusive license to use the content. Under federal copyright law, a non-exclusive licensee is permitted to use the licensed works within the scope of the license, but has no right to enforce the copyright in, or exclude others from using, the works. Despite this, the claims that X asserted in Bright Data sought to enforce contractual terms that would give X the ability to exclude others from using the content in a manner that would otherwise be available only to the owner (or, possibly, the exclusive licensee) of the copyright in the content, not to a mere non-exclusive licensee.
(As an aside, the court speculates that the reason that X does not require users to assign ownership of their content to X, or grant X an exclusive license to use content, under its Terms of Service is because this would impact X’s ability to claim the benefit of the safe harbors under Section 512(a) of the Digital Millenium Copyright Act and Section 230 of the Communications Decency Act, two federal statutes that protect website operators from liability arising from content posted by users. The court notes: “X Corp. wants it both ways: to keep its safe harbors yet exercise a copyright owner’s right to exclude, wresting fees from those who wish to extract and copy X users’ content.” In fact, it is not necessarily true that the availability of these statutory safe harbors turns on whether the website operator holds title (or an exclusive license) to the user content at issue. The more likely reason that X does not claim ownership of user content is that doing so would be viewed as overreach and result in significant backlash among its users. But as this issue is not central to the copyright preemption analysis, we leave it for another day.)
The court then goes on to hold that X’s claim that Bright Data breached the Terms of Service by scraping and selling user content is preempted by federal copyright law in three ways, namely that enforcement of X’s contractual rights would:
Interfere with copyright owners’ exclusive rights and frustrate federal copyright law, since allowing X to enforce its Terms of Service would give it, a non-exclusive licensee, the ability to exclude others from exercising those rights;
- Frustrate the operation of fair use, which would (absent a contract preventing it) allow use of the user content for certain permitted purposes (note that, as observed by guest blogger Guy Rub in Eric Goldman’s blog, the court seems to treat fair use as something akin to an affirmative right, rather than as the affirmative defense to an infringement claim that it actually is); and
- Restrict use of type of information that Congress intended to keep free from restraint, namely uncopyrightable items such as likes, user names, short comments, etc. (It is not entirely clear how this can form a basis for preemption given that these items fall outside the scope of copyright law … but we digress.)
Overall, the court presents an interesting new addition to web scraping law, one not addressed in a case that the court still found six occasions to cite, hiQ Labs, Inc. v. LinkedIn Corp. (see one of our previous articles on that case). In light of Bright Data, website operators should be aware that contractual prohibitions contained in terms of service may not be effective to prevent scraping of user content that they make publicly available.