On May 9, 2024, a California district court dismissed, with leave to amend, the complaint brought by social media platform X Corp. (formerly Twitter) against data provider Bright Data Ltd. (“Bright Data”) over Bright Data’s alleged scraping of publicly available data from X for use in data products sold
web scraping
Web Publisher Seeks Injunctive Relief to Address Web Scraper’s Domain Name Maneuvers Intended to Avoid Court Order
Late last year, Chegg Inc. (“Chegg”), an online learning platform, obtained a preliminary injunction based on allegations that the various operators of the Homeworkify website (“Defendants”) – which allows users to view Chegg’s paywalled solutions without creating an account – violated the Computer Fraud and Abuse Act (CFAA). (Chegg …
California Court Issues Noteworthy Decision on Breach of Contract Claims in Web Scraping Dispute
On January 23, 2024, a California district court released its opinion in a closely-watched scraping dispute between the social media platform Meta and data provider Bright Data Ltd. (“Bright Data”) over Bright Data’s alleged scraping of publicly-available data from Facebook and Instagram for use in data products sold to third…
Another Web Scraping Dispute Focused on Travel Data
- Flight and travel data has always been valuable for data aggregators and online travel services and has prompted litigation over the years.
- Latest suit from Air Canada against a rewards travel search site raises some interesting liability issues under the CFAA.
- The implications of this case, if the plaintiffs are successful, could impact the legal analysis of web scraping in a variety of circumstances, including for the training of generative AI models.
In a recent post, we recounted the myriad of issues raised by recently-filed data scraping suits involving job listings, company reviews and employment data. Soon after, another interesting scraping suit was filed, this time by a major airline against an award travel search site that aggregates fare and award travel data. Air Canada alleges that Defendant Localhost LLC (“Localhost” or “Defendant”), operator of the Seats.aero website, unlawfully bypassed technical measures and violated Air Canada’s website terms when it scraped “vast amounts” of flight data without permission and purportedly caused slowdowns to Air Canada’s site and other problems. (Air Canada v. Localhost LLC, No. 23-01177 (D. Del. Filed Oct. 19, 2023)).[1]
The complaint alleges that Localhost harvested data from Air Canada’s site and systems to populate the seats.aero site, which claims to be “the fastest search engine for award travel.”
It also alleged that in addition to scraping the Air Canada website, Localhost engaged in “API scraping” by impersonating authorized requests to Air Canada’s application programming interface.
Help Wanted: Site Owner Brings New Scraping Suits Focused on Job Posting and Employment Data
UPDATE: On February 5, 2024, the California district court granted the defendant Aspen Technology Labs, Inc.’s motion to dismiss Jobiak LLC’s web scraping complaint for lack of personal jurisdiction, with leave to amend. The court found that Jobiak had not adequately alleged that its copyright and tort-related claims arose out of the defendant’s forum-related activities and that there were no allegations that Jobiak’s database or website was hosted on servers in the California forum. On March 8, 2024, the court dismissed the action with prejudice, as Jobiak did not submit an amended complaint within the time allowed by the court.
In recent years there has been a great demand for information about job listings, company reviews and employment data. Recruiters, consultants, analysts and employment-related service providers, amongst others, are aggressively scraping job-posting sites to extract that type of information. Recall, for example, the long-running, landmark hiQ scraping litigation over the scraping of public LinkedIn data.
The two most recent disputes regarding scraping of employment and job-related data were brought by Jobiak LLC (“Jobiak”), an AI-based recruitment platform. Jobiak filed two nearly-identical scraping suits in California district court alleging that competitors unlawfully scraped its database and copied its optimized job listings without authorization. (Jobiak LLC v. Botmakers LLC, No. 23-08604 (C.D. Cal. Filed Oct. 12, 2023); Jobiak LLC v. Aspen Technology Labs, Inc., No. 23-08728 (C.D. Cal. Filed Oct. 17, 2023)).
Data Scraper’s Declaratory Action Seeking Green Light to Scrape LinkedIn Survives Motion to Dismiss
On November 15, 2022, a California district court declined to dismiss a declaratory judgment action brought by a data scraper, 3taps, Inc. (“3taps”), against LinkedIn Corp. (“LinkedIn”). (3taps, Inc. v. LinkedIn Corp., No. 18-00855 (N.D. Cal. Nov. 15, 2022)). 3taps is seeking an order to clarify whether the federal Computer Fraud and Abuse Act (CFAA) (or its California state law counterpart) prevents it from accessing and using publicly-available data on LinkedIn, and whether scraping such data would also subject it to an action brought by LinkedIn for breach of contract or trespass.
This is not 3tap’s first experience with scraping litigation (see prior post). But if this dispute sounds strangely familiar and reminiscent of the long-running dispute between hiQ Labs and LinkedIn (which we’ve followed closely), it is. The 3taps action traces its origin, in part, to the original hiQ ruling in August 2017, where this same judge first granted a preliminary injunction in favor of hiQ, enjoining LinkedIn from blocking hiQ’s access to LinkedIn members’ public profiles. Following that ruling, 3taps sent a letter to LinkedIn stating that it also intended to scrape publicly-available data from LinkedIn. LinkedIn responded that while it was not considering legal action against 3taps, it cautioned that “any further access by 3taps to the LinkedIn website and LinkedIn’s servers is without LinkedIn’s or its members’ authorization.” Thus, the hiQ ruling, 3taps’s letter to LinkedIn, and LinkedIn’s reply were the genesis of the current declaratory judgment action filed by 3taps against LinkedIn.[1]
District Court Decision Brings New Life to CFAA to Combat Unwanted Scraping
On October 24, 2022, a Delaware district court held that certain claims under the Computer Fraud and Abuse Act (CFAA) relating to the controversial practice of web scraping were sufficient to survive the defendant’s motion to dismiss. (Ryanair DAC v. Booking Holdings Inc., No. 20-01191 (D. Del. Oct. 24, 2022)). The opinion potentially breathes life into the use of the CFAA to combat unwanted scraping.
In the case, Ryanair DAC (“Ryanair”), a European low-fare airline, brought various claims against Booking Holdings Inc. (and its well-known suite of online travel and hotel booking websites) (collectively, “Defendants”) for allegedly scraping the ticketing portion of the Ryanair site. Ryanair asserted that the ticketing portion of the site is only accessible to logged-in users and therefore the data on the site is not public data.
The decision is important as it offers answers (at least from one district court) to several unsettled legal issues about the scope of CFAA liability related to screen scraping. In particular, the decision addresses:
- the potential for vicarious liability under the CFAA (which is important as many entities retain third party service providers to perform scraping)
- how a data scraper’s use of evasive measures (e.g., spoofed email addresses, rotating IP addresses) may be considered under a CFAA claim centered on an “intent to defraud”
- clarification as to the potential role of technical website-access limitations in analyzing CFAA “unauthorized access” liability
To find answers to these questions, the court’s opinion distills the holdings of two important CFAA rulings from this year – the Supreme Court’s holding in Van Buren that adopted a narrow interpretation of “exceeds unauthorized access” under the CFAA and the Ninth Circuit’s holding in the screen scraping hiQ case where that court found that the concept of “without authorization” under the CFAA does not apply to “public” websites.
DOJ Revises Policy for CFAA Prosecution to Reflect Developments in Web Scraping and Other Matters
On May 19, 2022, the Department of Justice (DOJ) announced that it had revised its policy regarding prosecution under the federal anti-hacking statute, the Computer Fraud and Abuse Act (CFAA). Since the DOJ last made changes to its CFAA policy in 2014, there have been a number of relevant developments in technology and business practices, most notably related to web scraping. Among other things, the revised policy reflects aspects of the evolving views of this sometimes-controversial statute and the outcome of two major CFAA court decisions in the last year (the Ninth Circuit hiQ decision and the Supreme Court’s Van Buren decision), both of which adopted a narrow interpretation of the CFAA in situations beyond a traditional outside computer hacker scenario.
While the DOJ’s revised CFAA policy is only binding on federal CFAA criminal prosecution decisions (and could be amended by subsequent Administrations) and does not directly affect state prosecutions (including under the many state versions of the CFAA) or civil litigation in the area, it is likely to be relevant and influential in those situations as well, and in particular, with respect to web scraping. It seems that even the DOJ has conceded that the big hiQ and Van Buren court decisions have mostly (but not entirely) eliminated the threat of criminal prosecution under the CFAA when it comes to the scraping of “public” data. Still, as described below, the DOJ’s revisions to its policy, as written, are not entirely consistent with the hiQ decision.
Southwest Airlines Wins Injunction Barring Travel Site from Scraping
UPDATE: On December 23, 2021, the parties reached a settlement, as Southwest filed an unopposed motion for entry of final judgment and a permanent injunction containing the same restrictions as the temporary injunction issued in September. Under the proposed permanent injunction, Kiwi would be barred from scraping flight and fare information from Southwest’s site, publishing any Southwest flight or fare information on kiwi’s site or app (or selling any Southwest flights), or otherwise using Southwest’s site for any commercial purpose or in a manner that violates Southwest’s site terms.
UPDATE: On November 1, 2021, the parties filed a Joint Notice of Settlement indicating that they have reached a settlement agreement in principle. The terms of the settlement were not disclosed.
UPDATE: On October 28, 2021, the defendant Kiwi.com, Inc. filed a notice of appeal to the Fifth Circuit seeking review of the district court’s ruling granting Southwest Airlines Co.’s motion for a preliminary injunction.
On September 30, 2021, a Texas district court granted Southwest Airline Co.’s (“Southwest”) request for a preliminary injunction against online travel site Kiwi.com, Inc. (“Kiwi”), barring Kiwi from, among other things, scraping fare data from Southwest’s website and committing other acts that violate Southwest’s terms. (Southwest Airlines Co. v. Kiwi.com, Inc., No. 21-00098 (N.D. Tex. Sept. 30, 2021)). Southwest is no stranger in seeking and, in most cases, obtaining injunctive relief against businesses that have harvested its fare data without authorization – ranging as far back as the 2000s (See e.g., Southwest Airlines Co. v. BoardFirst, LLC, No. 06-0891 (N.D. Tex. Sept. 12, 2007) (a case cited in the current court opinion)), and as recently as two years ago, when we wrote about a 2019 settlement Southwest entered into with an online entity that scraped Southwest’s site and had offered a fare notification service, all contrary to Southwest’s terms.
In this case, the Texas court found that Southwest had established a likelihood of success on the merits of its breach of contract claim. Rejecting Kiwi’s arguments that it did not assent to Southwest’s terms, the court found that Kiwi had knowledge of and assented to the terms in multiple ways, including by agreeing to the terms when purchasing tickets on Southwest’s site. In all, the court found the existence of a valid contract and Kiwi’s likely breach of the terms, which prohibit scraping Southwest’s flight data and selling Southwest flights without authorization. The court also found that Southwest made a sufficient showing that Kiwi’s scraping and unauthorized sale of tickets, if not barred, would result in irreparable harm. In ultimately granting Southwest’s request for a preliminary injunction, the Texas court also found that Southwest also demonstrated the threatened injury if the injunction is denied outweighed any harm to Kiwi that will result if the injunction is granted and that the injunction would be in the public interest.
What made this result particularly notable is that the preliminary injunction is based on the likelihood of success on the merits of Southwest’s breach of contract claim and Kiwi’s alleged violation of Southwest’s site terms, as opposed to other recent scraping disputes which have centered around claims of unauthorized access under the federal Computer Fraud and Abuse Act (CFAA).
Trove of Online LinkedIn User Data Fuels LinkedIn’s Anti-Scraping Position
Last week, the Italian data protection authority (the “GPDP”) opened an investigation after reports that a dataset allegedly containing data compiled from 500 million LinkedIn profiles and other websites was available for sale on a hacker forum. Apparently, this data represents more than two-thirds of LinkedIn’s estimated 740 million users. The hacker reportedly posted approximately two million records visibly online as evidence of the dataset, and offered to sell the rest for an undisclosed bitcoin payment.
According to a statement by LinkedIn, the company investigated the posting and determined that it is “an aggregation of data from a number of websites and companies,” including publicly viewable LinkedIn member profile data that apparently was scraped from LinkedIn’s site. LinkedIn stated that it was not a data breach because no private member profile data was included in the dataset it was able to review. LinkedIn stated that such scraping of data violated its terms.
The posting of this scraped data immediately reminds us of the ongoing scraping dispute between LinkedIn and data analytics start-up hiQ, Inc. (“hiQ”). The principal issue in the case concerns the scope of Computer Fraud and Abuse Act (CFAA) liability associated with web scraping of publicly available social media profile data. In a prior ruling, the Ninth Circuit affirmed the lower court’s order granting a preliminary injunction barring LinkedIn from blocking hiQ from accessing and scraping publicly available LinkedIn member profiles.