On May 9, 2024, a California district court dismissed, with leave to amend, the complaint brought by social media platform X Corp. (formerly Twitter) against data provider Bright Data Ltd. (“Bright Data”) over Bright Data’s alleged scraping of publicly available data from X for use in data products sold

Generative AI has been most synonymous in the public mind with “AI” since the commercial breakout of ChatGPT in November 2022. Consumers and businesses have seen the fruits of impressive innovation in various generative models’ ability to create audio, video, images and text, analyze and transform data, perform Q&A chatbot

On January 23, 2024, a California district court released its opinion in a closely-watched scraping dispute between the social media platform Meta and data provider Bright Data Ltd. (“Bright Data”) over Bright Data’s alleged scraping of publicly-available data from Facebook and Instagram for use in data products sold to third

Last week, OpenAI rolled out ChatGPT Team, a flexible subscription structure for small-to-medium sized businesses (with two or more users) that are not large enough to warrant the expense of a ChatGPT Enterprise subscription (which requires a minimum of 150 licensed users).  Despite being less expensive than its Enterprise counterpart, ChatGPT Team provides for the use of the latest OpenAI models with the robust privacy, security and confidentiality protections that previously only applied to the ChatGPT Enterprise subscription and which are far more protective than the terms that govern ordinary personal accounts. This development could be the proverbial “game changer” for smaller businesses, as for the first time, they can have access to tools previously only available to OpenAI Enterprise customers, under OpenAI’s more favorable Business Terms and the privacy policies listed on the Enterprise Privacy page, without making the financial or technical commitment required under an Enterprise relationship. 

Thus, for example, ChatGPT Team customers would be covered by the Business Terms’ non-training commitment (OpenAI’s Team announcement states: “We never train on your business data or conversations”), and by other data security controls, as well as Open AI’s “Copyright Shield,” which offers indemnity for customers in the event that a generated output infringes third party IP.[1] Moreover, under the enterprise-level privacy protections, customers can also create custom GPT models that are for in-house use and not shared with anyone else.

As noted above, until now, the protections under the OpenAI Business Terms were likely beyond reach for many small and medium sized businesses, either because of the financial commitment required by OpenAI’s Enterprise agreement or because of the unavailability of the technical infrastructure necessary to implement the OpenAI API Service. In the past, such smaller entities might resort to having employees use free or paid OpenAI products under individual accounts, with internal precautions (like restrictive AI policies) in place to avoid confidentiality and privacy concerns.[2]

As we’ve seen over the last year, one generative AI provider’s rollout of a new product, tool or contractual protection often results in other providers following suit. Indeed, earlier this week Microsoft announced that it is “expanding Copilot for Microsoft 365 availability to small and medium-sized businesses.” With businesses of all sizes using, testing or developing custom GAI products to stay abreast with the competition, we will watch for future announcements from other providers about more flexible licensing plans for small-to-medium sized businesses.

On December 19, 2023, AI research company Anthropic announced that it had updated and made publicly available its Commercial Terms of Service (effective Jan 1, 2024) to, among other things, indemnify its enterprise Claude API customers from copyright infringement claims made against them for “their authorized use of our services

  • Flight and travel data has always been valuable for data aggregators and online travel services and has prompted litigation over the years.
  • Latest suit from Air Canada against a rewards travel search site raises some interesting liability issues under the CFAA.
  • The implications of this case, if the plaintiffs are successful, could impact the legal analysis of web scraping in a variety of circumstances, including for the training of generative AI models.

In a recent post, we recounted the myriad of issues raised by recently-filed data scraping suits involving job listings, company reviews and employment data.  Soon after, another interesting scraping suit was filed, this time by a major airline against an award travel search site that aggregates fare and award travel data.  Air Canada alleges that Defendant Localhost LLC (“Localhost” or “Defendant”), operator of the Seats.aero website, unlawfully bypassed technical measures and violated Air Canada’s website terms when it scraped “vast amounts” of flight data without permission and purportedly caused slowdowns to Air Canada’s site and other problems. (Air Canada v. Localhost LLC, No. 23-01177 (D. Del. Filed Oct. 19, 2023)).[1]   

The complaint alleges that Localhost harvested data from Air Canada’s site and systems to populate the seats.aero site, which claims to be “the fastest search engine for award travel.” 

It also alleged that in addition to scraping the Air Canada website, Localhost engaged in “API scraping” by impersonating authorized requests to Air Canada’s application programming interface.  

In a previous post, we highlighted three key items to look out for when assessing the terms and conditions of generative artificial intelligence (“GAI”) tools: training rights, use restrictions and responsibility for outputs. With respect to responsibility for outputs specifically, we detailed Microsoft’s shift away, through its Copilot Copyright Commitment (discussed in greater detail below), from the blanket disclaimer of all responsibility for GAI tools’ outputs that we initially saw from most GAI providers.

In the latest expansion of intellectual property protection offered by a major GAI provider, OpenAI’s CEO Sam Altman announced to OpenAI “DevDay” conference attendees that “we can defend our customers and pay the costs incurred if you face legal claims around copyright infringement, and this applies both to ChatGPT Enterprise and the API.”

In the first half of 2023, a deluge of new generative artificial intelligence (“GAI”) tools hit the market, with companies ranging from startups to tech giants rolling out new products. In the large language model space alone, we have seen OpenAI’s GPT-4, Meta’s LLaMA, Anthropic’s Claude 2, Microsoft’s Bing AI, and others.

A proliferation of tools has meant a proliferation of terms and conditions. Many popular tools have both a free version and a paid version, which each subject to different terms, and several providers also have ‘enterprise’ grade tools available to the largest customers. For businesses looking to trial GAI, the number of options can be daunting.

This article sets out three key items to check when evaluating a GAI tool’s terms and conditions. Although determining which tool is right for a particular business is a complex question that requires an analysis of terms and conditions in their entirety – not to mention nonlegal considerations like pricing and technical capabilities – the below items can provide prospective customers with a starting place, as well as bellwether to help spot terms and conditions that are more or less aggressive than the market standard.

UPDATE: On February 5, 2024, the California district court granted the defendant Aspen Technology Labs, Inc.’s motion to dismiss Jobiak LLC’s web scraping complaint for lack of personal jurisdiction, with leave to amend. The court found that Jobiak had not adequately alleged that its copyright and tort-related claims arose out of the defendant’s forum-related activities and that there were no allegations that Jobiak’s database or website was hosted on servers in the California forum.  On March 8, 2024, the court dismissed the action with prejudice, as Jobiak did not submit an amended complaint within the time allowed by the court.  

In recent years there has been a great demand for information about job listings, company reviews and employment data.   Recruiters, consultants, analysts and employment-related service providers, amongst others, are aggressively scraping job-posting sites to extract that type of information. Recall, for example, the long-running, landmark hiQ scraping litigation over the scraping of public LinkedIn data.

The two most recent disputes regarding scraping of employment and job-related data were brought by Jobiak LLC (“Jobiak”), an AI-based recruitment platform.  Jobiak filed two nearly-identical scraping suits in California district court alleging that competitors unlawfully scraped its database and copied its optimized job listings without authorization. (Jobiak LLC v. Botmakers LLC, No. 23-08604 (C.D. Cal. Filed Oct. 12, 2023); Jobiak LLC v. Aspen Technology Labs, Inc., No. 23-08728 (C.D. Cal. Filed Oct. 17, 2023)).