In recent years there has been a great demand for information about job listings, company reviews and employment data. Recruiters, consultants, analysts and employment-related service providers, amongst others, are aggressively scraping job-posting sites to extract that type of information. Recall, for example, the long-running, landmark hiQ scraping litigation over the scraping of public LinkedIn data.
The two most recent disputes regarding scraping of employment and job-related data were brought by Jobiak LLC (“Jobiak”), an AI-based recruitment platform. Jobiak filed two nearly-identical scraping suits in California district court alleging that competitors unlawfully scraped its database and copied its optimized job listings without authorization. (Jobiak LLC v. Botmakers LLC, No. 23-08604 (C.D. Cal. Filed Oct. 12, 2023); Jobiak LLC v. Aspen Technology Labs, Inc., No. 23-08728 (C.D. Cal. Filed Oct. 17, 2023)).
Jobiak specializes in AI-based optimization services for publishing job postings on “Google for Jobs.” According to Jobiak’s website, Jobiak’s AI-platform “optimizes for over 25 ‘signals’ that factor into Google for Jobs rankings” to achieve improved search results for job posts. It goes on to explain that the “complex schema that Google requires for your job to appear on Google for Jobs” requires certain coding expertise and system level access that may be beyond many recruiters or others. Beyond these services, Jobiak also offers its ‘All Jobs’ AI-based jobs platform at <alljobs.ai> that allows users to search “100% of all jobs in real-time.” Jobiak claims that its database of all online job listings on alljobs.ai has been organized and optimized to be easily searchable and meet Google’s schema requirements. The complaints assert that the defendants unlawfully scraped and used Jobiak’s proprietary job database to populate their own competing sites, leading to financial losses for Jobiak.
Copyright issues are front and center in the recent flurry of complaints filed relating to the training of generative AI models. However, copyright is not often found in complaints that focus on data scraping, where the data being scraped may not be protectable be copyright. (In these cases, it is likely that the job descriptions themselves are provided by third parties, so Jobiak would not have a basis to allege copyright infringement based on the content of those listings alone.) So in what is a rather unusual twist for these types of data scraping complaints, Jobiak asserts that the defendants infringed its registered copyright for its database, “Group Registration for Automated Database Entitled ALL JOBS by Jobiak” (presumably, although not expressly stated in the complaints, the database of real-time job listings at <alljobs.ai>). The complaints describe the copyright as covering “a compilation of database information including descriptions, categories, job listings, and layout designs.” As a related claim, Jobiak asserts a DMCA anticircumvention claim (17 U.S.C. §§ 1201(a)(1)) for circumventing technological protection measures and gaining access to copyrighted material without permission.
As is common in scraping actions, Jobiak also advanced CFAA claims, alleging multiple theories of liability related to the defendants’ alleged “unauthorized access” via scraping to Jobiak’s systems and circumvention of technical blocking measures (e.g., IP address analysis, rate limits) that caused various harms. The complaints also state computer trespass and unfair competition claims– and request damages and injunctive relief barring the defendants’ access to its database.
- Copyright issues. The complaints’ copyright allegations state that Jobiak holds IP rights in the structure of its database, described as “a compilation of database information including descriptions, categories, job listings, and layout designs.”
Most data scraping actions (as opposed to actions involving copyrightable content) do not involve copyright-related claims, so the instant case raises some interesting issues. For example:
- How would a court interpret the extent of Jobiak’s copyright protection in its job listing keyword selections and overall database/compilation of listings?
- Is the selection and arrangement of the data according to Jobiak’s proprietary technology sufficiently original to warrant protection?
- Does compliance with Google’s schema mitigate against a finding of originality based on the merger doctrine or does Jobiak’s proprietary keyword technology make the format of the listings original and copyrightable?
- Does the defendants’ alleged access and copying constitute infringement?
- CFAA and public data. As previously explored in this blog, including our coverage of the hiQ-LinkedIn scraping litigation, the success of a CFAA “unauthorized access” claim depends, in part, on the nature of the scraped data. In its 2022 opinion in that case, the Ninth Circuit limited the applicability of the CFAA as a tool against the scraping of publicly available website data. It is not clear from a reading of the Jobiak complaints whether Jobiak considers the alljobs.ai job listings as “public” data or private data protected by a paywall or other “authorization gate.” The words “public” or “private” do not appear in the complaints and the complaints do not otherwise suggest that the data was not publicly available. However, in Jobiak’s cease and desist letter to defendant Aspen Technology Labs, Jobiak’s counsel suggested that its database is not publicly available: “We also have strong evidence that you hacked into Jobiak’s Products (specifically, but not limited to alljobs.ai) in order to scrape all of Jobiak’s proprietary keyword data,” an allegation that Aspen’s counsel denied in his response to Jobiak’s cease and desist letter.
- Scraping and website terms. As we wrote about when discussing the hiQ-LinkedIn litigation, while hiQ was essentially able to overcome LinkedIn’s CFAA claim over its scraping practices, the court in that case issued several adverse findings against hiQ related to breach of contract claims for purported violations of LinkedIn’s terms that prohibited unauthorized scraping of its site. The Jobiak complaints do not assert a breach of contract claim, though the complaints states that the defendants were using Jobiak’s work for “commercial purposes without permission.”
While alljobs.ai’s website’s terms of service prohibits automated access or bots accessing the site, users can seemingly search listings without registering or taking any affirmative step to agree to those terms. Presumably, the same can be true for scrapers. For such unregistered users, website operators try to enforce their terms under a browsewrap model, whereby the operator claims that users are presumed to be bound by a website’s terms by mere use of the site, without the need for any outward manifestation of assent (Jobiak’s terms state: “By using this web site, you are indicating your acceptance to be bound by the terms of these Terms and Conditions”). The enforceability of browsewrap terms has not been determined. Perhaps, in this case, Jodiak deemed its other claims more viable and decided not to plead breach of contract, particularly if evidence showed the defendants scraped the job listings without creating an account. The annexed cease and desist letters exchanged between the parties do not reference that the defendants were registered alljobs.ai account holder. Of course, Jobiak could amend its complaint in the future to add such a claim if the facts support it.
Due to the focus on the training of generative AI models, scraping – and the legal issues around scraping – have become a subject of great interest. These complaints bring up some dynamic legal issues involving data scraping and we will be watching developments closely. If the plaintiff is successful, these cases may open another means for online publishers of data to challenge unauthorized scraping.