Alternative Data

DC Court Ruling Reduces Webscraping Risk

Published: Apr. 17, 2020

Updated: Nov. 07, 2023

In a decision that reduces some risk associated with webscraping, the United States District Court for the District of Columbia ruled that violating a website’s terms of service cannot alone be the basis for a finding that the conduct is “unauthorized,” under the Computer Fraud and Abuse Act (“CFAA”). Christian W. Sandvig, et al. v. William P. Barr, 2020 WL 1494065 (D.D.C. 2020) (attached). Although the Sandvig decision is not binding upon courts outside of the District of Columbia, it provides other courts a useful point of reference as they consider how the CFAA might apply to webscraping.


The Sandvig decision results from the 2016 filing of a pre-enforcement constitutional challenge by several academics who intended to conduct research by accessing and using various recruiting websites through fake accounts. Specifically, the plaintiffs planned to use fictitious profiles to study whether the proprietary algorithms of these sites resulted in discriminatory biases. However, creation and use of fake accounts or profiles violated the sites’ terms of service. Accordingly, the plaintiffs alleged that their intended use of the websites would subject them to prosecution under the CFAA, which criminalizes obtaining information from a “protected computer” by means of “intentionally access[ing] a computer without authorization or exceed[ing] authorized access. . .” 18 U.S.C. § 1030(a)(2).

Although the plaintiffs made several constitutional claims, all but one were dismissed by the Court in 2018. As a result, the Court’s recent decision addressed only the plaintiffs’ remaining claim that the CFAA’s Access Provision is overbroad and chills First Amendment rights to freedom of speech. Ultimately, the Court dismissed the claim, finding it was moot because plaintiffs’ proposed research activities would not actually violate the CFAA.

The Court’s Interpretation of the CFAA’s Access Provision

In reaching its decision, the court adopted the Ninth Circuit Court of Appeals’ characterization of the internet as consisting of two “realms”—those portions of websites that are public and those that are private (i.e., where permission is required for access). Id. at 17-18 (citing hiQ Labs, Inc. v. LinkedIn Corp., 938 F.3d 985, 1000 (9th Cir. 2019)). The Court then continued to evaluate whether contractual restrictions, like website terms of service, create a sufficient barrier or “permission requirement” to trigger criminal liability under the CFAA if they are ignored or bypassed. The Court concluded that they do not, finding that: (i) a user commits unauthorized access only when the user bypasses a password, login credential, payment requirement, or other form of “authentication gate;” and (ii) violating public websites’ terms of service does not constitute “exceed[ing] authorized access” under the CFAA. The Court found that because the plaintiffs planned to create accounts with each website and pay the applicable subscription fees charged by such websites, the fact that the accounts violated the websites’ terms of service (i.e., by using fake or fictitious names) would not make the plaintiffs’ access and use of the websites unauthorized or outside the scope of authority under the CFAA.

The court identified three primary factors that led to its finding. First, the Court asserted that websites’ terms of service provide users inadequate notice for purposes of criminal liability, because they often are “long, dense, and subject to change” and not communicated in a prominent form (such as a link at the bottom of a website). Sandvig at 20. Second, the Court reasoned that enabling private website owners to define the scope of criminal liability through their terms of service would be problematic as it would “risk[] turning each website into its own criminal jurisdiction and each webmaster into his own legislature.” Id. at 21. Finally, the Court explained that certain common law principles favored the Court’s narrow reading of the CFAA.

Again, this decision reduces some risk of webscraping information from behind a login page, but it does not obviate all webscraping risk or address potential civil exposure for commercial claims or any securities considerations.