The post-API age of social media data access: Past, present, and future

Summary

This article updates Freelon’s 2018 “post-API age” framework by tracing nearly two decades of social media data access across Facebook, Instagram, Twitter/X, YouTube, Reddit, and TikTok. The authors propose a five-period historical schema (API prehistory, laissez-faire, authentication, limited options, academic cooperation) and argue that the present moment is best understood through a typology of four official data access regimes — laissez-faire API, academic API, academic walled garden, and pay-to-play API — supplemented by a fifth category of unofficial methods. Their central claim is that data access is permanently contingent on platform decisions shaped by scandal, regulation, leadership turnover, and commercial pressure, and that researchers therefore require both reformed official channels and legitimized unofficial alternatives.

Key Contributions

  • An extended periodized history of social media data access spanning ~20 years and six platforms, expanding Jünger’s (2021) earlier account.
  • A typology of four official data access regimes plus an unofficial methods category, useful both diagnostically and historiographically.
  • Normative recommendations across six dimensions — access, format, management, usage, costs, and sharing — for reforming platform-researcher relations.
  • A documented critique of specific problematic ToS provisions (continuous deletion, topic restrictions, anti-merging clauses, visibility thresholds).
  • A defense of unofficial scraping as ethically and legally legitimate, anchored in case law (hiQ Labs, Sandvig v. Barr, X v. Bright Data).

Methods

Historical and documentary analysis drawing on official platform documentation, archived web pages, terms of service, developer agreements, and secondary literature. The authors construct a periodization organized around regime shifts, conduct comparative descriptive analysis across six platforms, and develop a typological classification of current access modes. A concluding normative analysis articulates concrete reform proposals.

Findings

  • Meta’s 2018 closure of the Facebook/Instagram APIs after Cambridge Analytica is identified as the defining inflection point inaugurating the post-API age.
  • Twitter’s 2011 authentication requirement and prohibition on dataset sharing obsoleted earlier tools like TwapperKeeper and raised entry barriers.
  • CrowdTangle (2020–2024) partially restored Facebook access but suffered from inconsistent history, rate limits, and no comment data; its successor, the Meta Content Library, enforces 15,000–25,000 member visibility thresholds and a restrictive clean-room environment.
  • X’s 2023 pay-to-play pricing (210,000/month initially) is prohibitively expensive, with inconsistent pricing across its own documentation.
  • TikTok’s research API (Feb 2023) caps at 100,000 records/day, uses two-hour bearer tokens, restricts queries to 30-day windows, and mandates continuous refresh that deletes inaccessible content.
  • Reddit ended Pushshift public access in 2023 but archived torrents remain; its developer terms are nonetheless the most researcher-friendly of the six.
  • EU regulations (GDPR, DSA) produce geographic inequalities, granting EU-based researchers access denied elsewhere via VLOP obligations.

Connections

This paper is a foundational reference point for the broader topic cluster on platform data access, complementing platform-specific histories and infrastructural analyses like Helmond2026-ll and Rieder2025-ju / Rieder2026-pp, as well as the DSA Article 40 access debates explored in Ohme2026-nv and Tornberg2026-lc. It connects directly to defenses and assessments of unofficial/scraping methods such as Davidson2025-uu-style work represented here by Bouchaud2026-lr, Ventura2026-yc, and Achmann-Denkler2026-lx, and pairs naturally with reflections on the precarity of computational social science in Bruns2025-fz, Bak-Coleman2025-pm, and Murtfeldt2025-wu. For platform-specific empirical follow-ups under post-API conditions, see Pierri2025-hm and Bastos2025-ya on X and Hurcombe2025-cs on Meta-era transitions.

Podcast

A research-radio episode discusses this paper: Listen