Reflections from the Scraping Webinar Organised by the Social Platforms Data Access Taskforce
As online platforms become the spaces where public life increasingly unfolds, the ability to study them is not just a research concern, but a public necessity. Yet in the UK, researchers trying to access social platform data face a maze of inconsistent, unpredictable, and often opaque barriers. At the heart of this challenge lies public-interest data scraping, a method essential for gathering reliable evidence from social media.
To spotlight these issues, the Social Platform Data Access Taskforce (https://www.linkedin.com/groups/15691013/) organised a webinar on 01.12.2025 titled Understanding Barriers and Avenues to Data Scraping. The webinar brought together 27 researchers from Academia and Civil Society, who work directly with publicly accessible social media data. Their discussion offered timely insights into how regulatory, technical, and institutional constraints shape research today, and marked an important step toward more open, accountable data-access frameworks for the UK.
Why scraping matters (!)
To understand key digital policy issues, from online safety and misinformation to electoral integrity and teens’ mental health, researchers need reliable ways to observe what platforms are doing. Platforms sometimes provide access through Application Programming Interfaces (APIs), which are controlled channels that let approved users request specific types of data. But these APIs are often narrow in scope, subject to sudden change, or too limited to support independent scrutiny.
This is why researchers sometimes turn to data scraping, a method of automatically collecting information that is already publicly visible on a platform’s website. Public-interest data scraping can be one of the few viable ways to study real-world platform behaviour at scale. Yet using public-interest data scraping exposes deeper systemic issues. Researchers must operate within unclear terms of service, uneven platform enforcement, and legal or regulatory provisions that are difficult to interpret in practice. Even when collecting only publicly accessible data and applying strong ethical safeguards, there is persistent uncertainty about what qualifies as lawful and responsible access.
However, these ambiguities rarely constrain commercial actors: many companies routinely rely on large-scale scraping or purchase equivalent datasets from third-party vendors (Silverman, 2025). This disparity means that independent, public-interest researchers face significantly higher barriers to accessing the very data needed to evaluate platform behaviour. In turn, regulators and policymakers are left with weaker evidence when designing or assessing digital policy. This limitation is recognised across major policy analyses, including the EU’s Digital Services Act, OECD transparency reports, and Ofcom’s research, all of which emphasise that restricted access to platform data undermines the evaluation and enforcement of regulatory measures.
The clarity imperative
A core cross-cutting message from the webinar was the pressing need for clear national guidance on the legal and ethical status of public-interest scraping for research. Currently, institutions differ in how they assess compliance, platforms differ in how they communicate and respond to risk, and researchers are left to reconcile conflicting messages from both. This regulatory ambiguity is not merely an academic inconvenience but a significant barrier. It limits early-career scholars, disadvantages smaller institutions lacking extensive legal support, and risks creating a disparity where countries with clearer pathways for responsible scraping gain a competitive advantage in digital research.
As part of the work being done by the Social Platform Data Access Taskforce, we have gathered case studies showing researchers in universities and civil society being substantially limited in their ability to conduct research on the online world due to uncertainty around scraping. One example comes from Stefania Vicari, Senior Lecturer in Digital Sociology at the University of Sheffield, who studies how social media shapes public narratives of women’s health. She noted that while people increasingly rely on these platforms for information, opaque algorithmic systems influence which voices become visible, with implications for research and public understanding alike. Crucially, she emphasised that answering even basic questions about these dynamics, requires access to publicly shared content. Access that current platform restrictions and costs often make prohibitive.
Technical Barriers and the Loss of Transparency
The webinar highlighted that platform governance shifts further complicate the landscape. Measures like increased rate-limiting, stricter access controls, and the withdrawal or commercialisation of APIs are narrowing the practical space for non-commercial research. While often intended to address legitimate concerns like privacy and security, these technical barriers have the unintended effect of limiting independent scrutiny and transparency, precisely when platform accountability is a recognised policy priority. Effective platform governance demands a balance between robust user protection and enabling public-interest access.

Way Forward: A Critical Policy Window
The Taskforce recognises that resolving the ambiguity around data access requires more than ongoing conversation. It calls for a shared commitment to practical, near-term progress. We now have a timely opportunity to strengthen the foundations for responsible research. The webinar underscored that current policy momentum creates space for constructive engagement from the research community.
The DSIT Opportunity
With attention focused on the forthcoming consultation from the Department for Science, Innovation and Technology (DSIT), there is a valuable chance for researchers and institutions to contribute to shaping a balanced, workable framework.
Supporting Legal Clarity
Participants also highlighted the need for clearer guidance on data scraping. Encouraging the Information Commissioner’s Office (ICO) to issue practical, definitive advice, building on existing recommendations, including those from Ofcom, would provide the clarity institutions need and help ensure consistent, confident support for responsible research.
Conclusion
The ability to study online environments is critical for researchers, regulators and policy makers, especially as the UK refines digital legislation. The challenges surrounding public-interest data scraping are symptomatic of a wider confusion between research needs, platform governance, and regulatory expectations. Addressing this requires coordinated effort, clearer guidance, and sustained engagement across sectors. The insights gathered will directly inform the Taskforce’s continuous work to develop realistic, actionable recommendations that build a trustworthy and less complicated data access ecosystem. Responsible access to social platform data is an essential ingredient in a healthy, evidence-driven digital policy environment.
Sources
Silverman, 2025: https://www.techpolicy.press/why-commercial-tools-can-scrape-social-media-but-researchers-cant
About
Dr. Alexandra Boutopoulou is a Research Fellow at the University of Sheffield and a member of the UKRI Taskforce on Data Access. Drawing on her background in digital and social media strategy and research, her work informs policy debates on responsible access to social media data.