Workflow for YouTube Comment Data Acquisation

Abstract

This dataset accompanies a research project that conducts a corpus‑assisted critical discourse analysis of YouTube comments posted on the Cinema Therapy channel, specifically within the playlist “Coping with Coronavirus Quarantine.” The comments were retrieved using the researcher‑oriented YouTube Data Tool (YouTube Data API v3) and subsequently cleaned and filtered in Python, supplemented by manual verification. To comply with GDPR and institutional data protection guidelines, the final output does not include any original comment text from the year 2020. YouTube comments frequently contain personal narratives or identifiable contextual details, which makes them unsuitable for public release. Instead, the supporting materials include two Python scripts and a README file documenting the data collection, cleaning, and processing workflow. The purpose of the project is to examine how commenters position themselves within platform‑mediated therapeutic cultures by analyzing their linguistic patterns, critical discourses, and emotional expressions. In doing so, the project contributes to a more nuanced understanding of social‑media therapy audiences—one that recognizes them as interpretive, agentic participants in networked publics rather than uniformly passive, credulous, or happiness‑seeking consumers.</p

Similar works

Full text

This paper was published in CESSDA Data Catalogue OAI-PMH Repository.

Having an issue?

Is data on this page outdated, violates copyrights or anything else? Report the problem now and we will take corresponding actions after reviewing your request.