82 research outputs found

    Big Questions for Social Media Big Data: Representativeness, Validity and Other Methodological Pitfalls

    Full text link
    Large-scale databases of human activity in social media have captured scientific and policy attention, producing a flood of research and discussion. This paper considers methodological and conceptual challenges for this emergent field, with special attention to the validity and representativeness of social media big data analyses. Persistent issues include the over-emphasis of a single platform, Twitter, sampling biases arising from selection by hashtags, and vague and unrepresentative sampling frames. The socio-cultural complexity of user behavior aimed at algorithmic invisibility (such as subtweeting, mock-retweeting, use of "screen captures" for text, etc.) further complicate interpretation of big data social media. Other challenges include accounting for field effects, i.e. broadly consequential events that do not diffuse only through the network under study but affect the whole society. The application of network methods from other fields to the study of human social activity may not always be appropriate. The paper concludes with a call to action on practical steps to improve our analytic capacity in this promising, rapidly-growing field.Comment: Tufekci, Zeynep. (2014). Big Questions for Social Media Big Data: Representativeness, Validity and Other Methodological Pitfalls. In ICWSM '14: Proceedings of the 8th International AAAI Conference on Weblogs and Social Media, 2014. [forthcoming

    Recommendations and User Agency: The Reachability of Collaboratively-Filtered Information

    Full text link
    Recommender systems often rely on models which are trained to maximize accuracy in predicting user preferences. When the systems are deployed, these models determine the availability of content and information to different users. The gap between these objectives gives rise to a potential for unintended consequences, contributing to phenomena such as filter bubbles and polarization. In this work, we consider directly the information availability problem through the lens of user recourse. Using ideas of reachability, we propose a computationally efficient audit for top-NN linear recommender models. Furthermore, we describe the relationship between model complexity and the effort necessary for users to exert control over their recommendations. We use this insight to provide a novel perspective on the user cold-start problem. Finally, we demonstrate these concepts with an empirical investigation of a state-of-the-art model trained on a widely used movie ratings dataset.Comment: appeared at FAccT '2

    POTs: Protective Optimization Technologies

    Full text link
    Algorithmic fairness aims to address the economic, moral, social, and political impact that digital systems have on populations through solutions that can be applied by service providers. Fairness frameworks do so, in part, by mapping these problems to a narrow definition and assuming the service providers can be trusted to deploy countermeasures. Not surprisingly, these decisions limit fairness frameworks' ability to capture a variety of harms caused by systems. We characterize fairness limitations using concepts from requirements engineering and from social sciences. We show that the focus on algorithms' inputs and outputs misses harms that arise from systems interacting with the world; that the focus on bias and discrimination omits broader harms on populations and their environments; and that relying on service providers excludes scenarios where they are not cooperative or intentionally adversarial. We propose Protective Optimization Technologies (POTs). POTs provide means for affected parties to address the negative impacts of systems in the environment, expanding avenues for political contestation. POTs intervene from outside the system, do not require service providers to cooperate, and can serve to correct, shift, or expose harms that systems impose on populations and their environments. We illustrate the potential and limitations of POTs in two case studies: countering road congestion caused by traffic-beating applications, and recalibrating credit scoring for loan applicants.Comment: Appears in Conference on Fairness, Accountability, and Transparency (FAT* 2020). Bogdan Kulynych and Rebekah Overdorf contributed equally to this work. Version v1/v2 by Seda G\"urses, Rebekah Overdorf, and Ero Balsa was presented at HotPETS 2018 and at PiMLAI 201

    What were the historical reasons for the resistance to recognizing airborne transmission during the COVID‐19 pandemic?

    Get PDF
    The question of whether SARS‐CoV‐2 is mainly transmitted by droplets or aerosols has been highly controversial. We sought to explain this controversy through a historical analysis of transmission research in other diseases. For most of human history, the dominant paradigm was that many diseases were carried by the air, often over long distances and in a phantasmagorical way. This miasmatic paradigm was challenged in the mid to late 19th century with the rise of germ theory, and as diseases such as cholera, puerperal fever, and malaria were found to actually transmit in other ways. Motivated by his views on the importance of contact/droplet infection, and the resistance he encountered from the remaining influence of miasma theory, prominent public health official Charles Chapin in 1910 helped initiate a successful paradigm shift, deeming airborne transmission most unlikely. This new paradigm became dominant. However, the lack of understanding of aerosols led to systematic errors in the interpretation of research evidence on transmission pathways. For the next five decades, airborne transmission was considered of negligible or minor importance for all major respiratory diseases, until a demonstration of airborne transmission of tuberculosis (which had been mistakenly thought to be transmitted by droplets) in 1962. The contact/droplet paradigm remained dominant, and only a few diseases were widely accepted as airborne before COVID‐19: those that were clearly transmitted to people not in the same room. The acceleration of interdisciplinary research inspired by the COVID‐19 pandemic has shown that airborne transmission is a major mode of transmission for this disease, and is likely to be significant for many respiratory infectious diseases

    Keynote: Zeynep Tufekci

    No full text

    Facebook, Youth and Privacy in Networked Publics

    No full text
    Media accounts would have us believe that today’s youth are a particularly narcissistic generation. Young adults are often portrayed as exhibitionists who share personal information excessively and only react if “burned” by experience. This paper reports results from 450 surveys of young adults on social network site usage and privacy and surveillance experiences--as well as from a historical archive dating back to 2006. The findings show a complex picture of a generation actively engaging visibility and social boundaries online through privacy and visibility practices. A striking increase in privacy protective activities is documented. I examine whether these changes are in response to personal negative experiences from online disclosure or if they derive from general awareness. I find that students are reacting pro-actively and adjusting their privacy settings above and beyond the impact of negative personal experiences. Contrary to media reports, young adults do not appear uncaring about privacy and are not waiting until they get burned. Significant racial and gender differences remain in privacy behaviors. Strikingly, about 20% report having deactivated their profile at least once

    Engineering the public: Big data, surveillance and computational politics

    No full text
    Digital technologies have given rise to a new combination of big data and computational practices which allow for massive, latent data collection and sophisticated computational modeling, increasing the capacity of those with resources and access to use these tools to carry out highly effective, opaque and unaccountable campaigns of persuasion and social engineering in political, civic and commercial spheres. I examine six intertwined dynamics that pertain to the rise of computational politics: the rise of big data, the shift away from demographics to individualized targeting, the opacity and power of computational modeling, the use of persuasive behavioral science, digital media enabling dynamic real-time experimentation, and the growth of new power brokers who own the data or social media environments. I then examine the consequences of these new mechanisms on the public sphere and political campaigns
    • 

    corecore