520 research outputs found

    Towards Reliable Misinformation Mitigation: Generalization, Uncertainty, and GPT-4

    Full text link
    Misinformation poses a critical societal challenge, and current approaches have yet to produce an effective solution. We propose focusing on generalization, soft classification, and leveraging recent large language models to create more practical tools in contexts where perfect predictions remain unattainable. We begin by demonstrating that GPT-4 and other language models can outperform existing methods in the literature. Next, we explore their generalization, revealing that GPT-4 and RoBERTa-large exhibit critical differences in failure modes, which offer potential for significant performance improvements. Finally, we show that these models can be employed in soft classification frameworks to better quantify uncertainty. We find that models with inferior hard classification results can achieve superior soft classification performance. Overall, this research lays groundwork for future tools that can drive real-world progress on misinformation

    Developing a mental health index using a machine learning approach: Assessing the impact of mobility and lockdown during the COVID-19 pandemic

    Get PDF
    Governments worldwide have implemented stringent restrictions to curtail the spread of the COVID-19 pandemic. Although beneficial to physical health, these preventive measures could have a profound detrimental effect on the mental health of the population. This study focuses on the impact of lockdowns and mobility restrictions on mental health during the COVID-19 pandemic. We first develop a novel mental health index based on the analysis of data from over three million global tweets using the Microsoft Azure machine learning approach. The computed mental health index scores are then regressed with the lockdown strictness index and Google mobility index using fixed-effects ordinary least squares (OLS) regression. The results reveal that the reduction in workplace mobility, reduction in retail and recreational mobility, and increase in residential mobility (confinement to the residence) have harmed mental health. However, restrictions on mobility to parks, grocery stores, and pharmacy outlets were found to have no significant impact. The proposed mental health index provides a path for theoretical and empirical mental health studies using social media. [Abstract copyright: © 2022 Elsevier Inc. All rights reserved.

    Fake News Detection

    Get PDF

    Ethical and Responsible Behavior in Applied Empirical Research: Four Essays on Academic Practices in the Social Sciences

    Get PDF
    In a world filled with fake news and alternative facts, public trust in science is of utmost importance. Yet scandals like the cases of Diederik Stapel and Ulrich Lichtenthaler have questioned the integrity of scholars and their research results. To address this issue, several scientists investigated academic (mal)practices like plagiarism, HARKing, authorship misuse and data flexibility. The results were devastating and ignited a credibility crisis, especially in the social sciences. Fortunately, we already can see the light at the end of the tunnel as editors, publishers, research societies and universities have started to introduce techniques and infrastructure that ensure ethical and responsible scholarly behavior. For example, artificial intelligence has enabled plagiarism detection software to not only check for copy-pasting but also for content and reference similarities. Moreover, more and more journals motivate or sometimes even require researchers to pre-register their research hypotheses prior to data collection and/or data analysis to prevent HARKing. In the life sciences, contribution disclosure statements force authors to transparently report the contributions of each researcher involved in a research project. In the social sciences, several articles and editorials highlighted that ensuring replicability by means of transparent reporting and data sharing allows detecting and overcoming flexible and questionable data handling practices. This thesis builds upon the existing body of literature and provides guidance for those academic (mal)practices that have been covered only rudimentarily in the social sciences

    The emerging landscape of Social Media Data Collection: anticipating trends and addressing future challenges

    Full text link
    [spa] Las redes sociales se han convertido en una herramienta poderosa para crear y compartir contenido generado por usuarios en todo internet. El amplio uso de las redes sociales ha llevado a generar una enorme cantidad de información, presentando una gran oportunidad para el marketing digital. A través de las redes sociales, las empresas pueden llegar a millones de consumidores potenciales y capturar valiosos datos de los consumidores, que se pueden utilizar para optimizar estrategias y acciones de marketing. Los beneficios y desafíos potenciales de utilizar las redes sociales para el marketing digital también están creciendo en interés entre la comunidad académica. Si bien las redes sociales ofrecen a las empresas la oportunidad de llegar a una gran audiencia y recopilar valiosos datos de los consumidores, el volumen de información generada puede llevar a un marketing sin enfoque y consecuencias negativas como la sobrecarga social. Para aprovechar al máximo el marketing en redes sociales, las empresas necesitan recopilar datos confiables para propósitos específicos como vender productos, aumentar la conciencia de marca o fomentar el compromiso y para predecir los comportamientos futuros de los consumidores. La disponibilidad de datos de calidad puede ayudar a construir la lealtad a la marca, pero la disposición de los consumidores a compartir información depende de su nivel de confianza en la empresa o marca que lo solicita. Por lo tanto, esta tesis tiene como objetivo contribuir a la brecha de investigación a través del análisis bibliométrico del campo, el análisis mixto de perfiles y motivaciones de los usuarios que proporcionan sus datos en redes sociales y una comparación de algoritmos supervisados y no supervisados para agrupar a los consumidores. Esta investigación ha utilizado una base de datos de más de 5,5 millones de colecciones de datos durante un período de 10 años. Los avances tecnológicos ahora permiten el análisis sofisticado y las predicciones confiables basadas en los datos capturados, lo que es especialmente útil para el marketing digital. Varios estudios han explorado el marketing digital a través de las redes sociales, algunos centrándose en un campo específico, mientras que otros adoptan un enfoque multidisciplinario. Sin embargo, debido a la naturaleza rápidamente evolutiva de la disciplina, se requiere un enfoque bibliométrico para capturar y sintetizar la información más actualizada y agregar más valor a los estudios en el campo. Por lo tanto, las contribuciones de esta tesis son las siguientes. En primer lugar, proporciona una revisión exhaustiva de la literatura sobre los métodos para recopilar datos personales de los consumidores de las redes sociales para el marketing digital y establece las tendencias más relevantes a través del análisis de artículos significativos, palabras clave, autores, instituciones y países. En segundo lugar, esta tesis identifica los perfiles de usuario que más mienten y por qué. Específicamente, esta investigación demuestra que algunos perfiles de usuario están más inclinados a cometer errores, mientras que otros proporcionan información falsa intencionalmente. El estudio también muestra que las principales motivaciones detrás de proporcionar información falsa incluyen la diversión y la falta de confianza en las medidas de privacidad y seguridad de los datos. Finalmente, esta tesis tiene como objetivo llenar el vacío en la literatura sobre qué algoritmo, supervisado o no supervisado, puede agrupar mejor a los consumidores que proporcionan sus datos en las redes sociales para predecir su comportamiento futuro

    AI for social good: social media mining of migration discourse

    Get PDF
    The number of international migrants has steadily increased over the years, and it has become one of the pressing issues in today’s globalized world. Our bibliometric review of around 400 articles on Scopus platform indicates an increased interest in migration-related research in recent times but the extant research is scattered at best. AI-based opinion mining research has predominantly noted negative sentiments across various social media platforms. Additionally, we note that prior studies have mostly considered social media data in the context of a particular event or a specific context. These studies offered a nuanced view of the societal opinions regarding that specific event, but this approach might miss the forest for the trees. Hence, this dissertation makes an attempt to go beyond simplistic opinion mining to identify various latent themes of migrant-related social media discourse. The first essay draws insights from the social psychology literature to investigate two facets of Twitter discourse, i.e., perceptions about migrants and behaviors toward migrants. We identified two prevailing perceptions (i.e., sympathy and antipathy) and two dominant behaviors (i.e., solidarity and animosity) of social media users toward migrants. Additionally, this essay has also fine-tuned the binary hate speech detection task, specifically in the context of migrants, by highlighting the granular differences between the perceptual and behavioral aspects of hate speech. The second essay investigates the journey of migrants or refugees from their home to the host country. We draw insights from Gennep's seminal book, i.e., Les Rites de Passage, to identify four phases of their journey: Arrival of Refugees, Temporal stay at Asylums, Rehabilitation, and Integration of Refugees into the host nation. We consider multimodal tweets for this essay. We find that our proposed theoretical framework was relevant for the 2022 Ukrainian refugee crisis – as a use-case. Our third essay points out that a limited sample of annotated data does not provide insights regarding the prevailing societal-level opinions. Hence, this essay employs unsupervised approaches on large-scale societal datasets to explore the prevailing societal-level sentiments on YouTube platform. Specifically, it probes whether negative comments about migrants get endorsed by other users. If yes, does it depend on who the migrants are – especially if they are cultural others? To address these questions, we consider two datasets: YouTube comments before the 2022 Ukrainian refugee crisis, and during the crisis. Second dataset confirms the Cultural Us hypothesis, and our findings are inconclusive for the first dataset. Our final or fourth essay probes social integration of migrants. The first part of this essay probed the unheard and faint voices of migrants to understand their struggle to settle down in the host economy. The second part of this chapter explored the viability of social media platforms as a viable alternative to expensive commercial job portals for vulnerable migrants. Finally, in our concluding chapter, we elucidated the potential of explainable AI, and briefly pointed out the inherent biases of transformer-based models in the context of migrant-related discourse. To sum up, the importance of migration was recognized as one of the essential topics in the United Nation’s Sustainable Development Goals (SDGs). Thus, this dissertation has attempted to make an incremental contribution to the AI for Social Good discourse
    • …
    corecore