673 research outputs found

    Political Depolarization of News Articles Using Attribute-aware Word Embeddings

    Full text link
    Political polarization in the US is on the rise. This polarization negatively affects the public sphere by contributing to the creation of ideological echo chambers. In this paper, we focus on addressing one of the factors that contributes to this polarity, polarized media. We introduce a framework for depolarizing news articles. Given an article on a certain topic with a particular ideological slant (eg., liberal or conservative), the framework first detects polar language in the article and then generates a new article with the polar language replaced with neutral expressions. To detect polar words, we train a multi-attribute-aware word embedding model that is aware of ideology and topics on 360k full-length media articles. Then, for text generation, we propose a new algorithm called Text Annealing Depolarization Algorithm (TADA). TADA retrieves neutral expressions from the word embedding model that not only decrease ideological polarity but also preserve the original argument of the text, while maintaining grammatical correctness. We evaluate our framework by comparing the depolarized output of our model in two modes, fully-automatic and semi-automatic, on 99 stories spanning 11 topics. Based on feedback from 161 human testers, our framework successfully depolarized 90.1% of paragraphs in semi-automatic mode and 78.3% of paragraphs in fully-automatic mode. Furthermore, 81.2% of the testers agree that the non-polar content information is well-preserved and 79% agree that depolarization does not harm semantic correctness when they compare the original text and the depolarized text. Our work shows that data-driven methods can help to locate political polarity and aid in the depolarization of articles.Comment: In Proceedings of the 15th International AAAI Conference on Weblogs and Social Media (ICWSM 2021

    Geo-Information Harvesting from Social Media Data

    Get PDF
    As unconventional sources of geo-information, massive imagery and text messages from open platforms and social media form a temporally quasi-seamless, spatially multi-perspective stream, but with unknown and diverse quality. Due to its complementarity to remote sensing data, geo-information from these sources offers promising perspectives, but harvesting is not trivial due to its data characteristics. In this article, we address key aspects in the field, including data availability, analysis-ready data preparation and data management, geo-information extraction from social media text messages and images, and the fusion of social media and remote sensing data. We then showcase some exemplary geographic applications. In addition, we present the first extensive discussion of ethical considerations of social media data in the context of geo-information harvesting and geographic applications. With this effort, we wish to stimulate curiosity and lay the groundwork for researchers who intend to explore social media data for geo-applications. We encourage the community to join forces by sharing their code and data.Comment: Accepted for publication IEEE Geoscience and Remote Sensing Magazin

    Narrative and computational text analysis in business and economic history

    Get PDF
    Recent calls from within economics for increased attention to narrative open the door to possible cross-fertilisation between economics and more humanistically oriented business and economic history. Indeed, arguments for economists to take narratives seriously and incorporate them into economic theory have some similarities with classic calls for a revival of narrative in history and abandonment of ‘scientific’ history. Both share an approach to explaining social phenomena based on the micro-level. This article examines how new methods in computational text analysis can be employed to further the goals of prioritising narrative in economics and history but also challenge a focus on the micro-level. Through a survey of the most frequently used tools of computational text analysis and an overview of their uses to date across the social sciences and humanities, this article shows how such methods can provide economic and business historians tools to respond to and engage with the ‘narrative turn’ in economics while also building on and offering a macro-level corrective to the focus on narrative in history.publishedVersio

    Discriminative Topic Mining via Category-Name Guided Text Embedding

    Full text link
    Mining a set of meaningful and distinctive topics automatically from massive text corpora has broad applications. Existing topic models, however, typically work in a purely unsupervised way, which often generate topics that do not fit users' particular needs and yield suboptimal performance on downstream tasks. We propose a new task, discriminative topic mining, which leverages a set of user-provided category names to mine discriminative topics from text corpora. This new task not only helps a user understand clearly and distinctively the topics he/she is most interested in, but also benefits directly keyword-driven classification tasks. We develop CatE, a novel category-name guided text embedding method for discriminative topic mining, which effectively leverages minimal user guidance to learn a discriminative embedding space and discover category representative terms in an iterative manner. We conduct a comprehensive set of experiments to show that CatE mines high-quality set of topics guided by category names only, and benefits a variety of downstream applications including weakly-supervised classification and lexical entailment direction identification.Comment: WWW 2020. (Code: https://github.com/yumeng5/CatE

    Politische Maschinen: Maschinelles Lernen fĂŒr das VerstĂ€ndnis von sozialen Maschinen

    Get PDF
    This thesis investigates human-algorithm interactions in sociotechnological ecosystems. Specifically, it applies machine learning and statistical methods to uncover political dimensions of algorithmic influence in social media platforms and automated decision making systems. Based on the results, the study discusses the legal, political and ethical consequences of algorithmic implementations.Diese Arbeit untersucht Mensch-Algorithmen-Interaktionen in sozio-technologischen Ökosystemen. Sie wendet maschinelles Lernen und statistische Methoden an, um politische Dimensionen des algorithmischen Einflusses auf Socialen Medien und automatisierten Entscheidungssystemen aufzudecken. Aufgrund der Ergebnisse diskutiert die Studie die rechtlichen, politischen und ethischen Konsequenzen von algorithmischen Anwendungen

    The expansion of isms, 1820-1917 : Data-driven analysis of political language in digitized newspaper collections

    Get PDF
    Words with the suffix -ism are reductionist terms that help us navigate complex social issues by using a simple one-word label for them. On the one hand, they are often associated with political ideologies, but on the other they are present in many other domains of language, especially culture, science, and religion.This has not always been the case. This paper studies isms in a historical record of digitized newspapers published from 1820 to 1917 in Finland to find out how the language of isms developed historically.We use diachronic word embeddings and affinity propagation clustering to trace how new isms entered the lexicon and how they relate to one another over time. We are able to show how they became more common and entered more and more domains. Still, the uses of isms as traditions for political action and thinking stand out in our analysisPeer reviewe
    • 

    corecore