673 research outputs found
Political Depolarization of News Articles Using Attribute-aware Word Embeddings
Political polarization in the US is on the rise. This polarization negatively
affects the public sphere by contributing to the creation of ideological echo
chambers. In this paper, we focus on addressing one of the factors that
contributes to this polarity, polarized media. We introduce a framework for
depolarizing news articles. Given an article on a certain topic with a
particular ideological slant (eg., liberal or conservative), the framework
first detects polar language in the article and then generates a new article
with the polar language replaced with neutral expressions. To detect polar
words, we train a multi-attribute-aware word embedding model that is aware of
ideology and topics on 360k full-length media articles. Then, for text
generation, we propose a new algorithm called Text Annealing Depolarization
Algorithm (TADA). TADA retrieves neutral expressions from the word embedding
model that not only decrease ideological polarity but also preserve the
original argument of the text, while maintaining grammatical correctness. We
evaluate our framework by comparing the depolarized output of our model in two
modes, fully-automatic and semi-automatic, on 99 stories spanning 11 topics.
Based on feedback from 161 human testers, our framework successfully
depolarized 90.1% of paragraphs in semi-automatic mode and 78.3% of paragraphs
in fully-automatic mode. Furthermore, 81.2% of the testers agree that the
non-polar content information is well-preserved and 79% agree that
depolarization does not harm semantic correctness when they compare the
original text and the depolarized text. Our work shows that data-driven methods
can help to locate political polarity and aid in the depolarization of
articles.Comment: In Proceedings of the 15th International AAAI Conference on Weblogs
and Social Media (ICWSM 2021
Geo-Information Harvesting from Social Media Data
As unconventional sources of geo-information, massive imagery and text
messages from open platforms and social media form a temporally quasi-seamless,
spatially multi-perspective stream, but with unknown and diverse quality. Due
to its complementarity to remote sensing data, geo-information from these
sources offers promising perspectives, but harvesting is not trivial due to its
data characteristics. In this article, we address key aspects in the field,
including data availability, analysis-ready data preparation and data
management, geo-information extraction from social media text messages and
images, and the fusion of social media and remote sensing data. We then
showcase some exemplary geographic applications. In addition, we present the
first extensive discussion of ethical considerations of social media data in
the context of geo-information harvesting and geographic applications. With
this effort, we wish to stimulate curiosity and lay the groundwork for
researchers who intend to explore social media data for geo-applications. We
encourage the community to join forces by sharing their code and data.Comment: Accepted for publication IEEE Geoscience and Remote Sensing Magazin
Narrative and computational text analysis in business and economic history
Recent calls from within economics for increased attention to narrative open the door to possible cross-fertilisation between economics and more humanistically oriented business and economic history. Indeed, arguments for economists to take narratives seriously and incorporate them into economic theory have some similarities with classic calls for a revival of narrative in history and abandonment of âscientificâ history. Both share an approach to explaining social phenomena based on the micro-level. This article examines how new methods in computational text analysis can be employed to further the goals of prioritising narrative in economics and history but also challenge a focus on the micro-level. Through a survey of the most frequently used tools of computational text analysis and an overview of their uses to date across the social sciences and humanities, this article shows how such methods can provide economic and business historians tools to respond to and engage with the ânarrative turnâ in economics while also building on and offering a macro-level corrective to the focus on narrative in history.publishedVersio
Discriminative Topic Mining via Category-Name Guided Text Embedding
Mining a set of meaningful and distinctive topics automatically from massive
text corpora has broad applications. Existing topic models, however, typically
work in a purely unsupervised way, which often generate topics that do not fit
users' particular needs and yield suboptimal performance on downstream tasks.
We propose a new task, discriminative topic mining, which leverages a set of
user-provided category names to mine discriminative topics from text corpora.
This new task not only helps a user understand clearly and distinctively the
topics he/she is most interested in, but also benefits directly keyword-driven
classification tasks. We develop CatE, a novel category-name guided text
embedding method for discriminative topic mining, which effectively leverages
minimal user guidance to learn a discriminative embedding space and discover
category representative terms in an iterative manner. We conduct a
comprehensive set of experiments to show that CatE mines high-quality set of
topics guided by category names only, and benefits a variety of downstream
applications including weakly-supervised classification and lexical entailment
direction identification.Comment: WWW 2020. (Code: https://github.com/yumeng5/CatE
Politische Maschinen: Maschinelles Lernen fĂŒr das VerstĂ€ndnis von sozialen Maschinen
This thesis investigates human-algorithm interactions in sociotechnological ecosystems. Specifically, it applies machine learning and statistical methods to uncover political dimensions of algorithmic influence in social media platforms and automated decision making systems. Based on the results, the study discusses the legal, political and ethical consequences of algorithmic implementations.Diese Arbeit untersucht Mensch-Algorithmen-Interaktionen in sozio-technologischen Ăkosystemen. Sie wendet maschinelles Lernen und statistische Methoden an, um politische Dimensionen des algorithmischen Einflusses auf Socialen Medien und automatisierten Entscheidungssystemen aufzudecken. Aufgrund der Ergebnisse diskutiert die Studie die rechtlichen, politischen und ethischen Konsequenzen von algorithmischen Anwendungen
The expansion of isms, 1820-1917 : Data-driven analysis of political language in digitized newspaper collections
Words with the suffix -ism are reductionist terms that help us navigate complex social issues by using a simple one-word label for them. On the one hand, they are often associated with political ideologies, but on the other they are present in many other domains of language, especially culture, science, and religion.This has not always been the case. This paper studies isms in a historical record of digitized newspapers published from 1820 to 1917 in Finland to find out how the language of isms developed historically.We use diachronic word embeddings and affinity propagation clustering to trace how new isms entered the lexicon and how they relate to one another over time. We are able to show how they became more common and entered more and more domains. Still, the uses of isms as traditions for political action and thinking stand out in our analysisPeer reviewe
- âŠ