34 research outputs found
Recommended from our members
Making Sense of Microposts (#Microposts2016) Computational Social Sciences Track
For the second time, the #Microposts workshop features a track to highlight social science perspectives on micro communication structures in online environments. This paper introduces the #Microposts2016 (Computational) Social Science Track, which all contribute to connecting research methods and approaches in computer science and social science. By providing a forum for closer interaction between the two fields, the track is becoming a platform for interdisciplinary projects and new ideas to combine different methodologies and theories. For this year’s special track we see the trend of relating Microposts to external demographics or survey data as a way to better understand Microposts in their broader contexts
A Reverse Approach to Named Entity Extraction and Linking in Microposts
ABSTRACT In this paper, we present a pipeline for named entity extraction and linking that is designed specifically for noisy, grammatically inconsistent domains where traditional named entity techniques perform poorly. Our approach leverages a large knowledge base to improve entity recognition, while maintaining the use of traditional NER to identify mentions that are not co-referent with any entities in the knowledge base
FICLONE: Improving DBpedia Spotlight Using Named Entity Recognition and Collective Disambiguation
In this paper we present FICLONE, which aims to improve the performance of DBpedia Spotlight, not only for the task of semantic annotation (SA), but also for the sub-task of named entity disambiguation (NED). To achieve this aim, first we enhance the spotting phase by combining a named entity recognition system (Stanford NER ) with the results of DBpedia Spotlight. Second, we improve the disambiguation phase by using coreference resolution and exploiting a lexicon that associates a list of potential entities of Wikipedia to surface forms. Finally, to select the correct entity among the candidates found for one mention, FICLONE relies on collective disambiguation, an approach that has proved successful in many other annotators, and that takes into consideration the other mentions in the text. Our experiments show that FICLONE not only substantially improves the performance of DBpedia Spotlight for the NED sub-task but also generally outperforms other state-of-the-art systems. For the SA sub-task, FICLONE also outperforms DBpedia Spotlight against the dataset provided by the DBpedia Spotlight team
Improving Distantly-Supervised Named Entity Recognition with Self-Collaborative Denoising Learning
Distantly supervised named entity recognition (DS-NER) efficiently reduces
labor costs but meanwhile intrinsically suffers from the label noise due to the
strong assumption of distant supervision. Typically, the wrongly labeled
instances comprise numbers of incomplete and inaccurate annotation noise, while
most prior denoising works are only concerned with one kind of noise and fail
to fully explore useful information in the whole training set. To address this
issue, we propose a robust learning paradigm named Self-Collaborative Denoising
Learning (SCDL), which jointly trains two teacher-student networks in a
mutually-beneficial manner to iteratively perform noisy label refinery. Each
network is designed to exploit reliable labels via self denoising, and two
networks communicate with each other to explore unreliable annotations by
collaborative denoising. Extensive experimental results on five real-world
datasets demonstrate that SCDL is superior to state-of-the-art DS-NER denoising
methods.Comment: EMNLP (12 pages, 4 figures, 6 tables
Distantly-Supervised Named Entity Recognition with Uncertainty-aware Teacher Learning and Student-student Collaborative Learning
Distantly-Supervised Named Entity Recognition (DS-NER) effectively alleviates
the burden of annotation, but meanwhile suffers from the label noise. Recent
works attempt to adopt the teacher-student framework to gradually refine the
training labels and improve the overall robustness. However, we argue that
these teacher-student methods achieve limited performance because poor network
calibration produces incorrectly pseudo-labeled samples, leading to error
propagation. Therefore, we attempt to mitigate this issue by proposing: (1)
Uncertainty-aware Teacher Learning that leverages the prediction uncertainty to
guide the selection of pseudo-labels, avoiding the number of incorrect
pseudo-labels in the self-training stage. (2) Student-student Collaborative
Learning that allows the transfer of reliable labels between two student
networks instead of completely relying on all pseudo-labels from its teacher.
Meanwhile, this approach allows a full exploration of mislabeled samples rather
than simply filtering unreliable pseudo-labeled samples. Extensive experimental
results on five DS-NER datasets demonstrate that our method is superior to
state-of-the-art teacher-student methods
Scholarly use of social media and altmetrics : a review of the literature
Social media has become integrated into the fabric of the scholarly communication system in fundamental
ways: principally through scholarly use of social media platforms and the promotion of new indicators on
the basis of interactions with these platforms. Research and scholarship in this area has accelerated since
the coining and subsequent advocacy for altmetrics—that is, research indicators based on social media
activity. This review provides an extensive account of the state-of-the art in both scholarly use of social
media and altmetrics. The review consists of two main parts: the first examines the use of social media in
academia, examining the various functions these platforms have in the scholarly communication process
and the factors that affect this use. The second part reviews empirical studies of altmetrics, discussing the
various interpretations of altmetrics, data collection and methodological limitations, and differences
according to platform. The review ends with a critical discussion of the implications of this transformation
in the scholarly communication system
Implementing Gehl’s Theory to Study Urban Space. The Case of Monotowns
The paper presents a method to operationalize Jan Gehl’s questions for public space into metrics to map Russian monotowns’ urban life in 2017. With the use of social media data, it becomes possible to scale Gehl’s approach from the survey of small urban areas to the analysis of entire cities while maintaining the human scale’s resolution. When underperforming public spaces are detected, we propose a matrix for urban design interventions using Jane Jacobs’ typologies for good city life. Furthermore, this method was deployed to improve the conditions of public spaces in Russian monotowns through a series of architectural briefs for design competitions and urban design guidelines for local administrations.publishedVersionPeer reviewe