4,579 research outputs found
Towards cross-lingual alerting for bursty epidemic events
Background: Online news reports are increasingly becoming a source for event
based early warning systems that detect natural disasters. Harnessing the
massive volume of information available from multilingual newswire presents as
many challenges as opportunities due to the patterns of reporting complex
spatiotemporal events. Results: In this article we study the problem of
utilising correlated event reports across languages. We track the evolution of
16 disease outbreaks using 5 temporal aberration detection algorithms on
text-mined events classified according to disease and outbreak country. Using
ProMED reports as a silver standard, comparative analysis of news data for 13
languages over a 129 day trial period showed improved sensitivity, F1 and
timeliness across most models using cross-lingual events. We report a detailed
case study analysis for Cholera in Angola 2010 which highlights the challenges
faced in correlating news events with the silver standard. Conclusions: The
results show that automated health surveillance using multilingual text mining
has the potential to turn low value news into high value alerts if informed
choices are used to govern the selection of models and data sources. An
implementation of the C2 alerting algorithm using multilingual news is
available at the BioCaster portal http://born.nii.ac.jp/?page=globalroundup
Prosodic Event Recognition using Convolutional Neural Networks with Context Information
This paper demonstrates the potential of convolutional neural networks (CNN)
for detecting and classifying prosodic events on words, specifically pitch
accents and phrase boundary tones, from frame-based acoustic features. Typical
approaches use not only feature representations of the word in question but
also its surrounding context. We show that adding position features indicating
the current word benefits the CNN. In addition, this paper discusses the
generalization from a speaker-dependent modelling approach to a
speaker-independent setup. The proposed method is simple and efficient and
yields strong results not only in speaker-dependent but also
speaker-independent cases.Comment: Interspeech 2017 4 pages, 1 figur
Identity and Granularity of Events in Text
In this paper we describe a method to detect event descrip- tions in
different news articles and to model the semantics of events and their
components using RDF representations. We compare these descriptions to solve a
cross-document event coreference task. Our com- ponent approach to event
semantics defines identity and granularity of events at different levels. It
performs close to state-of-the-art approaches on the cross-document event
coreference task, while outperforming other works when assuming similar quality
of event detection. We demonstrate how granularity and identity are
interconnected and we discuss how se- mantic anomaly could be used to define
differences between coreference, subevent and topical relations.Comment: Invited keynote speech by Piek Vossen at Cicling 201
Cross-Lingual Induction and Transfer of Verb Classes Based on Word Vector Space Specialisation
Existing approaches to automatic VerbNet-style verb classification are
heavily dependent on feature engineering and therefore limited to languages
with mature NLP pipelines. In this work, we propose a novel cross-lingual
transfer method for inducing VerbNets for multiple languages. To the best of
our knowledge, this is the first study which demonstrates how the architectures
for learning word embeddings can be applied to this challenging
syntactic-semantic task. Our method uses cross-lingual translation pairs to tie
each of the six target languages into a bilingual vector space with English,
jointly specialising the representations to encode the relational information
from English VerbNet. A standard clustering algorithm is then run on top of the
VerbNet-specialised representations, using vector dimensions as features for
learning verb classes. Our results show that the proposed cross-lingual
transfer approach sets new state-of-the-art verb classification performance
across all six target languages explored in this work.Comment: EMNLP 2017 (long paper
PersoNER: Persian named-entity recognition
© 1963-2018 ACL. Named-Entity Recognition (NER) is still a challenging task for languages with low digital resources. The main difficulties arise from the scarcity of annotated corpora and the consequent problematic training of an effective NER pipeline. To abridge this gap, in this paper we target the Persian language that is spoken by a population of over a hundred million people world-wide. We first present and provide ArmanPerosNERCorpus, the first manually-annotated Persian NER corpus. Then, we introduce PersoNER, an NER pipeline for Persian that leverages a word embedding and a sequential max-margin classifier. The experimental results show that the proposed approach is capable of achieving interesting MUC7 and CoNNL scores while outperforming two alternatives based on a CRF and a recurrent neural network
Exploring Metaphorical Senses and Word Representations for Identifying Metonyms
A metonym is a word with a figurative meaning, similar to a metaphor. Because
metonyms are closely related to metaphors, we apply features that are used
successfully for metaphor recognition to the task of detecting metonyms. On the
ACL SemEval 2007 Task 8 data with gold standard metonym annotations, our system
achieved 86.45% accuracy on the location metonyms. Our code can be found on
GitHub.Comment: 9 pages, 8 pages conten
- …