2,121 research outputs found
Robust semantic analysis for adaptive speech interfaces
The DUMAS project develops speech-based applications that are adaptable to different users and domains. The paper describes the project's robust semantic analysis strategy, used both in the generic framework for the development of multilingual speech-based dialogue systems which is the main project goal, and in the initial test application, a mobile phone-based e-mail interface
From Word to Sense Embeddings: A Survey on Vector Representations of Meaning
Over the past years, distributed semantic representations have proved to be
effective and flexible keepers of prior knowledge to be integrated into
downstream applications. This survey focuses on the representation of meaning.
We start from the theoretical background behind word vector space models and
highlight one of their major limitations: the meaning conflation deficiency,
which arises from representing a word with all its possible meanings as a
single vector. Then, we explain how this deficiency can be addressed through a
transition from the word level to the more fine-grained level of word senses
(in its broader acceptation) as a method for modelling unambiguous lexical
meaning. We present a comprehensive overview of the wide range of techniques in
the two main branches of sense representation, i.e., unsupervised and
knowledge-based. Finally, this survey covers the main evaluation procedures and
applications for this type of representation, and provides an analysis of four
of its important aspects: interpretability, sense granularity, adaptability to
different domains and compositionality.Comment: 46 pages, 8 figures. Published in Journal of Artificial Intelligence
Researc
General Purpose Textual Sentiment Analysis and Emotion Detection Tools
Textual sentiment analysis and emotion detection consists in retrieving the
sentiment or emotion carried by a text or document. This task can be useful in
many domains: opinion mining, prediction, feedbacks, etc. However, building a
general purpose tool for doing sentiment analysis and emotion detection raises
a number of issues, theoretical issues like the dependence to the domain or to
the language but also pratical issues like the emotion representation for
interoperability. In this paper we present our sentiment/emotion analysis
tools, the way we propose to circumvent the di culties and the applications
they are used for.Comment: Workshop on Emotion and Computing (2013
ON MONITORING LANGUAGE CHANGE WITH THE SUPPORT OF CORPUS PROCESSING
One of the fundamental characteristics of language is that it can change over time. One
method to monitor the change is by observing its corpora: a structured language
documentation. Recent development in technology, especially in the field of Natural
Language Processing allows robust linguistic processing, which support the description of
diverse historical changes of the corpora. The interference of human linguist is inevitable as
it determines the gold standard, but computer assistance provides considerable support by
incorporating computational approach in exploring the corpora, especially historical
corpora. This paper proposes a model for corpus development, where corpus are annotated
to support further computational operations such as lexicogrammatical pattern matching,
automatic retrieval and extraction. The corpus processing operations are performed by local
grammar based corpus processing software on a contemporary Indonesian corpus. This
paper concludes that data collection and data processing in a corpus are equally crucial
importance to monitor language change, and none can be set aside
A Survey of Paraphrasing and Textual Entailment Methods
Paraphrasing methods recognize, generate, or extract phrases, sentences, or
longer natural language expressions that convey almost the same information.
Textual entailment methods, on the other hand, recognize, generate, or extract
pairs of natural language expressions, such that a human who reads (and trusts)
the first element of a pair would most likely infer that the other element is
also true. Paraphrasing can be seen as bidirectional textual entailment and
methods from the two areas are often similar. Both kinds of methods are useful,
at least in principle, in a wide range of natural language processing
applications, including question answering, summarization, text generation, and
machine translation. We summarize key ideas from the two areas by considering
in turn recognition, generation, and extraction methods, also pointing to
prominent articles and resources.Comment: Technical Report, Natural Language Processing Group, Department of
Informatics, Athens University of Economics and Business, Greece, 201
Riveter: Measuring Power and Social Dynamics Between Entities
Riveter provides a complete easy-to-use pipeline for analyzing verb
connotations associated with entities in text corpora. We prepopulate the
package with connotation frames of sentiment, power, and agency, which have
demonstrated usefulness for capturing social phenomena, such as gender bias, in
a broad range of corpora. For decades, lexical frameworks have been
foundational tools in computational social science, digital humanities, and
natural language processing, facilitating multifaceted analysis of text
corpora. But working with verb-centric lexica specifically requires natural
language processing skills, reducing their accessibility to other researchers.
By organizing the language processing pipeline, providing complete lexicon
scores and visualizations for all entities in a corpus, and providing
functionality for users to target specific research questions, Riveter greatly
improves the accessibility of verb lexica and can facilitate a broad range of
future research
- …