51,374 research outputs found
Multimodal person recognition for human-vehicle interaction
Next-generation vehicles will undoubtedly feature biometric person recognition as part of an effort to improve the driving experience. Today's technology prevents such systems from operating satisfactorily under adverse conditions. A proposed framework for achieving person recognition successfully combines different biometric modalities, borne out in two case studies
Deep Dialog Act Recognition using Multiple Token, Segment, and Context Information Representations
Dialog act (DA) recognition is a task that has been widely explored over the
years. Recently, most approaches to the task explored different DNN
architectures to combine the representations of the words in a segment and
generate a segment representation that provides cues for intention. In this
study, we explore means to generate more informative segment representations,
not only by exploring different network architectures, but also by considering
different token representations, not only at the word level, but also at the
character and functional levels. At the word level, in addition to the commonly
used uncontextualized embeddings, we explore the use of contextualized
representations, which provide information concerning word sense and segment
structure. Character-level tokenization is important to capture
intention-related morphological aspects that cannot be captured at the word
level. Finally, the functional level provides an abstraction from words, which
shifts the focus to the structure of the segment. We also explore approaches to
enrich the segment representation with context information from the history of
the dialog, both in terms of the classifications of the surrounding segments
and the turn-taking history. This kind of information has already been proved
important for the disambiguation of DAs in previous studies. Nevertheless, we
are able to capture additional information by considering a summary of the
dialog history and a wider turn-taking context. By combining the best
approaches at each step, we achieve results that surpass the previous
state-of-the-art on generic DA recognition on both SwDA and MRDA, two of the
most widely explored corpora for the task. Furthermore, by considering both
past and future context, simulating annotation scenario, our approach achieves
a performance similar to that of a human annotator on SwDA and surpasses it on
MRDA.Comment: 38 pages, 7 figures, 9 tables, submitted to JAI
Writer Identification Using Inexpensive Signal Processing Techniques
We propose to use novel and classical audio and text signal-processing and
otherwise techniques for "inexpensive" fast writer identification tasks of
scanned hand-written documents "visually". The "inexpensive" refers to the
efficiency of the identification process in terms of CPU cycles while
preserving decent accuracy for preliminary identification. This is a
comparative study of multiple algorithm combinations in a pattern recognition
pipeline implemented in Java around an open-source Modular Audio Recognition
Framework (MARF) that can do a lot more beyond audio. We present our
preliminary experimental findings in such an identification task. We simulate
"visual" identification by "looking" at the hand-written document as a whole
rather than trying to extract fine-grained features out of it prior
classification.Comment: 9 pages; 1 figure; presented at CISSE'09 at
http://conference.cisse2009.org/proceedings.aspx ; includes the the
application source code; based on MARF described in arXiv:0905.123
Dialogue Act Modeling for Automatic Tagging and Recognition of Conversational Speech
We describe a statistical approach for modeling dialogue acts in
conversational speech, i.e., speech-act-like units such as Statement, Question,
Backchannel, Agreement, Disagreement, and Apology. Our model detects and
predicts dialogue acts based on lexical, collocational, and prosodic cues, as
well as on the discourse coherence of the dialogue act sequence. The dialogue
model is based on treating the discourse structure of a conversation as a
hidden Markov model and the individual dialogue acts as observations emanating
from the model states. Constraints on the likely sequence of dialogue acts are
modeled via a dialogue act n-gram. The statistical dialogue grammar is combined
with word n-grams, decision trees, and neural networks modeling the
idiosyncratic lexical and prosodic manifestations of each dialogue act. We
develop a probabilistic integration of speech recognition with dialogue
modeling, to improve both speech recognition and dialogue act classification
accuracy. Models are trained and evaluated using a large hand-labeled database
of 1,155 conversations from the Switchboard corpus of spontaneous
human-to-human telephone speech. We achieved good dialogue act labeling
accuracy (65% based on errorful, automatically recognized words and prosody,
and 71% based on word transcripts, compared to a chance baseline accuracy of
35% and human accuracy of 84%) and a small reduction in word recognition error.Comment: 35 pages, 5 figures. Changes in copy editing (note title spelling
changed
Computational Sociolinguistics: A Survey
Language is a social phenomenon and variation is inherent to its social
nature. Recently, there has been a surge of interest within the computational
linguistics (CL) community in the social dimension of language. In this article
we present a survey of the emerging field of "Computational Sociolinguistics"
that reflects this increased interest. We aim to provide a comprehensive
overview of CL research on sociolinguistic themes, featuring topics such as the
relation between language and social identity, language use in social
interaction and multilingual communication. Moreover, we demonstrate the
potential for synergy between the research communities involved, by showing how
the large-scale data-driven methods that are widely used in CL can complement
existing sociolinguistic studies, and how sociolinguistics can inform and
challenge the methods and assumptions employed in CL studies. We hope to convey
the possible benefits of a closer collaboration between the two communities and
conclude with a discussion of open challenges.Comment: To appear in Computational Linguistics. Accepted for publication:
18th February, 201
- …