3,509 research outputs found
Filling Knowledge Gaps in a Broad-Coverage Machine Translation System
Knowledge-based machine translation (KBMT) techniques yield high quality in
domains with detailed semantic models, limited vocabulary, and controlled input
grammar. Scaling up along these dimensions means acquiring large knowledge
resources. It also means behaving reasonably when definitive knowledge is not
yet available. This paper describes how we can fill various KBMT knowledge
gaps, often using robust statistical techniques. We describe quantitative and
qualitative results from JAPANGLOSS, a broad-coverage Japanese-English MT
system.Comment: 7 pages, Compressed and uuencoded postscript. To appear: IJCAI-9
Computational Models for the Automatic Learning and Recognition of Irish Sign Language
This thesis presents a framework for the automatic recognition of Sign Language
sentences. In previous sign language recognition works, the issues of;
user independent recognition, movement epenthesis modeling and automatic
or weakly supervised training have not been fully addressed in a single recognition
framework. This work presents three main contributions in order to
address these issues.
The first contribution is a technique for user independent hand posture
recognition. We present a novel eigenspace Size Function feature which is
implemented to perform user independent recognition of sign language hand
postures.
The second contribution is a framework for the classification and spotting
of spatiotemporal gestures which appear in sign language. We propose a
Gesture Threshold Hidden Markov Model (GT-HMM) to classify gestures
and to identify movement epenthesis without the need for explicit epenthesis
training.
The third contribution is a framework to train the hand posture and spatiotemporal
models using only the weak supervision of sign language videos
and their corresponding text translations. This is achieved through our proposed
Multiple Instance Learning Density Matrix algorithm which automatically
extracts isolated signs from full sentences using the weak and noisy
supervision of text translations. The automatically extracted isolated samples
are then utilised to train our spatiotemporal gesture and hand posture
classifiers.
The work we present in this thesis is an important and significant contribution
to the area of natural sign language recognition as we propose a
robust framework for training a recognition system without the need for
manual labeling
Evaluation in natural language processing
quot; European Summer School on Language Logic and Information(ESSLLI 2007)(Trinity College Dublin Ireland 6-17 August 2007
Online Deception Detection Refueled by Real World Data Collection
The lack of large realistic datasets presents a bottleneck in online
deception detection studies. In this paper, we apply a data collection method
based on social network analysis to quickly identify high-quality deceptive and
truthful online reviews from Amazon. The dataset contains more than 10,000
deceptive reviews and is diverse in product domains and reviewers. Using this
dataset, we explore effective general features for online deception detection
that perform well across domains. We demonstrate that with generalized features
- advertising speak and writing complexity scores - deception detection
performance can be further improved by adding additional deceptive reviews from
assorted domains in training. Finally, reviewer level evaluation gives an
interesting insight into different deceptive reviewers' writing styles.Comment: 10 pages, Accepted to Recent Advances in Natural Language Processing
(RANLP) 201
The relationship between reading comprehension, working memory and language in children with cochlear implants
Working memory, language, and reading comprehension are strongly associated in children with severe and profound hearing impairment treated by cochlear implants (CI). In this study we explore this relationship in sixteen Swedish children with CI. We found that over 60% of the children with CI performed at the level of their hearing peers in a reading comprehension test. Demographic factors were not predictive of reading comprehension, but a complex working memory task was. Reading percentile was significantly correlated to the working memory test, but no other correlations between reading and cognitive/linguistic factors remained significant after age was factored out. Individual results from a comparison of the two best and the two poorest readers corroborate group results, confirming the important role of working memory for reading as measured by comprehension of words andmsentences in this group of children
Spoken content retrieval: A survey of techniques and technologies
Speech media, that is, digital audio and video containing spoken content, has blossomed in recent years. Large collections are accruing on the Internet as well as in private and enterprise settings. This growth has motivated extensive research on techniques and technologies that facilitate reliable indexing and retrieval. Spoken content retrieval (SCR) requires the combination of audio and speech processing technologies with methods from information retrieval (IR). SCR research initially investigated planned speech structured in document-like units, but has subsequently shifted focus to more informal spoken content produced spontaneously, outside of the studio and in conversational settings. This survey provides an overview of the field of SCR encompassing component technologies, the relationship of SCR to text IR and automatic speech recognition and user interaction issues. It is aimed at researchers with backgrounds in speech technology or IR who are seeking deeper insight on how these fields are integrated to support research and development, thus addressing the core challenges of SCR
- …