76,721 research outputs found
Robust semantic analysis for adaptive speech interfaces
The DUMAS project develops speech-based applications that are adaptable to different users and domains. The paper describes the project's robust semantic analysis strategy, used both in the generic framework for the development of multilingual speech-based dialogue systems which is the main project goal, and in the initial test application, a mobile phone-based e-mail interface
Robust Processing of Natural Language
Previous approaches to robustness in natural language processing usually
treat deviant input by relaxing grammatical constraints whenever a successful
analysis cannot be provided by ``normal'' means. This schema implies, that
error detection always comes prior to error handling, a behaviour which hardly
can compete with its human model, where many erroneous situations are treated
without even noticing them.
The paper analyses the necessary preconditions for achieving a higher degree
of robustness in natural language processing and suggests a quite different
approach based on a procedure for structural disambiguation. It not only offers
the possibility to cope with robustness issues in a more natural way but
eventually might be suited to accommodate quite different aspects of robust
behaviour within a single framework.Comment: 16 pages, LaTeX, uses pstricks.sty, pstricks.tex, pstricks.pro,
pst-node.sty, pst-node.tex, pst-node.pro. To appear in: Proc. KI-95, 19th
German Conference on Artificial Intelligence, Bielefeld (Germany), Lecture
Notes in Computer Science, Springer 199
Towards Deeply Unified Depth-aware Panoptic Segmentation with Bi-directional Guidance Learning
Depth-aware panoptic segmentation is an emerging topic in computer vision
which combines semantic and geometric understanding for more robust scene
interpretation. Recent works pursue unified frameworks to tackle this challenge
but mostly still treat it as two individual learning tasks, which limits their
potential for exploring cross-domain information. We propose a deeply unified
framework for depth-aware panoptic segmentation, which performs joint
segmentation and depth estimation both in a per-segment manner with identical
object queries. To narrow the gap between the two tasks, we further design a
geometric query enhancement method, which is able to integrate scene geometry
into object queries using latent representations. In addition, we propose a
bi-directional guidance learning approach to facilitate cross-task feature
learning by taking advantage of their mutual relations. Our method sets the new
state of the art for depth-aware panoptic segmentation on both Cityscapes-DVPS
and SemKITTI-DVPS datasets. Moreover, our guidance learning approach is shown
to deliver performance improvement even under incomplete supervision labels.Comment: to be published in ICCV 202
SCREEN: Learning a Flat Syntactic and Semantic Spoken Language Analysis Using Artificial Neural Networks
In this paper, we describe a so-called screening approach for learning robust
processing of spontaneously spoken language. A screening approach is a flat
analysis which uses shallow sequences of category representations for analyzing
an utterance at various syntactic, semantic and dialog levels. Rather than
using a deeply structured symbolic analysis, we use a flat connectionist
analysis. This screening approach aims at supporting speech and language
processing by using (1) data-driven learning and (2) robustness of
connectionist networks. In order to test this approach, we have developed the
SCREEN system which is based on this new robust, learned and flat analysis.
In this paper, we focus on a detailed description of SCREEN's architecture,
the flat syntactic and semantic analysis, the interaction with a speech
recognizer, and a detailed evaluation analysis of the robustness under the
influence of noisy or incomplete input. The main result of this paper is that
flat representations allow more robust processing of spontaneous spoken
language than deeply structured representations. In particular, we show how the
fault-tolerance and learning capability of connectionist networks can support a
flat analysis for providing more robust spoken-language processing within an
overall hybrid symbolic/connectionist framework.Comment: 51 pages, Postscript. To be published in Journal of Artificial
Intelligence Research 6(1), 199
The Right (Angled) Perspective: Improving the Understanding of Road Scenes Using Boosted Inverse Perspective Mapping
Many tasks performed by autonomous vehicles such as road marking detection,
object tracking, and path planning are simpler in bird's-eye view. Hence,
Inverse Perspective Mapping (IPM) is often applied to remove the perspective
effect from a vehicle's front-facing camera and to remap its images into a 2D
domain, resulting in a top-down view. Unfortunately, however, this leads to
unnatural blurring and stretching of objects at further distance, due to the
resolution of the camera, limiting applicability. In this paper, we present an
adversarial learning approach for generating a significantly improved IPM from
a single camera image in real time. The generated bird's-eye-view images
contain sharper features (e.g. road markings) and a more homogeneous
illumination, while (dynamic) objects are automatically removed from the scene,
thus revealing the underlying road layout in an improved fashion. We
demonstrate our framework using real-world data from the Oxford RobotCar
Dataset and show that scene understanding tasks directly benefit from our
boosted IPM approach.Comment: equal contribution of first two authors, 8 full pages, 6 figures,
accepted at IV 201
WordRank: Learning Word Embeddings via Robust Ranking
Embedding words in a vector space has gained a lot of attention in recent
years. While state-of-the-art methods provide efficient computation of word
similarities via a low-dimensional matrix embedding, their motivation is often
left unclear. In this paper, we argue that word embedding can be naturally
viewed as a ranking problem due to the ranking nature of the evaluation
metrics. Then, based on this insight, we propose a novel framework WordRank
that efficiently estimates word representations via robust ranking, in which
the attention mechanism and robustness to noise are readily achieved via the
DCG-like ranking losses. The performance of WordRank is measured in word
similarity and word analogy benchmarks, and the results are compared to the
state-of-the-art word embedding techniques. Our algorithm is very competitive
to the state-of-the- arts on large corpora, while outperforms them by a
significant margin when the training set is limited (i.e., sparse and noisy).
With 17 million tokens, WordRank performs almost as well as existing methods
using 7.2 billion tokens on a popular word similarity benchmark. Our multi-node
distributed implementation of WordRank is publicly available for general usage.Comment: Conference on Empirical Methods in Natural Language Processing
(EMNLP), November 1-5, 2016, Austin, Texas, US
- …