86,395 research outputs found
Attention and Anticipation in Fast Visual-Inertial Navigation
We study a Visual-Inertial Navigation (VIN) problem in which a robot needs to
estimate its state using an on-board camera and an inertial sensor, without any
prior knowledge of the external environment. We consider the case in which the
robot can allocate limited resources to VIN, due to tight computational
constraints. Therefore, we answer the following question: under limited
resources, what are the most relevant visual cues to maximize the performance
of visual-inertial navigation? Our approach has four key ingredients. First, it
is task-driven, in that the selection of the visual cues is guided by a metric
quantifying the VIN performance. Second, it exploits the notion of
anticipation, since it uses a simplified model for forward-simulation of robot
dynamics, predicting the utility of a set of visual cues over a future time
horizon. Third, it is efficient and easy to implement, since it leads to a
greedy algorithm for the selection of the most relevant visual cues. Fourth, it
provides formal performance guarantees: we leverage submodularity to prove that
the greedy selection cannot be far from the optimal (combinatorial) selection.
Simulations and real experiments on agile drones show that our approach ensures
state-of-the-art VIN performance while maintaining a lean processing time. In
the easy scenarios, our approach outperforms appearance-based feature selection
in terms of localization errors. In the most challenging scenarios, it enables
accurate visual-inertial navigation while appearance-based feature selection
fails to track robot's motion during aggressive maneuvers.Comment: 20 pages, 7 figures, 2 table
Generative Adversarial Networks (GANs): Challenges, Solutions, and Future Directions
Generative Adversarial Networks (GANs) is a novel class of deep generative
models which has recently gained significant attention. GANs learns complex and
high-dimensional distributions implicitly over images, audio, and data.
However, there exists major challenges in training of GANs, i.e., mode
collapse, non-convergence and instability, due to inappropriate design of
network architecture, use of objective function and selection of optimization
algorithm. Recently, to address these challenges, several solutions for better
design and optimization of GANs have been investigated based on techniques of
re-engineered network architectures, new objective functions and alternative
optimization algorithms. To the best of our knowledge, there is no existing
survey that has particularly focused on broad and systematic developments of
these solutions. In this study, we perform a comprehensive survey of the
advancements in GANs design and optimization solutions proposed to handle GANs
challenges. We first identify key research issues within each design and
optimization technique and then propose a new taxonomy to structure solutions
by key research issues. In accordance with the taxonomy, we provide a detailed
discussion on different GANs variants proposed within each solution and their
relationships. Finally, based on the insights gained, we present the promising
research directions in this rapidly growing field.Comment: 42 pages, Figure 13, Table
Large-Margin Determinantal Point Processes
Determinantal point processes (DPPs) offer a powerful approach to modeling
diversity in many applications where the goal is to select a diverse subset. We
study the problem of learning the parameters (the kernel matrix) of a DPP from
labeled training data. We make two contributions. First, we show how to
reparameterize a DPP's kernel matrix with multiple kernel functions, thus
enhancing modeling flexibility. Second, we propose a novel parameter estimation
technique based on the principle of large margin separation. In contrast to the
state-of-the-art method of maximum likelihood estimation, our large-margin loss
function explicitly models errors in selecting the target subsets, and it can
be customized to trade off different types of errors (precision vs. recall).
Extensive empirical studies validate our contributions, including applications
on challenging document and video summarization, where flexibility in modeling
the kernel matrix and balancing different errors is indispensable.Comment: 15 page
- …