54 research outputs found
IALE: Imitating Active Learner Ensembles
Active learning (AL) prioritizes the labeling of the most informative data
samples. However, the performance of AL heuristics depends on the structure of
the underlying classifier model and the data. We propose an imitation learning
scheme that imitates the selection of the best expert heuristic at each stage
of the AL cycle in a batch-mode pool-based setting. We use DAGGER to train the
policy on a dataset and later apply it to datasets from similar domains. With
multiple AL heuristics as experts, the policy is able to reflect the choices of
the best AL heuristics given the current state of the AL process. Our
experiment on well-known datasets show that we both outperform state of the art
imitation learners and heuristics.Comment: 17 page
ViPR: Visual-Odometry-aided Pose Regression for 6DoF Camera Localization
Visual Odometry (VO) accumulates a positional drift in long-term robot
navigation tasks. Although Convolutional Neural Networks (CNNs) improve VO in
various aspects, VO still suffers from moving obstacles, discontinuous
observation of features, and poor textures or visual information. While recent
approaches estimate a 6DoF pose either directly from (a series of) images or by
merging depth maps with optical flow (OF), research that combines absolute pose
regression with OF is limited. We propose ViPR, a novel modular architecture
for long-term 6DoF VO that leverages temporal information and synergies between
absolute pose estimates (from PoseNet-like modules) and relative pose estimates
(from FlowNet-based modules) by combining both through recurrent layers.
Experiments on known datasets and on our own Industry dataset show that our
modular design outperforms state of the art in long-term navigation tasks.Comment: Conf. on Computer Vision and Pattern Recognition (CVPR): Joint
Workshop on Long-Term Visual Localization, Visual Odometry and Geometric and
Learning-based SLAM 202
Fusing Structure from Motion and Simulation-Augmented Pose Regression from Optical Flow for Challenging Indoor Environments
The localization of objects is a crucial task in various applications such as
robotics, virtual and augmented reality, and the transportation of goods in
warehouses. Recent advances in deep learning have enabled the localization
using monocular visual cameras. While structure from motion (SfM) predicts the
absolute pose from a point cloud, absolute pose regression (APR) methods learn
a semantic understanding of the environment through neural networks. However,
both fields face challenges caused by the environment such as motion blur,
lighting changes, repetitive patterns, and feature-less structures. This study
aims to address these challenges by incorporating additional information and
regularizing the absolute pose using relative pose regression (RPR) methods.
The optical flow between consecutive images is computed using the Lucas-Kanade
algorithm, and the relative pose is predicted using an auxiliary small
recurrent convolutional network. The fusion of absolute and relative poses is a
complex task due to the mismatch between the global and local coordinate
systems. State-of-the-art methods fusing absolute and relative poses use pose
graph optimization (PGO) to regularize the absolute pose predictions using
relative poses. In this work, we propose recurrent fusion networks to optimally
align absolute and relative pose predictions to improve the absolute pose
prediction. We evaluate eight different recurrent units and construct a
simulation environment to pre-train the APR and RPR networks for better
generalized training. Additionally, we record a large database of different
scenarios in a challenging large-scale indoor environment that mimics a
warehouse with transportation robots. We conduct hyperparameter searches and
experiments to show the effectiveness of our recurrent fusion method compared
to PGO
Active Learning of Ordinal Embeddings: A User Study on Football Data
Humans innately measure distance between instances in an unlabeled dataset
using an unknown similarity function. Distance metrics can only serve as proxy
for similarity in information retrieval of similar instances. Learning a good
similarity function from human annotations improves the quality of retrievals.
This work uses deep metric learning to learn these user-defined similarity
functions from few annotations for a large football trajectory dataset. We
adapt an entropy-based active learning method with recent work from triplet
mining to collect easy-to-answer but still informative annotations from human
participants and use them to train a deep convolutional network that
generalizes to unseen samples. Our user study shows that our approach improves
the quality of the information retrieval compared to a previous deep metric
learning approach that relies on a Siamese network. Specifically, we shed light
on the strengths and weaknesses of passive sampling heuristics and active
learners alike by analyzing the participants' response efficacy. To this end,
we collect accuracy, algorithmic time complexity, the participants' fatigue and
time-to-response, qualitative self-assessment and statements, as well as the
effects of mixed-expertise annotators and their consistency on model
performance and transfer-learning.Comment: 23 pages, 17 figure
Velocity-Based Channel Charting with Spatial Distribution Map Matching
Fingerprint-based localization improves the positioning performance in
challenging, non-line-of-sight (NLoS) dominated indoor environments. However,
fingerprinting models require an expensive life-cycle management including
recording and labeling of radio signals for the initial training and regularly
at environmental changes. Alternatively, channel-charting avoids this labeling
effort as it implicitly associates relative coordinates to the recorded radio
signals. Then, with reference real-world coordinates (positions) we can use
such charts for positioning tasks. However, current channel-charting approaches
lag behind fingerprinting in their positioning accuracy and still require
reference samples for localization, regular data recording and labeling to keep
the models up to date. Hence, we propose a novel framework that does not
require reference positions. We only require information from velocity
information, e.g., from pedestrian dead reckoning or odometry to model the
channel charts, and topological map information, e.g., a building floor plan,
to transform the channel charts into real coordinates. We evaluate our approach
on two different real-world datasets using 5G and distributed
single-input/multiple-output system (SIMO) radio systems. Our experiments show
that even with noisy velocity estimates and coarse map information, we achieve
similar position accuraciesComment: This work has been submitted to the IEEE for possible publication.
Copyright may be transferred without notice, after which this version may no
longer be accessibl
Quantum Natural Policy Gradients: Towards Sample-Efficient Reinforcement Learning
Reinforcement learning is a growing field in AI with a lot of potential.
Intelligent behavior is learned automatically through trial and error in
interaction with the environment. However, this learning process is often
costly. Using variational quantum circuits as function approximators can reduce
this cost. In order to implement this, we propose the quantum natural policy
gradient (QNPG) algorithm -- a second-order gradient-based routine that takes
advantage of an efficient approximation of the quantum Fisher information
matrix. We experimentally demonstrate that QNPG outperforms first-order based
training on Contextual Bandits environments regarding convergence speed and
stability and thereby reduces the sample complexity. Furthermore, we provide
evidence for the practical feasibility of our approach by training on a
12-qubit hardware device.Comment: This work has been submitted to the IEEE for possible publication.
Copyright may be transferred without notice, after which this version may no
longer be accessible. 7 pages, 5 figures, 1 tabl
Quantum Policy Gradient Algorithm with Optimized Action Decoding
Quantum machine learning implemented by variational quantum circuits (VQCs)
is considered a promising concept for the noisy intermediate-scale quantum
computing era. Focusing on applications in quantum reinforcement learning, we
propose a specific action decoding procedure for a quantum policy gradient
approach. We introduce a novel quality measure that enables us to optimize the
classical post-processing required for action selection, inspired by local and
global quantum measurements. The resulting algorithm demonstrates a significant
performance improvement in several benchmark environments. With this technique,
we successfully execute a full training routine on a 5-qubit hardware device.
Our method introduces only negligible classical overhead and has the potential
to improve VQC-based algorithms beyond the field of quantum reinforcement
learning.Comment: Accepted to the 40th International Conference on Machine Learning
(ICML 2023), Honolulu, Hawaii, USA. 22 pages, 10 figures, 3 table
How to Learn from Risk: Explicit Risk-Utility Reinforcement Learning for Efficient and Safe Driving Strategies
Autonomous driving has the potential to revolutionize mobility and is hence
an active area of research. In practice, the behavior of autonomous vehicles
must be acceptable, i.e., efficient, safe, and interpretable. While vanilla
reinforcement learning (RL) finds performant behavioral strategies, they are
often unsafe and uninterpretable. Safety is introduced through Safe RL
approaches, but they still mostly remain uninterpretable as the learned
behaviour is jointly optimized for safety and performance without modeling them
separately. Interpretable machine learning is rarely applied to RL. This paper
proposes SafeDQN, which allows to make the behavior of autonomous vehicles safe
and interpretable while still being efficient. SafeDQN offers an
understandable, semantic trade-off between the expected risk and the utility of
actions while being algorithmically transparent. We show that SafeDQN finds
interpretable and safe driving policies for a variety of scenarios and
demonstrate how state-of-the-art saliency techniques can help to assess both
risk and utility.Comment: 8 pages, 5 figure
Cutting multi-control quantum gates with ZX calculus
Circuit cutting, the decomposition of a quantum circuit into independent
partitions, has become a promising avenue towards experiments with larger
quantum circuits in the noisy-intermediate scale quantum (NISQ) era. While
previous work focused on cutting qubit wires or two-qubit gates, in this work
we introduce a method for cutting multi-controlled Z gates. We construct a
decomposition and prove the upper bound on the associated
sampling overhead, where is the number of cuts in the circuit. This bound
is independent of the number of control qubits but can be further reduced to
for the special case of CCZ gates. Furthermore, we
evaluate our proposal on IBM hardware and experimentally show noise resilience
due to the strong reduction of CNOT gates in the cut circuits
- …