668 research outputs found
DeepKey: Towards End-to-End Physical Key Replication From a Single Photograph
This paper describes DeepKey, an end-to-end deep neural architecture capable
of taking a digital RGB image of an 'everyday' scene containing a pin tumbler
key (e.g. lying on a table or carpet) and fully automatically inferring a
printable 3D key model. We report on the key detection performance and describe
how candidates can be transformed into physical prints. We show an example
opening a real-world lock. Our system is described in detail, providing a
breakdown of all components including key detection, pose normalisation,
bitting segmentation and 3D model inference. We provide an in-depth evaluation
and conclude by reflecting on limitations, applications, potential security
risks and societal impact. We contribute the DeepKey Datasets of 5, 300+ images
covering a few test keys with bounding boxes, pose and unaligned mask data.Comment: 14 pages, 12 figure
Online Meta-Learning for Multi-Source and Semi-Supervised Domain Adaptation
Domain adaptation (DA) is the topical problem of adapting models from
labelled source datasets so that they perform well on target datasets where
only unlabelled or partially labelled data is available. Many methods have been
proposed to address this problem through different ways to minimise the domain
shift between source and target datasets. In this paper we take an orthogonal
perspective and propose a framework to further enhance performance by
meta-learning the initial conditions of existing DA algorithms. This is
challenging compared to the more widely considered setting of few-shot
meta-learning, due to the length of the computation graph involved. Therefore
we propose an online shortest-path meta-learning framework that is both
computationally tractable and practically effective for improving DA
performance. We present variants for both multi-source unsupervised domain
adaptation (MSDA), and semi-supervised domain adaptation (SSDA). Importantly,
our approach is agnostic to the base adaptation algorithm, and can be applied
to improve many techniques. Experimentally, we demonstrate improvements on
classic (DANN) and recent (MCD and MME) techniques for MSDA and SSDA, and
ultimately achieve state of the art results on several DA benchmarks including
the largest scale DomainNet.Comment: ECCV 2020 CR versio
Hierarchical Temporal Representation in Linear Reservoir Computing
Recently, studies on deep Reservoir Computing (RC) highlighted the role of
layering in deep recurrent neural networks (RNNs). In this paper, the use of
linear recurrent units allows us to bring more evidence on the intrinsic
hierarchical temporal representation in deep RNNs through frequency analysis
applied to the state signals. The potentiality of our approach is assessed on
the class of Multiple Superimposed Oscillator tasks. Furthermore, our
investigation provides useful insights to open a discussion on the main aspects
that characterize the deep learning framework in the temporal domain.Comment: This is a pre-print of the paper submitted to the 27th Italian
Workshop on Neural Networks, WIRN 201
Learning activation functions from data using cubic spline interpolation
Neural networks require a careful design in order to perform properly on a
given task. In particular, selecting a good activation function (possibly in a
data-dependent fashion) is a crucial step, which remains an open problem in the
research community. Despite a large amount of investigations, most current
implementations simply select one fixed function from a small set of
candidates, which is not adapted during training, and is shared among all
neurons throughout the different layers. However, neither two of these
assumptions can be supposed optimal in practice. In this paper, we present a
principled way to have data-dependent adaptation of the activation functions,
which is performed independently for each neuron. This is achieved by
leveraging over past and present advances on cubic spline interpolation,
allowing for local adaptation of the functions around their regions of use. The
resulting algorithm is relatively cheap to implement, and overfitting is
counterbalanced by the inclusion of a novel damping criterion, which penalizes
unwanted oscillations from a predefined shape. Experimental results validate
the proposal over two well-known benchmarks.Comment: Submitted to the 27th Italian Workshop on Neural Networks (WIRN 2017
Gravitational Dressing of N=2 Sigma-Models Beyond Leading Order
We study the beta-function of the N=2 sigma-model coupled to N=2 induced
supergravity. We compute corrections to first order in the semiclassical limit,
, beyond one-loop in the matter fields. As compared to the
corresponding bosonic, metric sigma-model calculation, we find new types of
contributions arising from the dilaton coupling automatically accounted for,
once the K\"ahler potential is coupled to N=2 supergravity.Comment: latex, 16 pages, 8 figure
Learning to Learn with Variational Information Bottleneck for Domain Generalization
Domain generalization models learn to generalize to previously unseen
domains, but suffer from prediction uncertainty and domain shift. In this
paper, we address both problems. We introduce a probabilistic meta-learning
model for domain generalization, in which classifier parameters shared across
domains are modeled as distributions. This enables better handling of
prediction uncertainty on unseen domains. To deal with domain shift, we learn
domain-invariant representations by the proposed principle of meta variational
information bottleneck, we call MetaVIB. MetaVIB is derived from novel
variational bounds of mutual information, by leveraging the meta-learning
setting of domain generalization. Through episodic training, MetaVIB learns to
gradually narrow domain gaps to establish domain-invariant representations,
while simultaneously maximizing prediction accuracy. We conduct experiments on
three benchmarks for cross-domain visual recognition. Comprehensive ablation
studies validate the benefits of MetaVIB for domain generalization. The
comparison results demonstrate our method outperforms previous approaches
consistently.Comment: 15 pages, 4 figures, ECCV202
Towards a Universal Theory of Artificial Intelligence based on Algorithmic Probability and Sequential Decision Theory
Decision theory formally solves the problem of rational agents in uncertain
worlds if the true environmental probability distribution is known.
Solomonoff's theory of universal induction formally solves the problem of
sequence prediction for unknown distribution. We unify both theories and give
strong arguments that the resulting universal AIXI model behaves optimal in any
computable environment. The major drawback of the AIXI model is that it is
uncomputable. To overcome this problem, we construct a modified algorithm
AIXI^tl, which is still superior to any other time t and space l bounded agent.
The computation time of AIXI^tl is of the order t x 2^l.Comment: 8 two-column pages, latex2e, 1 figure, submitted to ijca
Meta-Tracker: Fast and Robust Online Adaptation for Visual Object Trackers
This paper improves state-of-the-art visual object trackers that use online
adaptation. Our core contribution is an offline meta-learning-based method to
adjust the initial deep networks used in online adaptation-based tracking. The
meta learning is driven by the goal of deep networks that can quickly be
adapted to robustly model a particular target in future frames. Ideally the
resulting models focus on features that are useful for future frames, and avoid
overfitting to background clutter, small parts of the target, or noise. By
enforcing a small number of update iterations during meta-learning, the
resulting networks train significantly faster. We demonstrate this approach on
top of the high performance tracking approaches: tracking-by-detection based
MDNet and the correlation based CREST. Experimental results on standard
benchmarks, OTB2015 and VOT2016, show that our meta-learned versions of both
trackers improve speed, accuracy, and robustness.Comment: Code: https://github.com/silverbottlep/meta_tracker
Evaluating Two-Stream CNN for Video Classification
Videos contain very rich semantic information. Traditional hand-crafted
features are known to be inadequate in analyzing complex video semantics.
Inspired by the huge success of the deep learning methods in analyzing image,
audio and text data, significant efforts are recently being devoted to the
design of deep nets for video analytics. Among the many practical needs,
classifying videos (or video clips) based on their major semantic categories
(e.g., "skiing") is useful in many applications. In this paper, we conduct an
in-depth study to investigate important implementation options that may affect
the performance of deep nets on video classification. Our evaluations are
conducted on top of a recent two-stream convolutional neural network (CNN)
pipeline, which uses both static frames and motion optical flows, and has
demonstrated competitive performance against the state-of-the-art methods. In
order to gain insights and to arrive at a practical guideline, many important
options are studied, including network architectures, model fusion, learning
parameters and the final prediction methods. Based on the evaluations, very
competitive results are attained on two popular video classification
benchmarks. We hope that the discussions and conclusions from this work can
help researchers in related fields to quickly set up a good basis for further
investigations along this very promising direction.Comment: ACM ICMR'1
Deep Big Simple Neural Nets Excel on Handwritten Digit Recognition
Good old on-line back-propagation for plain multi-layer perceptrons yields a
very low 0.35% error rate on the famous MNIST handwritten digits benchmark. All
we need to achieve this best result so far are many hidden layers, many neurons
per layer, numerous deformed training images, and graphics cards to greatly
speed up learning.Comment: 14 pages, 2 figures, 4 listing
- …