15,158 research outputs found
Recurrent Pixel Embedding for Instance Grouping
We introduce a differentiable, end-to-end trainable framework for solving
pixel-level grouping problems such as instance segmentation consisting of two
novel components. First, we regress pixels into a hyper-spherical embedding
space so that pixels from the same group have high cosine similarity while
those from different groups have similarity below a specified margin. We
analyze the choice of embedding dimension and margin, relating them to
theoretical results on the problem of distributing points uniformly on the
sphere. Second, to group instances, we utilize a variant of mean-shift
clustering, implemented as a recurrent neural network parameterized by kernel
bandwidth. This recurrent grouping module is differentiable, enjoys convergent
dynamics and probabilistic interpretability. Backpropagating the group-weighted
loss through this module allows learning to focus on only correcting embedding
errors that won't be resolved during subsequent clustering. Our framework,
while conceptually simple and theoretically abundant, is also practically
effective and computationally efficient. We demonstrate substantial
improvements over state-of-the-art instance segmentation for object proposal
generation, as well as demonstrating the benefits of grouping loss for
classification tasks such as boundary detection and semantic segmentation
Classification and Geometry of General Perceptual Manifolds
Perceptual manifolds arise when a neural population responds to an ensemble
of sensory signals associated with different physical features (e.g.,
orientation, pose, scale, location, and intensity) of the same perceptual
object. Object recognition and discrimination requires classifying the
manifolds in a manner that is insensitive to variability within a manifold. How
neuronal systems give rise to invariant object classification and recognition
is a fundamental problem in brain theory as well as in machine learning. Here
we study the ability of a readout network to classify objects from their
perceptual manifold representations. We develop a statistical mechanical theory
for the linear classification of manifolds with arbitrary geometry revealing a
remarkable relation to the mathematics of conic decomposition. Novel
geometrical measures of manifold radius and manifold dimension are introduced
which can explain the classification capacity for manifolds of various
geometries. The general theory is demonstrated on a number of representative
manifolds, including L2 ellipsoids prototypical of strictly convex manifolds,
L1 balls representing polytopes consisting of finite sample points, and
orientation manifolds which arise from neurons tuned to respond to a continuous
angle variable, such as object orientation. The effects of label sparsity on
the classification capacity of manifolds are elucidated, revealing a scaling
relation between label sparsity and manifold radius. Theoretical predictions
are corroborated by numerical simulations using recently developed algorithms
to compute maximum margin solutions for manifold dichotomies. Our theory and
its extensions provide a powerful and rich framework for applying statistical
mechanics of linear classification to data arising from neuronal responses to
object stimuli, as well as to artificial deep networks trained for object
recognition tasks.Comment: 24 pages, 12 figures, Supplementary Material
Unsupervised machine learning for detection of phase transitions in off-lattice systems I. Foundations
We demonstrate the utility of an unsupervised machine learning tool for the
detection of phase transitions in off-lattice systems. We focus on the
application of principal component analysis (PCA) to detect the freezing
transitions of two-dimensional hard-disk and three-dimensional hard-sphere
systems as well as liquid-gas phase separation in a patchy colloid model. As we
demonstrate, PCA autonomously discovers order-parameter-like quantities that
report on phase transitions, mitigating the need for a priori construction or
identification of a suitable order parameter--thus streamlining the routine
analysis of phase behavior. In a companion paper, we further develop the method
established here to explore the detection of phase transitions in various model
systems controlled by compositional demixing, liquid crystalline ordering, and
non-equilibrium active forces
Adaptive Seeding for Gaussian Mixture Models
We present new initialization methods for the expectation-maximization
algorithm for multivariate Gaussian mixture models. Our methods are adaptions
of the well-known -means++ initialization and the Gonzalez algorithm.
Thereby we aim to close the gap between simple random, e.g. uniform, and
complex methods, that crucially depend on the right choice of hyperparameters.
Our extensive experiments indicate the usefulness of our methods compared to
common techniques and methods, which e.g. apply the original -means++ and
Gonzalez directly, with respect to artificial as well as real-world data sets.Comment: This is a preprint of a paper that has been accepted for publication
in the Proceedings of the 20th Pacific Asia Conference on Knowledge Discovery
and Data Mining (PAKDD) 2016. The final publication is available at
link.springer.com (http://link.springer.com/chapter/10.1007/978-3-319-31750-2
24
On-the-fly adaptivity for nonlinear twoscale simulations using artificial neural networks and reduced order modeling
A multi-fidelity surrogate model for highly nonlinear multiscale problems is
proposed. It is based on the introduction of two different surrogate models and
an adaptive on-the-fly switching. The two concurrent surrogates are built
incrementally starting from a moderate set of evaluations of the full order
model. Therefore, a reduced order model (ROM) is generated. Using a hybrid
ROM-preconditioned FE solver, additional effective stress-strain data is
simulated while the number of samples is kept to a moderate level by using a
dedicated and physics-guided sampling technique. Machine learning (ML) is
subsequently used to build the second surrogate by means of artificial neural
networks (ANN). Different ANN architectures are explored and the features used
as inputs of the ANN are fine tuned in order to improve the overall quality of
the ML model. Additional ANN surrogates for the stress errors are generated.
Therefore, conservative design guidelines for error surrogates are presented by
adapting the loss functions of the ANN training in pure regression or pure
classification settings. The error surrogates can be used as quality indicators
in order to adaptively select the appropriate -- i.e. efficient yet accurate --
surrogate. Two strategies for the on-the-fly switching are investigated and a
practicable and robust algorithm is proposed that eliminates relevant technical
difficulties attributed to model switching. The provided algorithms and ANN
design guidelines can easily be adopted for different problem settings and,
thereby, they enable generalization of the used machine learning techniques for
a wide range of applications. The resulting hybrid surrogate is employed in
challenging multilevel FE simulations for a three-phase composite with
pseudo-plastic micro-constituents. Numerical examples highlight the performance
of the proposed approach
Large-Margin Determinantal Point Processes
Determinantal point processes (DPPs) offer a powerful approach to modeling
diversity in many applications where the goal is to select a diverse subset. We
study the problem of learning the parameters (the kernel matrix) of a DPP from
labeled training data. We make two contributions. First, we show how to
reparameterize a DPP's kernel matrix with multiple kernel functions, thus
enhancing modeling flexibility. Second, we propose a novel parameter estimation
technique based on the principle of large margin separation. In contrast to the
state-of-the-art method of maximum likelihood estimation, our large-margin loss
function explicitly models errors in selecting the target subsets, and it can
be customized to trade off different types of errors (precision vs. recall).
Extensive empirical studies validate our contributions, including applications
on challenging document and video summarization, where flexibility in modeling
the kernel matrix and balancing different errors is indispensable.Comment: 15 page
Machine learning cosmological structure formation
We train a machine learning algorithm to learn cosmological structure
formation from N-body simulations. The algorithm infers the relationship
between the initial conditions and the final dark matter haloes, without the
need to introduce approximate halo collapse models. We gain insights into the
physics driving halo formation by evaluating the predictive performance of the
algorithm when provided with different types of information about the local
environment around dark matter particles. The algorithm learns to predict
whether or not dark matter particles will end up in haloes of a given mass
range, based on spherical overdensities. We show that the resulting predictions
match those of spherical collapse approximations such as extended
Press-Schechter theory. Additional information on the shape of the local
gravitational potential is not able to improve halo collapse predictions; the
linear density field contains sufficient information for the algorithm to also
reproduce ellipsoidal collapse predictions based on the Sheth-Tormen model. We
investigate the algorithm's performance in terms of halo mass and radial
position and perform blind analyses on independent initial conditions
realisations to demonstrate the generality of our results.Comment: 10 pages, 7 figures. Minor changes to match version published in
MNRAS. Accepted on 22/06/201
- …