6,519 research outputs found
Combining Multiple Clusterings via Crowd Agreement Estimation and Multi-Granularity Link Analysis
The clustering ensemble technique aims to combine multiple clusterings into a
probably better and more robust clustering and has been receiving an increasing
attention in recent years. There are mainly two aspects of limitations in the
existing clustering ensemble approaches. Firstly, many approaches lack the
ability to weight the base clusterings without access to the original data and
can be affected significantly by the low-quality, or even ill clusterings.
Secondly, they generally focus on the instance level or cluster level in the
ensemble system and fail to integrate multi-granularity cues into a unified
model. To address these two limitations, this paper proposes to solve the
clustering ensemble problem via crowd agreement estimation and
multi-granularity link analysis. We present the normalized crowd agreement
index (NCAI) to evaluate the quality of base clusterings in an unsupervised
manner and thus weight the base clusterings in accordance with their clustering
validity. To explore the relationship between clusters, the source aware
connected triple (SACT) similarity is introduced with regard to their common
neighbors and the source reliability. Based on NCAI and multi-granularity
information collected among base clusterings, clusters, and data instances, we
further propose two novel consensus functions, termed weighted evidence
accumulation clustering (WEAC) and graph partitioning with multi-granularity
link analysis (GP-MGLA) respectively. The experiments are conducted on eight
real-world datasets. The experimental results demonstrate the effectiveness and
robustness of the proposed methods.Comment: The MATLAB source code of this work is available at:
https://www.researchgate.net/publication/28197031
Black-Box Data-efficient Policy Search for Robotics
The most data-efficient algorithms for reinforcement learning (RL) in
robotics are based on uncertain dynamical models: after each episode, they
first learn a dynamical model of the robot, then they use an optimization
algorithm to find a policy that maximizes the expected return given the model
and its uncertainties. It is often believed that this optimization can be
tractable only if analytical, gradient-based algorithms are used; however,
these algorithms require using specific families of reward functions and
policies, which greatly limits the flexibility of the overall approach. In this
paper, we introduce a novel model-based RL algorithm, called Black-DROPS
(Black-box Data-efficient RObot Policy Search) that: (1) does not impose any
constraint on the reward function or the policy (they are treated as
black-boxes), (2) is as data-efficient as the state-of-the-art algorithm for
data-efficient RL in robotics, and (3) is as fast (or faster) than analytical
approaches when several cores are available. The key idea is to replace the
gradient-based optimization algorithm with a parallel, black-box algorithm that
takes into account the model uncertainties. We demonstrate the performance of
our new algorithm on two standard control benchmark problems (in simulation)
and a low-cost robotic manipulator (with a real robot).Comment: Accepted at the IEEE/RSJ International Conference on Intelligent
Robots and Systems (IROS) 2017; Code at
http://github.com/resibots/blackdrops; Video at http://youtu.be/kTEyYiIFGP
Proposal Flow: Semantic Correspondences from Object Proposals
Finding image correspondences remains a challenging problem in the presence
of intra-class variations and large changes in scene layout. Semantic flow
methods are designed to handle images depicting different instances of the same
object or scene category. We introduce a novel approach to semantic flow,
dubbed proposal flow, that establishes reliable correspondences using object
proposals. Unlike prevailing semantic flow approaches that operate on pixels or
regularly sampled local regions, proposal flow benefits from the
characteristics of modern object proposals, that exhibit high repeatability at
multiple scales, and can take advantage of both local and geometric consistency
constraints among proposals. We also show that the corresponding sparse
proposal flow can effectively be transformed into a conventional dense flow
field. We introduce two new challenging datasets that can be used to evaluate
both general semantic flow techniques and region-based approaches such as
proposal flow. We use these benchmarks to compare different matching
algorithms, object proposals, and region features within proposal flow, to the
state of the art in semantic flow. This comparison, along with experiments on
standard datasets, demonstrates that proposal flow significantly outperforms
existing semantic flow methods in various settings.Comment: arXiv admin note: text overlap with arXiv:1511.0506
Representations of molecules and materials for interpolation of quantum-mechanical simulations via machine learning
Computational study of molecules and materials from first principles is a cornerstone of physics, chemistry and materials science, but limited by the cost of accurate and precise simulations. In settings involving many simulations, machine learning can reduce these costs, sometimes by orders of magnitude, by interpolating between reference simulations. This requires representations that describe any molecule or material and support interpolation. We review, discuss and benchmark state-of-the-art representations and relations between them, including smooth overlap of atomic positions, many-body tensor representation, and symmetry functions. For this, we use a unified mathematical framework based on many-body functions, group averaging and tensor products, and compare energy predictions for organic molecules, binary alloys and Al-Ga-In sesquioxides in numerical experiments controlled for data distribution, regression method and hyper-parameter optimization
- …