14 research outputs found
An Exact Algorithm for Semi-supervised Minimum Sum-of-Squares Clustering
The minimum sum-of-squares clustering (MSSC), or k-means type clustering, is
traditionally considered an unsupervised learning task. In recent years, the
use of background knowledge to improve the cluster quality and promote
interpretability of the clustering process has become a hot research topic at
the intersection of mathematical optimization and machine learning research.
The problem of taking advantage of background information in data clustering is
called semi-supervised or constrained clustering. In this paper, we present a
branch-and-cut algorithm for semi-supervised MSSC, where background knowledge
is incorporated as pairwise must-link and cannot-link constraints. For the
lower bound procedure, we solve the semidefinite programming relaxation of the
MSSC discrete optimization model, and we use a cutting-plane procedure for
strengthening the bound. For the upper bound, instead, by using integer
programming tools, we use an adaptation of the k-means algorithm to the
constrained case. For the first time, the proposed global optimization
algorithm efficiently manages to solve real-world instances up to 800 data
points with different combinations of must-link and cannot-link constraints and
with a generic number of features. This problem size is about four times larger
than the one of the instances solved by state-of-the-art exact algorithms
A Bibliographic View on Constrained Clustering
A keyword search on constrained clustering on Web-of-Science returned just
under 3,000 documents. We ran automatic analyses of those, and compiled our own
bibliography of 183 papers which we analysed in more detail based on their
topic and experimental study, if any. This paper presents general trends of the
area and its sub-topics by Pareto analysis, using citation count and year of
publication. We list available software and analyse the experimental sections
of our reference collection. We found a notable lack of large comparison
experiments. Among the topics we reviewed, applications studies were most
abundant recently, alongside deep learning, active learning and ensemble
learning.Comment: 18 pages, 11 figures, 177 reference
Coach informed biomechanical analysis of the golf swing
Coach informed biomechanical analysis of the golf swin
Tracking the Temporal-Evolution of Supernova Bubbles in Numerical Simulations
The study of low-dimensional, noisy manifolds embedded in a higher dimensional space has been extremely useful in many applications, from the chemical analysis of multi-phase flows to simulations of galactic mergers. Building a probabilistic model of the manifolds has helped in describing their essential properties and how they vary in space. However, when the manifold is evolving through time, a joint spatio-temporal modelling is needed, in order to fully comprehend its nature. We propose a first-order Markovian process that propagates the spatial probabilistic model of a manifold at fixed time, to its adjacent temporal stages. The proposed methodology is demonstrated using a particle simulation of an interacting dwarf galaxy to describe the evolution of a cavity generated by a Supernov
Exploring missing heritability in neurodevelopmental disorders:Learning from regulatory elements
In this thesis, I aimed to solve part of the missing heritability in neurodevelopmental disorders, using computational approaches. Next to the investigations of a novel epilepsy syndrome and investigations aiming to elucidate the regulation of the gene involved, I investigated and prioritized genomic sequences that have implications in gene regulation during the developmental stages of human brain, with the goal to create an atlas of high confidence non-coding regulatory elements that future studies can assess for genetic variants in genetically unexplained individuals suffering from neurodevelopmental disorders that are of suspected genetic origin
Advances in knowledge discovery and data mining Part II
19th Pacific-Asia Conference, PAKDD 2015, Ho Chi Minh City, Vietnam, May 19-22, 2015, Proceedings, Part II</p