14 research outputs found

    An Exact Algorithm for Semi-supervised Minimum Sum-of-Squares Clustering

    Full text link
    The minimum sum-of-squares clustering (MSSC), or k-means type clustering, is traditionally considered an unsupervised learning task. In recent years, the use of background knowledge to improve the cluster quality and promote interpretability of the clustering process has become a hot research topic at the intersection of mathematical optimization and machine learning research. The problem of taking advantage of background information in data clustering is called semi-supervised or constrained clustering. In this paper, we present a branch-and-cut algorithm for semi-supervised MSSC, where background knowledge is incorporated as pairwise must-link and cannot-link constraints. For the lower bound procedure, we solve the semidefinite programming relaxation of the MSSC discrete optimization model, and we use a cutting-plane procedure for strengthening the bound. For the upper bound, instead, by using integer programming tools, we use an adaptation of the k-means algorithm to the constrained case. For the first time, the proposed global optimization algorithm efficiently manages to solve real-world instances up to 800 data points with different combinations of must-link and cannot-link constraints and with a generic number of features. This problem size is about four times larger than the one of the instances solved by state-of-the-art exact algorithms

    A Bibliographic View on Constrained Clustering

    Full text link
    A keyword search on constrained clustering on Web-of-Science returned just under 3,000 documents. We ran automatic analyses of those, and compiled our own bibliography of 183 papers which we analysed in more detail based on their topic and experimental study, if any. This paper presents general trends of the area and its sub-topics by Pareto analysis, using citation count and year of publication. We list available software and analyse the experimental sections of our reference collection. We found a notable lack of large comparison experiments. Among the topics we reviewed, applications studies were most abundant recently, alongside deep learning, active learning and ensemble learning.Comment: 18 pages, 11 figures, 177 reference

    Coach informed biomechanical analysis of the golf swing

    Get PDF
    Coach informed biomechanical analysis of the golf swin

    Tracking the Temporal-Evolution of Supernova Bubbles in Numerical Simulations

    Get PDF
    The study of low-dimensional, noisy manifolds embedded in a higher dimensional space has been extremely useful in many applications, from the chemical analysis of multi-phase flows to simulations of galactic mergers. Building a probabilistic model of the manifolds has helped in describing their essential properties and how they vary in space. However, when the manifold is evolving through time, a joint spatio-temporal modelling is needed, in order to fully comprehend its nature. We propose a first-order Markovian process that propagates the spatial probabilistic model of a manifold at fixed time, to its adjacent temporal stages. The proposed methodology is demonstrated using a particle simulation of an interacting dwarf galaxy to describe the evolution of a cavity generated by a Supernov

    Exploring missing heritability in neurodevelopmental disorders:Learning from regulatory elements

    Get PDF

    Exploring missing heritability in neurodevelopmental disorders:Learning from regulatory elements

    Get PDF
    In this thesis, I aimed to solve part of the missing heritability in neurodevelopmental disorders, using computational approaches. Next to the investigations of a novel epilepsy syndrome and investigations aiming to elucidate the regulation of the gene involved, I investigated and prioritized genomic sequences that have implications in gene regulation during the developmental stages of human brain, with the goal to create an atlas of high confidence non-coding regulatory elements that future studies can assess for genetic variants in genetically unexplained individuals suffering from neurodevelopmental disorders that are of suspected genetic origin

    Advances in knowledge discovery and data mining Part II

    Get PDF
    19th Pacific-Asia Conference, PAKDD 2015, Ho Chi Minh City, Vietnam, May 19-22, 2015, Proceedings, Part II</p
    corecore