7 research outputs found
Greedy MAXCUT Algorithms and their Information Content
MAXCUT defines a classical NP-hard problem for graph partitioning and it
serves as a typical case of the symmetric non-monotone Unconstrained Submodular
Maximization (USM) problem. Applications of MAXCUT are abundant in machine
learning, computer vision and statistical physics. Greedy algorithms to
approximately solve MAXCUT rely on greedy vertex labelling or on an edge
contraction strategy. These algorithms have been studied by measuring their
approximation ratios in the worst case setting but very little is known to
characterize their robustness to noise contaminations of the input data in the
average case. Adapting the framework of Approximation Set Coding, we present a
method to exactly measure the cardinality of the algorithmic approximation sets
of five greedy MAXCUT algorithms. Their information contents are explored for
graph instances generated by two different noise models: the edge reversal
model and Gaussian edge weights model. The results provide insights into the
robustness of different greedy heuristics and techniques for MAXCUT, which can
be used for algorithm design of general USM problems.Comment: This is a longer version of the paper published in 2015 IEEE
Information Theory Workshop (ITW
Data-driven modelling of biological multi-scale processes
Biological processes involve a variety of spatial and temporal scales. A
holistic understanding of many biological processes therefore requires
multi-scale models which capture the relevant properties on all these scales.
In this manuscript we review mathematical modelling approaches used to describe
the individual spatial scales and how they are integrated into holistic models.
We discuss the relation between spatial and temporal scales and the implication
of that on multi-scale modelling. Based upon this overview over
state-of-the-art modelling approaches, we formulate key challenges in
mathematical and computational modelling of biological multi-scale and
multi-physics processes. In particular, we considered the availability of
analysis tools for multi-scale models and model-based multi-scale data
integration. We provide a compact review of methods for model-based data
integration and model-based hypothesis testing. Furthermore, novel approaches
and recent trends are discussed, including computation time reduction using
reduced order and surrogate models, which contribute to the solution of
inference problems. We conclude the manuscript by providing a few ideas for the
development of tailored multi-scale inference methods.Comment: This manuscript will appear in the Journal of Coupled Systems and
Multiscale Dynamics (American Scientific Publishers
Learning representations from dendrograms
We propose unsupervised representation learning and feature extraction from dendrograms. The commonly used Minimax distance measures correspond to building a dendrogram with single linkage criterion, with defining specific forms of a level function and a distance function over that. Therefore, we extend this method to arbitrary dendrograms. We develop a generalized framework wherein different distance measures and representations can be inferred from different types of dendrograms, level functions and distance functions. Via an appropriate embedding, we compute a vector-based representation of the inferred distances, in order to enable many numerical machine learning algorithms to employ such distances. Then, to address the model selection problem, we study the aggregation of different dendrogram-based distances respectively in solution space and in representation space in the spirit of deep representations. In the first approach, for example for the clustering problem, we build a graph with positive and negative edge weights according to the consistency of the clustering labels of different objects among different solutions, in the context of ensemble methods. Then, we use an efficient variant of correlation clustering to produce the final clusters. In the second approach, we investigate the combination of different distances and features sequentially in the spirit of multi-layered architectures to obtain the final features. Finally, we demonstrate the effectiveness of our approach via several numerical studies
Proceedings of the Fifth Workshop on Information Theoretic Methods in Science and Engineering
These are the online proceedings of the Fifth Workshop on Information Theoretic Methods in Science and Engineering (WITMSE), which was held in the Trippenhuis, Amsterdam, in August 2012