Search CORE

169,544 research outputs found

Integrative analysis of large-scale biological data sets

Author: Enrico Glaab
Jonathan M. Garibaldi
Natalio Krasnogor
Publication venue
Publication date: 27/01/2011
Field of study

We present two novel web-applications for microarray and gene/protein set analysis, ArrayMining.net and TopoGSA. These bioinformatics tools use integrative analysis methods, including ensemble and consensus machine learning techniques, as well as modular combinations of different analysis types, to extract new biological insights from experimental transcriptomics and proteomics data. They enable researchers to combine related algorithms and datasets to increase the robustness and accuracy of statistical analyses and exploit synergies of different computational methods, ranging from statistical learning to optimization and topological network analysis

Parameter Tuning Using Gaussian Processes

Author: Ma Jinjin
Publication venue: 'University of Waikato'
Publication date: 23/01/2012
Field of study

Most machine learning algorithms require us to set up their parameter values before applying these algorithms to solve problems. Appropriate parameter settings will bring good performance while inappropriate parameter settings generally result in poor modelling. Hence, it is necessary to acquire the “best” parameter values for a particular algorithm before building the model. The “best” model not only reflects the “real” function and is well fitted to existing points, but also gives good performance when making predictions for new points with previously unseen values. A number of methods exist that have been proposed to optimize parameter values. The basic idea of all such methods is a trial-and-error process whereas the work presented in this thesis employs Gaussian process (GP) regression to optimize the parameter values of a given machine learning algorithm. In this thesis, we consider the optimization of only two-parameter learning algorithms. All the possible parameter values are specified in a 2-dimensional grid in this work. To avoid brute-force search, Gaussian Process Optimization (GPO) makes use of “expected improvement” to pick useful points rather than validating every point of the grid step by step. The point with the highest expected improvement is evaluated using cross-validation and the resulting data point is added to the training set for the Gaussian process model. This process is repeated until a stopping criterion is met. The final model is built using the learning algorithm based on the best parameter values identified in this process. In order to test the effectiveness of this optimization method on regression and classification problems, we use it to optimize parameters of some well-known machine learning algorithms, such as decision tree learning, support vector machines and boosting with trees. Through the analysis of experimental results obtained on datasets from the UCI repository, we find that the GPO algorithm yields competitive performance compared with a brute-force approach, while exhibiting a distinct advantage in terms of training time and number of cross-validation runs. Overall, the GPO method is a promising method for the optimization of parameter values in machine learning

Efficient Optimization of Dominant Set Clustering with Frank-Wolfe Algorithms

Author: Haghir Chehreghani Morteza
Johnell Carl
Publication venue: 'Association for Computing Machinery (ACM)'
Publication date: 01/01/2022
Field of study

We study Frank-Wolfe algorithms - standard, pairwise, and away-steps - for efficient optimization of Dominant Set Clustering. We present a unified and computationally efficient framework to employ the different variants of Frank-Wolfe methods, and we investigate its effectiveness via several experimental studies. In addition, we provide explicit convergence rates for the algorithms in terms of the so-called Frank-Wolfe gap. The theoretical analysis has been specialized to Dominant Set Clustering and covers consistently the different variants

Chalmers Research

Efficient Optimization of Dominant Set Clustering with Frank-Wolfe Algorithms

Author: Chehreghani Morteza Haghir
Johnell Carl
Publication venue
Publication date: 05/08/2020
Field of study

We study Frank-Wolfe algorithms -- standard, pairwise, and away-steps -- for efficient optimization of Dominant Set Clustering. We present a unified and computationally efficient framework to employ the different variants of Frank-Wolfe methods, and we investigate its effectiveness via several experimental studies. In addition, we provide explicit convergence rates for the algorithms in terms of the so-called Frank-Wolfe gap. The theoretical analysis has been specialized to the problem of Dominant Set Clustering and is thus more easily accessible compared to prior work

arXiv.org e-Print Archive

Chalmers Research

Riemannian optimization and multidisciplinary design optimization

Author: Bakker C
Parks GT
Publication venue: Optimization and Engineering
Publication date: 01/01/2016
Field of study

Riemannian Optimization (RO) generalizes standard optimization methods from Euclidean spaces to Riemannian manifolds. Multidisciplinary Design Optimization (MDO) problems exist on Riemannian manifolds, and with the differential geometry framework which we have previously developed, we can now apply RO techniques to MDO. Here, we provide background theory and a literature review for RO and give the necessary formulae to implement the Steepest Descent Method (SDM), Newton’s Method (NM), and the Conjugate Gradient Method (CGM), in Riemannian form, on MDO problems. We then compare the performance of the Riemannian and Euclidean SDM, NM, and CGM algorithms on several test problems (including a satellite design problem from the MDO literature); we use a calculated step size, line search, and geodesic search in our comparisons. With the framework’s induced metric, the RO algorithms are generally not as effective as their Euclidean counterparts, and line search is consistently better than geodesic search. In our post-experimental analysis, we also show how the optimization trajectories for the Riemannian SDM and CGM relate to design coupling and thereby provide some explanation for the observed optimization behaviour. This work is only a first step in applying RO to MDO, however, and the use of quasi-Newton methods and different metrics should be explored in future research.This is the author accepted manuscript. It is currently under an indefinite embargo pending publication by Springer

Springer - Publisher Connector