Search CORE

5 research outputs found

Codominant scoring of AFLP in association panels

Author: A Wong
AP Dempster
AX Deniau
B Garel
BG Lindsay
C Fraley
D Böhning
Fred A. van Eeuwijk
G Gort
G Gort
G McLachlan
Gerrit Gort
GJ McLachlan
HJ Eck van
HM Meudt
HP Piepho
HP Piepho
J Bezdek
J Chen
J Chen
J Cuesta-Albertos
JW Heath
Keygene Products BV
M Pérez-Enciso
M Vuylsteke
P Castiglioni
P Li
P McCullagh
P Vos
R Berloo van
R Berloo van
R Ihaka
RC Jansen
RC Jansen
RC Jansen
SM Reamon-Büttner
W Jank
Y Lo
Z Liu
ZD Feng
Publication venue: Springer-Verlag
Publication date: 01/01/2010
Field of study

A study on the codominant scoring of AFLP markers in association panels without prior knowledge on genotype probabilities is described. Bands are scored codominantly by fitting normal mixture models to band intensities, illustrating and optimizing existing methodology, which employs the EM-algorithm. We study features that improve the performance of the algorithm, and the unmixing in general, like parameter initialization, restrictions on parameters, data transformation, and outlier removal. Parameter restrictions include equal component variances, equal or nearly equal distances between component means, and mixing probabilities according to Hardy–Weinberg Equilibrium. Histogram visualization of band intensities with superimposed normal densities, and optional classification scores and other grouping information, assists further in the codominant scoring. We find empirical evidence favoring the square root transformation of the band intensity, as was found in segregating populations. Our approach provides posterior genotype probabilities for marker loci. These probabilities can form the basis for association mapping and are more useful than the standard scoring categories A, H, B, C, D. They can also be used to calculate predictors for additive and dominance effects. Diagnostics for data quality of AFLP markers are described: preference for three-component mixture model, good separation between component means, and lack of singletons for the component with highest mean. Software has been developed in R, containing the models for normal mixtures with facilitating features, and visualizations. The methods are applied to an association panel in tomato, comprising 1,175 polymorphic markers on 94 tomato hybrids, as part of a larger study within the Dutch Centre for BioSystems Genomics

Crossref

Springer - Publisher Connector

PubMed Central

Wageningen University & Research Publications

Global Optimization of Finite Mixture Models

Author: Heath Jeffrey Wells
Publication venue
Publication date: 31/05/2007
Field of study

The Expectation-Maximization (EM) algorithm is a popular and convenient tool for the estimation of Gaussian mixture models and its natural extension, model-based clustering. However, while the algorithm is convenient to implement and numerically very stable, it only produces solutions that are locally optimal. Thus, EM may not achieve the globally optimal solution in Gaussian mixture analysis problems, which can have a large number of local optima. This dissertation introduces several new algorithms designed to produce globally optimal solutions for Gaussian mixture models. The building blocks for these algorithms are methods from the operations research literature, namely the Cross-Entropy (CE) method and Model Reference Adaptive Search (MRAS). The new algorithms we propose must efficiently simulate positive definite covariance matrices of the Gaussian mixture components. We propose several new solutions to this problem. One solution is to blend the updating procedure of CE and MRAS with the principles of Expectation-Maximization updating for the covariance matrices, leading to two new algorithms, CE-EM and MRAS-EM. We also propose two additional algorithms, CE-CD and MRAS-CD, which rely on the Cholesky decomposition to construct the random covariance matrices. Numerical experiments illustrate the effectiveness of the proposed algorithms in finding global optima where the classical EM fails to do so. We find that although a single run of the new algorithms may be slower than EM, they have the potential of producing significantly better global solutions to the model-based clustering problem. We also show that the global optimum matters in the sense that it significantly improves the clustering task. Furthermore, we provide a a theoretical proof of global convergence to the optimal solution of the likelihood function of Gaussian mixtures for one of the algorithms, namely MRAS-CD. This offers support that the algorithm is not merely an ad-hoc heuristic, but is systematically designed to produce global solutions to Gaussian mixture models. Finally, we investigate the fitness landscape of Gaussian mixture models and give evidence for why this is a difficult global optimization problem. We discuss different metrics that can be used to evaluate the difficulty of global optimization problems, and then apply them to the context of Gaussian mixture models

Digital Repository at the University of Maryland

Bioinformatics

Author
Publication venue: 'IntechOpen'
Publication date: 20/04/2021
Field of study

This book is divided into different research areas relevant in Bioinformatics such as biological networks, next generation sequencing, high performance computing, molecular modeling, structural bioinformatics, molecular modeling and intelligent data analysis. Each book section introduces the basic concepts and then explains its application to problems of great relevance, so both novice and expert readers can benefit from the information and research works presented here

Directory of Open Access Books (DOAB)

New global optimization algorithms for model-based clustering

Author: Fu Michael C.
Heath Jeffrey W.
Jank Wolfgang
Publication venue
Publication date
Field of study

The Expectation-Maximization (EM) algorithm is a very popular optimization tool for mixture problems and in particular for model-based clustering problems. However, while the algorithm is convenient to implement and numerically very stable, it only produces local solutions. Thus, it may not achieve the globally optimal solution in problems that have a large number of local optima. This paper introduces several new algorithms designed to produce global solutions in model-based clustering. The building blocks for these algorithms are methods from the operations research literature, namely the Cross-Entropy (CE) method and Model Reference Adaptive Search (MRAS). One problem with applying these methods directly is the efficient simulation of positive definite covariance matrices. We propose several new solutions to this problem. One solution is to apply the principles of Expectation-Maximization updating, which leads to two new algorithms, CE-EM and MRAS-EM. We also propose two additional algorithms, CE-CD and MRAS-CD, which rely on the Cholesky decomposition. We conduct numerical experiments of varying complexity to evaluate the effectiveness of the proposed algorithms in comparison to classical EM. We find that although a single run of the new algorithms is slower than a single run of EM, all have the potential for producing significantly better solutions. We also find that although repeat application of EM may achieve similar results, our algorithms provide automated, data-driven decision rules which may significantly reduce the burden of searching for the global optimum.

Research Papers in Economics