4 research outputs found
The partitioned LASSO-patternsearch algorithm with application to gene expression data
In systems biology, the task of reverse engineering gene pathways from data has been limited not just by the curse of dimensionality (the interaction space is huge) but also by systematic error in the data. The gene expression barcode reduces spurious association driven by batch effects and probe effects. The binary nature of the resulting expression calls lends itself perfectly to modern regularization approaches that thrive in high-dimensional settings. The Partitioned LASSO-Patternsearch algorithm is proposed to identify patterns of multiple dichotomous risk factors for outcomes of interest in genomic studies. A partitioning scheme is used to identify promising patterns by solving many LASSO-Patternsearch subproblems in parallel. All variables that survive this stage proceed to an aggregation stage where the most significant patterns are identified by solving a reduced LASSO-Patternsearch problem in just these variables. This approach was applied to genetic data sets with expression levels dichotomized by gene expression bar code. Most of the genes and second-order interactions thus selected and are known to be related to the outcomes. We demonstrate with simulations and data analyses that the proposed method not only selects variables and patterns more accurately, but also provides smaller models with better prediction accuracy, in comparison to several alternative methodologies.https://doi.org/10.1186/1471-2105-13-9
Measuring memetic algorithm performance on image fingerprints dataset
Personal identification has become one of the most important terms in our society regarding access control, crime and forensic identification, banking and also computer system. The fingerprint is the most used biometric feature caused by its unique, universality and stability. The fingerprint is widely used as a security feature for forensic recognition, building access, automatic teller machine (ATM) authentication or payment. Fingerprint recognition could be grouped in two various forms, verification and identification. Verification compares one on one fingerprint data. Identification is matching input fingerprint with data that saved in the database. In this paper, we measure the performance of the memetic algorithm to process the image fingerprints dataset. Before we run this algorithm, we divide our fingerprints into four groups according to its characteristics and make 15 specimens of data, do four partial tests and at the last of work we measure all computation time
Multivariate Bernoulli distribution
In this paper, we consider the multivariate Bernoulli distribution as a model
to estimate the structure of graphs with binary nodes. This distribution is
discussed in the framework of the exponential family, and its statistical
properties regarding independence of the nodes are demonstrated. Importantly
the model can estimate not only the main effects and pairwise interactions
among the nodes but also is capable of modeling higher order interactions,
allowing for the existence of complex clique effects. We compare the
multivariate Bernoulli model with existing graphical inference models - the
Ising model and the multivariate Gaussian model, where only the pairwise
interactions are considered. On the other hand, the multivariate Bernoulli
distribution has an interesting property in that independence and
uncorrelatedness of the component random variables are equivalent. Both the
marginal and conditional distributions of a subset of variables in the
multivariate Bernoulli distribution still follow the multivariate Bernoulli
distribution. Furthermore, the multivariate Bernoulli logistic model is
developed under generalized linear model theory by utilizing the canonical link
function in order to include covariate information on the nodes, edges and
cliques. We also consider variable selection techniques such as LASSO in the
logistic model to impose sparsity structure on the graph. Finally, we discuss
extending the smoothing spline ANOVA approach to the multivariate Bernoulli
logistic model to enable estimation of non-linear effects of the predictor
variables.Comment: Published in at http://dx.doi.org/10.3150/12-BEJSP10 the Bernoulli
(http://isi.cbs.nl/bernoulli/) by the International Statistical
Institute/Bernoulli Society (http://isi.cbs.nl/BS/bshome.htm