Search CORE

153,963 research outputs found

APPLE: Approximate Path for Penalized Likelihood Estimators

Author: Feng Yang
Yu Yi
Publication venue
Publication date: 01/01/2013
Field of study

In high-dimensional data analysis, penalized likelihood estimators are shown to provide superior results in both variable selection and parameter estimation. A new algorithm, APPLE, is proposed for calculating the Approximate Path for Penalized Likelihood Estimators. Both the convex penalty (such as LASSO) and the nonconvex penalty (such as SCAD and MCP) cases are considered. The APPLE efficiently computes the solution path for the penalized likelihood estimator using a hybrid of the modified predictor-corrector method and the coordinate-descent algorithm. APPLE is compared with several well-known packages via simulation and analysis of two gene expression data sets.Comment: 24 pages, 9 figure

arXiv.org e-Print Archive

Crossref

Warwick Research Archives Portal Repository

Explore Bristol Research

A Robust Hybrid Approach Based on Estimation of Distribution Algorithm and Support Vector Machine for Hunting Candidate Disease Genes

Author: Chang Liu
Fang Wang
Fangfang Zhang
Hongmei Chen
Li Li
Lihua Bai
Luying Peng
Yihan Chen
Publication venue: 'Hindawi Limited'
Publication date: 01/01/2013
Field of study

Microarray data are high dimension with high noise ratio and relatively small sample size, which makes it a challenge to use microarray data to identify candidate disease genes. Here, we have presented a hybrid method that combines estimation of distribution algorithm with support vector machine for selection of key feature genes. We have benchmarked the method using the microarray data of both diffuse B cell lymphoma and colon cancer to demonstrate its performance for identifying key features from the profile data of high-dimension gene expression. The method was compared with a probabilistic model based on genetic algorithm and another hybrid method based on both genetics algorithm and support vector machine. The results showed that the proposed method provides new computational strategy for hunting candidate disease genes from the profile data of disease gene expression. The selected candidate disease genes may help to improve the diagnosis and treatment for diseases

Crossref

Directory of Open Access Journals

An evolutionary approach for balancing effectiveness and representation level in gene selection

Author: CANNAS LAURA MARIA
DESSI NICOLETTA
PES BARBARA
Publication venue: 'IGI Global'
Publication date: 01/01/2015
Field of study

As data mining develops and expands to new application areas, feature selection also reveals various aspects to be considered. This paper underlines two aspects that seem to categorize the large body of available feature selection algorithms: the effectiveness and the representation level. The effectiveness deals with selecting the minimum set of variables that maximize the accuracy of a classifier and the representation level concerns discovering how relevant the variables are for the domain of interest. For balancing the above aspects, the paper proposes an evolutionary framework for feature selection that expresses a hybrid method, organized in layers, each of them exploits a specific model of search strategy. Extensive experiments on gene selection from DNA-microarray datasets are presented and discussed. Results indicate that the framework compares well with different hybrid methods proposed in literature as it has the capability of finding well suited subsets of informative features while improving classification accurac

Crossref

Archivio istituzionale della ricerca - Università di Cagliari

Filter-Wrapper Methods For Gene Selection In Cancer Classification

Author: Alomari Osama Ahmad Suleiman
Publication venue
Publication date: 01/09/2018
Field of study

In microarray gene expression studies, finding the smallest subset of informative genes from microarray datasets for clinical diagnosis and accurate cancer classification is one of the most difficult challenges in machine learning task. Many researchers have devoted their efforts to address this problem by using a filter method, a wrapper method or a combination of both approaches. A hybrid method is a hybridisation approach between filter and wrapper methods. It benefits from the speed of the filter approach and the accuracy of the wrapper approach. Several hybrid filter-wrapper methods have been proposed to select informative genes. However, hybrid methods encounter a number of limitations, which are associated with filter and wrapper approaches. The gene subset that is produced by filter approaches lacks predictiveness and robustness. The wrapper approach encounters problems of complex interactions among genes and stagnation in local optima. To address these drawbacks, this study investigates filter and wrapper methods to develop effective hybrid methods for gene selection. This study proposes new hybrid filter-wrapper methods based on Maximum Relevancy Minimum Redundancy (MRMR) as a filter approach and adapted bat-inspired algorithm (BA) as a wrapper approach. First, MRMR hybridisation and BA adaptation are investigated to resolve the gene selection problem. The proposed method is called MRMR-BA

Repository@USM

Very Important Pool (VIP) genes – an application for microarray-based molecular signatures

Author: A Ben-Dor
A Bhattacharjee
A Butte
A Rosenwald
AK Jain
AL Bluma
B Liu
C Ambroise
C Ding
C Lai
D Singh
DG Beer
DJ Lockhart
EJ Yeoh
EK Tang
GJ Gordon
H Hackl
HH Zhang
Hong Fang
Huixiao Hong
IM Gana Dresen
InfoMetrix
J Dopazoa
J Gould
J Quackenbush
J Quackenbush
JG Zhang
JJ Chen
KE Lee
L Brehelin
L Breiman
L Ein-Dor
L Li
L Shi
L Shi
L Shi
L Wang
Leming Shi
LF Wessels
LJ van 't Veer
M Dettling
M Schena
MA Shipp
R Diaz-Uriarte
R Simon
Roger Perkins
S Dudoit
S Michiels
S Mukherjee
S Wold
SE Jarvis
SJ Raudys
SL Pomeroy
U Alon
U Lutz
VN Vapnik
W Jiang
Weida Tong
WJ Fu
X Chen
Y Peng
Y Wang
Z Su
Zhenqiang Su
Publication venue: BioMed Central
Publication date: 01/01/2008
Field of study

Abstract Background Advances in DNA microarray technology portend that molecular signatures from which microarray will eventually be used in clinical environments and personalized medicine. Derivation of biomarkers is a large step beyond hypothesis generation and imposes considerably more stringency for accuracy in identifying informative gene subsets to differentiate phenotypes. The inherent nature of microarray data, with fewer samples and replicates compared to the large number of genes, requires identifying informative genes prior to classifier construction. However, improving the ability to identify differentiating genes remains a challenge in bioinformatics. Results A new hybrid gene selection approach was investigated and tested with nine publicly available microarray datasets. The new method identifies a Very Important Pool (VIP) of genes from the broad patterns of gene expression data. The method uses a bagging sampling principle, where the re-sampled arrays are used to identify the most informative genes. Frequency of selection is used in a repetitive process to identify the VIP genes. The putative informative genes are selected using two methods, t-statistic and discriminatory analysis. In the t-statistic, the informative genes are identified based on p-values. In the discriminatory analysis, disjoint Principal Component Analyses (PCAs) are conducted for each class of samples, and genes with high discrimination power (DP) are identified. The VIP gene selection approach was compared with the p-value ranking approach. The genes identified by the VIP method but not by the p-value ranking approach are also related to the disease investigated. More importantly, these genes are part of the pathways derived from the common genes shared by both the VIP and p-ranking methods. Moreover, the binary classifiers built from these genes are statistically equivalent to those built from the top 50 p-value ranked genes in distinguishing different types of samples. Conclusion The VIP gene selection approach could identify additional subsets of informative genes that would not always be selected by the p-value ranking method. These genes are likely to be additional true positives since they are a part of pathways identified by the p-value ranking method and expected to be related to the relevant biology. Therefore, these additional genes derived from the VIP method potentially provide valuable biological insights.</p

Crossref

Springer - Publisher Connector

Directory of Open Access Journals

PubMed Central

Development and evaluation of machine learning algorithms for biomedical applications

Author: Turki Turki Talal
Publication venue: Digital Commons @ NJIT
Publication date: 01/04/2017
Field of study

Gene network inference and drug response prediction are two important problems in computational biomedicine. The former helps scientists better understand the functional elements and regulatory circuits of cells. The latter helps a physician gain full understanding of the effective treatment on patients. Both problems have been widely studied, though current solutions are far from perfect. More research is needed to improve the accuracy of existing approaches. This dissertation develops machine learning and data mining algorithms, and applies these algorithms to solve the two important biomedical problems. Specifically, to tackle the gene network inference problem, the dissertation proposes (i) new techniques for selecting topological features suitable for link prediction in gene networks; a graph sparsification method for network sampling; (iii) combined supervised and unsupervised methods to infer gene networks; and (iv) sampling and boosting techniques for reverse engineering gene networks. For drug sensitivity prediction problem, the dissertation presents (i) an instance selection technique and hybrid method for drug sensitivity prediction; (ii) a link prediction approach to drug sensitivity prediction; a noise-filtering method for drug sensitivity prediction; and (iv) transfer learning approaches for enhancing the performance of drug sensitivity prediction. Substantial experiments are conducted to evaluate the effectiveness and efficiency of the proposed algorithms. Experimental results demonstrate the feasibility of the algorithms and their superiority over the existing approaches

Digital Commons @ New Jersey Institute of Technology (NJIT)

Recommended from our members

Harnessing Saccharomyces cerevisiae Genetics for Cell Engineering

Author: Wingler Laura Michele
Publication venue: 'Columbia University Libraries/Information Services'
Publication date: 01/01/2011
Field of study

Cell engineering holds the promise of creating designer microorganisms that can address some of society's most pressing needs, ranging from the production of biofuels and drugs to the detection of disease states or environmental contaminants. Realizing these goals will require the extensive reengineering of cells, which will be a formidable task due both to our incomplete understanding of the cell at the systems level and to the technical difficulty of manipulating the genome on a large scale. In Chapter 1, we begin by discussing the potential of directed evolution approaches to overcome the challenges of cell engineering. We then cover the methodologies that are emerging to adapt the mutagenesis and selection steps of directed evolution for in vivo, multi-component systems. Yeast hybrid assays provide versatile systems for coupling a function of interest to a high-throughput growth selection for directed evolution. In Chapter 2, we develop an experimental framework to characterize and optimize the performance of yeast two- and three-hybrid growth selections. Using the LEU2 reporter gene as a model selectable marker, we show that quantitative characterization of these assay systems allows us to identify key junctures for optimization. In Chapter 3, we apply the same systematic characterization to the yeast three-hybrid counter selection, beginning with our previously reported URA3 reporter. We further develop a screening approach to identify effective new yeast three-hybrid counter selection reporters. Installing customized multi-gene pathways in the cell is arguably the first step of any cell engineering endeavor. Chapter 4 describes the design, construction, and initial validation of Reiterative Recombination, a robust in vivo DNA assembly method relying on homing endonuclease-stimulated homologous recombination. Reiterative Recombination elongates constructs of interest in a stepwise manner by employing pairs of alternating, orthogonal endonucleases and selectable markers. We anticipate that Reiterative Recombination will be a valuable tool for a variety of cell engineering endeavors because it is both highly efficient and technically straightforward. As an initial application, we illustrate Reiterative Recombination's utility in the area of metabolic engineering in Chapter 5. Specifically, we demonstrate that we can build functional biosynthetic pathways and generate large libraries of pathways in vivo. The facility of pathway construction by Reiterative Recombination should expedite strain optimization for metabolic engineering

Columbia University Academic Commons

Differential introgression and the maintenance of species boundaries in an advanced generation avian hybrid zone

Author: Kovach Adrienne I.
Olsen Brian J.
Shriver W. Gregory
Walsh Jennifer
Publication venue: University of New Hampshire Scholars\u27 Repository
Publication date: 22/03/2016
Field of study

Background: Evolutionary processes, including selection and differential fitness, shape the introgression of genetic material across a hybrid zone, resulting in the exchange of some genes but not others. Differential introgression of molecular or phenotypic markers can thus provide insight into factors contributing to reproductive isolation. We characterized patterns of genetic variation across a hybrid zone between two tidal marsh birds, Saltmarsh (Ammodramus caudacutus) and Nelson’s (A. nelsoni) sparrows (n = 286), and compared patterns of introgression among multiple genetic markers and phenotypic traits. Results: Geographic and genomic cline analyses revealed variable patterns of introgression among marker types. Most markers exhibited gradual clines and indicated that introgression exceeds the spatial extent of the previously documented hybrid zone. We found steeper clines, indicating strong selection for loci associated with traits related to tidal marsh adaptations, including for a marker linked to a gene region associated with metabolic functions, including an osmotic regulatory pathway, as well as for a marker related to melanin-based pigmentation, supporting an adaptive role of darker plumage (salt marsh melanism) in tidal marshes. Narrow clines at mitochondrial and sex-linked markers also offer support for Haldane’s rule. We detected patterns of asymmetrical introgression toward A. caudacutus, which may be driven by differences in mating strategy or differences in population density between the two species. Conclusions: Our findings offer insight into the dynamics of a hybrid zone traversing a unique environmental gradient and provide evidence for a role of ecological divergence in the maintenance of pure species boundaries despite ongoing gene flow

UNH Scholars' Repository