Search CORE

1,415 research outputs found

Detecting reliable gene interactions by a hierarchy of Bayesian network classifiers

Author: Alizadeh
Alon
Baker
Bayly
Blanco
Bontempi
Bouchard
Bouckaert
Braga-Neto
Causton
Duda
Efron
Francí
Friedman
Friedman
Friedman
Friedman
Fujita
Fukao
García
García
Garey
Golub
Greenbaum
Hall
Hall
Hartemink
Heckerman
Iñaki Inza
Kerber
Larrañaga
Lee
Li
Liang
Lin
Matusiak
Michiels
Minsky
Monti
Murayama
Pedro Larrañaga
Peña
Peña
Pe’er
Pe’er
Polyak
Rapaport
Rubén Armañanzas
Saeys
Sahami
Sakakura
Schwartz
Shmulevich
Simon
Stamatos
Statnikov
Swift
Takahashi
Wang
Wang
Wang
Yang
Zhang
Zhou
Publication venue: 'Elsevier BV'
Publication date: 01/01/2008
Field of study

The main purpose of a gene interaction network is to map the relationships of the genes that are out of sight when a genomic study is tackled. DNA microarrays allow the measure of gene expression of thousands of genes at the same time. These data constitute the numeric seed for the induction of the gene networks. In this paper, we propose a new approach to build gene networks by means of Bayesian classifiers, variable selection and bootstrap resampling. The interactions induced by the Bayesian classifiers are based both on the expression levels and on the phenotype information of the supervised variable. Feature selection and bootstrap resampling add reliability and robustness to the overall process removing the false positive findings. The consensus among all the induced models produces a hierarchy of dependences and, thus, of variables. Biologists can define the depth level of the model hierarchy so the set of interactions and genes involved can vary from a sparse to a dense set. Experimental results show how these networks perform well on classification tasks. The biological validation matches previous biological findings and opens new hypothesis for future studie

Crossref

LAReferencia - Red Federada de Repositorios Institucionales de Publicaciones Científicas Latinoamericanas

Archivo Digital UPM

Ensemble transcript interaction networks: A case study on Alzheimer's disease

Author: Abeel
Arai
Armañanzas
Armañanzas
Baker
Concha Bielza
Friedman
Froyen
Grupe
Harper
Ho
Hock
Kang
Kerber
Larrañaga
Liang
Miller
Nakashima
Noorbakhsh
Otaegui
Pandav
Pedro Larrañaga
Peña
Pe’er
Price
Rapaport
Rubén Armañanzas
Saeys
Sahami
Small
Sáenz
Tzang
Ueberham
Villoslada
Wall
Wu
Yang
Zhu
Publication venue: 'Elsevier BV'
Publication date: 01/01/2012
Field of study

Systems biology techniques are a topic of recent interest within the neurological field. Computational intelligence (CI) addresses this holistic perspective by means of consensus or ensemble techniques ultimately capable of uncovering new and relevant findings. In this paper, we propose the application of a CI approach based on ensemble Bayesian network classifiers and multivariate feature subset selection to induce probabilistic dependences that could match or unveil biological relationships. The research focuses on the analysis of high-throughput Alzheimer's disease (AD) transcript profiling. The analysis is conducted from two perspectives. First, we compare the expression profiles of hippocampus subregion entorhinal cortex (EC) samples of AD patients and controls. Second, we use the ensemble approach to study four types of samples: EC and dentate gyrus (DG) samples from both patients and controls. Results disclose transcript interaction networks with remarkable structures and genes not directly related to AD by previous studies. The ensemble is able to identify a variety of transcripts that play key roles in other neurological pathologies. Classical statistical assessment by means of non-parametric tests confirms the relevance of the majority of the transcripts. The ensemble approach pinpoints key metabolic mechanisms that could lead to new findings in the pathogenesis and development of A

Crossref

LAReferencia - Red Federada de Repositorios Institucionales de Publicaciones Científicas Latinoamericanas

Archivo Digital UPM

Choosing negative examples for the prediction of protein-protein interactions

Author: Ben-Hur Asa
Noble William Stafford
Publication venue: BioMed Central
Publication date: 01/01/2006
Field of study

The protein-protein interaction networks of even well-studied model organisms are sketchy at best, highlighting the continued need for computational methods to help direct experimentalists in the search for novel interactions. This need has prompted the development of a number of methods for predicting protein-protein interactions based on various sources of data and methodologies. The common method for choosing negative examples for training a predictor of protein-protein interactions is based on annotations of cellular localization, and the observation that pairs of proteins that have different localization patterns are unlikely to interact. While this method leads to high quality sets of non-interacting proteins, we find that this choice can lead to biased estimates of prediction accuracy, because the constraints placed on the distribution of the negative examples makes the task easier. The effects of this bias are demonstrated in the context of both sequence-based and non-sequence based features used for predicting protein-protein interactions

Crossref

Springer - Publisher Connector

PubMed Central

A review of estimation of distribution algorithms in bioinformatics

Evolutionary search algorithms have become an essential asset in the algorithmic toolbox for solving high-dimensional optimization problems in across a broad range of bioinformatics problems. Genetic algorithms, the most well-known and representative evolutionary search technique, have been the subject of the major part of such applications. Estimation of distribution algorithms (EDAs) offer a novel evolutionary paradigm that constitutes a natural and attractive alternative to genetic algorithms. They make use of a probabilistic model, learnt from the promising solutions, to guide the search process. In this paper, we set out a basic taxonomy of EDA techniques, underlining the nature and complexity of the probabilistic model of each EDA variant. We review a set of innovative works that make use of EDA techniques to solve challenging bioinformatics problems, emphasizing the EDA paradigm's potential for further research in this domain

Crossref

LAReferencia - Red Federada de Repositorios Institucionales de Publicaciones Científicas Latinoamericanas

Springer - Publisher Connector

Directory of Open Access Journals

Ghent University Academic Bibliography

PubMed Central

Archivo Digital para la Docencia y la Investigación

Academica-e

Archivo Digital UPM

Identification of a biomarker panel for colorectal cancer diagnosis

Abstract Background Malignancies arising in the large bowel cause the second largest number of deaths from cancer in the Western World. Despite progresses made during the last decades, colorectal cancer remains one of the most frequent and deadly neoplasias in the western countries. Methods A genomic study of human colorectal cancer has been carried out on a total of 31 tumoral samples, corresponding to different stages of the disease, and 33 non-tumoral samples. The study was carried out by hybridisation of the tumour samples against a reference pool of non-tumoral samples using Agilent Human 1A 60-mer oligo microarrays. The results obtained were validated by qRT-PCR. In the subsequent bioinformatics analysis, gene networks by means of Bayesian classifiers, variable selection and bootstrap resampling were built. The consensus among all the induced models produced a hierarchy of dependences and, thus, of variables. Results After an exhaustive process of pre-processing to ensure data quality--lost values imputation, probes quality, data smoothing and intraclass variability filtering--the final dataset comprised a total of 8, 104 probes. Next, a supervised classification approach and data analysis was carried out to obtain the most relevant genes. Two of them are directly involved in cancer progression and in particular in colorectal cancer. Finally, a supervised classifier was induced to classify new unseen samples. Conclusions We have developed a tentative model for the diagnosis of colorectal cancer based on a biomarker panel. Our results indicate that the gene profile described herein can discriminate between non-cancerous and cancerous samples with 94.45% accuracy using different supervised classifiers (AUC values in the range of 0.997 and 0.955).</p

Crossref

LAReferencia - Red Federada de Repositorios Institucionales de Publicaciones Científicas Latinoamericanas

Springer - Publisher Connector

Directory of Open Access Journals

PubMed Central

Archivo Digital para la Docencia y la Investigación

Archivo Digital UPM

Differential Micro RNA Expression in PBMC from Multiple Sclerosis Patients

Author: Adolfo López de Munain
Ana Asensio
Borja Calvo
David Otaegui
Iñaki Inza
Janet Kelso
Javier Olaskoaga
Jose A. Lozano
Maider Muñoz-Culla
Puya Khankhanian
Ruben Armañanzas
Sergio E. Baranzini
Tamara Castillo-Triviño
Publication venue: Public Library of Science
Publication date: 01/01/2009
Field of study

Differences in gene expression patterns have been documented not only in Multiple Sclerosis patients versus healthy controls but also in the relapse of the disease. Recently a new gene expression modulator has been identified: the microRNA or miRNA. The aim of this work is to analyze the possible role of miRNAs in multiple sclerosis, focusing on the relapse stage. We have analyzed the expression patterns of 364 miRNAs in PBMC obtained from multiple sclerosis patients in relapse status, in remission status and healthy controls. The expression patterns of the miRNAs with significantly different expression were validated in an independent set of samples. In order to determine the effect of the miRNAs, the expression of some predicted target genes of these were studied by qPCR. Gene interaction networks were constructed in order to obtain a co-expression and multivariate view of the experimental data. The data analysis and later validation reveal that two miRNAs (hsa-miR-18b and hsa-miR-599) may be relevant at the time of relapse and that another miRNA (hsa-miR-96) may be involved in remission. The genes targeted by hsa-miR-96 are involved in immunological pathways as Interleukin signaling and in other pathways as wnt signaling. This work highlights the importance of miRNA expression in the molecular mechanisms implicated in the disease. Moreover, the proposed involvement of these small molecules in multiple sclerosis opens up a new therapeutic approach to explore and highlight some candidate biomarker targets in MS

Public Library of Science (PLOS)

Crossref

LAReferencia - Red Federada de Repositorios Institucionales de Publicaciones Científicas Latinoamericanas

Directory of Open Access Journals

PubMed Central

Archivo Digital para la Docencia y la Investigación

Archivo Digital UPM

Discovering meaning from biological sequences: focus on predicting misannotated proteins, binding patterns, and G4-quadruplex secondary

Author: Andorf Carson Michael
Publication venue: Iowa State University Digital Repository
Publication date: 01/01/2013
Field of study

Proteins are the principal catalytic agents, structural elements, signal transmitters, transporters, and molecular machines in cells. Experimental determination of protein function is expensive in time and resources compared to computational methods. Hence, assigning proteins function, predicting protein binding patterns, and understanding protein regulation are important problems in functional genomics and key challenges in bioinformatics. This dissertation comprises of three studies. In the first two papers, we apply machine-learning methods to (1) identify misannotated sequences and (2) predict the binding patterns of proteins. The third paper is (3) a genome-wide analysis of G4-quadruplex sequences in the maize genome. The first two papers are based on two-stage classification methods. The first stage uses machine-learning approaches that combine composition-based and sequence-based features. We use either a decision trees (HDTree) or support vector machines (SVM) as second-stage classifiers and show that classification performance reaches or outperforms more computationally expensive approaches. For study (1) our method identified potential misannotated sequences within a well-characterized set of proteins in a popular bioinformatics database. We identified misannotated proteins and show the proteins have contradicting AmiGO and UniProt annotations. For study (2), we developed a three-phase approach: Phase I classifies whether a protein binds with another protein. Phase II determines whether a protein-binding protein is a hub. Phase III classifies hub proteins based on the number of binding sites and the number of concurrent binding partners. For study (3), we carried out a computational genome-wide screen to identify non-telomeric G4-quadruplex (G4Q) elements in maize to explore their potential role in gene regulation for flowering plants. Analysis of G4Q-containing genes uncovered a striking tendency for their enrichment in genes of networks and pathways associated with electron transport, sugar degradation, and hypoxia responsiveness. The maize G4Q elements may play a previously unrecognized role in coordinating global regulation of gene expression in response to hypoxia to control carbohydrate metabolism for anaerobic metabolism. We demonstrated that our three studies have the ability to predict and provide new insights in classifying misannotated proteins, understanding protein binding patterns, and identifying a potentially new model for gene regulation

Digital Repository @ Iowa State University (ISU)

Prediction of lung tumor types based on protein attributes by machine learning algorithms

Author
Publication venue: 'Springer Science and Business Media LLC'
Publication date: 01/01/2013
Field of study

Crossref

In Silico Gene Regulatory Network of the Maurer’s Cleft Pathway in Plasmodium falciparum

Author: Isewon Itunuoluwa
Oyelade O. J.
Brors B.
Adebiyi Ezekiel
Publication venue
Publication date: 01/01/2015
Field of study

The Maurer’s clefts (MCs) are very important for the survival of Plasmodium falciparum within an infected cell as they are induced by the parasite itself in the erythrocyte for protein trafficking. The MCs form an interesting part of the parasite’s biology as they shed more light on how the parasite remodels the erythrocyte leading to host pathogenesis and death. Here, we predicted and analyzed the genetic regulatory network of genes identified to belong to the MCs using regularized graphical Gaussian model. Our network shows four major activators, their corresponding target genes, and predicted binding sites. One of these master activators is the serine repeat antigen 5 (SERA5), predominantly expressed among the SERA multigene family of P. falciparum, which is one of the blood-stage malaria vaccine candidates. Our results provide more details about functional interactions and the regulation of the genes in the MCs’ pathway of P. falciparum

Almae Matris Studiorum Campus

Covenant University Repository

Crossref

Directory of Open Access Journals