Search CORE

97,569 research outputs found

Gene expression programming approach to event selection in high energy physics

Author: Teodorescu L
Publication venue: 'Institute of Electrical and Electronics Engineers (IEEE)'
Publication date: 01/08/2006
Field of study

Gene Expression Programming is a new evolutionary algorithm that overcomes many limitations of the more established Genetic Algorithms and Genetic Programming. Its first application to high energy physics data analysis is presented. The algorithm was successfully used for event selection on samples with both low and high background level. It allowed automatic identification of selection rules that can be interpreted as cuts applied on the input variables. The signal/background classification accuracy was over 90% in all cases

Crossref

Brunel University Research Archive

Spectral analysis of gene expression profiles using gene networks

Author: Barillot Emmanuel
Dutreix Marie
Rapaport Franck
Vert Jean-Philippe
Zinovyev Andrei
Publication venue
Publication date: 26/03/2006
Field of study

Microarrays have become extremely useful for analysing genetic phenomena, but establishing a relation between microarray analysis results (typically a list of genes) and their biological significance is often difficult. Currently, the standard approach is to map a posteriori the results onto gene networks to elucidate the functions perturbed at the level of pathways. However, integrating a priori knowledge of the gene networks could help in the statistical analysis of gene expression data and in their biological interpretation. Here we propose a method to integrate a priori the knowledge of a gene network in the analysis of gene expression data. The approach is based on the spectral decomposition of gene expression profiles with respect to the eigenfunctions of the graph, resulting in an attenuation of the high-frequency components of the expression profiles with respect to the topology of the graph. We show how to derive unsupervised and supervised classification algorithms of expression profiles, resulting in classifiers with biological relevance. We applied the method to the analysis of a set of expression profiles from irradiated and non-irradiated yeast strains. It performed at least as well as the usual classification but provides much more biologically relevant results and allows a direct biological interpretation

arXiv.org e-Print Archive

HAL-MINES ParisTech

A Feature Selection Algorithm to Compute Gene Centric Methylation from Probe Level Methylation Data

Author: Baur Brittany
Bozdag Serdar
Publication venue: e-Publications@Marquette
Publication date: 01/01/2016
Field of study

DNA methylation is an important epigenetic event that effects gene expression during development and various diseases such as cancer. Understanding the mechanism of action of DNA methylation is important for downstream analysis. In the Illumina Infinium HumanMethylation 450K array, there are tens of probes associated with each gene. Given methylation intensities of all these probes, it is necessary to compute which of these probes are most representative of the gene centric methylation level. In this study, we developed a feature selection algorithm based on sequential forward selection that utilized different classification methods to compute gene centric DNA methylation using probe level DNA methylation data. We compared our algorithm to other feature selection algorithms such as support vector machines with recursive feature elimination, genetic algorithms and ReliefF. We evaluated all methods based on the predictive power of selected probes on their mRNA expression levels and found that a K-Nearest Neighbors classification using the sequential forward selection algorithm performed better than other algorithms based on all metrics. We also observed that transcriptional activities of certain genes were more sensitive to DNA methylation changes than transcriptional activities of other genes. Our algorithm was able to predict the expression of those genes with high accuracy using only DNA methylation data. Our results also showed that those DNA methylation-sensitive genes were enriched in Gene Ontology terms related to the regulation of various biological processes

epublications@Marquette

Crossref

Directory of Open Access Journals

PubMed Central

FigShare

Classification of microarray data using gene networks

Author: A Sivachenko
Andrei Zinovyev
B Mohar
B Schölkopf
B Schölkopf
BE Boser
D Cavalieri
D Hanisch
D Hosack
Emmanuel Barillot
Franck Rapaport
FRK Chung
G Mercier
G Mercier
I Gat-Viks
I Jolliffe
J Rahnenfuhrer
J van Helden
JC Liao
Jean-Philippe Vert
JM Stuart
JP Vert
KR Curtis
Marie Dutreix
O Babur
O Radulescu
P Kharchenko
P Kharchenko
P Shannon
PD Karp
R Kelley
R Thomas
SJ Galbraith
T Breslin
T Hastie
TGO Consortium
Publication venue: BioMed Central
Publication date: 01/01/2007
Field of study

BACKGROUND: Microarrays have become extremely useful for analysing genetic phenomena, but establishing a relation between microarray analysis results (typically a list of genes) and their biological significance is often difficult. Currently, the standard approach is to map a posteriori the results onto gene networks in order to elucidate the functions perturbed at the level of pathways. However, integrating a priori knowledge of the gene networks could help in the statistical analysis of gene expression data and in their biological interpretation. RESULTS: We propose a method to integrate a priori the knowledge of a gene network in the analysis of gene expression data. The approach is based on the spectral decomposition of gene expression profiles with respect to the eigenfunctions of the graph, resulting in an attenuation of the high-frequency components of the expression profiles with respect to the topology of the graph. We show how to derive unsupervised and supervised classification algorithms of expression profiles, resulting in classifiers with biological relevance. We illustrate the method with the analysis of a set of expression profiles from irradiated and non-irradiated yeast strains. CONCLUSION: Including a priori knowledge of a gene network for the analysis of gene expression data leads to good classification performance and improved interpretability of the results

Crossref

Springer - Publisher Connector

Directory of Open Access Journals

PubMed Central

HAL-MINES ParisTech

Using Network Component Analysis to Dissect Regulatory Networks Mediated by Transcription Factors in Yeast

Author: A Boulesteix
A Ghazalpour
A Subramanian
B Efron
BE Stranger
BE Stranger
C Sabatti
C Ye
Chun Ye
CT Harbison
D GuhaThakurta
D Kulp
Edmund J. Crampin
EJ Chesler
Eleazar Eskin
EN Smith
F Gao
G Rustici
G Yvert
GA Churchill
James C. Liao
JC Liao
JD Storey
JD Storey
JJ Keurentjes
K Tedford
L Bardwell
L Bystrykh
L Chen
L Tian
L Tran
M Ronen
MA Zapala
MM Vleugel
N Bing
P Shannon
R Li
RB Brem
RB Brem
RB Brem
S Lee
Simon J. Galbraith
SJ Galbraith
TI Lee
VG Cheung
VK Mootha
W Sun
Publication venue: Public Library of Science
Publication date: 01/03/2009
Field of study

Understanding the relationship between genetic variation and gene expression is a central question in genetics. With the availability of data from high-throughput technologies such as ChIP-Chip, expression, and genotyping arrays, we can begin to not only identify associations but to understand how genetic variations perturb the underlying transcription regulatory networks to induce differential gene expression. In this study, we describe a simple model of transcription regulation where the expression of a gene is completely characterized by two properties: the concentrations and promoter affinities of active transcription factors. We devise a method that extends Network Component Analysis (NCA) to determine how genetic variations in the form of single nucleotide polymorphisms (SNPs) perturb these two properties. Applying our method to a segregating population of Saccharomyces cerevisiae, we found statistically significant examples of trans-acting SNPs located in regulatory hotspots that perturb transcription factor concentrations and affinities for target promoters to cause global differential expression and cis-acting genetic variations that perturb the promoter affinities of transcription factors on a single gene to cause local differential expression. Although many genetic variations linked to gene expressions have been identified, it is not clear how they perturb the underlying regulatory networks that govern gene expression. Our work begins to fill this void by showing that many genetic variations affect the concentrations of active transcription factors in a cell and their affinities for target promoters. Understanding the effects of these perturbations can help us to paint a more complete picture of the complex landscape of transcription regulation. The software package implementing the algorithms discussed in this work is available as a MATLAB package upon request

Crossref

Directory of Open Access Journals

PubMed Central

eScholarship - University of California

GenClust: A genetic algorithm for clustering gene expression data

Author: Di Gesú Vito
Giancarlo Raffaele
Lo Bosco Giosué
Raimondi Alessandra
Scaturro Davide
Publication venue: BioMed Central
Publication date: 01/01/2005
Field of study

BACKGROUND: Clustering is a key step in the analysis of gene expression data, and in fact, many classical clustering algorithms are used, or more innovative ones have been designed and validated for the task. Despite the widespread use of artificial intelligence techniques in bioinformatics and, more generally, data analysis, there are very few clustering algorithms based on the genetic paradigm, yet that paradigm has great potential in finding good heuristic solutions to a difficult optimization problem such as clustering. RESULTS: GenClust is a new genetic algorithm for clustering gene expression data. It has two key features: (a) a novel coding of the search space that is simple, compact and easy to update; (b) it can be used naturally in conjunction with data driven internal validation methods. We have experimented with the FOM methodology, specifically conceived for validating clusters of gene expression data. The validity of GenClust has been assessed experimentally on real data sets, both with the use of validation measures and in comparison with other algorithms, i.e., Average Link, Cast, Click and K-means. CONCLUSION: Experiments show that none of the algorithms we have used is markedly superior to the others across data sets and validation measures; i.e., in many cases the observed differences between the worst and best performing algorithm may be statistically insignificant and they could be considered equivalent. However, there are cases in which an algorithm may be better than others and therefore worthwhile. In particular, experiments for GenClust show that, although simple in its data representation, it converges very rapidly to a local optimum and that its ability to identify meaningful clusters is comparable, and sometimes superior, to that of more sophisticated algorithms. In addition, it is well suited for use in conjunction with data driven internal validation measures and, in particular, the FOM methodology

Springer - Publisher Connector

Directory of Open Access Journals

PubMed Central

Archivio istituzionale della ricerca - Università di Palermo

Computational, Integrative, and Comparative Methods for the Elucidation of Genetic Coexpression Networks

Author: Baldwin Nicole E.
Chesler Elissa J.
Kirov Stefan
Langston Michael A.
Snoddy Jay R.
Williams Robert W.
Zhang Bing
Publication venue: Hindawi Publishing Corporation
Publication date: 01/01/2004
Field of study

Gene expression microarray data can be used for the assembly of genetic coexpression network graphs. Using mRNA samples obtained from recombinant inbred Mus musculus strains, it is possible to integrate allelic variation with molecular and higher-order phenotypes. The depth of quantitative genetic analysis of microarray data can be vastly enhanced utilizing this mouse resource in combination with powerful computational algorithms, platforms, and data repositories. The resulting network graphs transect many levels of biological scale. This approach is illustrated with the extraction of cliques of putatively coregulated genes and their annotation using gene ontology analysis and cis-regulatory element discovery. The causal basis for coregulation is detected through the use of quantitative trait locus mapping

CiteSeerX

Crossref

The Jackson Laboratory: The Mouseion at the JAXlibrary

Directory of Open Access Journals

PubMed Central

Multi-membership gene regulation in pathway based microarray analysis

Author: A Goesmann
AB Khodursky
Annette M Payne
AP Gasch
D Cavalieri
D Greenbaum
E Panteris
FA Kolpakov
FR Blattner
G Russo
I Rojas
JH Holland
JL DeRisi
KD Dahlquist
L Stryer
M Kanehisa
M Quadroni
M Schena
P Grosu
P Shannon
PC Champe
PD Karp
R Hamming
RK Brouwer
S Kirkpatrick
S Pavlidis
S Swift
SJ Russell
Stelios P Pavlidis
Stephen M Swift
T Toyoda
Z Michalewicz
Publication venue: 'Springer Science and Business Media LLC'
Publication date: 01/01/2011
Field of study

This article is available through the Brunel Open Access Publishing Fund. This is an Open Access article distributed under the terms of the Creative Commons Attribution License (http://creativecommons.org/licenses/by/2.0), which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited.Background: Gene expression analysis has been intensively researched for more than a decade. Recently, there has been elevated interest in the integration of microarray data analysis with other types of biological knowledge in a holistic analytical approach. We propose a methodology that can be facilitated for pathway based microarray data analysis, based on the observation that a substantial proportion of genes present in biochemical pathway databases are members of a number of distinct pathways. Our methodology aims towards establishing the state of individual pathways, by identifying those truly affected by the experimental conditions based on the behaviour of such genes. For that purpose it considers all the pathways in which a gene participates and the general census of gene expression per pathway. Results: We utilise hill climbing, simulated annealing and a genetic algorithm to analyse the consistency of the produced results, through the application of fuzzy adjusted rand indexes and hamming distance. All algorithms produce highly consistent genes to pathways allocations, revealing the contribution of genes to pathway functionality, in agreement with current pathway state visualisation techniques, with the simulated annealing search proving slightly superior in terms of efficiency. Conclusions: We show that the expression values of genes, which are members of a number of biochemical pathways or modules, are the net effect of the contribution of each gene to these biochemical processes. We show that by manipulating the pathway and module contribution of such genes to follow underlying trends we can interpret microarray results centred on the behaviour of these genes.The work was sponsored by the studentship scheme of the School of Information Systems, Computing and Mathematics, Brunel Universit

Crossref

Springer - Publisher Connector

Directory of Open Access Journals

PubMed Central

Brunel University Research Archive