13,486 research outputs found
Maximization of negative correlations in time-course gene expression data for enhancing understanding of molecular pathways
Positive correlation can be diversely instantiated as shifting, scaling or geometric pattern, and it has been extensively explored for time-course gene expression data and pathway analysis. Recently, biological studies emerge a trend focusing on the notion of negative correlations such as opposite expression patterns, complementary patterns and self-negative regulation of transcription factors (TFs). These biological ideas and primitive observations motivate us to formulate and investigate the problem of maximizing negative correlations. The objective is to discover all maximal negative correlations of statistical and biological significance from time-course gene expression data for enhancing our understanding of molecular pathways. Given a gene expression matrix, a maximal negative correlation is defined as an activationāinhibition two-way expression pattern (AIE pattern). We propose a parameter-free algorithm to enumerate the complete set of AIE patterns from a data set. This algorithm can identify significant negative correlations that cannot be identified by the traditional clustering/biclustering methods. To demonstrate the biological usefulness of AIE patterns in the analysis of molecular pathways, we conducted deep case studies for AIE patterns identified from Yeast cell cycle data sets. In particular, in the analysis of the Lysine biosynthesis pathway, new regulation modules and pathway components were inferred according to a significant negative correlation which is likely caused by a co-regulation of the TFs at the higher layer of the biological network. We conjecture that maximal negative correlations between genes are actually a common characteristic in molecular pathways, which can provide insights into the cell stress response study, drug response evaluation, etc
Mining co-regulated gene profiles for the detection of functional associations in gene expression data
Motivation: Association pattern discovery (APD) methods have been successfully applied to gene expression data. They find groups of co-regulated genes in which the genes are either up- or down-regulated throughout the identified conditions. These methods, however, fail to identify similarly expressed genes whose expressions change between up- and down-regulation from one condition to another. In order to discover these hidden patterns, we propose the concept of mining co-regulated gene profiles. Co-regulated gene profiles contain two gene sets such that genes within the same set behave identically (up or down) while genes from different sets display contrary behavior. To reduce and group the large number of similar resulting patterns, we propose a new similarity measure that can be applied together with hierarchical clustering methods. Results: We tested our proposed method on two well-known yeast microarray data sets. Our implementation mined the data effectively and discovered patterns of co-regulated genes that are hidden to traditional APD methods. The high content of biologically relevant information in these patterns is demonstrated by the significant enrichment of co-regulated genes with similar functions. Our experimental results show that the Mining Attribute Profile (MAP) method is an efficient tool for the analysis of gene expression data and competitive with bi-clustering techniques. Contact: [email protected] Supplementary information: Supplementary data and an executable demo program of the MAP implementation are freely available at http://www.fgcz.ch/publications/ma
Techniques for clustering gene expression data
Many clustering techniques have been proposed for the analysis of gene expression data obtained from microarray experiments. However, choice of suitable method(s) for a given experimental dataset is not straightforward. Common approaches do not translate well and fail to take account of the data profile. This review paper surveys state of the art applications which recognises these limitations and implements procedures to overcome them. It provides a framework for the evaluation of clustering in gene expression analyses. The nature of microarray data is discussed briefly. Selected examples are presented for the clustering methods considered
Biclustering on expression data: A review
Biclustering has become a popular technique for the study of gene expression data, especially for discovering functionally related gene sets under different subsets of experimental conditions. Most of biclustering approaches use a measure or cost function that determines the quality of biclusters. In such cases, the development of both a suitable heuristics and a good measure for guiding the search are essential for discovering interesting biclusters in an expression matrix. Nevertheless, not all existing biclustering approaches base their search on evaluation measures for biclusters. There exists a diverse set of biclustering tools that follow different strategies and algorithmic concepts which guide the search towards meaningful results. In this paper we present a extensive survey of biclustering approaches, classifying them into two categories according to whether or not use evaluation metrics within the search method: biclustering algorithms based on evaluation measures and non metric-based biclustering algorithms. In both cases, they have been classified according to the type of meta-heuristics which they are based on.Ministerio de EconomĆa y Competitividad TIN2011-2895
Measuring the Quality of Shifting and Scaling Patterns in Biclusters
The most widespread biclustering algorithms use the Mean Squared Residue (MSR) as measure for assessing the quality of biclusters. MSR can identify correctly shifting patterns, but fails at discovering biclusters presenting scaling patterns. Virtual Error (VE) is a measure which improves the performance of MSR in this sense, since it is effective at recognizing biclusters containing shifting patters or scaling patterns as quality biclusters. However, VE presents some drawbacks when the biclusters present both kind of patterns simultaneously. In this paper, we propose a improvement of VE that can be integrated in any heuristic to discover biclusters with shifting and scaling patterns simultaneously.Ministerio de Ciencia y TecnologĆa TIN2007-68084-C02-0
TriGen: A genetic algorithm to mine triclusters in temporal gene expression data
Analyzing microarray data represents a computational challenge due to the characteristics of these data. Clustering
techniques are widely applied to create groups of genes that exhibit a similar behavior under the conditions tested.
Biclustering emerges as an improvement of classical clustering since it relaxes the constraints for grouping genes to
be evaluated only under a subset of the conditions and not under all of them. However, this technique is not
appropriate for the analysis of longitudinal experiments in which the genes are evaluated under certain conditions at
several time points. We present the TriGen algorithm, a genetic algorithm that finds triclusters of gene expression that
take into account the experimental conditions and the time points simultaneously. We have used TriGen to mine
datasets related to synthetic data, yeast (Saccharomyces cerevisiae) cell cycle and human inflammation and host
response to injury experiments. TriGen has proved to be capable of extracting groups of genes with similar patterns in
subsets of conditions and times, and these groups have shown to be related in terms of their functional annotations
extracted from the Gene Ontology.Ministerio de Ciencia y TecnologĆa TIN2011-28956-C00Ministerio de Ciencia y TecnologĆa TIN2009-13950Junta de AndalucĆa TIC-752
Analysis of the transcriptional program governing meiosis and gametogenesis in yeast and mammals
During meiosis a competent diploid cell replicates its DNA once and then undergoes two consecutive divisions followed by haploid gamete diļ¬erentiation. Important aspects of meiotic development that distinguish it from mitotic growth include a highly increased rate of recombination, formation of the synaptonemal complex that aligns the homologous chromosomes, as well as separation of the homologues and sister chromatids during meiosis I and II without an intervening S-phase. Budding yeast is an excellent model organism to study meiosis and gametogenesis and accordingly, to date it belongs to the best studied eukaryotic systems in this context. Knowledge coming from these studies has provided important insights into meiotic development in higher eukaryotes. This was possible because sporulation in yeast and spermatogenesis in higher eukaryotes are analogous developmental pathways that involve conserved genes. For budding yeast a huge amount of data from numerous genome-scale studies on gene expression and deletion phenotypes of meiotic development and sporulation are available. In contrast, mammalian gametogenesis has not been studied on a large-scale until recently. It was unclear if an expression proļ¬ling study using germ cells and testicular somatic control cells that underwent lengthy puriļ¬cation procedures would yield interpretable results. We have therefore carried out a pioneering expression proļ¬ling study of male germ cells from Rattus norvegicus using Aļ¬ymetrix U34A and B GeneChips. This work resulted in the ļ¬rst comprehensive large-scale expression proļ¬ling analysis of mammalian male germ cells undergoing mitotic growth, meiosis and gametogenesis. We have identiļ¬ed 1268 diļ¬erentially expressed genes in germ cells at diļ¬erent developmental stages, which were organized into four distinct expression clusters that reļ¬ect somatic, mitotic, meiotic and post-meiotic cell types. This included 293 yet uncharacterized transcripts whose expression pattern suggests that they are involved in spermatogenesis and fertility. A group of 121 transcripts were only expressed in meiotic (spermatocytes) and postmeiotic germ cells (round spermatids) but not in dividing germ cells (spermatogonia),
Sertoli
cells or two somatic control tissues (brain
and skeletal muscle). Functional analysis reveals
that most of the known genes in this
group fulfill essential functions during meiosis,
spermiogenesis (the process of sperm maturation)
and fertility. Therefore it is highly possible
that some of the ļæ½30 uncharacterized transcripts
in this group also contribute to these
processes. A web-accessible database (called
reXbase, which was later on integrated into
GermOnline) has been developed for our expression
profiling study of mammalian male
meiosis, which summarizes annotation information
and shows a graphical display of expression
profiles of every gene covered in our
study.
In the budding yeast Saccharomyces cerevisiae
entry into meiosis and subsequent progression
through sporulation and gametogenesis
are driven by a highly regulated transcriptional
program activated by signal pathways
responding to nutritional and cell-type cues.
Abf1p, which is a general transcription factor,
has previously been demonstrated to participate
in the induction of numerous mitotic as
well as early and middle meiotic genes. In
the current study we have addressed the question
how Abf1p transcriptionally coordinates
mitotic growth and meiotic development on a
genome-wide level. Because ABF1 is an essential
gene we used the temperature-sensitive
allele abf1-1. A phenotypical analysis of mutant
cells revealed that ABF1 plays an important
role in cell separation during mitosis,
meiotic development, and spore formation. In
order to identify genes whose expression depends
on Abf1p in growing and sporulating
cells we have performed expression profiling
experiments using Affymetrix S98 GeneChips
comparing wild-type and abf1-1 mutant cells
at both permissive and restrictive temperature.
We have identified 504 genes whose normal expression
depends on functional ABF1. By combining
the expression profiling data with data
from genome-wide DNA binding assays (ChIPCHIP)
and in silico predictions of potential
Abf1p-binding sites in the yeast genome, we
were able to define direct target genes. Expression
of these genes decreases in the absence
of functional ABF1 and whose promotors are
bound by Abf1p and/or contain a predicted
binding site.
Among 352 such bona fide direct target genes
we found many involved in ribosome biogenesis,
translation, vegetative growth and meiotic
developement and therefore could account for
the observed growth and sporulation defects of
abf1-1 mutant cells. Furthermore, the fact that
two members of the septin family (CDC3 and
CDC10 ) were found to be direct target genes
suggests a novel role for Abf1p in cytokinesis.
This was further substantiated by the observation
that chitin localization and septin ring
formation are perturbed in abf1-1 mutant cells
- ā¦