21 research outputs found
Genevestigator V3: A Reference Expression Database for the Meta-Analysis of Transcriptomes
The Web-based software tool Genevestigator provides powerful tools for biologists to explore gene
expression across a wide variety of biological contexts. Its first releases, however, were limited by the scaling
ability of the system architecture, multiorganism data storage and analysis capability, and availability of
computationally intensive analysis methods. Genevestigator V3 is a novel meta-analysis system resulting
from new algorithmic and software development using a client/server architecture, large-scale manual
curation and quality control of microarray data for several organisms, and curation of pathway data for mouse
and Arabidopsis. In addition to improved querying features, Genevestigator V3 provides new tools to analyze
the expression of genes in many different contexts, to identify biomarker genes, to cluster genes into
expression modules, and to model expression responses in the context of metabolic and regulatory networks.
Being a reference expression database with user-friendly tools, Genevestigator V3 facilitates discovery
research and hypothesis validation
A Multilevel Gamma-Clustering Layout Algorithm for Visualization of Biological Networks
Visualization of large complex networks has become an indispensable part of systems biology, where organisms need to be considered as one complex system. The visualization of the corresponding network is challenging due to the size and density of edges. In many cases, the use of standard visualization algorithms can lead to high running times and poorly readable visualizations due to many edge crossings. We suggest an approach that analyzes the structure of the graph first and then generates a new graph which contains specific semantic symbols for regular substructures like dense clusters. We propose a multilevel gamma-clustering layout visualization algorithm (MLGA) which proceeds in three subsequent steps: (i) a multilevel γ-clustering is used to identify the structure of the underlying network, (ii) the network is transformed to a tree, and (iii) finally, the resulting tree which shows the network structure is drawn using a variation of a force-directed algorithm. The algorithm has a potential to visualize very large networks because it uses modern clustering heuristics which are optimized for large graphs. Moreover, most of the edges are removed from the visual representation which allows keeping the overview over complex graphs with dense subgraphs
Sparse graphical Gaussian modeling of the isoprenoid gene network in Arabidopsis thaliana
We present a novel graphical Gaussian modeling approach for reverse engineering of genetic regulatory networks with many genes and few observations. When applying our approach to infer a gene network for isoprenoid biosynthesis in Arabidopsis thaliana, we detect modules of closely connected genes and candidate genes for possible cross-talk between the isoprenoid pathways. Genes of downstream pathways also fit well into the network. We evaluate our approach in a simulation study and using the yeast galactose network
RefGenes: identification of reliable and condition specific reference genes for RT-qPCR data normalization
Background
RT-qPCR is a sensitive and increasingly used method for gene expression quantification. To normalize RT-qPCR measurements between samples, most laboratories use endogenous reference genes as internal controls. There is increasing evidence, however, that the expression of commonly used reference genes can vary significantly in certain contexts.
Results
Using the Genevestigator database of normalized and well-annotated microarray experiments, we describe the expression stability characteristics of the transciptomes of several organisms. The results show that a) no genes are universally stable, b) most commonly used reference genes yield very high transcript abundances as compared to the entire transcriptome, and c) for each biological context a subset of stable genes exists that has smaller variance than commonly used reference genes or genes that were selected for their stability across all conditions.
Conclusion
We therefore propose the normalization of RT-qPCR data using reference genes that are specifically chosen for the conditions under study. RefGenes is a community tool developed for that purpose. Validation RT-qPCR experiments across several organisms showed that the candidates proposed by RefGenes generally outperformed commonly used reference genes. RefGenes is available within Genevestigator at http://www.genevestigator.com
Integrative genome-wide expression profiling identifies three distinct molecular subgroups of renal cell carcinoma with different patient outcome
ABSTRACT: BACKGROUND: Renal cell carcinoma (RCC) is characterized by a number of diverse molecular aberrations that differ among individuals. Recent approaches to molecularly classify RCC were based on clinical, pathological as well as on single molecular parameters. As a consequence, gene expression patterns reflecting the sum of genetic aberrations in individual tumors may not have been recognized. In an attempt to uncover such molecular features in RCC, we used a novel, unbiased and integrative approach. METHODS: We integrated gene expression data from 97 primary RCCs of different pathologic parameters, 15 RCC metastases as well as 34 cancer cell lines for two-way nonsupervised hierarchical clustering using gene groups suggested by the PANTHER Classification System. We depicted the genomic landscape of the resulted tumor groups by means of Single Nuclear Polymorphism (SNP) technology. Finally, the achieved results were immunohistochemically analyzed using a tissue microarray (TMA) composed of 254 RCC. Results: We found robust, genome wide expression signatures, which split RCC into three distinct molecular subgroups. These groups remained stable even if randomly selected gene sets were clustered. Notably, the pattern obtained from RCC cell lines was clearly distinguishable from that of primary tumors. SNP array analysis demonstrated differing frequencies of chromosomal copy number alterations among RCC subgroups. TMA analysis with group-specific markers showed a prognostic significance of the different groups. Conclusion: We propose the existence of characteristic and histologically independent genome-wide expression outputs in RCC with potential biological and clinical relevance
Web-based analysis of the mouse transcriptome using Genevestigator
<p>Abstract</p> <p>Background</p> <p>Gene function analysis often requires a complex and laborious sequence of laboratory and computer-based experiments. Choosing an effective experimental design generally results from hypotheses derived from prior knowledge or experimentation. Knowledge obtained from meta-analyzing compendia of expression data with annotation libraries can provide significant clues in understanding gene and network function, resulting in better hypotheses that can be tested in the laboratory.</p> <p>Description</p> <p>Genevestigator is a microarray database and analysis system allowing context-driven queries. Simple but powerful tools allow biologists with little computational background to retrieve information about when, where and how genes are expressed. We manually curated and quality-controlled 3110 mouse Affymetrix arrays from public repositories. Data queries can be run against an annotation library comprising 160 anatomy categories, 12 developmental stage groups, 80 stimuli, and 182 genetic backgrounds or modifications. The quality of results obtained through Genevestigator is illustrated by a number of biological scenarios that are substantiated by other types of experimentation in the literature.</p> <p>Conclusion</p> <p>The Genevestigator-Mouse database effectively provides biologically meaningful results and can be accessed at <url>https://www.genevestigator.ethz.ch</url>.</p
A Multilevel Gamma-Clustering Layout Algorithm for Visualization of Biological Networks
Visualization of large complex networks has become an indispensable part of systems biology, where organisms need to be considered as one complex system. The visualization of the corresponding network is challenging due to the size and density of edges. In many cases, the use of standard visualization algorithms can lead to high running times and poorly readable visualizations due to many edge crossings. We suggest an approach that analyzes the structure of the graph first and then generates a new graph which contains specific semantic symbols for regular substructures like dense clusters. We propose a multilevel gamma-clustering layout visualization algorithm (MLGA) which proceeds in three subsequent steps: (i) a multilevel γ-clustering is used to identify the structure of the underlying network, (ii) the network is transformed to a tree, and (iii) finally, the resulting tree which shows the network structure is drawn using a variation of a force-directed algorithm. The algorithm has a potential to visualize very large networks because it uses modern clustering heuristics which are optimized for large graphs. Moreover, most of the edges are removed from the visual representation which allows keeping the overview over complex graphs with dense subgraphs.ISSN:1687-8035ISSN:1687-802
Controlled vocabularies for plant anatomical parts optimized for use in data analysis tools and for cross-species studies
Background
It is generally accepted that controlled vocabularies are necessary to systematically integrate data from various sources. During the last decade, several plant ontologies have been developed, some of which are community specific or were developed for a particular purpose. In most cases, the practical application of these ontologies has been limited to systematically storing experimental data. Due to technical constraints, complex data structures and term redundancies, it has been difficult to apply them directly into analysis tools.
Results
Here, we describe a simplified and cross-species compatible set of controlled vocabularies for plant anatomy, focussing mainly on monocotypledonous and dicotyledonous crop and model plants. Their content was designed primarily for their direct use in graphical visualization tools. Specifically, we created annotation vocabularies that can be understood by non-specialists, are minimally redundant, simply structured, have low tree depth, and we tested them practically in the frame of Genevestigator.
Conclusions
The application of the proposed ontologies enabled the aggregation of data from hundreds of experiments to visualize gene expression across tissue types. It also facilitated the comparison of expression across species. The described controlled vocabularies are maintained by a dedicated curation team and are available upon request.ISSN:1746-481