48,508 research outputs found

    An Analysis System for Integrating High-Throughput Transcript Abundance Data with Metabolic Pathways in Green Algae

    Get PDF
    As the most important non-vascular plants, algae have many research applications, including high species diversity, biofuel sources, adsorption of heavy metals and, following processing, health supplements. With the increasing availability of next-generation sequencing (NGS) data for algae genomes and transcriptomes, an integrated resource for retrieving gene expression data and metabolic pathway is essential for functional analysis and systems biology in algae. However, gene expression profiles and biological pathways are displayed separately in current resources, and making it impossible to search current databases directly to identify the cellular response mechanisms. Therefore, this work develops a novel AlgaePath database to retrieve gene expression profiles efficiently under various conditions in numerous metabolic pathways. AlgaePath, a web-based database, integrates gene information, biological pathways, and next-generation sequencing (NGS) datasets in Chlamydomonasreinhardtii and Neodesmus sp. UTEX 2219-4. Users can identify gene expression profiles and pathway information by using five query pages (i.e. Gene Search, Pathway Search, Differentially Expressed Genes (DEGs) Search, Gene Group Analysis, and Co-Expression Analysis). The gene expression data of 45 and 4 samples can be obtained directly on pathway maps in C. reinhardtii and Neodesmus sp. UTEX 2219-4, respectively. Genes that are differentially expressed between two conditions can be identified in Folds Search. Furthermore, the Gene Group Analysis of AlgaePath includes pathway enrichment analysis, and can easily compare the gene expression profiles of functionally related genes in a map. Finally, Co-Expression Analysis provides co-expressed transcripts of a target gene. The analysis results provide a valuable reference for designing further experiments and elucidating critical mechanisms from high-throughput data. More than an effective interface to clarify the transcript response mechanisms in different metabolic pathways under various conditions, AlgaePath is also a data mining system to identify critical mechanisms based on high-throughput sequencing

    GEOGLE: context mining tool for the correlation between gene expression and the phenotypic distinction

    Get PDF
    <p>Abstract</p> <p>Background</p> <p>In the post-genomic era, the development of high-throughput gene expression detection technology provides huge amounts of experimental data, which challenges the traditional pipelines for data processing and analyzing in scientific researches.</p> <p>Results</p> <p>In our work, we integrated gene expression information from Gene Expression Omnibus (GEO), biomedical ontology from Medical Subject Headings (MeSH) and signaling pathway knowledge from sigPathway entries to develop a context mining tool for gene expression analysis – GEOGLE. GEOGLE offers a rapid and convenient way for searching relevant experimental datasets, pathways and biological terms according to multiple types of queries: including biomedical vocabularies, GDS IDs, gene IDs, pathway names and signature list. Moreover, GEOGLE summarizes the signature genes from a subset of GDSes and estimates the correlation between gene expression and the phenotypic distinction with an integrated p value.</p> <p>Conclusion</p> <p>This approach performing global searching of expression data may expand the traditional way of collecting heterogeneous gene expression experiment data. GEOGLE is a novel tool that provides researchers a quantitative way to understand the correlation between gene expression and phenotypic distinction through meta-analysis of gene expression datasets from different experiments, as well as the biological meaning behind. The web site and user guide of GEOGLE are available at: <url>http://omics.biosino.org:14000/kweb/workflow.jsp?id=00020</url></p

    Gene expression meta-analysis supports existence of molecular apocrine breast cancer with a role for androgen receptor and implies interactions with ErbB family

    Get PDF
    <p>Abstract</p> <p>Background</p> <p>Pathway discovery from gene expression data can provide important insight into the relationship between signaling networks and cancer biology. Oncogenic signaling pathways are commonly inferred by comparison with signatures derived from cell lines. We use the Molecular Apocrine subtype of breast cancer to demonstrate our ability to infer pathways directly from patients' gene expression data with pattern analysis algorithms.</p> <p>Methods</p> <p>We combine data from two studies that propose the existence of the Molecular Apocrine phenotype. We use quantile normalization and XPN to minimize institutional bias in the data. We use hierarchical clustering, principal components analysis, and comparison of gene signatures derived from Significance Analysis of Microarrays to establish the existence of the Molecular Apocrine subtype and the equivalence of its molecular phenotype across both institutions. Statistical significance was computed using the Fasano & Franceschini test for separation of principal components and the hypergeometric probability formula for significance of overlap in gene signatures. We perform pathway analysis using LeFEminer and Backward Chaining Rule Induction to identify a signaling network that differentiates the subset. We identify a larger cohort of samples in the public domain, and use Gene Shaving and Robust Bayesian Network Analysis to detect pathways that interact with the defining signal.</p> <p>Results</p> <p>We demonstrate that the two separately introduced ER<sup>- </sup>breast cancer subsets represent the same tumor type, called Molecular Apocrine breast cancer. LeFEminer and Backward Chaining Rule Induction support a role for AR signaling as a pathway that differentiates this subset from others. Gene Shaving and Robust Bayesian Network Analysis detect interactions between the AR pathway, EGFR trafficking signals, and ErbB2.</p> <p>Conclusion</p> <p>We propose criteria for meta-analysis that are able to demonstrate statistical significance in establishing molecular equivalence of subsets across institutions. Data mining strategies used here provide an alternative method to comparison with cell lines for discovering seminal pathways and interactions between signaling networks. Analysis of Molecular Apocrine breast cancer implies that therapies targeting AR might be hampered if interactions with ErbB family members are not addressed.</p

    MAPT and PAICE: Tools for time series and single time point transcriptionist visualization and knowledge discovery

    Get PDF
    With the advent of next-generation sequencing, -omics fields such as transcriptomics have experienced increases in data throughput on the order of magnitudes. In terms of analyzing and visually representing these huge datasets, an intuitive and computationally tractable approach is to map quantified transcript expression onto biochemical pathways while employing datamining and visualization principles to accelerate knowledge discovery. We present two cross-platform tools: MAPT (Mapping and Analysis of Pathways through Time) and PAICE (Pathway Analysis and Integrated Coloring of Experiments), an easy to use analysis suite to facilitate time series and single time point transcriptomics analysis. In unison, MAPT and PAICE serve as a visual workbench for transcriptomics knowledge discovery, data-mining and functional annotation. Both PAICE and MAPT are two distinct but yet inextricably linked tools. The former is specifically designed to map EC accessions onto KEGG pathways while handling multiple gene copies, detection-call analysis, as well as UN/annotated EC accessions lacking quantifiable expression. The latter tool integrates PAICE datasets to drive visualization, annotation, and data-mining

    A transcriptomic and molecular approach uncovering ASCL2 as a novel tumourigenic gene in breast cancer

    Get PDF
    Breast cancer is highly heterogeneous and is considered a collection of molecularly distinct tumour subtypes. Substantial efforts have been made to explore the gene expression profiles underlying the subtypes, and to elucidate possible markers associated with clinical outcomes. However, research in this area has been met with significant challenges and despite ongoing advancements in diagnostics and targeted therapeutics, incidence and mortality continues to rise. Thus, there is a need for greater molecular characterisation of breast tumours, to further understand the mechanistic roles of genes within their respective signalling pathways. With the advent of high-throughput technologies in transcriptomics, as well as the use of open databases and bioinformatics analysis tools, it is now possible to examine thousands of genes in parallel, generating an unprecedented amount of information. This provides a means for researchers to identify novel genes and targets from large volumes of gene expression data. However, the task of extracting clinically relevant results, is a prominent challenge. Therefore, the aim of this study was to use a streamlined in silico pipeline, integrated with in vitro methods to identify and functionally investigate a novel genetic marker demonstrating a key role in breast carcinogenesis. Gene expression profiles from breast cancer cell lines were obtained from public databases (Array Express and Gene Expression Omnibus). Data was filtered and subjected to an extreme variation analysis to generate a list of differentially expressed genes. Subsequently, multiple pathway analysis tools were used to identify a novel candidate gene for further investigation. Achaete-scute complex homolog 2 (ASCL2) is a transcription factor and Wnt-target gene, recognised as a regulator of stem cell identity and embryogenesis. Gene expression was validated in vitro by Reverse Transcription Quantitative Polymerase Chain Reaction (RT-qPCR), and to assess the tumourigenic potential of ASCL2, siRNA knockdown was performed; assays were employed to measure proliferation, wound-healing and apoptosis. Data mining of patient tumours obtained from the METABRIC study was also undertaken to ascertain the potential of ASCL2 as a prognostic indicator. This work utilised a systematic pipeline used by the wider scientific community for the identification of candidate genes from transcriptomic data. Differential expression of ASCL2 was observed across multiple breast cancer cell lines, with largest the expression seen in MCF7 cells. Although evidence did not support the usage of ASCL2 as a prognostic indicator in patient tumours, data integrated from multiple lines of investigation suggested that this gene may influence the migratory capacity of breast tumour cells, whilst exercising its tumourigeneic function via the Wnt signalling pathway in breast cancer. Thus, this potential novel role of ASCL2 in breast tumourigenesis highlights a prominent area for further exploration

    From data towards knowledge: Revealing the architecture of signaling systems by unifying knowledge mining and data mining of systematic perturbation data

    Get PDF
    Genetic and pharmacological perturbation experiments, such as deleting a gene and monitoring gene expression responses, are powerful tools for studying cellular signal transduction pathways. However, it remains a challenge to automatically derive knowledge of a cellular signaling system at a conceptual level from systematic perturbation-response data. In this study, we explored a framework that unifies knowledge mining and data mining approaches towards the goal. The framework consists of the following automated processes: 1) applying an ontology-driven knowledge mining approach to identify functional modules among the genes responding to a perturbation in order to reveal potential signals affected by the perturbation; 2) applying a graph-based data mining approach to search for perturbations that affect a common signal with respect to a functional module, and 3) revealing the architecture of a signaling system organize signaling units into a hierarchy based on their relationships. Applying this framework to a compendium of yeast perturbation-response data, we have successfully recovered many well-known signal transduction pathways; in addition, our analysis have led to many hypotheses regarding the yeast signal transduction system; finally, our analysis automatically organized perturbed genes as a graph reflecting the architect of the yeast signaling system. Importantly, this framework transformed molecular findings from a gene level to a conceptual level, which readily can be translated into computable knowledge in the form of rules regarding the yeast signaling system, such as "if genes involved in MAPK signaling are perturbed, genes involved in pheromone responses will be differentially expressed"
    • …
    corecore