516 research outputs found

    Application of regulatory sequence analysis and metabolic network analysis to the interpretation of gene expression data

    Get PDF
    We present two complementary approaches for the interpretation of clusters of co-regulated genes, such as those obtained from DNA chips and related methods. Starting from a cluster of genes with similar expression profiles, two basic questions can be asked: 1. Which mechanism is responsible for the coordinated transcriptional response of the genes? This question is approached by extracting motifs that are shared between the upstream sequences of these genes. The motifs extracted are putative cis-acting regulatory elements. 2. What is the physiological meaning for the cell to express together these genes? One way to answer the question is to search for potential metabolic pathways that could be catalyzed by the products of the genes. This can be done by selecting the genes from the cluster that code for enzymes, and trying to assemble the catalyzed reactions to form metabolic pathways. We present tools to answer these two questions, and we illustrate their use with selected examples in the yeast Saccharomyces cerevisiae. The tools are available on the web (http://ucmb.ulb.ac.be/bioinformatics/rsa-tools/; http://www.ebi.ac.uk/research/pfbp/; http://www.soi.city.ac.uk/~msch/)

    The Genopolis Microarray Database

    Get PDF
    <p>Abstract</p> <p>Background</p> <p>Gene expression databases are key resources for microarray data management and analysis and the importance of a proper annotation of their content is well understood.</p> <p>Public repositories as well as microarray database systems that can be implemented by single laboratories exist. However, there is not yet a tool that can easily support a collaborative environment where different users with different rights of access to data can interact to define a common highly coherent content. The scope of the Genopolis database is to provide a resource that allows different groups performing microarray experiments related to a common subject to create a common coherent knowledge base and to analyse it. The Genopolis database has been implemented as a dedicated system for the scientific community studying dendritic and macrophage cells functions and host-parasite interactions.</p> <p>Results</p> <p>The Genopolis Database system allows the community to build an object based MIAME compliant annotation of their experiments and to store images, raw and processed data from the Affymetrix GeneChip<sup>® </sup>platform. It supports dynamical definition of controlled vocabularies and provides automated and supervised steps to control the coherence of data and annotations. It allows a precise control of the visibility of the database content to different sub groups in the community and facilitates exports of its content to public repositories. It provides an interactive users interface for data analysis: this allows users to visualize data matrices based on functional lists and sample characterization, and to navigate to other data matrices defined by similarity of expression values as well as functional characterizations of genes involved. A collaborative environment is also provided for the definition and sharing of functional annotation by users.</p> <p>Conclusion</p> <p>The Genopolis Database supports a community in building a common coherent knowledge base and analyse it. This fills a gap between a local database and a public repository, where the development of a common coherent annotation is important. In its current implementation, it provides a uniform coherently annotated dataset on dendritic cells and macrophage differentiation.</p

    Does BCR/ABL1 positive Acute Myeloid Leukaemia Exist?

    Get PDF
    The BCR/ABL1 fusion gene, usually carried by the Philadelphia chromosome (Ph) resulting from t(9;22)(q34;q11) or variants, is pathognomonic for chronic myeloid leukaemia (CML). It is also occasionally found in acute lymphoblastic leukaemia (ALL) mostly in adults and rarely in de novo acute myeloid leukaemia (AML). Array Comparative Genomic Hybridization (aCGH) was used to study six Ph(+)AML, three bi-lineage and four Ph(+)ALL searching for specific genomic profiles. Surprisingly, loss of the IKZF1 and/or CDKN2A genes, the hallmark of Ph(+)ALL, were recurrent findings in Ph(+)AML and accompanied cryptic deletions within the immunoglobulin and T cell receptor genes. The latter two losses have been shown to be part of 'hot spot' genome imbalances associated with BCR/ABL1 positive pre-B lymphoid phenotype in CML and Ph(+)ALL. We applied Significance Analysis of Microarrays (SAM) to data from the 'hot spot' regions to the Ph(+)AML and a further 40 BCR/ABL1(+) samples looking for differentiating features. After exclusion of the most dominant markers, SAM identified aberrations unique to de novo Ph(+)AML that involved relevant genes. While the biological and clinical significance of this specific genome signature remains to be uncovered, the unique loss within the immunoglobulin genes provides a simple test to enable the differentiation of clinically similar de novo Ph(+) AML and myeloid blast crisis of CML. © 2013 John Wiley & Sons Ltd and Crown

    An optimized TOPS+ comparison method for enhanced TOPS models

    Get PDF
    This article has been made available through the Brunel Open Access Publishing Fund.Background Although methods based on highly abstract descriptions of protein structures, such as VAST and TOPS, can perform very fast protein structure comparison, the results can lack a high degree of biological significance. Previously we have discussed the basic mechanisms of our novel method for structure comparison based on our TOPS+ model (Topological descriptions of Protein Structures Enhanced with Ligand Information). In this paper we show how these results can be significantly improved using parameter optimization, and we call the resulting optimised TOPS+ method as advanced TOPS+ comparison method i.e. advTOPS+. Results We have developed a TOPS+ string model as an improvement to the TOPS [1-3] graph model by considering loops as secondary structure elements (SSEs) in addition to helices and strands, representing ligands as first class objects, and describing interactions between SSEs, and SSEs and ligands, by incoming and outgoing arcs, annotating SSEs with the interaction direction and type. Benchmarking results of an all-against-all pairwise comparison using a large dataset of 2,620 non-redundant structures from the PDB40 dataset [4] demonstrate the biological significance, in terms of SCOP classification at the superfamily level, of our TOPS+ comparison method. Conclusions Our advanced TOPS+ comparison shows better performance on the PDB40 dataset [4] compared to our basic TOPS+ method, giving 90 percent accuracy for SCOP alpha+beta; a 6 percent increase in accuracy compared to the TOPS and basic TOPS+ methods. It also outperforms the TOPS, basic TOPS+ and SSAP comparison methods on the Chew-Kedem dataset [5], achieving 98 percent accuracy. Software Availability: The TOPS+ comparison server is available at http://balabio.dcs.gla.ac.uk/mallika/WebTOPS/.This article is available through the Brunel Open Access Publishing Fun

    Knowledge sharing and collaboration in translational research, and the DC-THERA Directory

    Get PDF
    Biomedical research relies increasingly on large collections of data sets and knowledge whose generation, representation and analysis often require large collaborative and interdisciplinary efforts. This dimension of ‘big data’ research calls for the development of computational tools to manage such a vast amount of data, as well as tools that can improve communication and access to information from collaborating researchers and from the wider community. Whenever research projects have a defined temporal scope, an additional issue of data management arises, namely how the knowledge generated within the project can be made available beyond its boundaries and life-time. DC-THERA is a European ‘Network of Excellence’ (NoE) that spawned a very large collaborative and interdisciplinary research community, focusing on the development of novel immunotherapies derived from fundamental research in dendritic cell immunobiology. In this article we introduce the DC-THERA Directory, which is an information system designed to support knowledge management for this research community and beyond. We present how the use of metadata and Semantic Web technologies can effectively help to organize the knowledge generated by modern collaborative research, how these technologies can enable effective data management solutions during and beyond the project lifecycle, and how resources such as the DC-THERA Directory fit into the larger context of e-science

    4DXpress: a database for cross-species expression pattern comparisons

    Get PDF
    In the major animal model species like mouse, fish or fly, detailed spatial information on gene expression over time can be acquired through whole mount in situ hybridization experiments. In these species, expression patterns of many genes have been studied and data has been integrated into dedicated model organism databases like ZFIN for zebrafish, MEPD for medaka, BDGP for Drosophila or GXD for mouse. However, a central repository that allows users to query and compare gene expression patterns across different species has not yet been established. Therefore, we have integrated expression patterns for zebrafish, Drosophila, medaka and mouse into a central public repository called 4DXpress (expression database in four dimensions). Users can query anatomy ontology-based expression annotations across species and quickly jump from one gene to the orthologues in other species. Genes are linked to public microarray data in ArrayExpress. We have mapped developmental stages between the species to be able to compare developmental time phases. We store the largest collection of gene expression patterns available to date in an individual resource, reflecting 16 505 annotated genes. 4DXpress will be an invaluable tool for developmental as well as for computational biologists interested in gene regulation and evolution. 4DXpress is available at http://ani.embl.de/4DXpress

    High-throughput processing and normalization of one-color microarrays for transcriptional meta-analyses

    Get PDF
    <p>Abstract</p> <p>Background</p> <p>Microarray experiments are becoming increasingly common in biomedical research, as is their deposition in publicly accessible repositories, such as Gene Expression Omnibus (GEO). As such, there has been a surge in interest to use this microarray data for meta-analytic approaches, whether to increase sample size for a more powerful analysis of a specific disease (e.g. lung cancer) or to re-examine experiments for reasons different than those examined in the initial, publishing study that generated them. For the average biomedical researcher, there are a number of practical barriers to conducting such meta-analyses such as manually aggregating, filtering and formatting the data. Methods to automatically process large repositories of microarray data into a standardized, directly comparable format will enable easier and more reliable access to microarray data to conduct meta-analyses.</p> <p>Methods</p> <p>We present a straightforward, simple but robust against potential outliers method for automatic quality control and pre-processing of tens of thousands of single-channel microarray data files. GEO GDS files are quality checked by comparing parametric distributions and quantile normalized to enable direct comparison of expression level for subsequent meta-analyses.</p> <p>Results</p> <p>13,000 human 1-color experiments were processed to create a single gene expression matrix that subsets can be extracted from to conduct meta-analyses. Interestingly, we found that when conducting a global meta-analysis of gene-gene co-expression patterns across all 13,000 experiments to predict gene function, normalization had minimal improvement over using the raw data.</p> <p>Conclusions</p> <p>Normalization of microarray data appears to be of minimal importance on analyses based on co-expression patterns when the sample size is on the order of thousands microarray datasets. Smaller subsets, however, are more prone to aberrations and artefacts, and effective means of automating normalization procedures not only empowers meta-analytic approaches, but aids in reproducibility by providing a standard way of approaching the problem.</p> <p>Data availability: matrix containing normalized expression of 20,813 genes across 13,000 experiments is available for download at . Source code for GDS files pre-processing is available from the authors upon request.</p

    Serum microRNA array analysis identifies miR-140-3p, miR-33b-3p and miR-671-3p as potential osteoarthritis biomarkers involved in metabolic processes.

    Get PDF
    Background: MicroRNAs (miRNAs) in circulation have emerged as promising biomarkers. In this study, we aimed to identify a circulating miRNA signature for osteoarthritis (OA) patients and in combination with bioinformatics analysis to evaluate the utility of selected differentially expressed miRNAs in the serum as potential OA biomarkers. Methods: Serum samples were collected from 12 primary OA patients, and 12 healthy individuals were screened using the Agilent Human miRNA Microarray platform interrogating 2549 miRNAs. Receiver Operating Characteristic (ROC) curves were constructed to evaluate the diagnostic performance of the deregulated miRNAs. Expression levels of selected miRNAs were validated by quantitative real-time PCR (qRT-PCR) in all serum and in articular cartilage samples from OA patients (n = 12) and healthy individuals (n = 7). Bioinformatics analysis was used to investigate the involved pathways and target genes for the above miRNAs. Results: We identified 279 differentially expressed miRNAs in the serum of OA patients compared to controls. Two hundred and five miRNAs (73.5%) were upregulated and 74 (26.5%) downregulated. ROC analysis revealed that 77 miRNAs had area under the curve (AUC) > 0.8 and p < 0.05. Bioinformatics analysis in the 77 miRNAs revealed that their target genes were involved in multiple signaling pathways associated with OA, among which FoxO, mTOR, Wnt, pI3K/akt, TGF-β signaling pathways, ECM-receptor interaction, and fatty acid biosynthesis. qRT-PCR validation in seven selected out of the 77 miRNAs revealed 3 significantly downregulated miRNAs (hsa-miR-33b-3p, hsa-miR-671-3p, and hsa-miR-140-3p) in the serum of OA patients, which were in silico predicted to be enriched in pathways involved in metabolic processes. Target-gene analysis of hsa-miR-140-3p, hsa-miR-33b-3p, and hsa-miR-671-3p revealed that InsR and IGFR1 were common targets of all three miRNAs, highlighting their involvement in regulation of metabolic processes that contribute to OA pathology. Hsa-miR-140-3p and hsa-miR-671-3p expression levels were consistently downregulated in articular cartilage of OA patients compared to healthy individuals. Conclusions: A serum miRNA signature was established for the first time using high density resolution miR-arrays in OA patients. We identified a three-miRNA signature, hsa-miR-140-3p, hsa-miR-671-3p, and hsa-miR-33b-3p, in the serum of OA patients, predicted to regulate metabolic processes, which could serve as a potential biomarker for the evaluation of OA risk and progression.Peer reviewedFinal Published versio

    NCBI GEO: archive for high-throughput functional genomic data

    Get PDF
    The Gene Expression Omnibus (GEO) at the National Center for Biotechnology Information (NCBI) is the largest public repository for high-throughput gene expression data. Additionally, GEO hosts other categories of high-throughput functional genomic data, including those that examine genome copy number variations, chromatin structure, methylation status and transcription factor binding. These data are generated by the research community using high-throughput technologies like microarrays and, more recently, next-generation sequencing. The database has a flexible infrastructure that can capture fully annotated raw and processed data, enabling compliance with major community-derived scientific reporting standards such as ‘Minimum Information About a Microarray Experiment’ (MIAME). In addition to serving as a centralized data storage hub, GEO offers many tools and features that allow users to effectively explore, analyze and download expression data from both gene-centric and experiment-centric perspectives. This article summarizes the GEO repository structure, content and operating procedures, as well as recently introduced data mining features. GEO is freely accessible at http://www.ncbi.nlm.nih.gov/geo/

    Towards agreement on best practice for publishing raw clinical trial data

    Get PDF
    Many research-funding agencies now require open access to the results of research they have funded, and some also require that researchers make available the raw data generated from that research. Similarly, the journal Trials aims to address inadequate reporting in randomised controlled trials, and in order to fulfil this objective, the journal is working with the scientific and publishing communities to try to establish best practice for publishing raw data from clinical trials in peer-reviewed biomedical journals. Common issues encountered when considering raw data for publication include patient privacy – unless explicit consent for publication is obtained – and ownership, but agreed-upon policies for tackling these concerns do not appear to be addressed in the guidance or mandates currently established. Potential next steps for journal editors and publishers, ethics committees, research-funding agencies, and researchers are proposed, and alternatives to journal publication, such as restricted access repositories, are outlined
    corecore