16 research outputs found

    MACAT—microarray chromosome analysis tool

    Get PDF
    By linking differential gene expression to the chromosomal localization of genes, one can investigate microarray data for characteristic patterns of expression phenomena involving sizeable parts of specific chromosomes. We have implemented a statistical approach for identifying significantly differentially expressed chromosome regions. We demonstrate the applicability of the approach on a publicly available data set on acute lymphocytic leukemia

    Statistical Test of Expression Pattern (STEPath): a new strategy to integrate gene expression data with genomic information in individual and meta-analysis studies

    Get PDF
    <p>Abstract</p> <p>Background</p> <p>In the last decades, microarray technology has spread, leading to a dramatic increase of publicly available datasets. The first statistical tools developed were focused on the identification of significant differentially expressed genes. Later, researchers moved toward the systematic integration of gene expression profiles with additional biological information, such as chromosomal location, ontological annotations or sequence features. The analysis of gene expression linked to physical location of genes on chromosomes allows the identification of transcriptionally imbalanced regions, while, Gene Set Analysis focuses on the detection of coordinated changes in transcriptional levels among sets of biologically related genes.</p> <p>In this field, meta-analysis offers the possibility to compare different studies, addressing the same biological question to fully exploit public gene expression datasets.</p> <p>Results</p> <p>We describe STEPath, a method that starts from gene expression profiles and integrates the analysis of imbalanced region as an <it>a priori </it>step before performing gene set analysis. The application of STEPath in individual studies produced gene set scores weighted by chromosomal activation. As a final step, we propose a way to compare these scores across different studies (meta-analysis) on related biological issues. One complication with meta-analysis is batch effects, which occur because molecular measurements are affected by laboratory conditions, reagent lots and personnel differences. Major problems occur when batch effects are correlated with an outcome of interest and lead to incorrect conclusions. We evaluated the power of combining chromosome mapping and gene set enrichment analysis, performing the analysis on a dataset of leukaemia (example of individual study) and on a dataset of skeletal muscle diseases (meta-analysis approach).</p> <p>In leukaemia, we identified the Hox gene set, a gene set closely related to the pathology that other algorithms of gene set analysis do not identify, while the meta-analysis approach on muscular disease discriminates between related pathologies and correlates similar ones from different studies.</p> <p>Conclusions</p> <p>STEPath is a new method that integrates gene expression profiles, genomic co-expressed regions and the information about the biological function of genes. The usage of the STEPath-computed gene set scores overcomes batch effects in the meta-analysis approaches allowing the direct comparison of different pathologies and different studies on a gene set activation level.</p

    Positional gene enrichment analysis of gene sets for high-resolution identification of overrepresented chromosomal regions

    Get PDF
    The search for feature enrichment is a widely used method to characterize a set of genes. While several tools have been designed for nominal features such as Gene Ontology annotations or KEGG Pathways, very little has been proposed to tackle numerical features such as the chromosomal positions of genes. For instance, microarray studies typically generate gene lists that are differentially expressed in the sample subgroups under investigation, and when studying diseases caused by genome alterations, it is of great interest to delineate the chromosomal regions that are significantly enriched in these lists. In this article, we present a positional gene enrichment analysis method (PGE) for the identification of chromosomal regions that are significantly enriched in a given set of genes. The strength of our method relies on an original query optimization approach that allows to virtually consider all the possible chromosomal regions for enrichment, and on the multiple testing correction which discriminates truly enriched regions versus those that can occur by chance. We have developed a Web tool implementing this method applied to the human genome (http://www.esat.kuleuven.be/~bioiuser/pge). We validated PGE on published lists of differentially expressed genes. These analyses showed significant overrepresentation of known aberrant chromosomal regions

    WoPPER: Web server for Position Related data analysis of gene Expression in Prokaryotes

    Get PDF
    The structural and conformational organization of chromosomes is crucial for gene expression regulation in eukaryotes and prokaryotes as well. Up to date, gene expression data generated using either microarray or RNA-sequencing are available for many bacterial genomes. However, differential gene expression is usually investigated with methods considering each gene independently, thus not taking into account the physical localization of genes along a bacterial chromosome. Here, we present WoPPER, a web tool integrating gene expression and genomic annotations to identify differentially expressed chromosomal regions in bacteria. RNA-sequencing or microarray-based gene expression data are provided as input, along with gene annotations. The user can select genomic annotations from an internal database including 2780 bacterial strains, or provide custom genomic annotations. The analysis produces as output the lists of positionally related genes showing a coordinated trend of differential expression. Graphical representations, including a circular plot of the analyzed chromosome, allow intuitive browsing of the results. The analysis procedure is based on our previously published R-package PREDA. The release of this tool is timely and relevant for the scientific community, as WoPPER will fill an existing gap in prokaryotic gene expression data analysis and visualization tools. WoPPER is open to all users and can be reached at the following URL: https://WoPPER.ba.itb.cnr.it

    The Annotation, Mapping, Expression and Network (AMEN) suite of tools for molecular systems biology

    Get PDF
    <p>Abstract</p> <p>Background</p> <p>High-throughput genome biological experiments yield large and multifaceted datasets that require flexible and user-friendly analysis tools to facilitate their interpretation by life scientists. Many solutions currently exist, but they are often limited to specific steps in the complex process of data management and analysis and some require extensive informatics skills to be installed and run efficiently.</p> <p>Results</p> <p>We developed the Annotation, Mapping, Expression and Network (AMEN) software as a stand-alone, unified suite of tools that enables biological and medical researchers with basic bioinformatics training to manage and explore genome annotation, chromosomal mapping, protein-protein interaction, expression profiling and proteomics data. The current version provides modules for (i) uploading and pre-processing data from microarray expression profiling experiments, (ii) detecting groups of significantly co-expressed genes, and (iii) searching for enrichment of functional annotations within those groups. Moreover, the user interface is designed to simultaneously visualize several types of data such as protein-protein interaction networks in conjunction with expression profiles and cellular co-localization patterns. We have successfully applied the program to interpret expression profiling data from budding yeast, rodents and human.</p> <p>Conclusion</p> <p>AMEN is an innovative solution for molecular systems biological data analysis freely available under the GNU license. The program is available via a website at the Sourceforge portal which includes a user guide with concrete examples, links to external databases and helpful comments to implement additional functionalities. We emphasize that AMEN will continue to be developed and maintained by our laboratory because it has proven to be extremely useful for our genome biological research program.</p

    A locally adaptive statistical procedure (LAP) to identify differentially expressed chromosomal regions

    Get PDF
    Abstract Motivation: The systematic integration of expression profiles and other types of gene information, such as chromosomal localization, ontological annotations and sequence characteristics, still represents a challenge in the gene expression arena. In particular, the analysis of transcriptional data in context of the physical location of genes in a genome appears promising in detecting chromosomal regions with transcriptional imbalances often characterizing cancer. Results: A computational tool named locally adaptive statistical procedure (LAP), which incorporates transcriptional data and structural information for the identification of differentially expressed chromosomal regions, is described. LAP accounts for variations in the distance between genes and in gene density by smoothing standard statistics on gene position before testing the significance of their differential levels of gene expression. The procedure smoothes parameters and computes p-values locally to account for the complex structure of the genome and to more precisely estimate the differential expression of chromosomal regions. The application of LAP to three independent sets of raw expression data allowed identifying differentially expressed regions that are directly involved in known chromosomal aberrations characteristic of tumors. Availability: Functions in R for implementing the LAP method are available at Contact: [email protected] Supplementary Information

    A computational procedure to identify significant overlap of differentially expressed and genomic imbalanced regions in cancer datasets†

    Get PDF
    The integration of high-throughput genomic data represents an opportunity for deciphering the interplay between structural and functional organization of genomes and for discovering novel biomarkers. However, the development of integrative approaches to complement gene expression (GE) data with other types of gene information, such as copy number (CN) and chromosomal localization, still represents a computational challenge in the genomic arena. This work presents a computational procedure that directly integrates CN and GE profiles at genome-wide level. When applied to DNA/RNA paired data, this approach leads to the identification of Significant Overlaps of Differentially Expressed and Genomic Imbalanced Regions (SODEGIR). This goal is accomplished in three steps. The first step extends to CN a method for detecting regional imbalances in GE. The second part provides the integration of CN and GE data and identifies chromosomal regions with concordantly altered genomic and transcriptional status in a tumor sample. The last step elevates the single-sample analysis to an entire dataset of tumor specimens. When applied to study chromosomal aberrations in a collection of astrocytoma and renal carcinoma samples, the procedure proved to be effective in identifying discrete chromosomal regions of coordinated CN alterations and changes in transcriptional levels

    Genomic expression during human myelopoiesis

    Get PDF
    <p>Abstract</p> <p>Background</p> <p>Human myelopoiesis is an exciting biological model for cellular differentiation since it represents a plastic process where multipotent stem cells gradually limit their differentiation potential, generating different precursor cells which finally evolve into distinct terminally differentiated cells. This study aimed at investigating the genomic expression during myeloid differentiation through a computational approach that integrates gene expression profiles with functional information and genome organization.</p> <p>Results</p> <p>Gene expression data from 24 experiments for 8 different cell types of the human myelopoietic lineage were used to generate an integrated myelopoiesis dataset of 9,425 genes, each reliably associated to a unique genomic position and chromosomal coordinate. Lists of genes constitutively expressed or silent during myelopoiesis and of genes differentially expressed in commitment phase of myelopoiesis were first identified using a classical data analysis procedure. Then, the genomic distribution of myelopoiesis genes was investigated integrating transcriptional and functional characteristics of genes. This approach allowed identifying specific chromosomal regions significantly highly or weakly expressed, and clusters of differentially expressed genes and of transcripts related to specific functional modules.</p> <p>Conclusion</p> <p>The analysis of genomic expression during human myelopoiesis using an integrative computational approach allowed discovering important relationships between genomic position, biological function and expression patterns and highlighting chromatin domains, including genes with coordinated expression and lineage-specific functions.</p

    Deciphering the genetic heterogeneity in Acute Myeloid Leukemia

    Get PDF
    corecore