3 research outputs found

    BgeeDB, an R package for retrieval of curated expression datasets and for gene list expression localization enrichment tests.

    Get PDF
    BgeeDB is a collection of functions to import into R re-annotated, quality-controlled and re-processed expression data available in the Bgee database. This includes data from thousands of wild-type healthy samples of multiple animal species, generated with different gene expression technologies (RNA-seq, Affymetrix microarrays, expressed sequence tags, and in situ hybridizations). BgeeDB facilitates downstream analyses, such as gene expression analyses with other Bioconductor packages. Moreover, BgeeDB includes a new gene set enrichment test for preferred localization of expression of genes in anatomical structures ("TopAnat"). Along with the classical Gene Ontology enrichment test, this test provides a complementary way to interpret gene lists. Availability: https://www.bioconductor.org/packages/BgeeDB/

    Large-scale integration of microarray data: Investigating the pathologies of cancer and infectious diseases

    Get PDF
    DNA microarray data provide a high-throughput technique for the genome-wide profiling of genes at the transcript level. With large amounts of microarray data deposited on various types and aspects of malignancies, microarray technology has revolutionized the study of cancer. Such experiments aid in the discovery of novel biomarkers and provide insight into disease diagnosis, prognosis and response to treatment. Nonetheless, microarray data contains non-biological obscuring variations and systemic biases, which can distort the extraction of true aberrations in gene expression. Moreover, the number of samples generated by a single experiment is typically less than is statistically required to support the large number of genes studied. As a result, biomarker gene lists produced from independent datasets show little overlap. Therefore, to understand the pathophysiology of cancers and the influence they exert on the cellular processes they override, methods for combining data from different sources are necessary.Meta-analysis techniques have been utilized to address this issue by conducting an individual statistical analysis on each of the acquired datasets, then incorporating the results to generate a final gene list based on aggregated p-values or ranks. However, much of the publicly accessible cancer microarray datasets are unbalanced or asymmetric and therefore lack data from healthy samples. Consequently, critical and considerable amounts of data are overlooked. An integrative approach that combines data prior to analysis can incorporate asymmetric data. For this reason, a merge approach to the previously validated technique, the significance analysis of microarrays, is proposed. The merged SAM technique reproduced the known-cancer literature with higher coverage than meta-analysis in the five independent cancer tissues considered. The same methodology was extended to a database of approximately 6000 healthy and cancer samples arising from thirteen tissues. The integrative approach has allowed for the identification of key genes common to the invasive paths of multiple cancers and can aid in drug discovery. Moreover, this integrative microarray approach was applied to viral data from HIV-1, hepatitis C and influenza to investigate the effect of these infections on iron-binding proteins. Iron is crucial for proteins involved in metabolism, DNA synthesis and immunity, accentuating such proteins as direct or indirect viral targets.Ph.D., Biomedical Engineering -- Drexel University, 201
    corecore