2,824 research outputs found

    Making Informed Choices about Microarray Data Analysis

    Get PDF
    This article describes the typical stages in the analysis of microarray data for non-specialist researchers in systems biology and medicine. Particular attention is paid to significant data analysis issues that are commonly encountered among practitioners, some of which need wider airing. The issues addressed include experimental design, quality assessment, normalization, and summarization of multiple-probe data. This article is based on the ISMB 2008 tutorial on microarray data analysis. An expanded version of the material in this article and the slides from the tutorial can be found at http://www.people.vcu.edu/~mreimers/OGMDA/index.html

    Probabilistic analysis of the human transcriptome with side information

    Get PDF
    Understanding functional organization of genetic information is a major challenge in modern biology. Following the initial publication of the human genome sequence in 2001, advances in high-throughput measurement technologies and efficient sharing of research material through community databases have opened up new views to the study of living organisms and the structure of life. In this thesis, novel computational strategies have been developed to investigate a key functional layer of genetic information, the human transcriptome, which regulates the function of living cells through protein synthesis. The key contributions of the thesis are general exploratory tools for high-throughput data analysis that have provided new insights to cell-biological networks, cancer mechanisms and other aspects of genome function. A central challenge in functional genomics is that high-dimensional genomic observations are associated with high levels of complex and largely unknown sources of variation. By combining statistical evidence across multiple measurement sources and the wealth of background information in genomic data repositories it has been possible to solve some the uncertainties associated with individual observations and to identify functional mechanisms that could not be detected based on individual measurement sources. Statistical learning and probabilistic models provide a natural framework for such modeling tasks. Open source implementations of the key methodological contributions have been released to facilitate further adoption of the developed methods by the research community.Comment: Doctoral thesis. 103 pages, 11 figure

    Using the R Package crlmm for Genotyping and Copy Number Estimation

    Get PDF
    Genotyping platforms such as Affymetrix can be used to assess genotype-phenotype as well as copy number-phenotype associations at millions of markers. While genotyping algorithms are largely concordant when assessed on HapMap samples, tools to assess copy number changes are more variable and often discordant. One explanation for the discordance is that copy number estimates are susceptible to systematic differences between groups of samples that were processed at different times or by different labs. Analysis algorithms that do not adjust for batch effects are prone to spurious measures of association. The R package crlmm implements a multilevel model that adjusts for batch effects and provides allele-specific estimates of copy number. This paper illustrates a workflow for the estimation of allele-specific copy number and integration of the marker-level estimates with complimentary Bioconductor software for inferring regions of copy number gain or loss. All analyses are performed in the statistical environment R.

    The potential for liquid biopsies in the precision medical treatment of breast cancer.

    Get PDF
    Currently the clinical management of breast cancer relies on relatively few prognostic/predictive clinical markers (estrogen receptor, progesterone receptor, HER2), based on primary tumor biology. Circulating biomarkers, such as circulating tumor DNA (ctDNA) or circulating tumor cells (CTCs) may enhance our treatment options by focusing on the very cells that are the direct precursors of distant metastatic disease, and probably inherently different than the primary tumor's biology. To shift the current clinical paradigm, assessing tumor biology in real time by molecularly profiling CTCs or ctDNA may serve to discover therapeutic targets, detect minimal residual disease and predict response to treatment. This review serves to elucidate the detection, characterization, and clinical application of CTCs and ctDNA with the goal of precision treatment of breast cancer

    Next station in microarray data analysis: GEPAS

    Get PDF
    The Gene Expression Profile Analysis Suite (GEPAS) has been running for more than four years. During this time it has evolved to keep pace with the new interests and trends in the still changing world of microarray data analysis. GEPAS has been designed to provide an intuitive although powerful web-based interface that offers diverse analysis options from the early step of preprocessing (normalization of Affymetrix and two-colour microarray experiments and other preprocessing options), to the final step of the functional annotation of the experiment (using Gene Ontology, pathways, PubMed abstracts etc.), and include different possibilities for clustering, gene selection, class prediction and array-comparative genomic hybridization management. GEPAS is extensively used by researchers of many countries and its records indicate an average usage rate of 400 experiments per day. The web-based pipeline for microarray gene expression data, GEPAS, is available at

    A VISUALIZATION TOOL FOR CROSS-EXPERIMENT GENE EXPRESSION ANALYSIS OF C. ELEGANS

    Get PDF
    Forty-six genomic gene expression studies of free living soil nematode C. eleganshave been published. To facilitate exploratory analysis of those studies, we constructed adatabase containing all the published C. elegans expression datasets. A Perl CGIprogram, called Microarray Analysis Display (MAdisplay), allows gene expressionclustergrams of any combination of entered genes and datasets to be viewed(http://elegans.uky.edu/gl/madisplay). Perl programs were used to preprocess the rawdata from different sources into a common format and to transform the data to displaythe expression changes relative to each experiment\u27s controls. Three hundred lists ofgenes from figures and tables were extracted from the publications and made available inthe GeneLists database, which also contains Gene Ontology and KEGG gene lists. Weused these tools to examine in a systematic fashion the mean expression of gene lists inthe set of microarray and SAGE experiments. Seventy-nine percent of publicationderived gene lists show a strong expression change (p-value andlt;0.001) in more than oneexperiment with the median being fourteen out of the 127 experiments that are derivedfrom the forty-six publications. This indicates that groups of genes identified in onepublication typically show an expression effect in many other experiments

    Involvement of genes and non-coding RNAs in cancer: profiling using microarrays

    Get PDF
    MicroRNAs (miRNAs) are small noncoding RNAs (ncRNAs, RNAs that do not code for proteins) that regulate the expression of target genes. MiRNAs can act as tumor suppressor genes or oncogenes in human cancers. Moreover, a large fraction of genomic ultraconserved regions (UCRs) encode a particular set of ncRNAs whose expression is altered in human cancers. Bioinformatics studies are emerging as important tools to identify associations between miRNAs/ncRNAs and CAGRs (Cancer Associated Genomic Regions). ncRNA profiling, the use of highly parallel devices like microarrays for expression, public resources like mapping, expression, functional databases, and prediction algorithms have allowed the identification of specific signatures associated with diagnosis, prognosis and response to treatment of human tumors

    Copy number and gene expression differences between African American and Caucasian American prostate cancer

    Get PDF
    <p>Abstract</p> <p>Background</p> <p>The goal of our study was to investigate the molecular underpinnings associated with the relatively aggressive clinical behavior of prostate cancer (PCa) in African American (AA) compared to Caucasian American (CA) patients using a genome-wide approach.</p> <p>Methods</p> <p>AA and CA patients treated with radical prostatectomy (RP) were frequency matched for age at RP, Gleason grade, and tumor stage. Array-CGH (BAC SpectralChip2600) was used to identify genomic regions with significantly different DNA copy number between the groups. Gene expression profiling of the same set of tumors was also evaluated using Affymetrix HG-U133 Plus 2.0 arrays. Concordance between copy number alteration and gene expression was examined. A second aCGH analysis was performed in a larger validation cohort using an oligo-based platform (Agilent 244K).</p> <p>Results</p> <p>BAC-based array identified 27 chromosomal regions with significantly different copy number changes between the AA and CA tumors in the first cohort (Fisher's exact test, P < 0.05). Copy number alterations in these 27 regions were also significantly associated with gene expression changes. aCGH performed in a larger, independent cohort of AA and CA tumors validated 4 of the 27 (15%) most significantly altered regions from the initial analysis (3q26, 5p15-p14, 14q32, and 16p11). Functional annotation of overlapping genes within the 4 validated regions of AA/CA DNA copy number changes revealed significant enrichment of genes related to immune response.</p> <p>Conclusions</p> <p>Our data reveal molecular alterations at the level of gene expression and DNA copy number that are specific to African American and Caucasian prostate cancer and may be related to underlying differences in immune response.</p
    corecore