79 research outputs found

    Effects of filtering by Present call on analysis of microarray experiments

    Get PDF
    BACKGROUND: Affymetrix GeneChips(ยฎ )are widely used for expression profiling of tens of thousands of genes. The large number of comparisons can lead to false positives. Various methods have been used to reduce false positives, but they have rarely been compared or quantitatively evaluated. Here we describe and evaluate a simple method that uses the detection (Present/Absent) call generated by the Affymetrix microarray suite version 5 software (MAS5) to remove data that is not reliably detected before further analysis, and compare this with filtering by expression level. We explore the effects of various thresholds for removing data in experiments of different size (from 3 to 10 arrays per treatment), as well as their relative power to detect significant differences in expression. RESULTS: Our approach sets a threshold for the fraction of arrays called Present in at least one treatment group. This method removes a large percentage of probe sets called Absent before carrying out the comparisons, while retaining most of the probe sets called Present. It preferentially retains the more significant probe sets (p โ‰ค 0.001) and those probe sets that are turned on or off, and improves the false discovery rate. Permutations to estimate false positives indicate that probe sets removed by the filter contribute a disproportionate number of false positives. Filtering by fraction Present is effective when applied to data generated either by the MAS5 algorithm or by other probe-level algorithms, for example RMA (robust multichip average). Experiment size greatly affects the ability to reproducibly detect significant differences, and also impacts the effect of filtering; smaller experiments (3โ€“5 samples per treatment group) benefit from more restrictive filtering (โ‰ฅ50% Present). CONCLUSION: Use of a threshold fraction of Present detection calls (derived by MAS5) provided a simple method that effectively eliminated from analysis probe sets that are unlikely to be reliable while preserving the most significant probe sets and those turned on or off; it thereby increased the ratio of true positives to false positives

    Mapping of trans-acting regulatory factors from microarray data

    Get PDF
    To explore the mapping of factors regulating gene expression, we have carried out linkage studies using expression data from individual transcripts (from Affymetrix microarrays; Genetic Analysis Workshop 15 Problem 1) and composite data on correlated groups of transcripts. Quality measures for the arrays were used to remove outliers, and arrays with sex mismatches were also removed. Data likely to represent noise were removed by setting a minimum threshold of present calls among the non-redundant set of 190 arrays. SOLAR was used for genetic analysis, with MAS5 signal as the measure of expression. Probe sets with larger CVs generated more linkages (LOD > 2.0). While trans linkages predominated, linkages with the largest LOD scores (>4) were mostly cis. Hierarchical clustering was used to generate correlated groups of genes. We tested four composite measures of expression for the clusters. The average signal, average normalized signal, and the first principal component of the data behaved similarly; in 8/19 clusters tested, the composite measures linked to a region to which some individual probe sets within the cluster also linked. The second principal component only produced one linkage with LOD > 2. One cluster based upon chromosomal location, containing histone genes, linked to two trans regions. This work demonstrates that composite measures for genes with correlated expression can be used to identify loci that affect multiple co-expressed genes

    Probe set algorithms: is there a rational best bet?

    Get PDF
    Affymetrix microarrays have become a standard experimental platform for studies of mRNA expression profiling. Their success is due, in part, to the multiple oligonucleotide features (probes) against each transcript (probe set). This multiple testing allows for more robust background assessments and gene expression measures, and has permitted the development of many computational methods to translate image data into a single normalized "signal" for mRNA transcript abundance. There are now many probe set algorithms that have been developed, with a gradual movement away from chip-by-chip methods (MAS5), to project-based model-fitting methods (dCHIP, RMA, others). Data interpretation is often profoundly changed by choice of algorithm, with disoriented biologists questioning what the "accurate" interpretation of their experiment is. Here, we summarize the debate concerning probe set algorithms. We provide examples of how changes in mismatch weight, normalizations, and construction of expression ratios each dramatically change data interpretation. All interpretations can be considered as computationally appropriate, but with varying biological credibility. We also illustrate the performance of two new hybrid algorithms (PLIER, GC-RMA) relative to more traditional algorithms (dCHIP, MAS5, Probe Profiler PCA, RMA) using an interactive power analysis tool. PLIER appears superior to other algorithms in avoiding false positives with poorly performing probe sets. Based on our interpretation of the literature, and examples presented here, we suggest that the variability in performance of probe set algorithms is more dependent upon assumptions regarding "background", than on calculations of "signal". We argue that "background" is an enormously complex variable that can only be vaguely quantified, and thus the "best" probe set algorithm will vary from project to project

    Identification of transcription factor and microRNA binding sites in responsible to fetal alcohol syndrome

    Get PDF
    This is a first report, using our MotifModeler informatics program, to simultaneously identify transcription factor (TF) and microRNA (miRNA) binding sites from gene expression microarray data. Based on the assumption that gene expression is controlled by combinatorial effects of transcription factors binding in the 5'-upstream regulatory region and miRNAs binding in the 3'-untranslated region (3'-UTR), we developed a model for (1) predicting the most influential cis-acting elements under a given biological condition, and (2) estimating the effects of those elements on gene expression levels. The regulatory regions, TF and miRNA, which mediate the differential genes expression in fetal alcohol syndrome were unknown; microarray data from alcohol exposure paradigm was used. The model predicted strong inhibitory effects of 5' cis-acting elements and stimulatory effects of 3'-UTR under alcohol treatment. Current predictive model derived a key hypothesis for the first time a novel role of miRNAs in gene expression changes associated with abnormal mouse embryo development after alcohol exposure. This suggests that disturbance of miRNA functions may contribute to the alcohol-induced developmental deficiencies

    Identification of Reference Genes across Physiological States for qRT-PCR through Microarray Meta-Analysis

    Get PDF
    The accuracy of quantitative real-time PCR (qRT-PCR) is highly dependent on reliable reference gene(s). Some housekeeping genes which are commonly used for normalization are widely recognized as inappropriate in many experimental conditions. This study aimed to identify reference genes for clinical studies through microarray meta-analysis of human clinical samples.After uniform data preprocessing and data quality control, 4,804 Affymetrix HU-133A arrays performed by clinical samples were classified into four physiological states with 13 organ/tissue types. We identified a list of reference genes for each organ/tissue types which exhibited stable expression across physiological states. Furthermore, 102 genes identified as reference gene candidates in multiple organ/tissue types were selected for further analysis. These genes have been frequently identified as housekeeping genes in previous studies, and approximately 71% of them fall into Gene Expression (GO:0010467) category in Gene Ontology.Based on microarray meta-analysis of human clinical sample arrays, we identified sets of reference gene candidates for various organ/tissue types and then examined the functions of these genes. Additionally, we found that many of the reference genes are functionally related to transcription, RNA processing and translation. According to our results, researchers could select single or multiple reference gene(s) for normalization of qRT-PCR in clinical studies

    A meta-analysis of kidney microarray datasets: investigation of cytokine gene detection and correlation with rt-PCR and detection thresholds

    Get PDF
    BACKGROUND: Microarrays provide a means to simultaneously examine the gene expression of the entire transcriptome in a single sample. Many studies have highlighted the need for novel software and statistical approaches to assess the measured gene expression. Less attention has been directed toward whether genes considered undetectable by microarray can be detected by other strategies or whether these genes can provide accurate gene expression determinations. In the kidney this is a concern for genes such as cytokines which dramatically influence the immune response but are often considered low abundance genes produced by a small number of cells. RESULTS: Using both publicly available and our own microarray datasets we analyzed the detection p-value and detection call values for 81 human kidney samples run on the U133A or U133Plus2.0 Affymetrix microarrays (Affymetrix, Santa Clara, CA). For the cytokine genes, the frequency of detection in each sample group (normal, transplant and renal cell carcinoma) was examined and revealed that a majority of cytokine related genes are not detectable in human kidney by microarray. Using a subset of 29 Mayo transplant samples, a group of seven transplant-related cytokines and eight non-cytokine genes were evaluated by real-time PCR (rt-PCR). For these 15 genes we compared the impact of decreasing microarray detection frequency with the changes in gene expression observed by both microarray and rt-PCR. We found that as microarray detection frequency decreased the correlation between microarray and rt-PCR data also decreased. CONCLUSION: We conclude that, when analyzing microarray data from human kidney samples, genes generally expressed at low abundance (i.e. cytokines) should be evaluated with more sensitive approaches such as rt-PCR. In addition, our data suggest that the use of detection frequency cutoffs for inclusion or exclusion of microarray data may be appropriate when comparing microarray and rt-PCR gene expression data and p-value calculations

    Sex-Related Differences in Gene Expression in Human Skeletal Muscle

    Get PDF
    There is sexual dimorphism of skeletal muscle, the most obvious feature being the larger muscle mass of men. The molecular basis for this difference has not been clearly defined. To identify genes that might contribute to the relatively greater muscularity of men, we compared skeletal muscle gene expression profiles of 15 normal men and 15 normal women by using comprehensive oligonucleotide microarrays. Although there were sex-related differences in expression of several hundred genes, very few of the differentially expressed genes have functions that are obvious candidates for explaining the larger muscle mass of men. The men tended to have higher expression of genes encoding mitochondrial proteins, ribosomal proteins, and a few translation initiation factors. The women had >2-fold greater expression than the men (P<0.0001) of two genes that encode proteins in growth factor pathways known to be important in regulating muscle mass: growth factor receptor-bound 10 (GRB10) and activin A receptor IIB (ACVR2B). GRB10 encodes a protein that inhibits insulin-like growth factor-1 (IGF-1) signaling. ACVR2B encodes a myostatin receptor. Quantitative RT-PCR confirmed higher expression of GRB10 and ACVR2B genes in these women. In an independent microarray study of 10 men and 9 women with facioscapulohumeral dystrophy, women had higher expression of GRB10 (2.7-fold, P<0.001) and ACVR2B (1.7-fold, P<0.03). If these sex-related differences in mRNA expression lead to reduced IGF-1 activity and increased myostatin activity, they could contribute to the sex difference in muscle size

    ๊ณ„์ธต์  ๊ตฌ์กฐ ๋ชจํ˜•์„ ์ด์šฉํ•œ mRNA ๋ฐœํ˜„ ์ž๋ฃŒ์˜ ํŒจ์Šค์›จ์ด ๋ถ„์„

    Get PDF
    ํ•™์œ„๋…ผ๋ฌธ(์„์‚ฌ)--์„œ์šธ๋Œ€ํ•™๊ต ๋Œ€ํ•™์› :์ž์—ฐ๊ณผํ•™๋Œ€ํ•™ ํ˜‘๋™๊ณผ์ • ์ƒ๋ฌผ์ •๋ณดํ•™์ „๊ณต,2019. 8. ๋ฐ•ํƒœ์„ฑ.Although there have been several analyses for identifying cancer-associated pathways, based on gene expression data, most of these are based on single pathway analyses, and thus do not consider correlations between pathways. In this paper, we propose a hierarchical structural component model of pathway analysis for gene expression data (HisCoM-PAGE), which accounts for the hierarchical structure of genes and pathways, as well as the correlations among pathways. Specifically, HisCoM-PAGE focuses on the survival phenotype and identifies its associated pathways. Moreover, its application to a real biological data analysis of pancreatic cancer data demonstrated that HisCoM-PAGE could successfully identify pathways associated with pancreatic cancer prognosis. Simulation studies comparing the performance of HisCoM-PAGE with other competing methods such as Gene Set Enrichment Analysis (GSEA), Global Test, and Wald-type Test showed HisCoM-PAGE to have the highest power to detect causal pathways.์•”์— ์ƒ๊ด€๊ด€๊ณ„๊ฐ€ ์žˆ๋Š” ์ƒ๋ฌผํ•™์  ๊ธฐ์ž‘ ๊ณง, ํŒจ์Šค์›จ์ด๋ฅผ ์ฐพ์•„๋‚ด๊ธฐ ์œ„ํ•œ ์—ฌ๋Ÿฌ ๊ฐ€์ง€ ๋ถ„์„์ด ์žˆ์—ˆ์ง€๋งŒ ์œ ์ „์ž ๋ฐœํ˜„ ๋ฐ์ดํ„ฐ๋ฅผ ๊ธฐ๋ฐ˜์œผ๋กœ ํ•œ ๋ถ„์„๋“ค์˜ ๋Œ€๋ถ€๋ถ„์€ ๋‹จ์ผ ํŒจ์Šค์›จ์ด ๋ถ„์„์— ๊ธฐ์ดˆํ•˜๊ณ  ์žˆ์—ˆ๋‹ค. ์ด๋Ÿฌํ•œ ๋ถ„์„ ๋ฐฉ๋ฒ•์˜ ๊ฒฝ์šฐ, ํŒจ์Šค์›จ์ด๋“ค ๊ฐ„์˜ ์ƒ๊ด€ ๊ด€๊ณ„๋ฅผ ๊ณ ๋ คํ•˜์ง€ ์•Š์•˜๋‹ค. ๋ณธ ๋…ผ๋ฌธ์—์„œ๋Š” ์œ ์ „์ž์™€ ๊ทธ ์ƒ์œ„ ๋‹จ๊ณ„๋ผ๊ณ  ํ•  ์ˆ˜ ์žˆ๋Š” ํŒจ์Šค์›จ์ด์˜ ์ƒ๋ฌผํ•™์ ์ธ ์œ„๊ณ„ ๊ตฌ์กฐ๋ฅผ ๋ฐ˜์˜ํ•˜๋Š” HisCoM-PAGE: ๊ณ„์ธต์  ๊ตฌ์กฐ ๋ชจํ˜•์„ ์ด์šฉํ•œ ์œ ์ „์ž ๋ฐœํ˜„ ๋ฐ์ดํ„ฐ์˜ ํŒจ์Šค์›จ์ด ๋ถ„์„ ๋ชจ๋ธ์„ ์ œ์•ˆํ•œ๋‹ค. ํŠนํžˆ, HisCoM-PAGE๋Š” ์ƒ์กด์ž๋ฃŒ ํ‘œํ˜„ํ˜•์— ์ดˆ์ ์„ ๋งž์ถ”๊ณ  ์˜ˆํ›„์— ์ƒ๊ด€๊ด€๊ณ„๋ฅผ ๊ฐ€์ง€๋Š” ํ†ต๊ณ„์ ์œผ๋กœ ์œ ์˜ํ•œ ํŒจ์Šค์›จ์ด๋ฅผ ์ฐพ์•„๋‚ด๋Š” ๊ฒƒ์— ์ค‘์ ์„ ๋‘์—ˆ๋‹ค. ์‹ค์ œ ๋ฐ์ดํ„ฐ์— ๋Œ€ํ•œ ์ ์šฉ์œผ๋กœ๋Š” ์ทŒ์žฅ์•” ๋ฐ์ดํ„ฐ๋ฅผ ์ด์šฉํ•˜์˜€๋Š”๋ฐ, ์ด๋Š” ์ทŒ์žฅ์•”์ด ์—ฌ๋Ÿฌ ์•” ์ข… ์ค‘์—์„œ๋„ ์˜ˆํ›„๊ฐ€ ์ข‹์ง€ ๋ชปํ•œ ์งˆ๋ณ‘์œผ๋กœ, ์˜ˆํ›„์— ๋Œ€ํ•œ ์—ฐ๊ตฌ๊ฐ€ ์ค‘์š”ํ•˜๊ธฐ ๋•Œ๋ฌธ์ด๋‹ค. HisCoM-PAGE ๋ฐฉ๋ฒ•์„ ์‹ค์ œ ์ทŒ์žฅ์•” ์œ ์ „์ž ๋ฐœํ˜„ ๋ฐ์ดํ„ฐ์— ์ ์šฉํ•˜์˜€์„ ๋•Œ, HisCoM-PAGE ๋ฐฉ๋ฒ•์ด ์ทŒ์žฅ์•” ์˜ˆํ›„์™€ ๊ด€๋ จ๋œ ํŒจ์Šค์›จ์ด๋ฅผ ํšจ๊ณผ์ ์œผ๋กœ ์ฐพ์•„๋‚ผ ์ˆ˜ ์žˆ๋‹ค๋Š” ๊ฒƒ์„ ํ™•์ธํ•˜์˜€๋‹ค. ๋˜ํ•œ, ์ œ์‹œํ•œ ๋ฐฉ๋ฒ•๋ก ์˜ ํ†ต๊ณ„์ ์ธ ๊ฒ€์ •๋ ฅ์„ ํ™•์ธํ•˜๊ธฐ ์œ„ํ•ด์„œ ๊ธฐ์กด์— ํŒจ์Šค์›จ์ด ๋ฐฉ๋ฒ•๋ก ์œผ๋กœ ์ œ์•ˆ๋œ Gene Set Enrichment Analysis(GSEA), Global Test(GT), Adewale Test ์™€ ๊ฐ™์€ ๋‹ค๋ฅธ ํŒจ์Šค์›จ์ด ๋ฐฉ๋ฒ•๋ก ๊ณผ ๋น„๊ตํ•˜์—ฌ ์‹œ๋ฎฌ๋ ˆ์ด์…˜ ์—ฐ๊ตฌ๋ฅผ ์ง„ํ–‰ํ•˜์˜€๋‹ค. ํƒ€ ๋ฐฉ๋ฒ•๋ก ๊ณผ์˜ ๋น„๊ต๋ฅผ ํ†ตํ•ด์„œ HisCoM-PAGE๊ฐ€ ์งˆํ™˜๊ณผ์˜ ์ƒ๊ด€ ๊ด€๊ณ„๋ฅผ ๊ฐ€์ง€๋Š” ํ†ต๊ณ„์ ์œผ๋กœ ์œ ์˜ํ•œ ํŒจ์Šค์›จ์ด๋ฅผ ์ฐพ์•„๋‚ด๋Š”๋ฐ ๋†’์€ ๊ฒ€์ •๋ ฅ์„ ๊ฐ€์ง€๋Š” ๊ฒƒ์„ ํ™•์ธํ•˜์˜€๋‹ค.1 Introduction 1 2 Materials 6 3 Methodology 9 4 Results 18 5 Discussions 31 Bibliography 34 Abstract in Korean 40Maste
    • โ€ฆ
    corecore