272 research outputs found
Effects of filtering by Present call on analysis of microarray experiments
BACKGROUND: Affymetrix GeneChips(® )are widely used for expression profiling of tens of thousands of genes. The large number of comparisons can lead to false positives. Various methods have been used to reduce false positives, but they have rarely been compared or quantitatively evaluated. Here we describe and evaluate a simple method that uses the detection (Present/Absent) call generated by the Affymetrix microarray suite version 5 software (MAS5) to remove data that is not reliably detected before further analysis, and compare this with filtering by expression level. We explore the effects of various thresholds for removing data in experiments of different size (from 3 to 10 arrays per treatment), as well as their relative power to detect significant differences in expression. RESULTS: Our approach sets a threshold for the fraction of arrays called Present in at least one treatment group. This method removes a large percentage of probe sets called Absent before carrying out the comparisons, while retaining most of the probe sets called Present. It preferentially retains the more significant probe sets (p ≤ 0.001) and those probe sets that are turned on or off, and improves the false discovery rate. Permutations to estimate false positives indicate that probe sets removed by the filter contribute a disproportionate number of false positives. Filtering by fraction Present is effective when applied to data generated either by the MAS5 algorithm or by other probe-level algorithms, for example RMA (robust multichip average). Experiment size greatly affects the ability to reproducibly detect significant differences, and also impacts the effect of filtering; smaller experiments (3–5 samples per treatment group) benefit from more restrictive filtering (≥50% Present). CONCLUSION: Use of a threshold fraction of Present detection calls (derived by MAS5) provided a simple method that effectively eliminated from analysis probe sets that are unlikely to be reliable while preserving the most significant probe sets and those turned on or off; it thereby increased the ratio of true positives to false positives
Mapping of trans-acting regulatory factors from microarray data
To explore the mapping of factors regulating gene expression, we have carried out linkage studies using expression data from individual transcripts (from Affymetrix microarrays; Genetic Analysis Workshop 15 Problem 1) and composite data on correlated groups of transcripts. Quality measures for the arrays were used to remove outliers, and arrays with sex mismatches were also removed. Data likely to represent noise were removed by setting a minimum threshold of present calls among the non-redundant set of 190 arrays. SOLAR was used for genetic analysis, with MAS5 signal as the measure of expression. Probe sets with larger CVs generated more linkages (LOD > 2.0). While trans linkages predominated, linkages with the largest LOD scores (>4) were mostly cis. Hierarchical clustering was used to generate correlated groups of genes. We tested four composite measures of expression for the clusters. The average signal, average normalized signal, and the first principal component of the data behaved similarly; in 8/19 clusters tested, the composite measures linked to a region to which some individual probe sets within the cluster also linked. The second principal component only produced one linkage with LOD > 2. One cluster based upon chromosomal location, containing histone genes, linked to two trans regions. This work demonstrates that composite measures for genes with correlated expression can be used to identify loci that affect multiple co-expressed genes
Expand your research: Next-Gen Sequencing, Genotyping, Gene Expression, and Epigenetics at the Center for Medical Genomics at IUSM
poster abstractThe Center for Medical Genomics (CMG) provides Indiana researchers with next-generation sequencing, SNP genotyping, gene expression and epigenetics. We provide expertise in experimental design, carry out the procedures, and assist with analyses and interpretation. These state-of-the-art technologies have enabled a large number of grants to be funded over the years, and have resulted in a very large number of publications. Our next-generation sequencing technology includes SOLiD5500xl, Ion Proton and Ion Torrent PGM (Personal Genome Machine). This set of instruments covers a wide range of nextgeneration sequencing capabilities from small bacterial genomes to the whole human genome, transcriptome (total RNA), small RNA, targeted DNA fragments, exome, ChIP-seq, and methylseq, with high sequencing accuracy. We have generated sequencing data for 74 projects over the past two-three years. Our SNP genotyping facility, using the Sequenom MassArray platform, specializes in targeted genotyping of 20-30 SNPs per assay and is an excellent choice for candidate gene studies and for following up results from GWAS and next-generation sequencing. It has been a central part of several large, multi-site collaborative genetic studies, including Genetics of Alcoholism (COGA), bipolar disorder, osteoporosis and hypertension, as well as many smaller projects; it is most efficient for sets of approximately 370 samples. We have produced more than 20 million targeted SNP genotypes to date. This platform is also capable of measuring allele-specific gene expression, and targeted quantitative DNA methylation for epigenetics study. For many projects, microarrays offer a good alternative to next-generation sequencing for measuring gene expression. We use Affymetrix GeneChip microarrays, capable of measuring expression of nearly all genes in humans (and all exons within them), rats, mice and most model organisms, and can measure expression of miRNAs. We can also use RNA extracted from FFPE samples. We have carried out more than 6,700 GeneChip hybridizations to date in support of many different projects. The CMG partners with the Center for Computational Biology and Bioinformatics for data analysis. We are recognized as a Core Facility of the Indiana CTSI and available to faculty not only from IU and IUPUI, but also from Purdue and Notre Dame Universities
Estrogen receptor-dependent attenuation of hypoxia-induced changes in the lung genome of pulmonary hypertension rats
17β-estradiol (E2) exerts complex and context-dependent effects in pulmonary hypertension. In hypoxia-induced pulmonary hypertension (HPH), E2 attenuates lung vascular remodeling through estrogen receptor (ER)-dependent effects; however, ER target genes in the hypoxic lung remain unknown. In order to identify the genome regulated by the E2-ER axis in the hypoxic lung, we performed a microarray analysis in lungs from HPH rats treated with E2 (75 mcg/kg/day) ± ER-antagonist ICI182,780 (3 mg/kg/day). Untreated HPH rats and normoxic rats served as controls. Using a false discovery rate of 10%, we identified a significantly differentially regulated genome in E2-treated versus untreated hypoxia rats. Genes most upregulated by E2 encoded matrix metalloproteinase 8, S100 calcium binding protein A8, and IgA Fc receptor; genes most downregulated by E2 encoded olfactory receptor 63, secreted frizzled-related protein 2, and thrombospondin 2. Several genes affected by E2 changed in the opposite direction after ICI182,780 co-treatment, indicating an ER-regulated genome in HPH lungs. The bone morphogenetic protein antagonist Grem1 (gremlin 1) was upregulated by hypoxia, but found to be among the most downregulated genes after E2 treatment. Gremlin 1 protein was reduced in E2-treated versus untreated hypoxic animals, and ER-blockade abolished the inhibitory effect of E2 on Grem1 mRNA and protein. In conclusion, E2 ER-dependently regulates several genes involved in proliferative and inflammatory processes during hypoxia. Gremlin 1 is a novel target of the E2-ER axis in HPH. Understanding the mechanisms of E2 gene regulation in HPH may allow for selectively harnessing beneficial transcriptional activities of E2 for therapeutic purposes
Reproducibility of oligonucleotide arrays using small samples
BACKGROUND: Low RNA yields from small tissue samples can limit the use of oligonucleotide microarrays (Affymetrix GeneChips(®)). Methods using less cRNA for hybridization or amplifying the cRNA have been reported to reduce the number of transcripts detected, but the effect on realistic experiments designed to detect biological differences has not been analyzed. We systematically explore the effects of using different starting amounts of RNA on the ability to detect differential gene expression. RESULTS: The standard Affymetrix protocol can be used starting with only 2 micrograms of total RNA, with results equivalent to the recommended 10 micrograms. Biological variability is much greater than the technical variability introduced by this change. A simple amplification protocol described here can be used for samples as small as 0.1 micrograms of total RNA. This amplification protocol allows detection of a substantial fraction of the significant differences found using the standard protocol, despite an increase in variability and the 5' truncation of the transcripts, which prevents detection of a subset of genes. CONCLUSIONS: Biological differences in a typical experiment are much greater than differences resulting from technical manipulations in labeling and hybridization. The standard protocol works well with 2 micrograms of RNA, and with minor modifications could allow the use of samples as small as 1 micrograms. For smaller amounts of starting material, down to 0.1 micrograms RNA, differential gene expression can still be detected using the single cycle amplification protocol. Comparisons of groups of four arrays detect many more significant differences than comparisons of three arrays
- …