194 research outputs found
Single nucleotide polymorphisms affect both cis- and trans-eQTLs
AbstractSingle nucleotide polymorphisms (SNPs) between microarray probes and RNA targets can affect the performance of expression array by weakening the hybridization. In this paper, we examined the effect of the SNPs on Affymetrix GeneChip probe set summaries and the expression quantitative trait loci (eQTL) mapping results in two eQTL datasets, one from mouse and one from human. We showed that removing SNP-containing probes significantly changed the probe set summaries and the more SNP-containing probes we removed the greater the change. Comparison of the eQTL mapping results between with and without SNP-containing probes showed that less than 70% of the significant eQTL peaks were concordant regardless of the significance threshold. These results indicate that SNPs do affect both probe set summaries and eQTLs (both cis and trans), thus SNP-containing probes should be filtered out to improve the performance of eQTL mapping
COGA phenotypes and linkages on chromosome 2
An initial linkage analysis of the alcoholism phenotype as defined by DSM-III-R criteria and alcoholism defined by DSM-IV criteria showed many, sometimes striking, inconsistencies. These inconsistencies are greatly reduced by making the definition of alcoholism more specific. We defined new phenotypes combining the alcoholism definitions and the latent variables, defining an individual as affected if that individual is alcoholic under one of the definitions (either DSM-III-R or DSM-IV), and indicated having a symptom defined by one of the latent variables. This was done for each of the two alcoholism definitions and five latent variables, selected from a canonical discriminant analyses indicating they formed significant groupings using the electrophysiological variables. We found that linkage analyses utilizing these latent variables were much more robust and consistent than the linkage results based on DSM-III-R or DSM-IV criteria for definition of alcoholism. We also performed linkage analyses on two first prinicipal components derived phenotypes, one derived from the electrophysiolocical variables, and the other derived from the latent variables. A region on chromosome 2 at 250 cM was found to be linked to both of these derived phenotypes. Further examination of the SNPs in this region identified several haplotypes strongly associated with these derived phenotypes
Randomization Tests for Small Samples: An Application for Genetic Expression Data
An advantage of randomization tests for small samples is that an exact P-value can be computed under an additive model. a disadvantage with very small sample sizes is that the resulting discrete distribution for P-values can make it mathematically impossible for a P-value to attain a particular degree of significance. We investigate a distribution of P-values that arises when several thousand randomization tests are conducted simultaneously using small samples, a situation that arises with microarray gene expression data. We show that the distribution yields valuable information regarding groups of genes that are differentially expressed between two groups: A treatment group and a control group. This distribution helps to categorize genes with varying degrees of overlap of genetic expression values between the two groups, and it helps to quantify the degree of overlap by using the P-value from a randomization test. Moreover, a statistical test is available that compares the actual distribution of P-values with an expected distribution if there are no genes that are differentially expressed. We demonstrate the method and illustrate the results by using a microarray data set involving a cell line for rheumatoid arthritis. a small simulation study evaluates the effect that correlated gene expression levels could have on results from the analysis
Transcriptional reprogramming of gene expression in bovine somatic cell chromatin transfer embryos
<p>Abstract</p> <p>Background</p> <p>Successful reprogramming of a somatic genome to produce a healthy clone by somatic cells nuclear transfer (SCNT) is a rare event and the mechanisms involved in this process are poorly defined. When serial or successive rounds of cloning are performed, blastocyst and full term development rates decline even further with the increasing rounds of cloning. Identifying the "cumulative errors" could reveal the epigenetic reprogramming blocks in animal cloning.</p> <p>Results</p> <p>Bovine clones from up to four generations of successive cloning were produced by chromatin transfer (CT). Using Affymetrix bovine microarrays we determined that the transcriptomes of blastocysts derived from the first and the fourth rounds of cloning (CT1 and CT4 respectively) have undergone an extensive reprogramming and were more similar to blastocysts derived from <it>in vitro </it>fertilization (IVF) than to the donor cells used for the first and the fourth rounds of chromatin transfer (DC1 and DC4 respectively). However a set of transcripts in the cloned embryos showed a misregulated pattern when compared to IVF embryos. Among the genes consistently upregulated in both CT groups compared to the IVF embryos were genes involved in regulation of cytoskeleton and cell shape. Among the genes consistently upregulated in IVF embryos compared to both CT groups were genes involved in chromatin remodelling and stress coping.</p> <p>Conclusion</p> <p>The present study provides a data set that could contribute in our understanding of epigenetic errors in somatic cell chromatin transfer. Identifying "cumulative errors" after serial cloning could reveal some of the epigenetic reprogramming blocks shedding light on the reprogramming process, important for both basic and applied research.</p
A proposed metric for assessing the measurement quality of individual microarrays
BACKGROUND: High-density microarray technology is increasingly applied to study gene expression levels on a large scale. Microarray experiments rely on several critical steps that may introduce error and uncertainty in analyses. These steps include mRNA sample extraction, amplification and labeling, hybridization, and scanning. In some cases this may be manifested as systematic spatial variation on the surface of microarray in which expression measurements within an individual array may vary as a function of geographic position on the array surface. RESULTS: We hypothesized that an index of the degree of spatiality of gene expression measurements associated with their physical geographic locations on an array could indicate the summary of the physical reliability of the microarray. We introduced a novel way to formulate this index using a statistical analysis tool. Our approach regressed gene expression intensity measurements on a polynomial response surface of the microarray's Cartesian coordinates. We demonstrated this method using a fixed model and presented results from real and simulated datasets. CONCLUSION: We demonstrated the potential of such a quantitative metric for assessing the reliability of individual arrays. Moreover, we showed that this procedure can be incorporated into laboratory practice as a means to set quality control specifications and as a tool to determine whether an array has sufficient quality to be retained in terms of spatial correlation of gene expression measurements
The PowerAtlas: a power and sample size atlas for microarray experimental design and research
BACKGROUND: Microarrays permit biologists to simultaneously measure the mRNA abundance of thousands of genes. An important issue facing investigators planning microarray experiments is how to estimate the sample size required for good statistical power. What is the projected sample size or number of replicate chips needed to address the multiple hypotheses with acceptable accuracy? Statistical methods exist for calculating power based upon a single hypothesis, using estimates of the variability in data from pilot studies. There is, however, a need for methods to estimate power and/or required sample sizes in situations where multiple hypotheses are being tested, such as in microarray experiments. In addition, investigators frequently do not have pilot data to estimate the sample sizes required for microarray studies. RESULTS: To address this challenge, we have developed a Microrarray PowerAtlas [1]. The atlas enables estimation of statistical power by allowing investigators to appropriately plan studies by building upon previous studies that have similar experimental characteristics. Currently, there are sample sizes and power estimates based on 632 experiments from Gene Expression Omnibus (GEO). The PowerAtlas also permits investigators to upload their own pilot data and derive power and sample size estimates from these data. This resource will be updated regularly with new datasets from GEO and other databases such as The Nottingham Arabidopsis Stock Center (NASC). CONCLUSION: This resource provides a valuable tool for investigators who are planning efficient microarray studies and estimating required sample sizes
- …