3,294 research outputs found
Using GWAS Data to Identify Copy Number Variants Contributing to Common Complex Diseases
Copy number variants (CNVs) account for more polymorphic base pairs in the
human genome than do single nucleotide polymorphisms (SNPs). CNVs encompass
genes as well as noncoding DNA, making these polymorphisms good candidates for
functional variation. Consequently, most modern genome-wide association studies
test CNVs along with SNPs, after inferring copy number status from the data
generated by high-throughput genotyping platforms. Here we give an overview of
CNV genomics in humans, highlighting patterns that inform methods for
identifying CNVs. We describe how genotyping signals are used to identify CNVs
and provide an overview of existing statistical models and methods used to
infer location and carrier status from such data, especially the most commonly
used methods exploring hybridization intensity. We compare the power of such
methods with the alternative method of using tag SNPs to identify CNV carriers.
As such methods are only powerful when applied to common CNVs, we describe two
alternative approaches that can be informative for identifying rare CNVs
contributing to disease risk. We focus particularly on methods identifying de
novo CNVs and show that such methods can be more powerful than case-control
designs. Finally we present some recommendations for identifying CNVs
contributing to common complex disorders.Comment: Published in at http://dx.doi.org/10.1214/09-STS304 the Statistical
Science (http://www.imstat.org/sts/) by the Institute of Mathematical
Statistics (http://www.imstat.org
Generalized Species Sampling Priors with Latent Beta reinforcements
Many popular Bayesian nonparametric priors can be characterized in terms of
exchangeable species sampling sequences. However, in some applications,
exchangeability may not be appropriate. We introduce a {novel and
probabilistically coherent family of non-exchangeable species sampling
sequences characterized by a tractable predictive probability function with
weights driven by a sequence of independent Beta random variables. We compare
their theoretical clustering properties with those of the Dirichlet Process and
the two parameters Poisson-Dirichlet process. The proposed construction
provides a complete characterization of the joint process, differently from
existing work. We then propose the use of such process as prior distribution in
a hierarchical Bayes modeling framework, and we describe a Markov Chain Monte
Carlo sampler for posterior inference. We evaluate the performance of the prior
and the robustness of the resulting inference in a simulation study, providing
a comparison with popular Dirichlet Processes mixtures and Hidden Markov
Models. Finally, we develop an application to the detection of chromosomal
aberrations in breast cancer by leveraging array CGH data.Comment: For correspondence purposes, Edoardo M. Airoldi's email is
[email protected]; Federico Bassetti's email is
[email protected]; Michele Guindani's email is
[email protected] ; Fabrizo Leisen's email is
[email protected]. To appear in the Journal of the American
Statistical Associatio
Copy number variants and selective sweeps in natural populations of the house mouse (Mus musculus domesticus)
Copy–number variants (CNVs) may play an important role in early adaptations, potentially facilitating rapid divergence of populations. We describe an approach to study this question by investigating CNVs present in natural populations of mice in the early stages of divergence and their involvement in selective sweeps. We have analyzed individuals from two recently diverged natural populations of the house mouse (Mus musculus domesticus) from Germany and France using custom, high–density, comparative genome hybridization arrays (CGH) that covered almost 164 Mb and 2444 genes. One thousand eight hundred and sixty one of those genes we previously identified as differentially expressed between these populations, while the expression of the remaining genes was invariant. In total, we identified 1868 CNVs across all 10 samples, 200 bp to 600 kb in size and affecting 424 genic regions. Roughly two thirds of all CNVs found were deletions. We found no enrichment of CNVs among the differentially expressed genes between the populations compared to the invariant ones, nor any meaningful correlation between CNVs and gene expression changes. Among the CNV genes, we found cellular component gene ontology categories of the synapse overrepresented among all the 2444 genes tested. To investigate potential adaptive significance of the CNV regions, we selected six that showed large differences in frequency of CNVs between the two populations and analyzed variation in at least two microsatellites surrounding the loci in a sample of 46 unrelated animals from the same populations collected in field trappings. We identified two loci with large differences in microsatellite heterozygosity (Sfi1 and Glo1/Dnahc8 regions) and one locus with low variation across the populations (Cmah), thus suggesting that these genomic regions might have recently undergone selective sweeps. Interestingly, the Glo1 CNV has previously been implicated in anxiety–like behavior in mice, suggesting a differential evolution of a behavioral trai
Recommended from our members
Genomic and Expression Analysis of the 12p11-p12 Amplicon Using EST Arrays Identifies Two Novel Amplified and Overexpressed Genes
We performed parallel array comparative genomic hybridization and array expression analysis of the 12p11-p12 amplicon in human testicular seminomas and an ovarian carcinoma cell line using an expressed se- quence tags (ESTs) array spotted with 8254 ESTs. The data were normal- ized using a robust statistical modeling and the significance inferred from the local SD. We identified two ESTs within the chromosomal amplicon that were amplified and overexpressed in >75–100% of analyzed tumors with the 12p11-p12 amplicon. These sequences, belonging to coding re- gions of two novel genes designated here as GCT1 and GCT2, were broadly expressed in a panel of human tissues, including testis and ovary. GCT1 and GCT2 were overexpressed in 92 and 71%, respectively, of a panel of seminomas tested. Combined array comparative genomic hybridization and array expression analysis is a valid approach for gene discovery in large chromosomal amplicons
Multiple putative oncogenes at the chromosome 20q amplicon contribute to colorectal adenoma to carcinoma progression
Objective: This study aimed to identify the oncogenes at 20q involved in colorectal adenoma to carcinoma progression by measuring the effect of 20q gain on mRNA expression of genes in this amplicon.
Methods: Segmentation of DNA copy number changes on 20q was performed by array CGH (comparative genomic hybridisation) in 34 non-progressed colorectal adenomas, 41 progressed adenomas (ie, adenomas that present a focus of cancer) and 33 adenocarcinomas. Moreover, a robust analysis of altered expression of genes in these segments was performed by microarray analysis in 37 adenomas and 31 adenocarcinomas. Protein expression was evaluated by immunohistochemistry on tissue microarrays.
Results: The genes C20orf24, AURKA, RNPC1, TH1L, ADRM1, C20orf20 and TCFL5, mapping at 20q, were significantly overexpressed in carcinomas compared with adenomas as a consequence of copy number gain of 20q.
Conclusion: This approach revealed C20orf24, AURKA, RNPC1, TH1L, ADRM1, C20orf20 and TCFL5 genes to be important in chromosomal instability-related adenoma to carcinoma progression. These genes therefore may serve as highly specific biomarkers for colorectal cancer with potential clinical applications
Gene amplifications associated with the development of hormone- resistant prostate cancer
Purpose: Hormone resistance remains a significant clinical problem in prostate cancer with few therapeutic options. Research into mechanisms of hormone resistance is essential.
Experimental Design: We analyzed 38 paired (prehormone/posthormone resistance) prostate cancer samples using the Vysis GenoSensor. Archival microdissected tumor DNA was extracted, amplified, labeled, and hybridized to Amplione I DNA microarrays containing 57 oncogenes.
Results: Genetic instability increased during progression from hormone-sensitive to hormone-resistant cancer (P = 0.008). Amplification frequencies of 15 genes (TERC, MYBL3, HRAS, PI3KCA, JUNB, LAMC2, RAF1, MYC, GARP, SAS, FGFR1, PGY1, MYCL1, MYB, FGR) increased by greater than 10% during hormone escape. Receptor tyrosine kinases were amplified in 73% of cases; this was unrelated to development of hormone resistance. However, downstream receptor tyrosine kinase signaling pathways showed increased amplification rates in resistant tumors for the mitogen-activated protein kinase (FGR/Src-2, HRAS, and RAF1; P = 0.005) and phosphatidylinositol 3'-kinase pathways (FGR/ Src-2, PI3K, and Akt; P = 0.046). Transcription factors regulated by these pathways were also more frequently amplified after escape (MYC family: 21% before versus 63% after, P = 0.027; MYB family: 26 % before versus 53 % after, P = 0.18).
Conclusions: Development of clinical hormone escape is linked to phosphatidylinositol 3'-kinase and mitogen-activated protein kinase pathways. These pathways may function independently of the androgen receptor or via androgen receptor activation by phosphorylation, providing novel therapeutic targets
- …