3,267 research outputs found
Missing Heritability in the Tails of Quantitative Traits? A Simulation Study on the Impact of Slightly Altered True Genetic Models
Objective: Genome-wide association studies have identified robust associations between single nucleotide polymorphisms and complex traits. As the proportion of phenotypic variance explained is still limited for most of the traits, larger and larger meta-analyses are being conducted to detect additional associations. Here we investigate the impact of the study design and the underlying assumption about the true genetic effect in a bimodal mixture situation on the power to detect associations. Methods: We performed simulations of quantitative phenotypes analysed by standard linear regression and dichotomized case-control data sets from the extremes of the quantitative trait analysed by standard logistic regression. Results: Using linear regression, markers with an effect in the extremes of the traits were almost undetectable, whereas analysing extremes by case-control design had superior power even for much smaller sample sizes. Two real data examples are provided to support our theoretical findings and to explore our mixture and parameter assumption. Conclusions: Our findings support the idea to re-analyse the available meta-analysis data sets to detect new loci in the extremes. Moreover, our investigation offers an explanation for discrepant findings when analysing quantitative traits in the general population and in the extremes. Copyright (C) 2011 S. Karger AG, Base
Likelihood Ratio Test process for Quantitative Trait Locus detection
International audienceWe consider the likelihood ratio test (LRT) process related to the test of the absence of QTL (a QTL denotes a quantitative trait locus, i.e. a gene with quantitative effect on a trait) on the interval [0,T] representing a chromosome. The observation is the trait and the composition of the genome at some locations called ''markers''. We give the asymptotic distribution of this LRT process under the null hypothesis that there is no QTL on [0,T] and under local alternatives with a QTL at t* on [0,T]. We show that the LRT is asymptotically the square of some Gaussian process. We give a description of this process as an '' non-linear interpolated and normalized process ''. We propose a simple method to calculate the maximum of the LRT process using only statistics on markers and their ratio. This gives a new method to calculate thresholds for QTL detection
Safe and complete contig assembly via omnitigs
Contig assembly is the first stage that most assemblers solve when
reconstructing a genome from a set of reads. Its output consists of contigs --
a set of strings that are promised to appear in any genome that could have
generated the reads. From the introduction of contigs 20 years ago, assemblers
have tried to obtain longer and longer contigs, but the following question was
never solved: given a genome graph (e.g. a de Bruijn, or a string graph),
what are all the strings that can be safely reported from as contigs? In
this paper we finally answer this question, and also give a polynomial time
algorithm to find them. Our experiments show that these strings, which we call
omnitigs, are 66% to 82% longer on average than the popular unitigs, and 29% of
dbSNP locations have more neighbors in omnitigs than in unitigs.Comment: Full version of the paper in the proceedings of RECOMB 201
Enrichment analysis of Alu elements with different spatial chromatin proximity in the human genome
Transposable elements (TEs) have no longer been totally considered as “junk DNA” for quite a time since the continual discoveries of their multifunctional roles in eukaryote genomes. As one of the most important and abundant TEs that still active in human genome, Alu, a SINE family, has demonstrated its indispensable regulatory functions at sequence level, but its spatial roles are still unclear. Technologies based on 3C(chromosomeconformation capture) have revealed the mysterious three-dimensional structure of chromatin, and make it possible to study the distal chromatin interaction in the genome. To find the role TE
playing in distal regulation in human genome, we compiled the new released Hi-C data, TE annotation, histone marker annotations, and the genome-wide methylation data to operate correlation analysis, and found that the density of Alu elements showed a strong positive correlation with the level of chromatin interactions (hESC: r=0.9, P<2.2×1016; IMR90 fibroblasts: r = 0.94, P < 2.2 × 1016) and also have a significant positive correlation withsomeremote functional DNA elements like enhancers and promoters (Enhancer: hESC: r=0.997, P=2.3×10−4; IMR90: r=0.934, P=2×10−2; Promoter: hESC: r = 0.995, P = 3.8 × 10−4; IMR90: r = 0.996, P = 3.2 × 10−4). Further investigation involving GC content and methylation status showed the GC content of Alu covered sequences shared a similar pattern with that of the overall sequence, suggesting that Alu elements also function as the GC nucleotide and CpG site provider. In all, our results suggest that the Alu elements may act as an alternative parameter to evaluate the Hi-C data, which is confirmed by the correlation analysis of Alu elements and histone markers. Moreover, the GC-rich Alu sequence can bring high GC content and methylation flexibility to the regions with more distal chromatin contact, regulating the transcription of tissue-specific genes
Uncovering regulatory pathways that affect hematopoietic stem cell function using 'genetical genomics'
We combined large-scale mRNA expression analysis and gene mapping to identify genes and loci that control hematopoietic stem cell (HSC) function. We measured mRNA expression levels in purified HSCs isolated from a panel of densely genotyped recombinant inbred mouse strains. We mapped quantitative trait loci (QTLs) associated with variation in expression of thousands of transcripts. By comparing the physical transcript position with the location of the controlling QTL, we identified polymorphic cis-acting stem cell genes. We also identified multiple trans-acting control loci that modify expression of large numbers of genes. These groups of coregulated transcripts identify pathways that specify variation in stem cells. We illustrate this concept with the identification of candidate genes involved with HSC turnover. We compared expression QTLs in HSCs and brain from the same mice and identified both shared and tissue-specific QTLs. Our data are accessible through WebQTL, a web-based interface that allows custom genetic linkage analysis and identification of coregulated transcripts.
Viral population estimation using pyrosequencing
The diversity of virus populations within single infected hosts presents a
major difficulty for the natural immune response as well as for vaccine design
and antiviral drug therapy. Recently developed pyrophosphate based sequencing
technologies (pyrosequencing) can be used for quantifying this diversity by
ultra-deep sequencing of virus samples. We present computational methods for
the analysis of such sequence data and apply these techniques to pyrosequencing
data obtained from HIV populations within patients harboring drug resistant
virus strains. Our main result is the estimation of the population structure of
the sample from the pyrosequencing reads. This inference is based on a
statistical approach to error correction, followed by a combinatorial algorithm
for constructing a minimal set of haplotypes that explain the data. Using this
set of explaining haplotypes, we apply a statistical model to infer the
frequencies of the haplotypes in the population via an EM algorithm. We
demonstrate that pyrosequencing reads allow for effective population
reconstruction by extensive simulations and by comparison to 165 sequences
obtained directly from clonal sequencing of four independent, diverse HIV
populations. Thus, pyrosequencing can be used for cost-effective estimation of
the structure of virus populations, promising new insights into viral
evolutionary dynamics and disease control strategies.Comment: 23 pages, 13 figure
Parameter Estimation and Quantitative Parametric Linkage Analysis with GENEHUNTER-QMOD
Objective: We present a parametric method for linkage analysis of quantitative phenotypes. The method provides a test for linkage as well as an estimate of different phenotype parameters. We have implemented our new method in the program GENEHUNTER-QMOD and evaluated its properties by performing simulations. Methods: The phenotype is modeled as a normally distributed variable, with a separate distribution for each genotype. Parameter estimates are obtained by maximizing the LOD score over the normal distribution parameters with a gradient-based optimization called PGRAD method. Results: The PGRAD method has lower power to detect linkage than the variance components analysis (VCA) in case of a normal distribution and small pedigrees. However, it outperforms the VCA and Haseman-Elston regression for extended pedigrees, nonrandomly ascertained data and non-normally distributed phenotypes. Here, the higher power even goes along with conservativeness, while the VCA has an inflated type I error. Parameter estimation tends to underestimate residual variances but performs better for expectation values of the phenotype distributions. Conclusion: With GENEHUNTER-QMOD, a powerful new tool is provided to explicitly model quantitative phenotypes in the context of linkage analysis. It is freely available at http://www.helmholtz-muenchen.de/genepi/downloads. Copyright (C) 2012 S. Karger AG, Base
Characteristics of transposable element exonization within human and mouse
Insertion of transposed elements within mammalian genes is thought to be an
important contributor to mammalian evolution and speciation. Insertion of
transposed elements into introns can lead to their activation as alternatively
spliced cassette exons, an event called exonization. Elucidation of the
evolutionary constraints that have shaped fixation of transposed elements
within human and mouse protein coding genes and subsequent exonization is
important for understanding of how the exonization process has affected
transcriptome and proteome complexities. Here we show that exonization of
transposed elements is biased towards the beginning of the coding sequence in
both human and mouse genes. Analysis of single nucleotide polymorphisms (SNPs)
revealed that exonization of transposed elements can be population-specific,
implying that exonizations may enhance divergence and lead to speciation. SNP
density analysis revealed differences between Alu and other transposed
elements. Finally, we identified cases of primate-specific Alu elements that
depend on RNA editing for their exonization. These results shed light on TE
fixation and the exonization process within human and mouse genes.Comment: 11 pages, 4 figure
Linkage disequilibrium in young genetically isolated Dutch population
The design and feasibility of genetic studies of complex diseases are critically dependent on the extent and distribution of linkage disequilibrium (LD) across the genome and between different populations. We have examined genomewide and region-specific LD in a young genetically isolated population identified in the Netherlands by genotyping approximately 800 Short Tandem Repeat markers distributed genomewide across 58 individuals. Several regions were an
- …
