982 research outputs found
Systematic Inference of Copy-Number Genotypes from Personal Genome Sequencing Data Reveals Extensive Olfactory Receptor Gene Content Diversity
Copy-number variations (CNVs) are widespread in the human genome, but comprehensive assignments of integer locus copy-numbers (i.e., copy-number genotypes) that, for example, enable discrimination of homozygous from heterozygous CNVs, have remained challenging. Here we present CopySeq, a novel computational approach with an underlying statistical framework that analyzes the depth-of-coverage of high-throughput DNA sequencing reads, and can incorporate paired-end and breakpoint junction analysis based CNV-analysis approaches, to infer locus copy-number genotypes. We benchmarked CopySeq by genotyping 500 chromosome 1 CNV regions in 150 personal genomes sequenced at low-coverage. The assessed copy-number genotypes were highly concordant with our performed qPCR experiments (Pearson correlation coefficient 0.94), and with the published results of two microarray platforms (95–99% concordance). We further demonstrated the utility of CopySeq for analyzing gene regions enriched for segmental duplications by comprehensively inferring copy-number genotypes in the CNV-enriched >800 olfactory receptor (OR) human gene and pseudogene loci. CopySeq revealed that OR loci display an extensive range of locus copy-numbers across individuals, with zero to two copies in some OR loci, and two to nine copies in others. Among genetic variants affecting OR loci we identified deleterious variants including CNVs and SNPs affecting ∼15% and ∼20% of the human OR gene repertoire, respectively, implying that genetic variants with a possible impact on smell perception are widespread. Finally, we found that for several OR loci the reference genome appears to represent a minor-frequency variant, implying a necessary revision of the OR repertoire for future functional studies. CopySeq can ascertain genomic structural variation in specific gene families as well as at a genome-wide scale, where it may enable the quantitative evaluation of CNVs in genome-wide association studies involving high-throughput sequencing
Extensive Copy-Number Variation of Young Genes across Stickleback Populations
MM received funding from the Max Planck innovation funds for this project. PGDF was supported by a Marie Curie European Reintegration Grant (proposal nr 270891). CE was supported by German Science Foundation grants (DFG, EI 841/4-1 and EI 841/6-1). The funders had no role in study design, data collection and analysis, decision to publish, or preparation of the manuscript
On the power and the systematic biases of the detection of chromosomal inversions by paired-end genome sequencing
One of the most used techniques to study structural variation at a genome level is paired-end mapping (PEM). PEM has the advantage of being able to detect balanced events, such as inversions and translocations. However, inversions are still quite difficult to predict reliably, especially from high-throughput sequencing data. We simulated realistic PEM experiments with different combinations of read and library fragment lengths, including sequencing errors and meaningful base-qualities, to quantify and track down the origin of false positives and negatives along sequencing, mapping, and downstream analysis. We show that PEM is very appropriate to detect a wide range of inversions, even with low coverage data. However, % of inversions located between segmental duplications are expected to go undetected by the most common sequencing strategies. In general, longer DNA libraries improve the detectability of inversions far better than increments of the coverage depth or the read length. Finally, we review the performance of three algorithms to detect inversions -SVDetect, GRIAL, and VariationHunter-, identify common pitfalls, and reveal important differences in their breakpoint precisions. These results stress the importance of the sequencing strategy for the detection of structural variants, especially inversions, and offer guidelines for the design of future genome sequencing projects
Drug-resistant genotypes and multi-clonality in Plasmodium falciparum analysed by direct genome sequencing from peripheral blood of malaria patients.
Naturally acquired blood-stage infections of the malaria parasite Plasmodium falciparum typically harbour multiple haploid clones. The apparent number of clones observed in any single infection depends on the diversity of the polymorphic markers used for the analysis, and the relative abundance of rare clones, which frequently fail to be detected among PCR products derived from numerically dominant clones. However, minority clones are of clinical interest as they may harbour genes conferring drug resistance, leading to enhanced survival after treatment and the possibility of subsequent therapeutic failure. We deployed new generation sequencing to derive genome data for five non-propagated parasite isolates taken directly from 4 different patients treated for clinical malaria in a UK hospital. Analysis of depth of coverage and length of sequence intervals between paired reads identified both previously described and novel gene deletions and amplifications. Full-length sequence data was extracted for 6 loci considered to be under selection by antimalarial drugs, and both known and previously unknown amino acid substitutions were identified. Full mitochondrial genomes were extracted from the sequencing data for each isolate, and these are compared against a panel of polymorphic sites derived from published or unpublished but publicly available data. Finally, genome-wide analysis of clone multiplicity was performed, and the number of infecting parasite clones estimated for each isolate. Each patient harboured at least 3 clones of P. falciparum by this analysis, consistent with results obtained with conventional PCR analysis of polymorphic merozoite antigen loci. We conclude that genome sequencing of peripheral blood P. falciparum taken directly from malaria patients provides high quality data useful for drug resistance studies, genomic structural analyses and population genetics, and also robustly represents clonal multiplicity
Consistency-based detection of potential tumor-specific deletions in matched normal/tumor genomes
Wittler R, Chauve C. Consistency-based detection of potential tumor-specific deletions in matched normal/tumor genomes. BMC Bioinformatics. 2011;12(Suppl. 9):S21
Search for Doubly-Charged Higgs Boson Production at HERA
A search for the single production of doubly-charged Higgs bosons H^{\pm \pm}
in ep collisions is presented. The signal is searched for via the Higgs decays
into a high mass pair of same charge leptons, one of them being an electron.
The analysis uses up to 118 pb^{-1} of ep data collected by the H1 experiment
at HERA. No evidence for doubly-charged Higgs production is observed and mass
dependent upper limits are derived on the Yukawa couplings h_{el} of the Higgs
boson to an electron-lepton pair. Assuming that the doubly-charged Higgs only
decays into an electron and a muon via a coupling of electromagnetic strength
h_{e \mu} = \sqrt{4 \pi \alpha_{em}} = 0.3, a lower limit of 141 GeV on the
H^{\pm\pm} mass is obtained at the 95% confidence level. For a doubly-charged
Higgs decaying only into an electron and a tau and a coupling h_{e\tau} = 0.3,
masses below 112 GeV are ruled out.Comment: 15 pages, 3 figures, 1 tabl
Identification of polymorphic inversions from genotypes
Background: Polymorphic inversions are a source of genetic variability with a direct impact on recombination frequencies. Given the difficulty of their experimental study, computational methods have been developed to infer their existence in a large number of individuals using genome-wide data of nucleotide variation. Methods based on haplotype tagging of known inversions attempt to classify individuals as having a normal or inverted allele. Other methods that measure differences between linkage disequilibrium attempt to identify regions with inversions but unable to classify subjects accurately, an essential requirement for association studies. Results: We present a novel method to both identify polymorphic inversions from genome-wide genotype data and classify individuals as containing a normal or inverted allele. Our method, a generalization of a published method for haplotype data [1], utilizes linkage between groups of SNPs to partition a set of individuals into normal and inverted subpopulations. We employ a sliding window scan to identify regions likely to have an inversion, and accumulation of evidence from neighboring SNPs is used to accurately determine the inversion status of each subject. Further, our approach detects inversions directly from genotype data, thus increasing its usability to current genome-wide association studies (GWAS). Conclusions: We demonstrate the accuracy of our method to detect inversions and classify individuals on principled-simulated genotypes, produced by the evolution of an inversion event within a coalescent model [2]. We applied our method to real genotype data from HapMap Phase III to characterize the inversion status of two known inversions within the regions 17q21 and 8p23 across 1184 individuals. Finally, we scan the full genomes of the European Origin (CEU) and Yoruba (YRI) HapMap samples. We find population-based evidence for 9 out of 15 well-established autosomic inversions, and for 52 regions previously predicted by independent experimental methods in ten (9+1) individuals [3,4]. We provide efficient implementations of both genotype and haplotype methods as a unified R package inveRsion
Multi-Jet Event Rates in Deep Inelastic Scattering and Determination of the Strong Coupling Constant
Jet event rates in deep inelastic ep scattering at HERA are investigated
applying the modified JADE jet algorithm. The analysis uses data taken with the
H1 detector in 1994 and 1995. The data are corrected for detector and
hadronization effects and then compared with perturbative QCD predictions using
next-to-leading order calculations. The strong coupling constant alpha_S(M_Z^2)
is determined evaluating the jet event rates. Values of alpha_S(Q^2) are
extracted in four different bins of the negative squared momentum
transfer~\qq in the range from 40 GeV2 to 4000 GeV2. A combined fit of the
renormalization group equation to these several alpha_S(Q^2) values results in
alpha_S(M_Z^2) = 0.117+-0.003(stat)+0.009-0.013(syst)+0.006(jet algorithm).Comment: 17 pages, 4 figures, 3 tables, this version to appear in Eur. Phys.
J.; it replaces first posted hep-ex/9807019 which had incorrect figure 4
Multiplicity Structure of the Hadronic Final State in Diffractive Deep-Inelastic Scattering at HERA
The multiplicity structure of the hadronic system X produced in
deep-inelastic processes at HERA of the type ep -> eXY, where Y is a hadronic
system with mass M_Y< 1.6 GeV and where the squared momentum transfer at the pY
vertex, t, is limited to |t|<1 GeV^2, is studied as a function of the invariant
mass M_X of the system X. Results are presented on multiplicity distributions
and multiplicity moments, rapidity spectra and forward-backward correlations in
the centre-of-mass system of X. The data are compared to results in e+e-
annihilation, fixed-target lepton-nucleon collisions, hadro-produced
diffractive final states and to non-diffractive hadron-hadron collisions. The
comparison suggests a production mechanism of virtual photon dissociation which
involves a mixture of partonic states and a significant gluon content. The data
are well described by a model, based on a QCD-Regge analysis of the diffractive
structure function, which assumes a large hard gluonic component of the
colourless exchange at low Q^2. A model with soft colour interactions is also
successful.Comment: 22 pages, 4 figures, submitted to Eur. Phys. J., error in first
submission - omitted bibliograph
Differential (2+1) Jet Event Rates and Determination of alpha_s in Deep Inelastic Scattering at HERA
Events with a (2+1) jet topology in deep-inelastic scattering at HERA are
studied in the kinematic range 200 < Q^2< 10,000 GeV^2. The rate of (2+1) jet
events has been determined with the modified JADE jet algorithm as a function
of the jet resolution parameter and is compared with the predictions of Monte
Carlo models. In addition, the event rate is corrected for both hadronization
and detector effects and is compared with next-to-leading order QCD
calculations. A value of the strong coupling constant of alpha_s(M_Z^2)=
0.118+- 0.002 (stat.)^(+0.007)_(-0.008) (syst.)^(+0.007)_(-0.006) (theory) is
extracted. The systematic error includes uncertainties in the calorimeter
energy calibration, in the description of the data by current Monte Carlo
models, and in the knowledge of the parton densities. The theoretical error is
dominated by the renormalization scale ambiguity.Comment: 25 pages, 6 figures, 3 tables, submitted to Eur. Phys.
- …