Search CORE

13 research outputs found

A Macaque's-Eye View of Human Insertions and Deletions: Differences in Mechanisms

Author: Erika M Kvikstad
Francesca Chiaromonte
International Human Genome Sequencing Consortium
Kateryna D Makova
Rhesus Macaque Genome Sequencing and Analysis Consortium
Svitlana Tyekucheva
The Chimpanzee Sequencing and Analysis Consortium
Wen-Hsiung Li
Publication venue: Public Library of Science
Publication date: 01/01/2007
Field of study

Insertions and deletions (indels) cause numerous genetic diseases and lead to pronounced evolutionary differences among genomes. The macaque sequences provide an opportunity to gain insights into the mechanisms generating these mutations on a genome-wide scale by establishing the polarity of indels occurring in the human lineage since its divergence from the chimpanzee. Here we apply novel regression techniques and multiscale analyses to demonstrate an extensive regional indel rate variation stemming from local fluctuations in divergence, GC content, male and female recombination rates, proximity to telomeres, and other genomic factors. We find that both replication and, surprisingly, recombination are significantly associated with the occurrence of small indels. Intriguingly, the relative inputs of replication versus recombination differ between insertions and deletions, thus the two types of mutations are likely guided in part by distinct mechanisms. Namely, insertions are more strongly associated with factors linked to recombination, while deletions are mostly associated with replication-related features. Indel as a term misleadingly groups the two types of mutations together by their effect on a sequence alignment. However, here we establish that the correct identification of a small gap as an insertion or a deletion (by use of an outgroup) is crucial to determining its mechanism of origin. In addition to providing novel insights into insertion and deletion mutagenesis, these results will assist in gap penalty modeling and eventually lead to more reliable genomic alignments

Public Library of Science (PLOS)

Crossref

Directory of Open Access Journals

PubMed Central

Archivio della ricerca della Scuola Superiore Sant'Anna

Recommended from our members

The complete costs of genome sequencing: a microcosting study in cancer and rare diseases from a single center in the United Kingdom

Author: Antoniou Pavlos
Buchanan James
Camps Carme
Dreau Helene
Fermont Jilles M.
Harris Steve
Knight Samantha J. L.
Kvikstad Erika M.
Pagnamenta Alistair T.
Pentony Melissa M.
Popitsch Niko
Schuh Anna
Schwarze Katharina
Taylor Jenny C.
Taylor John M.
Tilley Mark W.
Wordsworth Sarah
Publication venue: Genetics in Medicine
Publication date: 01/01/2020
Field of study

Abstract: Purpose: The translation of genome sequencing into routine health care has been slow, partly because of concerns about affordability. The aspirational cost of sequencing a genome is

1000, but there is little evidence to support this estimate. We estimate the cost of using genome sequencing in routine clinical care in patients with cancer or rare diseases. Methods: We performed a microcosting study of Illumina-based genome sequencing in a UK National Health Service laboratory processing 399 samples/year. Cost data were collected for all steps in the sequencing pathway, including bioinformatics analysis and reporting of results. Sensitivity analysis identified key cost drivers. Results: Genome sequencing costs £6841 per cancer case (comprising matched tumor and germline samples) and £7050 per rare disease case (three samples). The consumables used during sequencing are the most expensive component of testing (68–72% of the total cost). Equipment costs are higher for rare disease cases, whereas consumable and staff costs are slightly higher for cancer cases. Conclusion: The cost of genome sequencing is underestimated if only sequencing costs are considered, and likely surpasses

1000/genome in a single laboratory. This aspirational sequencing cost will likely only be achieved if consumable costs are considerably reduced and sequencing is performed at scale

Apollo (Cambridge)

Clinically actionable mutation profiles in patients with cancer identified by whole-genome sequencing

Author: Ahmed Ahmed
Antoniou Pavlos
Athanasou Nick
Church David
Colling Richard
Dreau Helene
Flanagan Adrienne M.
Hamblin Angela
Harris Adrian
Hassan Bass
Knight Samantha J.l.
Kvikstad Erika M.
Mizani Tuba
Orosz Zsolt
Parton Marina
Pentony Melissa M.
Popitsch Niko
Protheroe Andrew
Ridout Kate
Schuh Anna
Shah Ketan A.
Taylor Jenny C.
Tomlinson Ian
Vavoulis Dimitris
Winter Stuart
Publication venue: 'Cold Spring Harbor Laboratory'
Publication date: 01/01/2018
Field of study

Next-generation sequencing (NGS) efforts have established catalogs of mutations relevant to cancer development. However, the clinical utility of this information remains largely unexplored. Here, we present the results of the first eight patients recruited into a clinical whole-genome sequencing (WGS) program in the United Kingdom. We performed PCR-free WGS of fresh frozen tumors and germline DNA at 75× and 30×, respectively, using the HiSeq2500 HTv4. Subtracted tumor VCFs and paired germlines were subjected to comprehensive analysis of coding and noncoding regions, integration of germline with somatically acquired variants, and global mutation signatures and pathway analyses. Results were classified into tiers and presented to a multidisciplinary tumor board. WGS results helped to clarify an uncertain histopathological diagnosis in one case, led to informed or supported prognosis in two cases, leading to de-escalation of therapy in one, and indicated potential treatments in all eight. Overall 26 different tier 1 potentially clinically actionable findings were identified using WGS compared with six SNVs/indels using routine targeted NGS. These initial results demonstrate the potential of WGS to inform future diagnosis, prognosis, and treatment choice in cancer and justify the systematic evaluation of the clinical utility of WGS in larger cohorts of patients with cancer

Crossref

UCL Discovery

Edinburgh Research Explorer

Oxford University Research Archive

Structural and non-coding variants increase the diagnostic yield of clinical whole genome sequencing for rare diseases

Author: Allroggen Holger
Ansorge Olaf
Babbs Christian
Banka Siddharth
Baños-Piñero Benito
Beeson David
Ben-Ami Tal
Bennett David L.
Bento Celeste
Blair Edward
Brasch-Andersen Charlotte
Bull Katherine R.
Calpena Eduardo
Camps Carme
Cario Holger
Cilliers Deirdre
Conti Valerio
Dacal Beatriz Diez
Davies E. Graham
Dhalla Fatima
Dong Yin
Dreau Helene
Dunford James E.
Ferla Matteo
Giacopuzzi Edoardo
Guerrini Renzo
Harris Adrian L.
Hartley Jane
Hashim Mona
Hashimoto Akiko
Hollander Georg
Hughes Jim R.
Javaid Kassim
Kaisaki Pamela J.
Kane Maureen
Kelly Deirdre
Kelly Dominic
Kesim Yesim
Kini Usha
Knight Samantha J. L.
Kreins Alexandra Y.
Kvikstad Erika M.
Lange Lukas
Langman Craig B.
Lester Tracy
Lines Kate E.
Lord Simon R.
Lu Xin
Lunter Gerton
Mansour Sahar
Manzur Adnan
Maroofian Reza
Marsden Brian
Mason Joanne
McGowan Simon J.
Mei Davide
Mlcochova Hana
Murakami Yoshiko
Németh Andrea H.
Okoli Steven
Ormondroyd Elizabeth
Ousager Lilian Bomme
Pagnamenta Alistair T.
Palace Jacqueline
Patel Smita Y.
Pentony Melissa M.
Popitsch Niko
Pugh Chris
Rad Aboulfazl
Ragoussis Vassilis
Ramesh Archana
Riva Simone G.
Roberts Irene
Roy Noémi
Salminen Outi
Sanders Edward
Schilling Kyleen D.
Schuh Anna H.
Schwessinger Ron
Scott Caroline
Sen Arjune
Smith Conrad
Stevenson Mark
Taylor Jenny C.
Taylor John M.
Thakker Rajesh V.
Twigg Stephen R. F.
Uhlig Holm H.
van Wijk Richard
Vavoulis Dimitrios V.
Vona Barbara
Wall Steven
Wang Jing
Watkins Hugh
Wilkie Andrew O. M.
Yu Jing
Zak Jaroslav
Publication venue
Publication date: 09/11/2023
Field of study

BACKGROUND: Whole genome sequencing is increasingly being used for the diagnosis of patients with rare diseases. However, the diagnostic yields of many studies, particularly those conducted in a healthcare setting, are often disappointingly low, at 25-30%. This is in part because although entire genomes are sequenced, analysis is often confined to in silico gene panels or coding regions of the genome.METHODS: We undertook WGS on a cohort of 122 unrelated rare disease patients and their relatives (300 genomes) who had been pre-screened by gene panels or arrays. Patients were recruited from a broad spectrum of clinical specialties. We applied a bioinformatics pipeline that would allow comprehensive analysis of all variant types. We combined established bioinformatics tools for phenotypic and genomic analysis with our novel algorithms (SVRare, ALTSPLICE and GREEN-DB) to detect and annotate structural, splice site and non-coding variants.RESULTS: Our diagnostic yield was 43/122 cases (35%), although 47/122 cases (39%) were considered solved when considering novel candidate genes with supporting functional data into account. Structural, splice site and deep intronic variants contributed to 20/47 (43%) of our solved cases. Five genes that are novel, or were novel at the time of discovery, were identified, whilst a further three genes are putative novel disease genes with evidence of causality. We identified variants of uncertain significance in a further fourteen candidate genes. The phenotypic spectrum associated with RMND1 was expanded to include polymicrogyria. Two patients with secondary findings in FBN1 and KCNQ1 were confirmed to have previously unidentified Marfan and long QT syndromes, respectively, and were referred for further clinical interventions. Clinical diagnoses were changed in six patients and treatment adjustments made for eight individuals, which for five patients was considered life-saving.CONCLUSIONS: Genome sequencing is increasingly being considered as a first-line genetic test in routine clinical settings and can make a substantial contribution to rapidly identifying a causal aetiology for many patients, shortening their diagnostic odyssey. We have demonstrated that structural, splice site and intronic variants make a significant contribution to diagnostic yield and that comprehensive analysis of the entire genome is essential to maximise the value of clinical genome sequencing.</p

University of Birmingham Research Portal

The University of Manchester - Institutional Repository

Strong heterogeneity in mutation rate causes misleading hallmarks of natural selection on indel mutations in the human genome

Author: Erika M. Kvikstad
Laurent Duret
Publication venue: 'Oxford University Press (OUP)'
Publication date: 01/01/2013
Field of study

Elucidating the mechanisms of mutation accumulation and fixation is critical to understand the nature of genetic variation and its contribution to genome evolution. Of particular interest is the effect of insertions and deletions (indels) on the evolution of genome landscapes. Recent population-scaled sequencing efforts provide unprecedented data for analyzing the relative impact of selection versus nonadaptive forces operating on indels. Here, we combined McDonald–Kreitman tests with the analysis of derived allele frequency spectra to investigate the dynamics of allele fixation of short (1–50 bp) indels in the human genome. Our analyses revealed apparently higher fixation probabilities for insertions than deletions. However, this fixation bias is not consistent with either selection or biased gene conversion and varies with local mutation rate, being particularly pronounced at indel hotspots. Furthermore, we identified an unprecedented number of loci with evidence for multiple indel events in the primate phylogeny. Even in nonrepetitive sequence contexts (a priori not prone to indel mutations), such loci are 60-fold more frequent than expected according to a model of uniform indel mutation rate. This provides evidence of as yet unidentified cryptic indel hotspots. We propose that indel homoplasy, at known and cryptic hotspots, produces systematic errors in determination of ancestral alleles via parsimony and advise caution interpreting classic selection tests given the strong heterogeneity in indel rates across the genome. These results will have great impact on studies seeking to infer evolutionary forces operating on indels observed in closely related species, because such mutations are traditionally presumed homoplasy-free

Crossref

INRIA a CCSD electronic archive server

PubMed Central

HAL Descartes

Oxford University Research Archive

The (r)evolution of SINE versus LINE distributions in primate genomes: Sex chromosomes are important

Author: Kvikstad Erika M.
Makova Kateryna D.
Publication venue: Cold Spring Harbor Laboratory Press
Publication date
Field of study

The densities of transposable elements (TEs) in the human genome display substantial variation both within individual chromosomes and among chromosome types (autosomes and the two sex chromosomes). Finding an explanation for this variability has been challenging, especially in light of genome landscapes unique to the sex chromosomes. Here, using a multiple regression framework, we investigate primate Alu and L1 densities shaped by regional genome features and location on a particular chromosome type. As a result of our analysis, first, we build statistical models explaining up to 79% and 44% of variation in Alu and L1 element density, respectively. Second, we analyze sex chromosome versus autosome TE densities corrected for regional genomic effects. We discover that sex-chromosome bias in Alu and L1 distributions not only persists after accounting for these effects, but even presents differences in patterns, confirming preferential Alu integration in the male germline, yet likely integration of L1s in both male and female germlines or in early embryogenesis. Additionally, our models reveal that local base composition (measured by GC content and density of L1 target sites) and natural selection (inferred via density of most conserved elements) are significant to predicting densities of L1s. Interestingly, measurements of local double-stranded breaks (a 13-mer associated with genome instability) strongly correlate with densities of Alu elements; little evidence was found for the role of recombination-driven deletion in driving TE distributions over evolutionary time. Thus, Alu and L1 densities have been influenced by the combination of distinct local genome landscapes and the unique evolutionary dynamics of sex chromosomes

Crossref

PubMed Central

Ride the wavelet: A multiscale analysis of genomic contexts flanking small insertions and deletions

Author: Chiaromonte Francesca
Kvikstad Erika M.
Makova Kateryna D.
Publication venue: Cold Spring Harbor Laboratory Press
Publication date
Field of study

Recent studies have revealed that insertions and deletions (indels) are more different in their formation than previously assumed. What remains enigmatic is how the local DNA sequence context contributes to these differences. To investigate the relative impact of various molecular mechanisms to indel formation, we analyzed sequence contexts of indels in the non protein- or RNA-coding, nonrepetitive (NCNR) portion of the human genome. We considered small (≤30-bp) indels occurring in the human lineage since its divergence from chimpanzee and used wavelet techniques to study, simultaneously for multiple scales, the spatial patterns of short sequence motifs associated with indel mutagenesis. In particular, we focused on motifs associated with DNA polymerase activity, topoisomerase cleavage, double-strand breaks (DSBs), and their repair. We came to the following conclusions. First, many motifs are characterized by unique enrichment profiles in the vicinity of indels vs. indel-free portions of the genome, verifying the importance of sequence context in indel mutagenesis. Second, only limited similarity in motif frequency profiles is evident flanking insertions vs. deletions, confirming differences in their mutagenesis. Third, substantial similarity in frequency profiles exists between pairs of individual motifs flanking insertions (and separately deletions), suggesting “cooperation” among motifs, and thus molecular mechanisms, during indel formation. Fourth, the wavelet analyses demonstrate that all these patterns are highly dependent on scale (the size of an interval considered). Finally, our results depict a model of indel mutagenesis comprising both replication and recombination (via repair of paused replication forks and site-specific recombination)

Crossref

PubMed Central

A high throughput screen for active human transposable elements

Author: Erika M. Kvikstad
Gerton Lunter
Jenny C. Taylor
Paolo Piazza
Publication venue: 'Springer Science and Business Media LLC'
Publication date: 01/01/2018
Field of study

Abstract Background Transposable elements (TEs) are mobile genetic sequences that randomly propagate within their host’s genome. This mobility has the potential to affect gene transcription and cause disease. However, TEs are technically challenging to identify, which complicates efforts to assess the impact of TE insertions on disease. Here we present a targeted sequencing protocol and computational pipeline to identify polymorphic and novel TE insertions using next-generation sequencing: TE-NGS. The method simultaneously targets the three subfamilies that are responsible for the majority of recent TE activity (L1HS, AluYa5/8, and AluYb8/9) thereby obviating the need for multiple experiments and reducing the amount of input material required. Results Here we describe the laboratory protocol and detection algorithm, and a benchmark experiment for the reference genome NA12878. We demonstrate a substantial enrichment for on-target fragments, and high sensitivity and precision to both reference and NA12878-specific insertions. We report 17 previously unreported loci for this individual which are supported by orthogonal long-read evidence, and we identify 1470 polymorphic and novel TEs in 12 additional samples that were previously undocumented in databases of insertion polymorphisms. Conclusions We anticipate that future applications of TE-NGS alongside exome sequencing of patients with sporadic disease will reduce the number of unresolved cases, and improve estimates of the contribution of TEs to human genetic disease

Directory of Open Access Journals

Oxford University Research Archive

Strong Heterogeneity in Mutation Rate Causes Misleading Hallmarks of Natural Selection on Indel Mutations in the Human Genome

Author: 1000 Genomes Project Consortium
Ananda
Arndt
Belinky
Bhangale
Blankenberg
Britten
Britten
Brunschwig
Bustamante
Carvalho
Castillo-Davis
Chen
Chen
Chen
Chimpanzee Sequencing and Analysis Consortium
Chindelevitch
Clark
Comeron
Davydov
de la Chaux
Diallo
Duret
Duret
Ellegren
Ellegren
Erika M. Kvikstad
Eyre-Walker
Fu
Garcia-Diaz
Gregory
Gu
Hardison
Harrow
Hernandez
Hickey
International HapMap Consortium
Karolchik
Katzman
Kelkar
Kelkar
Kondrashov
Kvikstad
Lamb
Lander
Laurent Duret
Leclercq
Leushkin
Levinson
Locke
Lunter
Lunter
Lynch
Makova
McDonald
Messer
Montgomery
Mouchiroud
Nam
Ometto
Parsch
Petrov
Petrov
Petrov
Pettersson
Podlaha
Presgraves
Sjodin
Smagulova
Smit
Spencer
Streisinger
Tanay
Taylor
Waterston
Webster
Wetterbom
Zhu
Publication venue: 'Oxford University Press (OUP)'
Publication date
Field of study

Crossref