Search CORE

179 research outputs found

Recommended from our members

Bottleneck and selection in the germline and maternal age influence transmission of mitochondrial DNA in human pedigrees.

Author: Anthony Kate
Arbeithuber Barbara
Makova Kateryna D
Nekrutenko Anton
Nielsen Rasmus
Paul Ian M
Su Marcia Shu-Wei
Wilton Peter R
Zaidi Arslan A
Publication venue: eScholarship, University of California
Publication date: 01/12/2019
Field of study

Heteroplasmy-the presence of multiple mitochondrial DNA (mtDNA) haplotypes in an individual-can lead to numerous mitochondrial diseases. The presentation of such diseases depends on the frequency of the heteroplasmic variant in tissues, which, in turn, depends on the dynamics of mtDNA transmissions during germline and somatic development. Thus, understanding and predicting these dynamics between generations and within individuals is medically relevant. Here, we study patterns of heteroplasmy in 2 tissues from each of 345 humans in 96 multigenerational families, each with, at least, 2 siblings (a total of 249 mother-child transmissions). This experimental design has allowed us to estimate the timing of mtDNA mutations, drift, and selection with unprecedented precision. Our results are remarkably concordant between 2 complementary population-genetic approaches. We find evidence for a severe germline bottleneck (7-10 mtDNA segregating units) that occurs independently in different oocyte lineages from the same mother, while somatic bottlenecks are less severe. We demonstrate that divergence between mother and offspring increases with the mother's age at childbirth, likely due to continued drift of heteroplasmy frequencies in oocytes under meiotic arrest. We show that this period is also accompanied by mutation accumulation leading to more de novo mutations in children born to older mothers. We show that heteroplasmic variants at intermediate frequencies can segregate for many generations in the human population, despite the strong germline bottleneck. We show that selection acts during germline development to keep the frequency of putatively deleterious variants from rising. Our findings have important applications for clinical genetics and genetic counseling

eScholarship - University of California

Manipulation of FASTQ data with Galaxy

Author: A. Gordon
A. Nekrutenko
Blankenberg
D. Blankenberg
G. Von Kuster
J. Taylor
N. Coraor
Publication venue: Oxford University Press
Publication date: 01/07/2010
Field of study

Summary: Here, we describe a tool suite that functions on all of the commonly known FASTQ format variants and provides a pipeline for manipulating next generation sequencing data taken from a sequencing machine all the way through the quality filtering steps

Crossref

Cold Spring Harbor Laboratory Institutional Repository

PubMed Central

Controlling for contamination in re-sequencing studies with a reproducible web-based phylogenetic approach

Author: Blankenberg D
Dickins B
Makova KD
Nekrutenko A
Paul IM
Rebolledo-Jaramillo B
Stoler N
Su MS-W
Publication venue: Informa BioSciences
Publication date: 01/03/2014
Field of study

Polymorphism discovery is a routine application of next-generation sequencing technology where multiple samples are sent to a service provider for library preparation, subsequent sequencing, and bioinformatic analyses. The decreasing cost and advances in multiplexing approaches have made it possible to analyze hundreds of samples at a reasonable cost. However, because of the manual steps involved in the initial processing of samples and handling of sequencing equipment, cross-contamination remains a significant challenge. It is especially problematic in cases where polymorphism frequencies do not adhere to diploid expectation, for example, heterogeneous tumor samples, organellar genomes, as well as during bacterial and viral sequencing. In these instances, low levels of contamination may be readily mistaken for polymorphisms, leading to false results. Here we describe practical steps designed to reliably detect contamination and uncover its origin, and also provide new, Galaxy-based, readily accessible computational tools and workflows for quality control. All results described in this report can be reproduced interactively on the web as described at http://usegalaxy.org/contamination

Crossref

Nottingham Trent Institutional Repository (IRep)

PubMed Central

Integrating diverse databases into an unified analysis framework: a Galaxy approach

Author: A. Nekrutenko
Blankenberg
Bock
D. Blankenberg
G. Von Kuster
Giardine
Hawkins
J. Taylor
Karolchik
Lyne
N. Coraor
Publication venue: Oxford University Press
Publication date
Field of study

Recent technological advances have lead to the ability to generate large amounts of data for model and non-model organisms. Whereas, in the past, there have been a relatively small number of central repositories that serve genomic data, an increasing number of distinct specialized data repositories and resources have been established. Here, we describe a generic approach that provides for the integration of a diverse spectrum of data resources into a unified analysis framework, Galaxy (http://usegalaxy.org). This approach allows the simplified coupling of external data resources with the data analysis tools available to Galaxy users, while leveraging the native data mining facilities of the external data resources

Crossref

PubMed Central

Maternal age effect and severe germ-line bottleneck in the inheritance of human mitochondrial DNA

Author: A. Nekrutenko
Abu-Amero
Ashley
B. Dickins
B. Rebolledo-Jaramillo
Barritt
Barritt
Bartmann
Cao
Chinnery
D. Blankenberg
del Castillo
Denver
Du
Eichenlaub-Ritter
F. Chiaromonte
Galtier
Goto
Guo
Haag-Liautard
Helgadottir
Hindson
Howell
Howell
I. M. Paul
J. A. McElhoe
Jenuth
K. D. Makova
Kong
Larsson
Li
Li
Lutz
Lynch
M. M. Holland
M. S.-W. Su
Ma
Marchington
Millar
Monnot
N. Stoler
Olivo
Parsons
Prasad
R. Nielsen
Rollins
Seifer
Shu
Sigur ard ttir
Simone
T. S. Korneliussen
Tang
Wai
Wallace
Wortmann
Publication venue: 'Proceedings of the National Academy of Sciences'
Publication date: 01/01/2014
Field of study

The manifestation of mitochondrial DNA (mtDNA) diseases depends on the frequency of heteroplasmy (the presence of several alleles in an individual), yet its transmission across generations cannot be readily predicted owing to a lack of data on the size of the mtDNA bottleneck during oogenesis. For deleterious heteroplasmies, a severe bottleneck may abruptly transform a benign (low) frequency in a mother into a disease-causing (high) frequency in her child. Here we present a high-resolution study of heteroplasmy transmission conducted on blood and buccal mtDNA of 39 healthy mother–child pairs of European ancestry (a total of 156 samples, each sequenced at ∼20,000× per site). On average, each individual carried one heteroplasmy, and one in eight individuals carried a disease-associated heteroplasmy, with minor allele frequency ≥1%. We observed frequent drastic heteroplasmy frequency shifts between generations and estimated the effective size of the germ-line mtDNA bottleneck at only ∼30–35 (interquartile range from 9 to 141). Accounting for heteroplasmies, we estimated the mtDNA germ-line mutation rate at 1.3 × 10−8 (interquartile range from 4.2 × 10−9 to 4.1 × 10−8) mutations per site per year, an order of magnitude higher than for nuclear DNA. Notably, we found a positive association between the number of heteroplasmies in a child and maternal age at fertilization, likely attributable to oocyte aging. This study also took advantage of droplet digital PCR (ddPCR) to validate heteroplasmies and confirm a de novo mutation. Our results can be used to predict the transmission of disease-causing mtDNA variants and illuminate evolutionary dynamics of the mitochondrial genome

Crossref

Nottingham Trent Institutional Repository (IRep)

Copenhagen University Research Information System

PubMed Central

eScholarship - University of California

The Effect of Transposable Element Insertions on Gene Expression Evolution in Rodents

Author: A Nekrutenko
A Smit
Adam Eyre-Walker
AI Su
AO Urrutia
B McClintock
B-Y Liao
C Feschotte
CB Lowe
David Enard
G Bejerano
I. King Jordan
IK Jordan
J Brosius
JC Silva
JR Walker
JS Han
JS Han
L Marino-Ramirez
LA Pennacchio
LN van de Lagemaat
M Kamal
P Khaitovich
P Medstrand
PD Keightley
RA Irizarry
RC Gentleman
RJ Britten
RJ Britten
RM Kuhn
TJ Hubbard
TS Mikkelsen
V Pereira
Vini Pereira
W Enard
W Makalowski
X Xie
Publication venue: 'Public Library of Science (PLoS)'
Publication date: 01/01/2009
Field of study

Background:Many genomes contain a substantial number of transposable elements (TEs), a few of which are known to be involved in regulating gene expression. However, recent observations suggest that TEs may have played a very important role in the evolution of gene expression because many conserved non-genic sequences, some of which are know to be involved in gene regulation, resemble TEs. Results:Here we investigate whether new TE insertions affect gene expression profiles by testing whether gene expression divergence between mouse and rat is correlated to the numbers of new transposable elements inserted near genes. We show that expression divergence is significantly correlated to the number of new LTR and SINE elements, but not to the numbers of LINEs. We also show that expression divergence is not significantly correlated to the numbers of ancestral TEs in most cases, which suggests that the correlations between expression divergence and the numbers of new TEs are causal in nature. We quantify the effect and estimate that TE insertion has accounted for ~20% (95% confidence interval: 12% to 26%) of all expression profile divergence in rodents. Conclusions:We conclude that TE insertions may have had a major impact on the evolution of gene expression levels in rodents

Public Library of Science (PLOS)

Crossref

Directory of Open Access Journals

PubMed Central

Sussex Research Online

Fast and Space-Efficient Location of Heavy or Dense Segments in Run-Length Encoded Sequences

Author: A. Nekrutenko
F. Larsen
M. Gardiner-Garden
R.C. Hardison
S. Hannenhalli
X. Huang
Y. Lin Ling
Publication venue: 'Springer Science and Business Media LLC'
Publication date: 01/07/2003
Field of study

This paper considers several variations of an optimization problem with potential applications in such areas as biomolecular sequence analysis and image processing. Given a sequence of items, each with a weight and a length, the goal is to find a subsequence of consecutive items of optimal value, where value is either total weight or total weight divided by total length. There may also be a specified lower and/or upper bound on the acceptable length of subsequences. This paper shows that all the variations of the problem are solvable in linear time and space even with non-uniform item lengths and divisible items, implying that run-length encoded sequences can be handled in time and space linear in the number of runs. Furthermore, some problem variations can be solved in constant space. Also, these time and space bounds suffice for certain problem variations in which we call for reporting of many “good” subsequences

Crossref

Loyola eCommons

Characteristics of transposable element exonization within human and mouse

Author: A Athanasiadis
A Corvelo
A Gerber
A Goren
A Levy
A Magen
A Nekrutenko
A Resch
AFA Smit
Agnes Hotz-Wagenblatt
B Giardine
B Mersch
BR Graveley
Britta Mersch
C Liu
D Karolchik
D Labuda
DD Kim
E Kim
ES Lander
EY Levanon
G Ast
G Lev-Maor
G Lev-Maor
Gil Ast
H Xie
Ilya Ruvinsky
J Hull
J Jurka
J Jurka
JO Kriegs
JO Yang
JP Nemes
K Nakabayashi
KP Kister
L Lin
L Lin
M Amit
M Blow
M Krull
M Moller-Krull
M Roy
M Sironi
M Sironi
MA Batzer
MD Koob
N Gal-Mark
N Gal-Mark
N Sela
NH Gehring
Noa Sela
O Ram
P Deininger
PL Deininger
R Cordaux
R Sorek
R Sorek
R Sorek
RA Gibbs
RE Mills
RH Waterston
RM Kuhn
RT Hillman
S He
S Schwartz
SK Ng
SS Singer
ST Sherry
T Kwan
T Kwan
VV Kapitonov
W Makalowski
WJ Kent
WL Chen
WS Lo
XH Zhang
Y Xing
YF Chang
Publication venue: 'Public Library of Science (PLoS)'
Publication date: 01/06/2010
Field of study

Insertion of transposed elements within mammalian genes is thought to be an important contributor to mammalian evolution and speciation. Insertion of transposed elements into introns can lead to their activation as alternatively spliced cassette exons, an event called exonization. Elucidation of the evolutionary constraints that have shaped fixation of transposed elements within human and mouse protein coding genes and subsequent exonization is important for understanding of how the exonization process has affected transcriptome and proteome complexities. Here we show that exonization of transposed elements is biased towards the beginning of the coding sequence in both human and mouse genes. Analysis of single nucleotide polymorphisms (SNPs) revealed that exonization of transposed elements can be population-specific, implying that exonizations may enhance divergence and lead to speciation. SNP density analysis revealed differences between Alu and other transposed elements. Finally, we identified cases of primate-specific Alu elements that depend on RNA editing for their exonization. These results shed light on TE fixation and the exonization process within human and mouse genes.Comment: 11 pages, 4 figure

arXiv.org e-Print Archive

Public Library of Science (PLOS)

Crossref

Directory of Open Access Journals

PubMed Central

The effects of multiple features of alternatively spliced exons on the K(A)/K(S )ratio test

Author: A Nekrutenko
A Nekrutenko
B Modrek
C Lee
DL Philipps
E Quevillon
EV Kriventseva
F Wen
FC Chen
FC Chen
Feng-Chi Chen
G Yeo
GW Yeo
J Wang
K Iida
L Cartegni
L Cartegni
LC Filip
LD Hurst
M Karnaugh
MS Cline
NJ Mulder
R Sorek
R Sorek
S Stamm
SM Berget
TA Thanaraj
Trees-Juen Chuang
U Ohler
WG Fairbrother
WG Fairbrother
WG Fairbrother
XH Zhang
XH Zhang
Y Xing
Y Xing
Y Xing
Y Xing
Z Wang
Z Yang
Z Yang
Publication venue: BioMed Central
Publication date: 01/01/2006
Field of study

BACKGROUND: The evolution of alternatively spliced exons (ASEs) is of primary interest because these exons are suggested to be a major source of functional diversity of proteins. Many exon features have been suggested to affect the evolution of ASEs. However, previous studies have relied on the K(A)/K(S )ratio test without taking into consideration information sufficiency (i.e., exon length > 75 bp, cross-species divergence > 5%) of the studied exons, leading to potentially biased interpretations. Furthermore, which exon feature dominates the results of the K(A)/K(S )ratio test and whether multiple exon features have additive effects have remained unexplored. RESULTS: In this study, we collect two different datasets for analysis – the ASE dataset (which includes lineage-specific ASEs and conserved ASEs) and the ACE dataset (which includes only conserved ASEs). We first show that information sufficiency can significantly affect the interpretation of relationship between exons features and the K(A)/K(S )ratio test results. After discarding exons with insufficient information, we use a Boolean method to analyze the relationship between test results and four exon features (namely length, protein domain overlapping, inclusion level, and exonic splicing enhancer (ESE) frequency) for the ASE dataset. We demonstrate that length and protein domain overlapping are dominant factors, and they have similar impacts on test results of ASEs. In addition, despite the weak impacts of inclusion level and ESE motif frequency when considered individually, combination of these two factors still have minor additive effects on test results. However, the ACE dataset shows a slightly different result in that inclusion level has a marginally significant effect on test results. Lineage-specific ASEs may have contributed to the difference. Overall, in both ASEs and ACEs, protein domain overlapping is the most dominant exon feature while ESE frequency is the weakest one in affecting test results. CONCLUSION: The proposed method can easily find additive effects of individual or multiple factors on the K(A)/K(S )ratio test results of exons. Therefore, the system can analyze complex conditions in evolution where multiple features are involved. More factors can also be added into the system to extend the scope of evolutionary analysis of exons. In addition, our method may be useful when orthologous exons can not be found for the K(A)/K(S )ratio test

Crossref

Springer - Publisher Connector

National Health Research Institues

Directory of Open Access Journals

PubMed Central

Enrichment analysis of Alu elements with different spatial chromatin proximity in the human genome

Author: A Antonaki
A Huda
A Nekrutenko
A Smallwood
AF Smit
AM Deaton
C Esnault
C Feschotte
CB Lowe
CT Ong
D Grover
D Grover
D Schmidt
D Xie
E Berezikov
E Lieberman-Aiden
E Wit de
E Yaffe
EP Nora
ES Lander
ES Lander
F Cui
G Bourque
G Kunarso
G Li
G Li
GA Maston
GJ Faulkner
GN Gallus
H Santos-Rosa
H Santos-Rosa
H Xie
HH Kazazian Jr
IK Jordan
J Banerji
J Dekker
J Dostie
J Jurka
J Jurka
J Ule
JA Yoder
JE Hambor
JF Brookfield
JF Brookfield
JM Chen
JR Dixon
JR Korenberg
K Ahn
K Kaer
KC Wang
L Lin
L Teng
M Hackenberg
M Simonis
M Weber
MA Batzer
MG Kidwell
MH Kagey
MJ Fullwood
MM Suzuki
ND Heintzman
NR Smalheiser
P Jin
P Medstrand
P Polak
R Cordaux
R Eskeland
R Lister
R Schneider
R Sorek
RD Hawkins
S Shen
S Winkler
SD Gillies
SL Oei
T Pastor
T Wicker
V Kapitonov
VJ Lynch
WD Gifford
Y Lu
Y Quentin
Y Quentin
Y Zhang
Publication venue: 'Springer Science and Business Media LLC'
Publication date: 01/01/2016
Field of study

Transposable elements (TEs) have no longer been totally considered as “junk DNA” for quite a time since the continual discoveries of their multifunctional roles in eukaryote genomes. As one of the most important and abundant TEs that still active in human genome, Alu, a SINE family, has demonstrated its indispensable regulatory functions at sequence level, but its spatial roles are still unclear. Technologies based on 3C(chromosomeconformation capture) have revealed the mysterious three-dimensional structure of chromatin, and make it possible to study the distal chromatin interaction in the genome. To find the role TE playing in distal regulation in human genome, we compiled the new released Hi-C data, TE annotation, histone marker annotations, and the genome-wide methylation data to operate correlation analysis, and found that the density of Alu elements showed a strong positive correlation with the level of chromatin interactions (hESC: r=0.9, P<2.2×1016; IMR90 fibroblasts: r = 0.94, P < 2.2 × 1016) and also have a significant positive correlation withsomeremote functional DNA elements like enhancers and promoters (Enhancer: hESC: r=0.997, P=2.3×10−4; IMR90: r=0.934, P=2×10−2; Promoter: hESC: r = 0.995, P = 3.8 × 10−4; IMR90: r = 0.996, P = 3.2 × 10−4). Further investigation involving GC content and methylation status showed the GC content of Alu covered sequences shared a similar pattern with that of the overall sequence, suggesting that Alu elements also function as the GC nucleotide and CpG site provider. In all, our results suggest that the Alu elements may act as an alternative parameter to evaluate the Hi-C data, which is confirmed by the correlation analysis of Alu elements and histone markers. Moreover, the GC-rich Alu sequence can bring high GC content and methylation flexibility to the regions with more distal chromatin contact, regulating the transcription of tissue-specific genes

Crossref

Springer - Publisher Connector

Directory of Open Access Journals

University of Bedfordshire Repository