Search CORE

66 research outputs found

Recommended from our members

Interplay between DNA sequence and negative superhelicity drives R-loop structures.

Author: Benham Craig J
Chedin Frederic
Hartono Stella R
Malig Maika
Stolz Robert
Sulthana Shaheen
Publication venue: eScholarship, University of California
Publication date: 01/03/2019
Field of study

R-loops are abundant three-stranded nucleic-acid structures that form in cis during transcription. Experimental evidence suggests that R-loop formation is affected by DNA sequence and topology. However, the exact manner by which these factors interact to determine R-loop susceptibility is unclear. To investigate this, we developed a statistical mechanical equilibrium model of R-loop formation in superhelical DNA. In this model, the energy involved in forming an R-loop includes four terms-junctional and base-pairing energies and energies associated with superhelicity and with the torsional winding of the displaced DNA single strand around the RNA:DNA hybrid. This model shows that the significant energy barrier imposed by the formation of junctions can be overcome in two ways. First, base-pairing energy can favor RNA:DNA over DNA:DNA duplexes in favorable sequences. Second, R-loops, by absorbing negative superhelicity, partially or fully relax the rest of the DNA domain, thereby returning it to a lower energy state. In vitro transcription assays confirmed that R-loops cause plasmid relaxation and that negative superhelicity is required for R-loops to form, even in a favorable region. Single-molecule R-loop footprinting following in vitro transcription showed a strong agreement between theoretical predictions and experimental mapping of stable R-loop positions and further revealed the impact of DNA topology on the R-loop distribution landscape. Our results clarify the interplay between base sequence and DNA superhelicity in controlling R-loop stability. They also reveal R-loops as powerful and reversible topology sinks that cells may use to nonenzymatically relieve superhelical stress during transcription

eScholarship - University of California

LINE-1 Retrotransposition Activity in Human Genomes

Author: Badge Richard M.
Beck Christine R.
Collier Pamela
Eichler Evan E.
Kidd Jeffrey M.
Macfarlane Catriona
Malig Maika
Moran John V.
Publication venue: Elsevier Inc.
Publication date: 25/06/2010
Field of study

SummaryHighly active (i.e., “hot”) long interspersed element-1 (LINE-1 or L1) sequences comprise the bulk of retrotransposition activity in the human genome; however, the abundance of hot L1s in the human population remains largely unexplored. Here, we used a fosmid-based, paired-end DNA sequencing strategy to identify 68 full-length L1s that are differentially present among individuals but are absent from the human genome reference sequence. The majority of these L1s were highly active in a cultured cell retrotransposition assay. Genotyping 26 elements revealed that two L1s are only found in Africa and that two more are absent from the H952 subset of the Human Genome Diversity Panel. Therefore, these results suggest that hot L1s are more abundant in the human population than previously appreciated, and that ongoing L1 retrotransposition continues to be a major source of interindividual genetic variation

Elsevier - Publisher Connector

PubMed Central

Complete Haplotype Sequence of the Human Immunoglobulin Heavy-Chain Variable, Diversity, and Joining Genes and Characterization of Allelic and Copy-Number Variation

Author: Breden Felix
Eichler Evan E.
Graves Tina A.
Holt Robert A.
Huddleston John
Joy Jeffrey B.
Malig Maika
Schein Jacqueline
Scott Jamie K.
Steinberg Karyn M.
Warren Rene L.
Watson Corey T.
Willsey A. Jeremy
Wilson Richard K.
Publication venue: The American Society of Human Genetics. Published by Elsevier Inc.
Publication date: 04/04/2013
Field of study

The immunoglobulin heavy-chain locus (IGH) encodes variable (IGHV), diversity (IGHD), joining (IGHJ), and constant (IGHC) genes and is responsible for antibody heavy-chain biosynthesis, which is vital to the adaptive immune response. Programmed V-(D)-J somatic rearrangement and the complex duplicated nature of the locus have impeded attempts to reconcile its genomic organization based on traditional B-lymphocyte derived genetic material. As a result, sequence descriptions of germline variation within IGHV are lacking, haplotype inference using traditional linkage disequilibrium methods has been difficult, and the human genome reference assembly is missing several expressed IGHV genes. By using a hydatidiform mole BAC clone resource, we present the most complete haplotype of IGHV, IGHD, and IGHJ gene regions derived from a single chromosome, representing an alternate assembly of ∼1 Mbp of high-quality finished sequence. From this we add 101 kbp of previously uncharacterized sequence, including functional IGHV genes, and characterize four large germline copy-number variants (CNVs). In addition to this germline reference, we identify and characterize eight CNV-containing haplotypes from a panel of nine diploid genomes of diverse ethnic origin, discovering previously unmapped IGHV genes and an additional 121 kbp of insertion sequence. We genotype four of these CNVs by using PCR in 425 individuals from nine human populations. We find that all four are highly polymorphic and show considerable evidence of stratification (Fst = 0.3–0.5), with the greatest differences observed between African and Asian populations. These CNVs exhibit weak linkage disequilibrium with SNPs from two commercial arrays in most of the populations tested

Elsevier - Publisher Connector

PubMed Central

An evolutionary driver of interspersed segmental duplications in primates

Author: Anaclerio Fabio
Baker Carl
Cantsilieris Stuart
Catacchio Claudia Rita
Conlon Ronald A.
Dougherty Max L.
Eichler Evan E.
Girirajan Santhosh
Hsieh PingHsun
Huddleston John
Jiang Weihong
Johnson Matthew E.
Lamb Bruce T.
Malig Maika
Mao Yafei
Munson Katherine M.
Sorensen Melanie
Sulovari Arvis
Sunkin Susan M.
Underwood Jason G.
Ventura Mario
Welch AnneMarie E.
Publication venue: 'Springer Science and Business Media LLC'
Publication date: 01/01/2020
Field of study

Background The complex interspersed pattern of segmental duplications in humans is responsible for rearrangements associated with neurodevelopmental disease, including the emergence of novel genes important in human brain evolution. We investigate the evolution of LCR16a, a putative driver of this phenomenon that encodes one of the most rapidly evolving human–ape gene families, nuclear pore interacting protein (NPIP). Results Comparative analysis shows that LCR16a has independently expanded in five primate lineages over the last 35 million years of primate evolution. The expansions are associated with independent lineage-specific segmental duplications flanking LCR16a leading to the emergence of large interspersed duplication blocks at non-orthologous chromosomal locations in each primate lineage. The intron-exon structure of the NPIP gene family has changed dramatically throughout primate evolution with different branches showing characteristic gene models yet maintaining an open reading frame. In the African ape lineage, we detect signatures of positive selection that occurred after a transition to more ubiquitous expression among great ape tissues when compared to Old World and New World monkeys. Mouse transgenic experiments from baboon and human genomic loci confirm these expression differences and suggest that the broader ape expression pattern arose due to mutational changes that emerged in cis. Conclusions LCR16a promotes serial interspersed duplications and creates hotspots of genomic instability that appear to be an ancient property of primate genomes. Dramatic changes to NPIP gene structure and altered tissue expression preceded major bouts of positive selection in the African ape lineage, suggestive of a gene undergoing strong adaptive evolution

IUPUIScholarWorks

Archivio istituzionale della ricerca - Università di Bari

Recommended from our members

Whole-Genome Sequencing of Individuals from a Founder Population Identifies Candidate Genes for Asthma

Author: Abney Mark
Brigino-Buenaventura Emerita
Campbell Catarina D.
Chong Jessica X.
Du Gaixin
Eng Celeste
Herman Catherine
Hormozdiari Fereydoun
Hu Donglei
Ko Arthur
Krumm Niklas
Lee Choli
Malig Maika
Mohajeri Kiana
O'Roak Brian J.
Ober Carole
Patterson Kristen M.
Rodriguez-Cintron William
Rodriguez-Santana Jose
Roth Lindsey A.
Torgerson Dara G.
Vives Laura
Publication venue
Publication date: 01/02/2024
Field of study

Asthma is a complex genetic disease caused by a combination of genetic and environmental risk factors. We sought to test classes of genetic variants largely missed by genome-wide association studies (GWAS), including copy number variants (CNVs) and low-frequency variants, by performing whole-genome sequencing (WGS) on 16 individuals from asthma-enriched and asthma-depleted families. The samples were obtained from an extended 13-generation Hutterite pedigree with reduced genetic heterogeneity due to a small founding gene pool and reduced environmental heterogeneity as a result of a communal lifestyle. We sequenced each individual to an average depth of 13-fold, generated a comprehensive catalog of genetic variants, and tested the most severe mutations for association with asthma. We identified and validated 1960 CNVs, 19 nonsense or splice-site single nucleotide variants (SNVs), and 18 insertions or deletions that were out of frame. As follow-up, we performed targeted sequencing of 16 genes in 837 cases and 540 controls of Puerto Rican ancestry and found that controls carry a significantly higher burden of mutations in IL27RA (2.0% of controls; 0.23% of cases; nominal p = 0.004; Bonferroni p = 0.21). We also genotyped 593 CNVs in 1199 Hutterite individuals. We identified a nominally significant association (p = 0.03; Odds ratio (OR) = 3.13) between a 6 kbp deletion in an intron of NEDD4L and increased risk of asthma. We genotyped this deletion in an additional 4787 non-Hutterite individuals (nominal p = 0.056; OR = 1.69). NEDD4L is expressed in bronchial epithelial cells, and conditional knockout of this gene in the lung in mice leads to severe inflammation and mucus accumulation. Our study represents one of the early instances of applying WGS to complex disease with a large environmental component and demonstrates how WGS can identify risk variants, including CNVs and low-frequency variants, largely untested in GWAS

Knowledge UChicago

An integrated map of structural variation in 2,504 human genomes

Author: Abyzov Alexej
Alkan Can
Antaki Danny
Auton Adam
Bae Taejeong
Casale Francesco Paolo
Cerveira Eliza
Chaisson Mark J.P.
Chen Jieming
Chen Ken
Chines Peter
Chong Zechen
Dayama Gargi
Fritz Markus His Yang
Gardner Eugene J.
Garrison Erik
Handsaker Robert E.
Hormozdiari Fereydoun
Huddleston John
Jun Goo
Kashin Seva
Konkel Miriam K.
Lam Hugo Y.K.
Malhotra Ankit
Malig Maika
Meiers Sascha
Mu Xinmeng Jasmine
Rausch Tobias
Shi Xinghua
Stütz Adrian M.
Sudmant Peter H.
Walter Klaudia
Ye Kai
Zhang Yan
Publication venue: LSU Digital Commons
Publication date: 30/09/2015
Field of study

© 2015 Macmillan Publishers Limited. All rights reserved. Structural variants are implicated in numerous diseases and make up the majority of varying nucleotides among human genomes. Here we describe an integrated set of eight structural variant classes comprising both balanced and unbalanced variants, which we constructed using short-read DNA sequencing data and statistically phased onto haplotype blocks in 26 human populations. Analysing this set, we identify numerous gene-intersecting structural variants exhibiting population stratification and describe naturally occurring homozygous gene knockouts that suggest the dispensability of a variety of human genes. We demonstrate that structural variants are enriched on haplotypes identified by genome-wide association studies and exhibit enrichment for expression quantitative trait loci. Additionally, we uncover appreciable levels of structural variant complexity at different scales, including genic loci subject to clusters of repeated rearrangement and complex structural variants with multiple breakpoints likely to have formed through individual mutational events. Our catalogue will enhance future studies into structural variant demography, functional impact and disease association

Louisiana State University

Diversity of human copy number variation and multicopy genes

Author: Abecasis G. R.
Alkan Can
Altshuler D. L.
Antonacci Francesca
Bentley D. R.
Bruhn Laurakay
Chakravarti A.
Clark A. G.
Collins F. S.
De La Vega F. M.
Donnelly P.
Durbin R. M.
Egholm M.
Eichler Evan E.
Flicek P.
Gabriel S. B.
Gibbs R. A.
Kitzman Jacob O.
Knoppers B. M.
Lander E. S.
Lehrach H.
Malig Maika
Mardis E. R.
McVean G. A.
Nickerson D. A.
Peltonen L.
Sampas Nick
Schafer A. J.
Shendure Jay
Sherry S. T.
Sudmant Peter H.
Tsalenko Anya
Wang J.
Wilson R. K.
Publication venue: LSU Digital Commons
Publication date: 29/10/2010
Field of study

Copy number variants affect both disease and normal phenotypic variation, but those lying within heavily duplicated, highly identical sequence have been difficult to assay. By analyzing short-read mapping depth for 159 human genomes, we demonstrated accurate estimation of absolute copy number for duplications as small as 1.9 kilobase pairs, ranging from 0 to 48 copies. We identified 4.1 million singly unique nucleotide positions informative in distinguishing specific copies and used them to genotype the copy and content of specific paralogs within highly duplicated gene families. These data identify human-specific expansions in genes associated with brain development, reveal extensive population genetic diversity, and detect signatures consistent with gene conversion in the human species. Our approach makes ∼1000 genes accessible to genetic studies of disease association

Louisiana State University