59 research outputs found

    breakpointR:an R/Bioconductor package to localize strand state changes in Strand-seq data

    Get PDF
    MOTIVATION: Strand-seq is a specialized single-cell DNA sequencing technique centered around the directionality of single-stranded DNA. Computational tools for Strand-seq analyses must capture the strand-specific information embedded in these data. RESULTS: Here we introduce breakpointR, an R/Bioconductor package specifically tailored to process and interpret single-cell strand-specific sequencing data obtained from Strand-seq. We developed breakpointR to detect local changes in strand directionality of aligned Strand-seq data, to enable fine-mapping of sister chromatid exchanges, germline inversion and to support global haplotype assembly. Given the broad spectrum of Strand-seq applications we expect breakpointR to be an important addition to currently available tools and extend the accessibility of this novel sequencing technique. AVAILABILITY: R/Bioconductor package https://bioconductor.org/packages/breakpointR

    Dense and accurate whole-chromosome haplotyping of individual genomes

    Get PDF
    The diploid nature of the human genome is neglected in many analyses done today, where a genome is perceived as a set of unphased variants with respect to a reference genome. This lack of haplotype-level analyses can be explained by a lack of methods that can produce dense and accurate chromosome-length haplotypes at reasonable costs. Here we introduce an integrative phasing strategy that combines global, but sparse haplotypes obtained from strand-specific single-cell sequencing (Strand-seq) with dense, yet local, haplotype information available through long-read or linked-read sequencing. We provide comprehensive guidance on the required sequencing depths and reliably assign more than 95% of alleles (NA12878) to their parental haplotypes using as few as 10 Strand-seq libraries in combination with 10-fold coverage PacBio data or, alternatively, 10X Genomics linked-read sequencing data. We conclude that the combination of Strand-seq with different technologies represents an attractive solution to chart the genetic variation of diploid genomes

    Novel Cephalosporins Selectively Active on Nonreplicating Mycobacterium tuberculosis

    Get PDF
    We report two series of novel cephalosporins that are bactericidal to Mycobacterium tuberculosis alone of the pathogens tested, which only kill M. tuberculosis when its replication is halted by conditions resembling those believed to pertain in the host, and whose bactericidal activity is not dependent upon or enhanced by clavulanate, a ÎČ-lactamase inhibitor. The two classes of cephalosporins bear an ester or alternatively an oxadiazole isostere at C-2 of the cephalosporin ring system, a position that is almost exclusively a carboxylic acid in clinically used agents in the class. Representatives of the series kill M. tuberculosis within macrophages without toxicity to the macrophages or other mammalian cells

    Functional analysis of structural variants in single cells using Strand-seq

    Full text link
    Somatic structural variants (SVs) are widespread in cancer, but their impact on disease evolution is understudied due to a lack of methods to directly characterize their functional consequences. We present a computational method, scNOVA, which uses Strand-seq to perform haplotype-aware integration of SV discovery and molecular phenotyping in single cells by using nucleosome occupancy to infer gene expression as a readout. Application to leukemias and cell lines identifies local effects of copy-balanced rearrangements on gene deregulation, and consequences of SVs on aberrant signaling pathways in subclones. We discovered distinct SV subclones with dysregulated Wnt signaling in a chronic lymphocytic leukemia patient. We further uncovered the consequences of subclonal chromothripsis in T cell acute lymphoblastic leukemia, which revealed c-Myb activation, enrichment of a primitive cell state and informed successful targeting of the subclone in cell culture, using a Notch inhibitor. By directly linking SVs to their functional effects, scNOVA enables systematic single-cell multiomic studies of structural variation in heterogeneous cell populations

    A high-quality bonobo genome refines the analysis of hominid evolution

    Get PDF
    The divergence of chimpanzee and bonobo provides one of the few examples of recent hominid speciation1,2. Here we describe a fully annotated, high-quality bonobo genome assembly, which was constructed without guidance from reference genomes by applying a multiplatform genomics approach. We generate a bonobo genome assembly in which more than 98% of genes are completely annotated and 99% of the gaps are closed, including the resolution of about half of the segmental duplications and almost all of the full-length mobile elements. We compare the bonobo genome to those of other great apes1,3,4,5 and identify more than 5,569 fixed structural variants that specifically distinguish the bonobo and chimpanzee lineages. We focus on genes that have been lost, changed in structure or expanded in the last few million years of bonobo evolution. We produce a high-resolution map of incomplete lineage sorting and estimate that around 5.1% of the human genome is genetically closer to chimpanzee or bonobo and that more than 36.5% of the genome shows incomplete lineage sorting if we consider a deeper phylogeny including gorilla and orangutan. We also show that 26% of the segments of incomplete lineage sorting between human and chimpanzee or human and bonobo are non-randomly distributed and that genes within these clustered segments show significant excess of amino acid replacement compared to the rest of the genome

    Computational pan-genomics: status, promises and challenges

    Get PDF
    International audienceMany disciplines, from human genetics and oncology to plant breeding, microbiology and virology, commonly face the challenge of analyzing rapidly increasing numbers of genomes. In case of Homo sapiens, the number of sequenced genomes will approach hundreds of thousands in the next few years. Simply scaling up established bioinformatics pipelines will not be sufficient for leveraging the full potential of such rich genomic data sets. Instead, novel, qualitatively different computational methods and paradigms are needed. We will witness the rapid extension of computational pan-genomics, a new sub-area of research in computational biology. In this article, we generalize existing definitions and understand a pan-genome as any collection of genomic sequences to be analyzed jointly or to be used as a reference. We examine already available approaches to construct and use pan-genomes, discuss the potential benefits of future technologies and methodologies and review open challenges from the vantage point of the above-mentioned biological disciplines. As a prominent example for a computational paradigm shift, we particularly highlight the transition from the representation of reference genomes as strings to representations as graphs. We outline how this and other challenges from different application domains translate into common computational problems, point out relevant bioinformatics techniques and identify open problems in computer science. With this review, we aim to increase awareness that a joint approach to computational pan-genomics can help address many of the problems currently faced in various domains

    A Molecular Signature of Proteinuria in Glomerulonephritis

    Get PDF
    Proteinuria is the most important predictor of outcome in glomerulonephritis and experimental data suggest that the tubular cell response to proteinuria is an important determinant of progressive fibrosis in the kidney. However, it is unclear whether proteinuria is a marker of disease severity or has a direct effect on tubular cells in the kidneys of patients with glomerulonephritis. Accordingly we studied an in vitro model of proteinuria, and identified 231 “albumin-regulated genes” differentially expressed by primary human kidney tubular epithelial cells exposed to albumin. We translated these findings to human disease by studying mRNA levels of these genes in the tubulo-interstitial compartment of kidney biopsies from patients with IgA nephropathy using microarrays. Biopsies from patients with IgAN (n = 25) could be distinguished from those of control subjects (n = 6) based solely upon the expression of these 231 “albumin-regulated genes.” The expression of an 11-transcript subset related to the degree of proteinuria, and this 11-mRNA subset was also sufficient to distinguish biopsies of subjects with IgAN from control biopsies. We tested if these findings could be extrapolated to other proteinuric diseases beyond IgAN and found that all forms of primary glomerulonephritis (n = 33) can be distinguished from controls (n = 21) based solely on the expression levels of these 11 genes derived from our in vitro proteinuria model. Pathway analysis suggests common regulatory elements shared by these 11 transcripts. In conclusion, we have identified an albumin-regulated 11-gene signature shared between all forms of primary glomerulonephritis. Our findings support the hypothesis that albuminuria may directly promote injury in the tubulo-interstitial compartment of the kidney in patients with glomerulonephritis

    Recurrent inversion toggling and great ape genome evolution.

    No full text
    • 

    corecore