23 research outputs found

    Human centromeres: from initial assemblies to structural and evolutionary analysis

    No full text
    Recent advances in long-read sequencing technologies allowed generation of the first complete assembly of a human genome. They revealed previously inaccessible sequences of human centromeres and allowed analysis of their structure and evolution. We introduce centroFlye β€” the first algorithm for automated assembly of centromeres from error-prone long reads. We then describe TandemTools and VerityMap algorithms for quality assessment of the newly assembled regions. Afterwards, we present StringDecomposer, CentromereArchitect, and HORmon algorithms for structural and evolutionary analysis of human centromeres. We introduce LJA β€” the first de Bruijn-based genome assembler for accurate long reads. Finally, we describe TandemAligner —– the first parameter-free sequence alignment algorithm that introduces a sequence-dependent scoring that automatically changes for any pair of compared sequences

    DataSheet_1_A scalable model for simulating multi-round antibody evolution and benchmarking of clonal tree reconstruction methods.pdf

    No full text
    Affinity maturation (AM) of B cells through somatic hypermutations (SHMs) enables the immune system to evolve to recognize diverse pathogens. The accumulation of SHMs leads to the formation of clonal lineages of antibody-secreting b cells that have evolved from a common naΓ―ve B cell. Advances in high-throughput sequencing have enabled deep scans of B cell receptor repertoires, paving the way for reconstructing clonal trees. However, it is not clear if clonal trees, which capture microevolutionary time scales, can be reconstructed using traditional phylogenetic reconstruction methods with adequate accuracy. In fact, several clonal tree reconstruction methods have been developed to fix supposed shortcomings of phylogenetic methods. Nevertheless, no consensus has been reached regarding the relative accuracy of these methods, partially because evaluation is challenging. Benchmarking the performance of existing methods and developing better methods would both benefit from realistic models of clonal lineage evolution specifically designed for emulating B cell evolution. In this paper, we propose a model for modeling B cell clonal lineage evolution and use this model to benchmark several existing clonal tree reconstruction methods. Our model, designed to be extensible, has several features: by evolving the clonal tree and sequences simultaneously, it allows modeling selective pressure due to changes in affinity binding; it enables scalable simulations of large numbers of cells; it enables several rounds of infection by an evolving pathogen; and, it models building of memory. In addition, we also suggest a set of metrics for comparing clonal trees and measuring their properties. Our results show that while maximum likelihood phylogenetic reconstruction methods can fail to capture key features of clonal tree expansion if applied naively, a simple post-processing of their results, where short branches are contracted, leads to inferences that are better than alternative methods.</p

    Simulated barcoded Rep-seq datasets (IGH, barcode length: 15 nt)

    No full text
    <p>Simulated barcoded Rep-seq libraries with various amplification error rates for repertoire from doi.org/10.5281/zenodo.823351. Barcode errors, barcode collisions and chimeric reads are introduced into datasets. Barcodes are encoded in headers.</p
    corecore