Human centromeres: from initial assemblies to structural and evolutionary analysis

Abstract

Recent advances in long-read sequencing technologies allowed generation of the first complete assembly of a human genome. They revealed previously inaccessible sequences of human centromeres and allowed analysis of their structure and evolution. We introduce centroFlye — the first algorithm for automated assembly of centromeres from error-prone long reads. We then describe TandemTools and VerityMap algorithms for quality assessment of the newly assembled regions. Afterwards, we present StringDecomposer, CentromereArchitect, and HORmon algorithms for structural and evolutionary analysis of human centromeres. We introduce LJA — the first de Bruijn-based genome assembler for accurate long reads. Finally, we describe TandemAligner —– the first parameter-free sequence alignment algorithm that introduces a sequence-dependent scoring that automatically changes for any pair of compared sequences

    Similar works

    Full text

    thumbnail-image

    Available Versions

    Last time updated on 18/01/2023
    Last time updated on 18/01/2023