Schelles Multiples Sequenz Alignment

Abstract

Sequencing technologies are continuously improving and provide access to anincreasing amount of data. This gives rise to many opportunities to gather new insightsabout the sequenced organisms. However, handling such volumes of data is a challenge onits own and established, alignment-based, methods for sequence comparison are reachingtheir capacities. The alternative,alignment-freemethods scale better to large datasetsand are especially useful for phylogeny reconstruction. But their effectiveness comes withthe disadvantage of loosing information about the underlying alignment structure (Vinga,2014). In this thesis, the approach ofanchor alignmentsis implemented. Anchor align-ments have already shown good results inphylonium(Klötzl and Haubold, 2019) foralignment-freedistance estimation. The goal of this thesis is to make the underlyingalignment accessible and to evaluate the results against alignment-based methods.The resulting programparis faster than classical alignment-based approaches. The align-ments are accurate on very closely related genomes that are currently collected duringpangenomic outbreaks. However, as the sequences become more divergent, the accuracystarts to drop quickly

    Similar works