Visualization Tools for Comparative Genomics applied to Convergent Evolution in Ash Trees

Abstract

Assembly and analysis of whole genomes is now a routine part of genetic research, but effective tools for the visualization of whole genomes and their alignments are few. Here we present two approaches to allow such visualizations to be done in an efficient and user-friendly manner. These allow researchers to spot problems and patterns in their data and present them effectively. First, FluentDNA is developed to tackle single full genome visualization and assembly tasks by representing nucleotides as colored pixels in a zooming interface. This enables users to identify features without relying on algorithmic annotation. FluentDNA also supports visualizing pairwise alignments of wellassembled whole genomes from chromosome to nucleotide resolution. Second, Pantograph is developed to tackle the problem of visualizing variation among large numbers of whole genome sequences. This uses a graph genome approach, which addresses many of the technical challenges of whole genome multiple sequence alignments by representing aligned sequences as nodes which can be shared by many individuals. Pantograph is capable of scaling to thousands of individuals and is applied to SARS and A. thaliana pangenomes. Alongside the development of these new genomics tools, comparative genomic research was undertaken on worldwide species of ash trees. I assembled 13 ash genomes and used FluentDNA to quality check the results and discovered contaminants and a mitochondrial integration. I annotated protein coding genes in 28 ash assemblies and aligned their gene families. Using phylogenetic analysis, I identified gene duplications that likely occurred in an ancient whole genome duplication shared by all ash species. I examined the fate of these duplicated genes, showing that losses are concentrated in a subset of gene families more often than predicted by a null model simulation. I conclude that convergent evolution has occurred in the loss and retention of duplicated genes in different ash species.BBSRC BB/S004661/

    Similar works