Search CORE

13 research outputs found

Whole Genome Phylogenetic Tree Reconstruction Using Colored de Bruijn Graphs

Author: Bodily Paul M.
Bybee Seth M.
Clement Mark J.
Crandall Keith A.
Fujimoto M. Stanley
Lyman Cole A.
Snell Quinn
Suvorov Anton
Publication venue: 'Institute of Electrical and Electronics Engineers (IEEE)'
Publication date: 01/07/2017
Field of study

We present kleuren, a novel assembly-free method to reconstruct phylogenetic trees using the Colored de Bruijn Graph. kleuren works by constructing the Colored de Bruijn Graph and then traversing it, finding bubble structures in the graph that provide phylogenetic signal. The bubbles are then aligned and concatenated to form a supermatrix, from which a phylogenetic tree is inferred. We introduce the algorithms that kleuren uses to accomplish this task, and show its performance on reconstructing the phylogenetic tree of 12 Drosophila species. kleuren reconstructed the established phylogenetic tree accurately, and is a viable tool for phylogenetic tree reconstruction using whole genome sequences. Software package available at: https://github.com/Colelyman/kleurenComment: 6 pages, 3 figures, accepted at BIBE 2017. Minor modifications to the text due to reviewer feedback and fixed typo

arXiv.org e-Print Archive

Crossref

George Washington University: Health Sciences Research Commons (HSRC)

Learning the Language of Genes: Representing Global Codon Bias with Deep Language Models

Author: Bodily Paul M.
Clement Mark J.
Fujimoto M. Stanley
Jacobsen J. Andrew
Lyman Cole A.
Publication venue: DigitalCommons@USU
Publication date: 08/05/2017
Field of study

Codon bias, the usage patterns of synonymous codons for encoding a protein sequence as nucleotides, is a biological phenomenon that is not well understood. Current methods that measure and model the codon bias of an organism exist for usage in codon optimization. In synthetic biology, codon optimization is a task the involves selecting the appropriate codons to reverse translate a protein sequence into a nucleotide sequence to maximize expression in a vector. These features include codon adaptation index (CAI) [1], individual codon usage (ICU), hidden stop codons (HSC) [2] and codon context (CC) [3]. While explicitly modeling these features has helped us to engineer high synthesis yield proteins, it is unclear what other biological features should be taken into account during codon selection for protein synthesis maximization. In this article, we present a method for modeling global codon bias through deep language models that is more robust than current methods by providing more contextual information and long-range dependencies to be considered during codon selection

DigitalCommons@USU

Dimensions of distance: international flight connections, historical determinism, and economic relations in Africa

Author: Anton Suvorov (129941)
Camilla Sharkey (3243678)
Haley Wightman (3243666)
M. Fujimoto (3243669)
Mark Clement (3243663)
Nicholas Jensen (3243675)
Paul Bodily (3243672)
Seth Bybee (3243681)
T. Ogden (3243684)
Publication venue: Emerald
Publication date: 17/10/2016
Field of study

Purpose: The paper examines how distance manifests in terms of air passenger transport links between countries and focuses on the 48 countries of sub-Saharan Africa (SSA). It asks to what extent do existing flight connections reflect economic relations between countries and if so, do they represent past, current or future relations? It asks whether the impact of distance is similar for all countries and at different stages of development. Design/methodology/approach: Passenger flight connection data was extracted to generate map images and flight frequencies in order to observe inter-relationships between different locations and to observe emerging patterns. The paper uses ESRIs ArcGIS software to visualise all these data into maps. Findings: SSA is poorly connected both intra- and inter-continentally. Cultural and historical ties dominate and elements of historical determinism appear within flight connections in SSA reflecting the biases associated with colonialism. Larger economies in SSA are less dependent on these past ties and their flight connections reveal a greater level of diversity and interests. SSA has generally been slow to develop flight routings to the new emerging markets. Originality/value: Its contribution lies not only in examining these flight patterns for an under-researched region but aides in future work on SSA and its integration into the global economy and international business networks. It argues that whilst distance matters; how it matters varies

Dryad Digital Repository (Duke University)

Sussex Research Online

FigShare

Recommended from our members

Inferring Structural Constraints in Musical Sequences via Multiple Self-Alignment

Author: Bodily Paul Mark
Ventura Dan
Publication venue: eScholarship, University of California
Publication date: 01/01/2021
Field of study

A critical aspect of the way humans recognize and understand meaning in sequential data is the ability to identify abstract structural repetitions. We present a novel approach to discovering structural repetitions within sequences that uses a multiple Smith-Waterman self-alignment. We illustrate our approach in the context of finding different forms of structural repetition in music composition. Feature-specific alignment scoring functions enable structure finding in primitive features such as rhythm, melody, and lyrics. These can be compounded to create scoring functions that find higher-level structure including verse-chorus structure. We demonstrate our approach by finding harmonic, pitch, rhythmic, and lyrical structure in symbolic music and compounding these viewpoints to identify the abstract structure of verse-chorus segmentation

eScholarship - University of California

Genome polymorphism detection through relaxed de bruijn graph construction

Author: Bodily Paul
Bybee Seth
Clement Mark
Crandall Keith
Fujimoto M. Stanley
Lyman Cole
Snell Quinn
Suvorov Anton
Publication venue: Health Sciences Research Commons
Publication date: 01/07/2017
Field of study

Comparing genomes to identify polymorphisms is a difficult task, especially beyond single nucleotide poly-morphisms. Polymorphism detection is important in disease association studies as well as in phylogenetic tree reconstruc-tion. We present a method for identifying polymorphisms in genomes by using a modified version de Bruijn graphs, data structures widely used in genome assembly from Next-Generation Sequencing. Using our method, we are able to identify polymorphisms that exist within a genome as well as well as see graph structures that form in the de Bruijn graph for particular types of polymorphisms (translocations, etc.

George Washington University: Health Sciences Research Commons (HSRC)

Deep ancestral introgression shapes evolutionary history of dragonflies and damselflies

Author: Bodily Paul
Bybee Seth
Clement Mark
Crandall Keith
Fujimoto M Stanley
Schrider Daniel
Scornavacca Celine
Suvorov Anton
Whiting Michael
Publication venue: 'Oxford University Press (OUP)'
Publication date: 29/07/2021
Field of study

International audienceAbstract Introgression is an important biological process affecting at least 10% of the extant species in the animal kingdom. Introgression significantly impacts inference of phylogenetic species relationships where a strictly binary tree model cannot adequately explain reticulate net-like species relationships. Here we use phylogenomic approaches to understand patterns of introgression along the evolutionary history of a unique, non-model insect system: dragonflies and damselflies (Odonata). We demonstrate that introgression is a pervasive evolutionary force across various taxonomic levels within Odonata. In particular, we show that the morphologically “intermediate” species of Anisozygoptera (one of the three primary suborders within Odonata besides Zygoptera and Anisoptera), which retain phenotypic characteristics of the other two suborders, experienced high levels of introgression likely coming from zygopteran genomes. Additionally, we find evidence for multiple cases of deep inter-superfamilial ancestral introgression

HAL Descartes

HAL-IRD

HAL-CIRAD

George Washington University: Health Sciences Research Commons (HSRC)

Hal-Diderot

ScaffoldScaffolder: solving contig orientation via bidirected to directed graph reduction

Author: Achterberg
Dan Ventura
Edmonds
Khot
M. Stanley Fujimoto
Makhorin
Mark J. Clement
Medvedev
Paul M. Bodily
Quinn Snell
Publication venue: 'Oxford University Press (OUP)'
Publication date
Field of study

Crossref

Data from: Opsins have evolved under the permanent heterozygote model: insights from phylotranscriptomics of Odonata

Author: Bodily Paul
Bybee Seth M.
Clement Mark J.
Fujimoto M. Stanley
Jensen Nicholas O.
Ogden T. Heath
Sharkey Camilla R.
Suvorov Anton
Wightman Haley M. Cahill
Publication venue
Publication date: 17/10/2016
Field of study

Gene duplication plays a central role in adaptation to novel environments by providing new genetic material for functional divergence and evolution of biological complexity. Several evolutionary models have been proposed for gene duplication to explain how new gene copies are preserved by natural selection, but these models have rarely been tested using empirical data. Opsin proteins, when combined with a chromophore, form a photopigment that is responsible for the absorption of light, the first step in the phototransduction cascade. Adaptive gene duplications have occurred many times within the animal opsins’ gene family, leading to novel wavelength sensitivities. Consequently, opsins are an attractive choice for the study of gene duplication evolutionary models. Odonata (dragonflies and damselflies) have the largest opsin repertoire of any insect currently known. Additionally, there is tremendous variation in opsin copy number between species, particularly in the long-wavelength-sensitive (LWS) class. Using comprehensive phylotranscriptomic and statistical approaches, we tested various evolutionary models of gene duplication. Our results suggest that both the blue-sensitive (BS) and LWS opsin classes were subjected to strong positive selection that greatly weakens after multiple duplication events, a pattern that is consistent with the permanent heterozygote model. Due to the immense interspecific variation and duplicability potential of opsin genes among odonates, they represent a unique model system to test hypotheses regarding opsin gene duplication and diversification at the molecular level

ZENODO

Dryad Digital Repository (Duke University)

Electronic Archiving System

NEUROSURGERY ENTHUSIASTIC WOMEN SOCIETY

Bayesian "diffusion" model

Author: Anton Suvorov (129941)
Camilla Sharkey (3243678)
Haley Wightman (3243666)
M. Fujimoto (3243669)
Mark Clement (3243663)
Nicholas Jensen (3243675)
Paul Bodily (3243672)
Seth Bybee (3243681)
T. Ogden (3243684)
Publication venue
Publication date: 17/10/2016
Field of study

Naive Bayesian model of selection "diffusion" written in R programming language

Dryad Digital Repository (Duke University)

FigShare