Search CORE

13 research outputs found

GENOMIC SIGNATURES OF ADAPTIVE EVOLUTION

Author: Weber Jessica
Publication venue: UNM Digital Repository
Publication date: 27/07/2018
Field of study

Comparative genomics has revolutionized virtually all fields of biology including the study of evolution. In this dissertation, I used next-generation sequencing to explore the evolutionary histories and adaptive evolution of a diverse set of taxa. Comparisons ranged across time scales, from population-level genetic diversity studies to questions spanning the deepest branches of the metazoan lineage. Whole genome sequencing of 50 unrelated Korean individuals revealed that Koreans have a distinct genetic history from the Chinese and Japanese populations. Our Korean-specific variome database was used to identify novel disease-causing variants in the Korean population, highlighting the value of high-quality ethnic variation databases for the accurate interpretation of individual genomes and genetic variations. Using multi-species comparative genomics of mammals, I identified signatures of high-altitude adaptation in the endangered long-tailed goral (Naemorhedus caudatus) in the mountains of Korea, and in a separate study in three species of closely-related montane guinea pigs in the Andes of South America. Phylogenomic analyses were used to confirm that the source species of the domestic guinea pig (Cavia porcellus) was the high-altitude species Cavia tschudii, not the lowland Cavia aperea. Finally, the first jellyfish (Nemopilema nomurai) and shark (Rhinocodon typus) genomes were assembled and used to identify genetic features unique to those lineages. Large scale genomic comparisons of over 80 metazoans revealed correlations between a number of physiological and genetic traits. Taken together, this dissertation shows the power of comparative genomics to address fundamental biological questions across evolutionary time and diverse non-model systems

Secondary Structure Predictions for Long RNA Sequences Based on Inversion Excursions and MapReduce

Author: Johnson Kyle L
Kodimala Vikram K.
Leung Ming-Ying
Taufer Michela
Yehdego Daniel T.
Zhang Boyu
Publication venue: ScholarWorks@UTEP
Publication date: 01/01/2013
Field of study

Secondary structures of ribonucleic acid (RNA) molecules play important roles in many biological processes including gene expression and regulation. Experimental observations and computing limitations suggest that we can approach the secondary structure prediction problem for long RNA sequences by segmenting them into shorter chunks, predicting the secondary structures of each chunk individually using existing prediction programs, and then assembling the results to give the structure of the original sequence. The selection of cutting points is a crucial component of the segmenting step. Noting that stem-loops and pseudo knots always contain an inversion, i.e., a stretch of nucleotides followed closely by its inverse complementary sequence, we developed two cutting methods for segmenting long RNA sequences based on inversion excursions: the centered and optimized method. Each step of searching for inversions, chunking, and predictions can be performed in parallel. In this paper we use a MapReduce framework, i.e., Hadoop, to extensively explore meaningful inversion stem lengths and gap sizes for the segmentation and identify correlations between chunking methods and prediction accuracy. We show that for a set of long RNA sequences in the RFAM database, whose secondary structures are known to contain pseudo knots, our approach predicts secondary structures more accurately than methods that do not segment the sequence, when the latter predictions are possible computationally. We also show that, as sequences exceed certain lengths, some programs cannot computationally predict pseudo knots while our chunking methods can. Overall, our predicted structures still retain the accuracy level of the original prediction programs when compared with known experimental secondary structure

DigitalCommons@UTEP

Crossref

PubMed Central

Using MapReduce Streaming for Distributed Life Simulation on the Cloud

Author: Radenski Atanas
Publication venue: Chapman University Digital Commons
Publication date: 01/01/2013
Field of study

Distributed software simulations are indispensable in the study of large-scale life models but often require the use of technically complex lower-level distributed computing frameworks, such as MPI. We propose to overcome the complexity challenge by applying the emerging MapReduce (MR) model to distributed life simulations and by running such simulations on the cloud. Technically, we design optimized MR streaming algorithms for discrete and continuous versions of Conway’s life according to a general MR streaming pattern. We chose life because it is simple enough as a testbed for MR’s applicability to a-life simulations and general enough to make our results applicable to various lattice-based a-life models. We implement and empirically evaluate our algorithms’ performance on Amazon’s Elastic MR cloud. Our experiments demonstrate that a single MR optimization technique called strip partitioning can reduce the execution time of continuous life simulations by 64%. To the best of our knowledge, we are the first to propose and evaluate MR streaming algorithms for lattice-based simulations. Our algorithms can serve as prototypes in the development of novel MR simulation algorithms for large-scale lattice-based a-life models.https://digitalcommons.chapman.edu/scs_books/1014/thumbnail.jp

Chapman University Digital Commons