6,055 research outputs found

    A quantitative method for measuring and visualizing species\u27 relatedness in a two-dimensional Euclidean space.

    Get PDF
    Representing DNA sequences graphically and evaluating, as well as displaying, species’ relationships have been considered to be an important aspect of molecular biology research. A novel approach is proposed in this thesis that combines three methods: a) Chaos Game Representation (CGR), to portray quantitative characteristics of a DNA sequence as a black-and -white image, b) Structural Similarity (SSIM) index, an image comparison method, to compute pair-wise distances between these images, and c) Multidimensional Scaling (MDS), to visually display each sequence as a point in a two-dimensional Euclidean space. The proposed method produces a visual representation called Genome Distance Map (GDM) when applied to a collection of genomic DNA sequences. In a resulting Genome Distance Map, the sequences can be visualized as points in a common two-dimensional Euclidean space, wherein the geometric distance between any two points is approximate to the differences between their respective DNA sequence compositions. In addition, the proposed Genome Distance Map provides a compelling visualization of species’ relatedness in comparison to the phylogenetic trees. Moreover, the proposed method is sensitive and robust in detecting insertions, deletions, substitutions of nucleotides in a genome

    Oligonucleotide Frequencies of Barcoding Loci Can Discriminate Species across Kingdoms

    Get PDF
    Background: DNA barcoding refers to the use of short DNA sequences for rapid identification of species. Genetic distance or character attributes of a particular barcode locus discriminate the species. We report an efficient approach to analyze short sequence data for discrimination between species. Methodology and Principal Findings: A new approach, Oligonucleotide Frequency Range (OFR) of barcode loci for species discrimination is proposed. OFR of the loci that discriminates between species was characteristic of a species, i.e., the maxima and minima within a species did not overlap with that of other species. We compared the species resolution ability of different barcode loci using p-distance, Euclidean distance of oligonucleotide frequencies, nucleotide-character based approach and OFR method. The species resolution by OFR was either higher or comparable to the other methods. A short fragment of 126 bp of internal transcribed spacer region in ribosomal RNA gene was sufficient to discriminate a majority of the species using OFR. Conclusions/Significance: Oligonucleotide frequency range of a barcode locus can discriminate between species. Ability to discriminate species using very short DNA fragments may have wider applications in forensic and conservation studies

    Molecular Evolution of Hominoid Primates: Phylogeny and Regulation

    Get PDF
    The complete mtDNA of one eastern gorilla was sequenced to provide the most accurate date for the mitochondrial divergence of gorillas. The most recent common ancestor of eastern lowland and western lowland gorillas existed about 1.9 million years ago, slightly more recent than that of chimpanzee and bonobo. This study also depicts that the eastern and western gorillas show species level genetic divergence. Hominoid mating systems differ tremendously. The level of sperm competition varies according to the mating system, which presumably imposes unique selective pressures on the seminal proteins of each species. Cartilage acidic protein 1 (CRTAC1) was identified in our lab as the protein with the largest difference in abundance between human and chimpanzee, being found at 142-fold higher in chimpanzee. The coding region of CRTAC1 is extremely conserved with signature of strong purifying selection. Paradoxically, CRTAC1 `promoter\u27 from human drives transcription significantly greater than chimpanzee, with or without androgen stimulation. Analyzing H3K27Ac data, a ~2.2kb region was identified as a possible additional cis-regulatory element. The cis-regulatory region behaved like a silencer and aided in strong transcriptional repression in humans. Although its underlying basis remains elusive, it can be speculated that the differential expression of CRTAC1 between human and chimpanzee seminal plasma results from tissue specific over/under expression of this gene. The unique gains and losses of miRNAs within hominoids have remained understudied. The overall goal of this project was to identify the uniquely gained and lost miRNAs and their targets within hominoids. I found 14 miRNAs uniquely gained in humans. Maximum uniquely gained and lost miRNAs were found to be brain specific. The targets of uniquely gained miRNAs in human are also associated with brain-associated functions. Older miRNAs were found to be more conserved compared to the newer miRNAs gained \u3c15 Mya

    The Development of Three Long Universal Nuclear Protein-Coding Locus Markers and Their Application to Osteichthyan Phylogenetics with Nested PCR

    Get PDF
    BACKGROUND: Universal nuclear protein-coding locus (NPCL) markers that are applicable across diverse taxa and show good phylogenetic discrimination have broad applications in molecular phylogenetic studies. For example, RAG1, a representative NPCL marker, has been successfully used to make phylogenetic inferences within all major osteichthyan groups. However, such markers with broad working range and high phylogenetic performance are still scarce. It is necessary to develop more universal NPCL markers comparable to RAG1 for osteichthyan phylogenetics. METHODOLOGY/PRINCIPAL FINDINGS: We developed three long universal NPCL markers (>1.6 kb each) based on single-copy nuclear genes (KIAA1239, SACS and TTN) that possess large exons and exhibit the appropriate evolutionary rates. We then compared their phylogenetic utilities with that of the reference marker RAG1 in 47 jawed vertebrate species. In comparison with RAG1, each of the three long universal markers yielded similar topologies and branch supports, all in congruence with the currently accepted osteichthyan phylogeny. To compare their phylogenetic performance visually, we also estimated the phylogenetic informativeness (PI) profile for each of the four long universal NPCL markers. The PI curves indicated that SACS performed best over the whole timescale, while RAG1, KIAA1239 and TTN exhibited similar phylogenetic performances. In addition, we compared the success of nested PCR and standard PCR when amplifying NPCL marker fragments. The amplification success rate and efficiency of the nested PCR were overwhelmingly higher than those of standard PCR. CONCLUSIONS/SIGNIFICANCE: Our work clearly demonstrates the superiority of nested PCR over the conventional PCR in phylogenetic studies and develops three long universal NPCL markers (KIAA1239, SACS and TTN) with the nested PCR strategy. The three markers exhibit high phylogenetic utilities in osteichthyan phylogenetics and can be widely used as pilot genes for phylogenetic questions of osteichthyans at different taxonomic levels

    Transcription, signaling receptor activity, oxidative phosphorylation, and fatty acid metabolism mediate the presence of closely related species in distinct intertidal and cold-seep habitats

    Get PDF
    Bathyal cold seeps are isolated extreme deep-sea environments characterized by low species diversity while biomass can be high. The Hakon Mosby mud volcano (Barents Sea, 1,280 m) is a rather stable chemosynthetic driven habitat characterized by prominent surface bacterial mats with high sulfide concentrations and low oxygen levels. Here, the nematode Halomonhystera hermesithrives in high abundances (11,000 individuals 10 cm(-2)). Halomonhystera hermesi is a member of the intertidal Halomonhystera disjuncta species complex that includes five cryptic species (GD 1-5). GD1-5's common habitat is characterized by strong environmental fluctuations. Here, we compared the transcriptomes of H. hermesi and GD1, H. hermesi's closest relative. Genes encoding proteins involved in oxidative phosphorylation are more strongly expressed in H. hermesi than in GD1, and many genes were only observed in H. hermesi while being completely absent in GD1. Both observations could in part be attributed to high sulfide concentrations and low oxygen levels. Additionally, fatty acid elongation was also prominent in H. hermesi confirming the importance of highly unsaturated fatty acids in this species. Significant higher amounts of transcription factors and genes involved in signaling receptor activity were observed in GD1 (many of which were completely absent in H. hermesi), allowing fast signaling and transcriptional reprogramming which can mediate survival in dynamic intertidal environments. GC content was approximately 8% higher in H. hermesi coding unigenes resulting in differential codon usage between both species and a higher proportion of amino acids with GC-rich codons in H. hermesi. In general our results showed that most pathways were active in both environments and that only three genes are under natural selection. This indicates that also plasticity should be taken in consideration in the evolutionary history of Halomonhystera species. Such plasticity, as well as possible preadaptation to low oxygen and high sulfide levels might have played an important role in the establishment of a cold-seep Halomonhystera population

    Investigating Evolutionary History Using Phylogenomics

    Get PDF
    Reconstructing the Tree of Life is one of the principal aims of evolutionary biology. The development of molecular phylogenetics to elucidate evolutionary history has complemented palaeontology, biogeography, and archaeology in elucidating biological history. The development of molecular-clock analyses allowed evolutionary timescales to be estimated using nucleotide sequences and other products of the evolutionary process Until recently, the twin challenges of molecular dating were in obtaining sufficient data and developing robust methods. The former concern is now less important as high–throughput sequencing technology allows entire genomes to be sampled. Genome–scale data enhances statistical power, but accompanying this wealth of data is a new suite of analytical challenges. One of these key challenges is analysing these data in synthesis with the paleontological record without statistical overparameterisation. There are also aspects of the evolutionary process, such as among–lineage rate variation, that can affect the precision and accuracy of current methods. In this thesis, I first use the richest nucleotide sequence data set of insects available to estimate an authoritative insect evolutionary timescale that dates the origins and diversification of every major insect order. I then focus on molecular-clock methods by testing their performance in inferring evolutionary rates from time–structured data, common in the study of ancient DNA. I find that among–rate lineage variation and phylo–temporal clustering affect rate estimates. I also study data partitioning, a common technique used to optimise the analysis of multilocus data where independent parameters are applied across different subsets of the data. New data from the genomic revolution gifts biologists new opportunities to re-examine enduring questions about the evolutionary process. Here, I use phylogenetic tools to show that evolution leaves figurative fingerprints on genomes over millions of years

    Characterization of a Nonclassical Class I MHC Gene in a Reptile, the Galápagos Marine Iguana (Amblyrhynchus cristatus)

    Get PDF
    Squamates are a diverse order of vertebrates, representing more than 7,000 species. Yet, descriptions of full-length major histocompatibility complex (MHC) genes in this group are nearly absent from the literature, while the number of MHC studies continues to rise in other vertebrate taxa. The lack of basic information about MHC organization in squamates inhibits investigation into the relationship between MHC polymorphism and disease, and leaves a large taxonomic gap in our understanding of amniote MHC evolution. Here, we use both cDNA and genomic sequence data to characterize a class I MHC gene (Amcr-UA) from the Galápagos marine iguana, a member of the squamate subfamily Iguaninae. Amcr-UA appears to be functional since it is expressed in the blood and contains many of the conserved peptide-binding residues that are found in classical class I genes of other vertebrates. In addition, comparison of Amcr-UA to homologous sequences from other iguanine species shows that the antigen-binding portion of this gene is under purifying selection, rather than balancing selection, and therefore may have a conserved function. A striking feature of Amcr-UA is that both the cDNA and genomic sequences lack the transmembrane and cytoplasmic domains that are necessary to anchor the class I receptor molecule into the cell membrane, suggesting that the product of this gene is secreted and consequently not involved in classical class I antigen-presentation. The truncated and conserved character of Amcr-UA lead us to define it as a nonclassical gene that is related to the few available squamate class I sequences. However, phylogenetic analysis placed Amcr-UA in a basal position relative to other published classical MHC genes from squamates, suggesting that this gene diverged near the beginning of squamate diversification

    Complete Mitochondrial Genome Sequence of Three Tetrahymena Species Reveals Mutation Hot Spots and Accelerated Nonsynonymous Substitutions in Ymf Genes

    Get PDF
    The ciliate Tetrahymena, a model organism, contains divergent mitochondrial (Mt) genome with unusual properties, where half of its 44 genes still remain without a definitive function. These genes could be categorized into two major groups of KPC (known protein coding) and Ymf (genes without an identified function). To gain insights into the mechanisms underlying gene divergence and molecular evolution of Tetrahymena (T.) Mt genomes, we sequenced three Mt genomes of T.paravorax, T.pigmentosa, and T.malaccensis. These genomes were aligned and the analyses were carried out using several programs that calculate distance, nucleotide substitution (dn/ds), and their rate ratios (ω) on individual codon sites and via a sliding window approach. Comparative genomic analysis indicated a conserved putative transcription control sequence, a GC box, in a region where presumably transcription and replication initiate. We also found distinct features in Mt genome of T.paravorax despite similar genome organization among these ∼47 kb long linear genomes. Another significant finding was the presence of at least one or more highly variable regions in Ymf genes where majority of substitutions were concentrated. These regions were mutation hotspots where elevated distances and the dn/ds ratios were primarily due to an increase in the number of nonsynonymous substitutions, suggesting relaxed selective constraint. However, in a few Ymf genes, accelerated rates of nonsynonymous substitutions may be due to positive selection. Similarly, on protein level the majority of amino acid replacements occurred in these regions. Ymf genes comprise half of the genes in Tetrahymena Mt genomes, so understanding why they have not been assigned definitive functions is an important aspect of molecular evolution. Importantly, nucleotide substitution types and rates suggest possible reasons for not being able to find homologues for Ymf genes. Additionally, comparative genomic analysis of complete Mt genomes is essential in identifying biologically significant motifs such as control regions

    The molecular basis of high duty-cycle echolocation in bats, and its role in the divergence of populations and species

    Get PDF
    PhD thesisHow populations diverge and form new species in the face of gene flow is a key question in evolutionary biology. Recent research suggests this may be possible where the same traits affect the ecological niche and are involved in assortative mating, and that a small number of genes could be involved in driving speciation in these cases. Echolocation call frequency in bats has roles in ecology and social communication. Bats using HDC echolocation have hearing tuned to specific frequencies, with frequency shifts impacting ecological niche and mate recognition, meaning this is a good candidate trait to drive speciation. HDC echolocation has evolved independently in two highly divergent groups of bats, providing a unique opportunity to study the molecular basis of a trait potentially driving speciation. I have combined selection testing of specific loci with genomewide divergence scans to test hypotheses concerning the evolution of HDC echolocation. Members of the yangochiropteran genus Pteronotus use low duty-cycle echolocation, except for the subgenus Phyllodia. Selection tests on coding sequence data revealed loci associated with hearing under positive selection in Phyllodia and in Pteronotus, including eleven shared with a yinpterochiropteran HDC echolocator, Rhinolophus sinicus. Three size and acoustic morphs of Rhinolophus philippinensis exist in sympatry on Buton Island. Phylogenetic reconstructions revealed population structure between the morphs, though with conflicting topologies based on mitochondrial and nuclear data. Species delimitation identified at least two separate taxa. Genomewide scans of divergence indicated low background FST between the morphs, punctuated with highly diverged islands featuring an overrepresentation of genes associated with body size and hearing. 3 This thesis represents the first genome-wide investigation of HDC echolocation, highlighting candidate genes related to this trait. It additionally describes a rarely observed mammalian ecological speciation, providing support for the claim that species designated R. philippinensis represent a complex across their range
    • …
    corecore