Search CORE

365 research outputs found

The Interrelationships of Placental Mammals and the Limits of Phylogenetic Inference

Author: Asher Robert J
Donoghue Philip C.J.
dos Reis Mario
King Benjamin L
Mirarab Siavash
Moran Raymond J. J
O\u27Connell Mary J
O\u27Reilly Joseph E
Parker Sean
Peterson Kevin J
Pisani Davide
Tarver James E
Warnow Tandy
Publication venue: Dartmouth Digital Commons
Publication date: 23/12/2015
Field of study

Placental mammals comprise three principal clades: Afrotheria (e.g., elephants and tenrecs), Xenarthra (e.g., armadillos and sloths), and Boreoeutheria (all other placental mammals), the relationships among which are the subject of controversy and a touchstone for debate on the limits of phylogenetic inference. Previous analyses have found support for all three hypotheses, leading some to conclude that this phylogenetic problem might be impossible to resolve due to the compounded effects of incomplete lineage sorting (ILS) and a rapid radiation. Here we show, using a genome scale nucleotide data set, microRNAs, and the reanalysis of the three largest previously published amino acid data sets, that the root of Placentalia lies between Atlantogenata and Boreoeutheria. Although we found evidence for ILS in early placental evolution, we are able to reject previous conclusions that the placental root is a hard polytomy that cannot be resolved. Reanalyses of previous data sets recover Atlantogenata + Boreoeutheria and show that contradictory results are a consequence of poorly fitting evolutionary models; instead, when the evolutionary process is better-modeled, all data sets converge on Atlantogenata. Our Bayesian molecular clock analysis estimates that marsupials diverged from placentals 157-170 Ma, crown Placentalia diverged 86-100 Ma, and crown Atlantogenata diverged 84-97 Ma. Our results are compatible with placental diversification being driven by dispersal rather than vicariance mechanisms, postdating early phases in the protracted opening of the Atlantic Ocean

Crossref

PubMed Central

UCL Discovery

Dartmouth Digital Commons (Dartmouth College)

Queen Mary Research Online

White Rose Research Online

Explore Bristol Research

Estimating phylogenetic trees from genome-scale data

Author: Davis Charles
Edwards Scott V.
Liu Liang
Wu Shaoyuan
Xi Zhenxiang
Publication venue: 'Wiley'
Publication date: 15/01/2015
Field of study

As researchers collect increasingly large molecular data sets to reconstruct the Tree of Life, the heterogeneity of signals in the genomes of diverse organisms poses challenges for traditional phylogenetic analysis. A class of phylogenetic methods known as "species tree methods" have been proposed to directly address one important source of gene tree heterogeneity, namely the incomplete lineage sorting or deep coalescence that occurs when evolving lineages radiate rapidly, resulting in a diversity of gene trees from a single underlying species tree. Although such methods are gaining in popularity, they are being adopted with caution in some quarters, in part because of an increasing number of examples of strong phylogenetic conflict between concatenation or supermatrix methods and species tree methods. Here we review theory and empirical examples that help clarify these conflicts. Thinking of concatenation as a special case of the more general model provided by the multispecies coalescent can help explain a number of differences in the behavior of the two methods on phylogenomic data sets. Recent work suggests that species tree methods are more robust than concatenation approaches to some of the classic challenges of phylogenetic analysis, including rapidly evolving sites in DNA sequences, base compositional heterogeneity and long branch attraction. We show that approaches such as binning, designed to augment the signal in species tree analyses, can distort the distribution of gene trees and are inconsistent. Computationally efficient species tree methods that incorporate biological realism are a key to phylogenetic analysis of whole genome data.Comment: 39 pages, 3 figure

arXiv.org e-Print Archive

CiteSeerX

The molecular phylogeny of placental mammals and its application to uncovering signatures of molecular adaptation.

Author: Morgan Claire C.
Publication venue: Dublin City University. School of Biotechnology
Publication date: 01/03/2013
Field of study

Considerable conflict remains in the literature as to the position of the root of placental mammals, and the placement of several intra-ordinal groups. Debate continues over the use of DNA or amino acids datasets and over the use of Supertree or Supermatrix approaches. Known phenomena exist within mammal data that complicate the reconstruction of phylogeny. These include (but are not limited to), variation in longevity, body size, metabolic rates, and germ-line generation time that result in variation in mutation rates and composition biases. Previous attempts to resolve the placental mammal phylogeny have used homogeneous evolutionary models that cannot capture and adequately describe these features across the species sampled. In this thesis I explore the properties of different datasets and data types and their suitability to the resolution of the mammal phylogeny at different depths: (i) the position of the root of the placental mammals, and (ii), the intraordinal placements within the Laurasiatheria. The datasets tested were (i) mitochondrial and nuclear data types, (ii) previously published datasets for mammals, and (iii), datasets I assembled specifically for analyses at different phylogenetic depths. I propose and apply the use of heterogeneous models to resolve the position of the root of the placental mammal phylogeny to these datasets. Reconstruction of a robust mammal phylogeny provides us with an essential framework for understanding the molecular underpinnings of adaptation to environment. The placental mammals display a huge variations in life traits such longevity, body size and DNA repair efficiency, since they emerged ~100 million years ago. With this robust phylogeny, I set out to determine the level of adaptive and non-adaptive processes acting on a set of mammal genes that are linked with longevity and cancer. The results of these analyses yield important insights into data and model suitability, and provide strong evidence for a single hypothesis for the rooting of placental mammals. These results also show that Laurasiatheria intra-ordinal placements are not fully resolved and additional sampling from this diverse clade is required. Using this resolved phylogeny, specific molecular adaptations and non-adaptive mechanisms were identified in the mammalia for a set of telomere-associated genes

Irish Universities

DCU Online Research Access Service

Rare coral under the genomic microscope: timing and relationships among Hawaiian Montipora

Author: Belderok Roy
Castilho Rita
Cunha Regina Lopes da
Forsman Zac H
Knapp Ingrid S S
Toonen Robert J
Publication venue: 'Springer Science and Business Media LLC'
Publication date: 01/07/2019
Field of study

Background Evolutionary patterns of scleractinian (stony) corals are difficult to infer given the existence of few diagnostic characters and pervasive phenotypic plasticity. A previous study of Hawaiian Montipora (Scleractinia: Acroporidae) based on five partial mitochondrial and two nuclear genes revealed the existence of a species complex, grouping one of the rarest known species (M. dilatata, which is listed as Endangered by the International Union for Conservation of Nature - IUCN) with widespread corals of very different colony growth forms (M. flabellata and M. cf. turgescens). These previous results could result from a lack of resolution due to a limited number of markers, compositional heterogeneity or reflect biological processes such as incomplete lineage sorting (ILS) or introgression. Results All 13 mitochondrial protein-coding genes from 55 scleractinians (14 lineages from this study) were used to evaluate if a recent origin of the M. dilatata species complex or rate heterogeneity could be compromising phylogenetic inference. Rate heterogeneity detected in the mitochondrial data set seems to have no significant impacts on the phylogenies but clearly affects age estimates. Dating analyses show different estimations for the speciation of M. dilatata species complex depending on whether taking compositional heterogeneity into account (0.8 [0.05–2.6] Myr) or assuming rate homogeneity (0.4 [0.14–0.75] Myr). Genomic data also provided evidence of introgression among all analysed samples of the complex. RADseq data indicated that M. capitata colour morphs may have a genetic basis. Conclusions Despite the volume of data (over 60,000 SNPs), phylogenetic relationships within the M. dilatata species complex remain unresolved most likely due to a recent origin and ongoing introgression. Species delimitation with genomic data is not concordant with the current taxonomy, which does not reflect the true diversity of this group. Nominal species within the complex are either undergoing a speciation process or represent ecomorphs exhibiting phenotypic polymorphisms.info:eu-repo/semantics/publishedVersio

Directory of Open Access Journals

Sapientia

Developing and applying supertree methods in Phylogenomics and Macroevolution

Author: Akanni Wasiu
Publication venue
Publication date: 01/04/2014
Field of study

Supertrees can be used to combine partially overalapping trees and generate more inclusive phylogenies. It has been proposed that Maximum Likelihood (ML) supertrees method (SM) could be developed using an exponential probability distribution to model errors in the input trees (given a proposed supertree). When the tree-‐to-‐tree distances used in the ML computation are symmetric differences, the ML SM has been shown to be equivalent to a Majority-‐Rule consensus SM, and hence, exactly as the latter, it has the desirable property of being a median tree (with reference to the set of input trees). The ability to estimate the likelihood of supertrees, allows implementing Bayesian (MCMC) approaches, which have the advantage to allow the support for the clades in a supertree to be properly estimated. I present here the L.U.St software package; it contains the first implementation of a ML SM and allows for the first time statistical tests on supertrees. I also characterized the first implementation of the Bayesian (MCMC) SM. Both the ML and the Bayesian (MCMC) SMs have been tested for and found to be immune to biases. The Bayesian (MCMC) SM is applied to the reanalyses of a variety of datasets (i.e. the datasets for the Metazoa and the Carnivora), and I have also recovered the first Bayesian supertree-‐based phylogeny of the Eubacteria and the Archaebacteria. These new SMs are discussed, with reference to other, well-‐ known SMs like Matrix Representation with Parsimony. Both the ML and Bayesian SM offer multiple attractive advantages over current alternatives

MURAL - Maynooth University Research Archive Library

Irish Universities

NUI Maynooth Eprint Archive

Maynooth University ePrints and eTheses Archive

Developing and applying supertree methods in Phylogenomics and Macroevolution

Author: Akanni Wasiu
Publication venue
Publication date: 01/04/2014
Field of study

MURAL - Maynooth University Research Archive Library

Recommended from our members

Estimating phylogenetic trees from genome-scale data

Author: Davis Charles Cavender
Edwards Scott V.
Liu Liang
Wu Shaoyuan
Xi Zhenxiang
Publication venue: 'Wiley'
Publication date: 17/02/2017
Field of study

The heterogeneity of signals in the genomes of diverse organisms poses challenges for traditional phylogenetic analysis. Phylogenetic methods known as “species tree” methods have been proposed to directly address one important source of gene tree heterogeneity, namely the incomplete lineage sorting that occurs when evolving lineages radiate rapidly, resulting in a diversity of gene trees from a single underlying species tree. Here we review theory and empirical examples that help clarify conflicts between species tree and concatenation methods, and misconceptions in the literature about the performance of species tree methods. Considering concatenation as a special case of the multispecies coalescent model helps explain differences in the behavior of the two methods on phylogenomic data sets. Recent work suggests that species tree methods are more robust than concatenation approaches to some of the classic challenges of phylogenetic analysis, including rapidly evolving sites in DNA sequences and long-branch attraction. We show that approaches, such as binning, designed to augment the signal in species tree analyses can distort the distribution of gene trees and are inconsistent. Computationally efficient species tree methods incorporating biological realism are a key to phylogenetic analysis of whole-genome data.Organismic and Evolutionary Biolog

Harvard University - DASH

바이오인포매틱스 프로그램을 이용한 유전자 마커 선별 및 계통수 오류 평가 연구

Author: 이정환
Publication venue: 서울대학교 대학원
Publication date: 01/08/2021
Field of study

학위논문(석사) -- 서울대학교대학원 : 자연과학대학 협동과정 생물정보학전공, 2021.8. 손현석.지속적으로 산출되는 엄청난 양의 생물학적 서열 데이터는 유기체 사이의 진화적 역사와 계통학적 관계(phylogenetic relationship)를 유추할 수 있는 기회를 제공한다. 이제 계통수 구축은 거의 모든 생물학 연구에서 수행되는 과정의 하나가 되었다. 여기서 계통정보학(phyloinformatics)은 계통수 생성 알고리즘과 진화적 모델 개발과 같은 기술적 또는 방법론적 연구를 중심으로 발전되어 왔다. 현재의 계통수 분석은 서열 데이터, 즉 유전적 마커를 이용하여 계통수를 생성함으로써 실제에 가까운 계통수를 추론하는 것을 목표로 한다. 그러나 유전적 마커를 비롯한 데이터의 크기가 기하급수적으로 증가하고 따라오는 계통수 분석의 정확성에 대한 의문이 점차 중요하게 다루어 지기 시작하면서 계통수의 정확성 및 신뢰성을 평가하기 위한 연구가 다수 이루어지고 있는 상황이다. 분자 시스템학 관점에서 계통수에 대한 정확성 평가는 두 가지 갈래로 나누어 접근할 수 있는데, 하나는 진화 조건, 분자데이터의 양과 같은 특정 환경 아래에서 계통 분석 알고리즘이 얼마나 잘 작동하는지를 다루는 것이고, 또 다른 하나는 특정 계통수를 얼마나 신뢰할 수 있는지에 집중하는 것이다. 그리고 데이터셋의 퀄리티 관점에서 신뢰할 만한 계통수를 획득하기 위해 계통수 분석을 수행한 후, 사용한 데이터셋과의 적절성을 평가하는 것도 중요하다. 대규모 데이터를 기본으로 취급하는 최근 계통수 분석에서 확률론적 오류의 가능성은 낮아졌지만, 시스템 오류의 가능성은 오히려 높아졌으므로, 계통수 정확성을 평가 및 개선하기 위해 계통 분석 결과 후에 데이터셋이 가지는 시스템 오류의 근원을 평가하는 것이 매우 중요한 과정이 되었기 때문이다. 이에 본 연구에서는 데이터 퀄리티 관점에서 계통수의 신뢰도 향상을 가져오기 위해 APSE (Assessment Program for Systematic Error, tentative)라는 프로그램을 개발하였다. APSE를 활용하면 분류군 특이적 상대적 구성 빈도 변이(RCFV)와 대칭적 왜곡값(skew)을 산출하여 염기서열의 구성적 편향성에 대한 정보를 얻고, 이를 통해 연구하고자 하는 데이터의 유전적 이질성(heterogeneity) 및 유전적 변이 편향성(mutational bias)을 추정할 수 있다. 뿐만 아니라 다양한 염기 그룹의 빈도, 변이에 의한 다수 치환을 의미하는 포화(saturation)와 공유 결측 데이터(shared missing data) 변수를 통해 시스템 오류를 유발할 수 있는 편향성 정보들을 계산하는 것이 가능하다. 또한, 시스템 성능을 평가하기 위해 다양한 유전자 마커 사이의 모순되는 계통수를 출력하고 있는, 특이적 예시(Terebelliformia, Daphniid, Glires)를 APSE에 적용하여 마커 데이터셋의 시스템 오류 평가와 그에 따라 선별된 마커 계통수의 정확성 추론에 대한 분석이 제대로 수행될 수 있음을 확인하였다. 따라서 향후 APSE는 시스템학적 관점에서 데이터 퀄리티에 집중하여 생성된 계통수가 보다 정확한 결과를 이끌어낼 수 있도록 사용자의 데이터와 계통수 사이의 정확성을 평가하는 역할을 할 것이고, 유전적 마커에 따라 오해의 소지가 있는 계통수가 출력되었을 때, 시스템 오류의 근원에 대한 철저한 분석과 해당 오류의 영향을 받은 데이터가 계통수에 주는 효과를 파악하는 일을 수행할 수 있을 것이라 기대한다.The steadily increasing volume of biological data with decisive phylogenetic relationship provides unparalleled opportunities in bioinformatics. Phylogenetics based on a large amount of datasets handling an evolutionary history and assigning the placement of taxa in a phylogeny establishes the tree of life. Constructing a phylogeny involving a phylogenetic analysis is implemented in most branches of biology and emphasizing the evolutionary history elucidates the phylogenetical background as a prerequisite interpreting a specific biological system, which is a biologically indispensable process. Due to the advent of computing and sequencing techniques as the phylogenetic approach, phyloinformatics has rapidly advanced at the technical and methodological levels along with phylogenetic reconstruction algorithm and evolutionary models. Unlike the classic approach using morphological data, modern phylogenetic analysis reconstructs a phylogeny using genetic information following the inference of phylogenetic tree from molecular data. Therefore, phylogeneticists have naturally dealt with questions concerning the accuracy of phylogenetic estimation and carried out studies on the reliability of phylogenies. In terms of molecular systematics, the concerns regarding the assessment of phylogenetic accuracy considering specific evolutionary conditions and the amount of molecular data implemented can now be divided into two types: how phylogenetic method works and how reliable it is under certain circumstances. Moreover, in terms of data quality, assessment for suitability of nuclear marker is required before the phylogenetic inference is performed for confident phylogeny. Recently, the probability of stochastic errors in phylogenetic estimation dealing with a large-scale datasets has decreased, while the probability of systematic errors has increased. Thus, before the implementation of phylogenetic reconstruction, the assessment of sources of systematic errors is indispensable for the improvement and estimation of phylogenetic accuracy. Assessment Program for Systematic Error (APSE) developed by this study will plays a key role in assessment between user datasets and phylogenies for improving the results of phylogenetic reconstruction in systematics and will be able to implement an analysis of the effect on data bearing systematic errors in a phylogeny after the misleading phylogenetic results are produced. This study with APSE will serve as the inference of phylogenetic accuracy and the assessment of systematic errors using an unresolved example showing the contradicting topologies between different gene markers in the same diversity group. Furthermore, by selectively grouping the properties of the existing systematic biases provided by the APSE, it proceeds in the direction of proposing a new protocol that can provide the best gene marker among candidate markers for a specific taxon.I. INTRODUCTION 1 1.1 Background of research 1 1.2 Necessity of research 20 1.3 Research objectives 22 II. MATERIALS AND METHODS 30 2.1 Datasets definition and data collection 30 2.2 Data processing and bioinformatics software used 33 2.3 Phylogenetic reconstruction and accuracy assessment 36 2.4 Software development environment and allowable data 37 2.5 Assessment of the systematic errors 38 III. RESULTS 45 3.1 Phylogenetic analysis results for incongruence between gene markers 45 3.2 Data-quality analysis using systematic errors 49 IV. DISCUSSION 79 4.1 Significance and implications of study 79 4.2 Application to bioinformatics research 80 4.3 Improvement and achievement 81 V. CONCLUSION AND SUMMARY 83 5.1 Conclusion 83 5.2 Summary 84 BIBLIOGRAPHY 87 ABSTRACT (KOREAN) 96석

SNU Open Repository and Archive

Suprafamilial relationships among Rodentia and the phylogenetic effect of removing fast-evolving nucleotides in mitochondrial, exon and intron fragments

Author: Arnal Véronique
Forty Ellen
Matthee Conrad A
Montgelard Claudine
Publication venue: BioMed Central
Publication date: 01/01/2008
Field of study

The number of rodent clades identified above the family level is contentious, and to date, no consensus has been reached on the basal evolutionary relationships among all rodent families. Rodent suprafamilial phylogenetic relationships are investigated in the present study using approximately 7600 nucleotide characters derived from two mitochondrial genes (Cytochrome b and 12S rRNA), two nuclear exons (IRBP and vWF) and four nuclear introns (MGF, PRKC, SPTBN, THY). Because increasing the number of nucleotides does not necessarily increase phylogenetic signal (especially if the data is saturated), we assess the potential impact of saturation for each dataset by removing the fastest-evolving positions that have been recognized as sources of inconsistencies in phylogenetics

Crossref

Springer - Publisher Connector

Directory of Open Access Journals

PubMed Central

UCL Discovery

Stellenbosch University SUNScholar Repository

Investigating Evolutionary History Using Phylogenomics

Author: Tong Kwei Jun
Publication venue: Faculty of Science, School of Life and Environmental Sciences
Publication date: 14/03/2018
Field of study

Reconstructing the Tree of Life is one of the principal aims of evolutionary biology. The development of molecular phylogenetics to elucidate evolutionary history has complemented palaeontology, biogeography, and archaeology in elucidating biological history. The development of molecular-clock analyses allowed evolutionary timescales to be estimated using nucleotide sequences and other products of the evolutionary process Until recently, the twin challenges of molecular dating were in obtaining sufficient data and developing robust methods. The former concern is now less important as high–throughput sequencing technology allows entire genomes to be sampled. Genome–scale data enhances statistical power, but accompanying this wealth of data is a new suite of analytical challenges. One of these key challenges is analysing these data in synthesis with the paleontological record without statistical overparameterisation. There are also aspects of the evolutionary process, such as among–lineage rate variation, that can affect the precision and accuracy of current methods. In this thesis, I first use the richest nucleotide sequence data set of insects available to estimate an authoritative insect evolutionary timescale that dates the origins and diversification of every major insect order. I then focus on molecular-clock methods by testing their performance in inferring evolutionary rates from time–structured data, common in the study of ancient DNA. I find that among–rate lineage variation and phylo–temporal clustering affect rate estimates. I also study data partitioning, a common technique used to optimise the analysis of multilocus data where independent parameters are applied across different subsets of the data. New data from the genomic revolution gifts biologists new opportunities to re-examine enduring questions about the evolutionary process. Here, I use phylogenetic tools to show that evolution leaves figurative fingerprints on genomes over millions of years

Sydney eScholarship