25 research outputs found

    A Detailed History of Intron-rich Eukaryotic Ancestors Inferred from a Global Survey of 100 Complete Genomes

    Get PDF
    Protein-coding genes in eukaryotes are interrupted by introns, but intron densities widely differ between eukaryotic lineages. Vertebrates, some invertebrates and green plants have intron-rich genes, with 6–7 introns per kilobase of coding sequence, whereas most of the other eukaryotes have intron-poor genes. We reconstructed the history of intron gain and loss using a probabilistic Markov model (Markov Chain Monte Carlo, MCMC) on 245 orthologous genes from 99 genomes representing the three of the five supergroups of eukaryotes for which multiple genome sequences are available. Intron-rich ancestors are confidently reconstructed for each major group, with 53 to 74% of the human intron density inferred with 95% confidence for the Last Eukaryotic Common Ancestor (LECA). The results of the MCMC reconstruction are compared with the reconstructions obtained using Maximum Likelihood (ML) and Dollo parsimony methods. An excellent agreement between the MCMC and ML inferences is demonstrated whereas Dollo parsimony introduces a noticeable bias in the estimations, typically yielding lower ancestral intron densities than MCMC and ML. Evolution of eukaryotic genes was dominated by intron loss, with substantial gain only at the bases of several major branches including plants and animals. The highest intron density, 120 to 130% of the human value, is inferred for the last common ancestor of animals. The reconstruction shows that the entire line of descent from LECA to mammals was intron-rich, a state conducive to the evolution of alternative splicing

    Large expert-curated database for benchmarking document similarity detection in biomedical literature search

    Get PDF
    Document recommendation systems for locating relevant literature have mostly relied on methods developed a decade ago. This is largely due to the lack of a large offline gold-standard benchmark of relevant documents that cover a variety of research fields such that newly developed literature search techniques can be compared, improved and translated into practice. To overcome this bottleneck, we have established the RElevant LIterature SearcH consortium consisting of more than 1500 scientists from 84 countries, who have collectively annotated the relevance of over 180 000 PubMed-listed articles with regard to their respective seed (input) article/s. The majority of annotations were contributed by highly experienced, original authors of the seed articles. The collected data cover 76% of all unique PubMed Medical Subject Headings descriptors. No systematic biases were observed across different experience levels, research fields or time spent on annotations. More importantly, annotations of the same document pairs contributed by different scientists were highly concordant. We further show that the three representative baseline methods used to generate recommended articles for evaluation (Okapi Best Matching 25, Term Frequency-Inverse Document Frequency and PubMed Related Articles) had similar overall performances. Additionally, we found that these methods each tend to produce distinct collections of recommended articles, suggesting that a hybrid method may be required to completely capture all relevant articles. The established database server located at https://relishdb.ict.griffith.edu.au is freely available for the downloading of annotation data and the blind testing of new methods. We expect that this benchmark will be useful for stimulating the development of new powerful techniques for title and title/abstract-based search engines for relevant articles in biomedical research.Peer reviewe

    Global burden of 369 diseases and injuries in 204 countries and territories, 1990–2019: a systematic analysis for the Global Burden of Disease Study 2019

    Get PDF
    Background: In an era of shifting global agendas and expanded emphasis on non-communicable diseases and injuries along with communicable diseases, sound evidence on trends by cause at the national level is essential. The Global Burden of Diseases, Injuries, and Risk Factors Study (GBD) provides a systematic scientific assessment of published, publicly available, and contributed data on incidence, prevalence, and mortality for a mutually exclusive and collectively exhaustive list of diseases and injuries. Methods: GBD estimates incidence, prevalence, mortality, years of life lost (YLLs), years lived with disability (YLDs), and disability-adjusted life-years (DALYs) due to 369 diseases and injuries, for two sexes, and for 204 countries and territories. Input data were extracted from censuses, household surveys, civil registration and vital statistics, disease registries, health service use, air pollution monitors, satellite imaging, disease notifications, and other sources. Cause-specific death rates and cause fractions were calculated using the Cause of Death Ensemble model and spatiotemporal Gaussian process regression. Cause-specific deaths were adjusted to match the total all-cause deaths calculated as part of the GBD population, fertility, and mortality estimates. Deaths were multiplied by standard life expectancy at each age to calculate YLLs. A Bayesian meta-regression modelling tool, DisMod-MR 2.1, was used to ensure consistency between incidence, prevalence, remission, excess mortality, and cause-specific mortality for most causes. Prevalence estimates were multiplied by disability weights for mutually exclusive sequelae of diseases and injuries to calculate YLDs. We considered results in the context of the Socio-demographic Index (SDI), a composite indicator of income per capita, years of schooling, and fertility rate in females younger than 25 years. Uncertainty intervals (UIs) were generated for every metric using the 25th and 975th ordered 1000 draw values of the posterior distribution. Findings: Global health has steadily improved over the past 30 years as measured by age-standardised DALY rates. After taking into account population growth and ageing, the absolute number of DALYs has remained stable. Since 2010, the pace of decline in global age-standardised DALY rates has accelerated in age groups younger than 50 years compared with the 1990–2010 time period, with the greatest annualised rate of decline occurring in the 0–9-year age group. Six infectious diseases were among the top ten causes of DALYs in children younger than 10 years in 2019: lower respiratory infections (ranked second), diarrhoeal diseases (third), malaria (fifth), meningitis (sixth), whooping cough (ninth), and sexually transmitted infections (which, in this age group, is fully accounted for by congenital syphilis; ranked tenth). In adolescents aged 10–24 years, three injury causes were among the top causes of DALYs: road injuries (ranked first), self-harm (third), and interpersonal violence (fifth). Five of the causes that were in the top ten for ages 10–24 years were also in the top ten in the 25–49-year age group: road injuries (ranked first), HIV/AIDS (second), low back pain (fourth), headache disorders (fifth), and depressive disorders (sixth). In 2019, ischaemic heart disease and stroke were the top-ranked causes of DALYs in both the 50–74-year and 75-years-and-older age groups. Since 1990, there has been a marked shift towards a greater proportion of burden due to YLDs from non-communicable diseases and injuries. In 2019, there were 11 countries where non-communicable disease and injury YLDs constituted more than half of all disease burden. Decreases in age-standardised DALY rates have accelerated over the past decade in countries at the lower end of the SDI range, while improvements have started to stagnate or even reverse in countries with higher SDI. Interpretation: As disability becomes an increasingly large component of disease burden and a larger component of health expenditure, greater research and developm nt investment is needed to identify new, more effective intervention strategies. With a rapidly ageing global population, the demands on health services to deal with disabling outcomes, which increase with age, will require policy makers to anticipate these changes. The mix of universal and more geographically specific influences on health reinforces the need for regular reporting on population health in detail and by underlying cause to help decision makers to identify success stories of disease control to emulate, as well as opportunities to improve. Funding: Bill & Melinda Gates Foundation. © 2020 The Author(s). Published by Elsevier Ltd. This is an Open Access article under the CC BY 4.0 licens

    Global, regional, and national age-sex-specific mortality for 282 causes of death in 195 countries and territories, 1980-2017: a systematic analysis for the Global Burden of Disease Study 2017.

    Get PDF
    BACKGROUND: Global development goals increasingly rely on country-specific estimates for benchmarking a nation's progress. To meet this need, the Global Burden of Diseases, Injuries, and Risk Factors Study (GBD) 2016 estimated global, regional, national, and, for selected locations, subnational cause-specific mortality beginning in the year 1980. Here we report an update to that study, making use of newly available data and improved methods. GBD 2017 provides a comprehensive assessment of cause-specific mortality for 282 causes in 195 countries and territories from 1980 to 2017. METHODS: The causes of death database is composed of vital registration (VR), verbal autopsy (VA), registry, survey, police, and surveillance data. GBD 2017 added ten VA studies, 127 country-years of VR data, 502 cancer-registry country-years, and an additional surveillance country-year. Expansions of the GBD cause of death hierarchy resulted in 18 additional causes estimated for GBD 2017. Newly available data led to subnational estimates for five additional countries-Ethiopia, Iran, New Zealand, Norway, and Russia. Deaths assigned International Classification of Diseases (ICD) codes for non-specific, implausible, or intermediate causes of death were reassigned to underlying causes by redistribution algorithms that were incorporated into uncertainty estimation. We used statistical modelling tools developed for GBD, including the Cause of Death Ensemble model (CODEm), to generate cause fractions and cause-specific death rates for each location, year, age, and sex. Instead of using UN estimates as in previous versions, GBD 2017 independently estimated population size and fertility rate for all locations. Years of life lost (YLLs) were then calculated as the sum of each death multiplied by the standard life expectancy at each age. All rates reported here are age-standardised

    Islands in the oil: Quantifying salt marsh shoreline erosion after the Deepwater Horizon oiling

    Get PDF
    Qualitative inferences and sparse bay-wide measurements suggest that shoreline erosion increased after the 2010 BP Deepwater Horizon (DWH) disaster, but quantifying the impacts has been elusive at the landscape scale. We quantified the shoreline erosion of 46 islands for before and after the DWH oil spill to determine how much shoreline was lost, if the losses were temporary, and if recovery/restoration occurred. The erosion rates at the oiled islands increased to 275% in the first six months after the oiling, were 200% of that of the unoiled islands for the first 2.5 years after the oiling, and twelve times the average land loss in the deltaic plain of 0.4%y(-1) from 1988 to 2011. These results support the hypothesis that oiling compromised the belowground biomass of the emergent vegetation. The islands are, in effect, sentinels of marsh stability already in decline before the oil spill. (C) 2016 The Authors. Published by Elsevier Ltd

    Phylogenetic Evidence for Horizontal Transfer of mutS Alleles among Naturally Occurring Escherichia coli Strains

    No full text
    mutS mutators accelerate the bacterial mutation rate 100- to 1,000-fold and relax the barriers that normally restrict homeologous recombination. These mutators thus afford the opportunity for horizontal exchange of DNA between disparate strains. While much is known regarding the mutS phenotype, the evolutionary structure of the mutS(+) gene in Escherichia coli remains unclear. The physical proximity of mutS to an adjacent polymorphic region of the chromosome suggests that this gene itself may be subject to horizontal transfer and recombination events. To test this notion, a phylogenetic approach was employed that compared gene phylogeny to strain phylogeny, making it possible to identify E. coli strains in which mutS alleles have recombined. Comparison of mutS phylogeny against predicted E. coli “whole-chromosome” phylogenies (derived from multilocus enzyme electrophoresis and mdh sequences) revealed striking levels of phylogenetic discordance among mutS alleles and their respective strains. We interpret these incongruences as signatures of horizontal exchange among mutS alleles. Examination of additional sites surrounding mutS also revealed incongruous distributions compared to E. coli strain phylogeny. This suggests that other regional sequences are equally subject to horizontal transfer, supporting the hypothesis that the 61.5-min mutS-rpoS region is a recombinational hot spot within the E. coli chromosome. Furthermore, these data are consistent with a mechanism for stabilizing adaptive changes promoted by mutS mutators through rescue of defective mutS alleles with wild-type sequences
    corecore