11 research outputs found
Large expert-curated database for benchmarking document similarity detection in biomedical literature search
Document recommendation systems for locating relevant literature have mostly relied on methods developed a decade ago. This is largely due to the lack of a large offline gold-standard benchmark of relevant documents that cover a variety of research fields such that newly developed literature search techniques can be compared, improved and translated into practice. To overcome this bottleneck, we have established the RElevant LIterature SearcH consortium consisting of more than 1500 scientists from 84 countries, who have collectively annotated the relevance of over 180 000 PubMed-listed articles with regard to their respective seed (input) article/s. The majority of annotations were contributed by highly experienced, original authors of the seed articles. The collected data cover 76% of all unique PubMed Medical Subject Headings descriptors. No systematic biases were observed across different experience levels, research fields or time spent on annotations. More importantly, annotations of the same document pairs contributed by different scientists were highly concordant. We further show that the three representative baseline methods used to generate recommended articles for evaluation (Okapi Best Matching 25, Term Frequency-Inverse Document Frequency and PubMed Related Articles) had similar overall performances. Additionally, we found that these methods each tend to produce distinct collections of recommended articles, suggesting that a hybrid method may be required to completely capture all relevant articles. The established database server located at https://relishdb.ict.griffith.edu.au is freely available for the downloading of annotation data and the blind testing of new methods. We expect that this benchmark will be useful for stimulating the development of new powerful techniques for title and title/abstract-based search engines for relevant articles in biomedical research.Peer reviewe
Novel hopanoid cyclases from the environment
Hopanoids are ubiquitous isoprenoid lipids found in modern biota, in recent sediments and in low-maturity sedimentary rocks. Because these lipids primarily are derived from bacteria, they are used as proxies to help decipher geobiological communities. To date, much of the information about sources of hopanoids has come from surveys of culture collections, an approach that does not address the vast fraction of prokaryotic communities that remains uncharacterized. Here we investigated the phylogeny of hopanoid producers using culture-independent methods. We obtained 79 new sequences of squalene-hopene cyclase genes (sqhC) from marine and lacustrine bacterioplankton and analysed them along with all 31 sqhC fragments available from existing metagenomics libraries. The environmental sqhCs average only 60% translated amino acid identity to their closest relatives in public databases. The data imply that the sources of these important geologic biomarkers remain largely unknown. In particular, genes affiliated with known cyanobacterial sequences were not detected in the contemporary environments analysed here, yet the geologic record contains abundant hopanoids apparently of cyanobacterial origin. The data also suggest that hopanoid biosynthesis is uncommon: < 10% of bacterial species may be capable of producing hopanoids. A better understanding of the contemporary distribution of hopanoid biosynthesis may reveal fundamental insight about the function of these compounds, the organisms in which they are found, and the environmental signals preserved in the sedimentary record
Radiocarbon-based ages and growth rates of bamboo corals from the Gulf of Alaska
Deep-sea coral communities have long been recognized by fisherman as areas that support large populations of commercial fish. As a consequence, many deep-sea coral communities are threatened by bottom trawling. Successful management and conservation of this widespread deep-sea habitat requires knowledge of the age and growth rates of deep-sea corals. These organisms also contain important archives of intermediate and deep-water variability, and are thus of interest in the context of decadal to century-scale climate dynamics. Here, we present Δ 14C data that suggest that bamboo corals from the Gulf of Alaska are long-lived (75-126 years) and that they acquire skeletal carbon from two distinct sources. Independent verification of our growth rate estimates and coral ages is obtained by counting seasonal Sr/Ca cycles and probable lunar cycle growth bands
Disability-adjusted life years (DALYs) for 291 diseases and injuries in 21 regions, 1990-2010: a systematic analysis for the Global Burden of Disease Study 2010
Background Measuring disease and injury burden in populations requires a composite metric that captures both premature mortality and the prevalence and severity of ill-health. The 1990 Global Burden of Disease study proposed disability-adjusted life years (DALYs) to measure disease burden. No comprehensive update of disease burden worldwide incorporating a systematic reassessment of disease and injury-specific epidemiology has been done since the 1990 study. We aimed to calculate disease burden worldwide and for 21 regions for 1990, 2005, and 2010 with methods to enable meaningful comparisons over time. Methods We calculated DALYs as the sum of years of life lost (YLLs) and years lived with disability (YLDs). DALYs were calculated for 291 causes, 20 age groups, both sexes, and for 187 countries, and aggregated to regional and global estimates of disease burden for three points in time with strictly comparable definitions and methods. YLLs were calculated from age-sex-country-time-specific estimates of mortality by cause, with death by standardised lost life expectancy at each age. YLDs were calculated as prevalence of 1160 disabling sequelae, by age, sex, and cause, and weighted by new disability weights for each health state. Neither YLLs nor YLDs were age-weighted or discounted. Uncertainty around cause-specific DALYs was calculated incorporating uncertainty in levels of all-cause mortality, cause-specific mortality, prevalence, and disability weights. Findings Global DALYs remained stable from 1990 (2.503 billion) to 2010 (2.490 billion). Crude DALYs per 1000 decreased by 23% (472 per 1000 to 361 per 1000). An important shift has occurred in DALY composition with the contribution of deaths and disability among children (younger than 5 years of age) declining from 41% of global DALYs in 1990 to 25% in 2010. YLLs typically account for about half of disease burden in more developed regions (high-income Asia Pacific, western Europe, high-income North America, and Australasia), rising to over 80% of DALYs in sub-Saharan Africa. In 1990, 47% of DALYs worldwide were from communicable, maternal, neonatal, and nutritional disorders, 43% from non-communicable diseases, and 10% from injuries. By 2010, this had shifted to 35%, 54%, and 11%, respectively. Ischaemic heart disease was the leading cause of DALYs worldwide in 2010 (up from fourth rank in 1990, increasing by 29%), followed by lower respiratory infections (top rank in 1990; 44% decline in DALYs), stroke (fifth in 1990; 19% increase), diarrhoeal diseases (second in 1990; 51% decrease), and HIV/AIDS (33rd in 1990; 351% increase). Major depressive disorder increased from 15th to 11th rank (37% increase) and road injury from 12th to 10th rank (34% increase). Substantial heterogeneity exists in rankings of leading causes of disease burden among regions. Interpretation Global disease burden has continued to shift away from communicable to non-communicable diseases and from premature death to years lived with disability. In sub-Saharan Africa, however, many communicable, maternal, neonatal, and nutritional disorders remain the dominant causes of disease burden. The rising burden from mental and behavioural disorders, musculoskeletal disorders, and diabetes will impose new challenges on health systems. Regional heterogeneity highlights the importance of understanding local burden of disease and setting goals and targets for the post-2015 agenda taking such patterns into account. Because of improved definitions, methods, and data, these results for 1990 and 2010 supersede all previously published Global Burden of Disease results
Years lived with disability (YLDs) for 1160 sequelae of 289 diseases and injuries 1990-2010:a systematic analysis for the Global Burden of Disease Study 2010
BACKGROUND: Non-fatal health outcomes from diseases and injuries are a crucial consideration in the promotion and monitoring of individual and population health. The Global Burden of Disease (GBD) studies done in 1990 and 2000 have been the only studies to quantify non-fatal health outcomes across an exhaustive set of disorders at the global and regional level. Neither effort quantified uncertainty in prevalence or years lived with disability (YLDs).METHODS: Of the 291 diseases and injuries in the GBD cause list, 289 cause disability. For 1160 sequelae of the 289 diseases and injuries, we undertook a systematic analysis of prevalence, incidence, remission, duration, and excess mortality. Sources included published studies, case notification, population-based cancer registries, other disease registries, antenatal clinic serosurveillance, hospital discharge data, ambulatory care data, household surveys, other surveys, and cohort studies. For most sequelae, we used a Bayesian meta-regression method, DisMod-MR, designed to address key limitations in descriptive epidemiological data, including missing data, inconsistency, and large methodological variation between data sources. For some disorders, we used natural history models, geospatial models, back-calculation models (models calculating incidence from population mortality rates and case fatality), or registration completeness models (models adjusting for incomplete registration with health-system access and other covariates). Disability weights for 220 unique health states were used to capture the severity of health loss. YLDs by cause at age, sex, country, and year levels were adjusted for comorbidity with simulation methods. We included uncertainty estimates at all stages of the analysis.FINDINGS: Global prevalence for all ages combined in 2010 across the 1160 sequelae ranged from fewer than one case per 1 million people to 350,000 cases per 1 million people. Prevalence and severity of health loss were weakly correlated (correlation coefficient -0·37). In 2010, there were 777 million YLDs from all causes, up from 583 million in 1990. The main contributors to global YLDs were mental and behavioural disorders, musculoskeletal disorders, and diabetes or endocrine diseases. The leading specific causes of YLDs were much the same in 2010 as they were in 1990: low back pain, major depressive disorder, iron-deficiency anaemia, neck pain, chronic obstructive pulmonary disease, anxiety disorders, migraine, diabetes, and falls. Age-specific prevalence of YLDs increased with age in all regions and has decreased slightly from 1990 to 2010. Regional patterns of the leading causes of YLDs were more similar compared with years of life lost due to premature mortality. Neglected tropical diseases, HIV/AIDS, tuberculosis, malaria, and anaemia were important causes of YLDs in sub-Saharan Africa.INTERPRETATION: Rates of YLDs per 100,000 people have remained largely constant over time but rise steadily with age. Population growth and ageing have increased YLD numbers and crude rates over the past two decades. Prevalences of the most common causes of YLDs, such as mental and behavioural disorders and musculoskeletal disorders, have not decreased. Health systems will need to address the needs of the rising numbers of individuals with a range of disorders that largely cause disability but not mortality. Quantification of the burden of non-fatal health outcomes will be crucial to understand how well health systems are responding to these challenges. Effective and affordable strategies to deal with this rising burden are an urgent priority for health systems in most parts of the world.FUNDING: Bill & Melinda Gates Foundation.</p
Large expert-curated database for benchmarking document similarity detection in biomedical literature search
Document recommendation systems for locating relevant literature have mostly relied on methods developed a decade ago. This is largely due to the lack of a large offline gold-standard benchmark of relevant documents that cover a variety of research fields such that newly developed literature search techniques can be compared, improved and translated into practice. To overcome this bottleneck, we have established the RElevant LIterature SearcH consortium consisting of more than 1500 scientists from 84 countries, who have collectively annotated the relevance of over 180 000 PubMed-listed articles with regard to their respective seed (input) article/s. The majority of annotations were contributed by highly experienced, original authors of the seed articles. The collected data cover 76% of all unique PubMed Medical Subject Headings descriptors. No systematic biases were observed across different experience levels, research fields or time spent on annotations. More importantly, annotations of the same document pairs contributed by different scientists were highly concordant. We further show that the three representative baseline methods used to generate recommended articles for evaluation (Okapi Best Matching 25, Term Frequency-Inverse Document Frequency and PubMed Related Articles) had similar overall performances. Additionally, we found that these methods each tend to produce distinct collections of recommended articles, suggesting that a hybrid method may be required to completely capture all relevant articles. The established database server located at https://relishdb.ict.griffith.edu.au is freely available for the downloading of annotation data and the blind testing of new methods. We expect that this benchmark will be useful for stimulating the development of new powerful techniques for title and title/abstract-based search engines for relevant articles in biomedical science. © The Author(s) 2019. Published by Oxford University Press