73 research outputs found

    Large expert-curated database for benchmarking document similarity detection in biomedical literature search

    Get PDF
    Document recommendation systems for locating relevant literature have mostly relied on methods developed a decade ago. This is largely due to the lack of a large offline gold-standard benchmark of relevant documents that cover a variety of research fields such that newly developed literature search techniques can be compared, improved and translated into practice. To overcome this bottleneck, we have established the RElevant LIterature SearcH consortium consisting of more than 1500 scientists from 84 countries, who have collectively annotated the relevance of over 180 000 PubMed-listed articles with regard to their respective seed (input) article/s. The majority of annotations were contributed by highly experienced, original authors of the seed articles. The collected data cover 76% of all unique PubMed Medical Subject Headings descriptors. No systematic biases were observed across different experience levels, research fields or time spent on annotations. More importantly, annotations of the same document pairs contributed by different scientists were highly concordant. We further show that the three representative baseline methods used to generate recommended articles for evaluation (Okapi Best Matching 25, Term Frequency-Inverse Document Frequency and PubMed Related Articles) had similar overall performances. Additionally, we found that these methods each tend to produce distinct collections of recommended articles, suggesting that a hybrid method may be required to completely capture all relevant articles. The established database server located at https://relishdb.ict.griffith.edu.au is freely available for the downloading of annotation data and the blind testing of new methods. We expect that this benchmark will be useful for stimulating the development of new powerful techniques for title and title/abstract-based search engines for relevant articles in biomedical research.Peer reviewe

    Finishing the euchromatic sequence of the human genome

    Get PDF
    The sequence of the human genome encodes the genetic instructions for human physiology, as well as rich information about human evolution. In 2001, the International Human Genome Sequencing Consortium reported a draft sequence of the euchromatic portion of the human genome. Since then, the international collaboration has worked to convert this draft into a genome sequence with high accuracy and nearly complete coverage. Here, we report the result of this finishing process. The current genome sequence (Build 35) contains 2.85 billion nucleotides interrupted by only 341 gaps. It covers ∼99% of the euchromatic genome and is accurate to an error rate of ∼1 event per 100,000 bases. Many of the remaining euchromatic gaps are associated with segmental duplications and will require focused work with new methods. The near-complete sequence, the first for a vertebrate, greatly improves the precision of biological analyses of the human genome including studies of gene number, birth and death. Notably, the human enome seems to encode only 20,000-25,000 protein-coding genes. The genome sequence reported here should serve as a firm foundation for biomedical research in the decades ahead

    Worldwide trends in underweight and obesity from 1990 to 2022: a pooled analysis of 3663 population-representative studies with 222 million children, adolescents, and adults

    Get PDF
    Background Underweight and obesity are associated with adverse health outcomes throughout the life course. We estimated the individual and combined prevalence of underweight or thinness and obesity, and their changes, from 1990 to 2022 for adults and school-aged children and adolescents in 200 countries and territories. Methods We used data from 3663 population-based studies with 222 million participants that measured height and weight in representative samples of the general population. We used a Bayesian hierarchical model to estimate trends in the prevalence of different BMI categories, separately for adults (age ≥20 years) and school-aged children and adolescents (age 5–19 years), from 1990 to 2022 for 200 countries and territories. For adults, we report the individual and combined prevalence of underweight (BMI <18·5 kg/m2) and obesity (BMI ≥30 kg/m2). For schoolaged children and adolescents, we report thinness (BMI <2 SD below the median of the WHO growth reference) and obesity (BMI >2 SD above the median). Findings From 1990 to 2022, the combined prevalence of underweight and obesity in adults decreased in 11 countries (6%) for women and 17 (9%) for men with a posterior probability of at least 0·80 that the observed changes were true decreases. The combined prevalence increased in 162 countries (81%) for women and 140 countries (70%) for men with a posterior probability of at least 0·80. In 2022, the combined prevalence of underweight and obesity was highest in island nations in the Caribbean and Polynesia and Micronesia, and countries in the Middle East and north Africa. Obesity prevalence was higher than underweight with posterior probability of at least 0·80 in 177 countries (89%) for women and 145 (73%) for men in 2022, whereas the converse was true in 16 countries (8%) for women, and 39 (20%) for men. From 1990 to 2022, the combined prevalence of thinness and obesity decreased among girls in five countries (3%) and among boys in 15 countries (8%) with a posterior probability of at least 0·80, and increased among girls in 140 countries (70%) and boys in 137 countries (69%) with a posterior probability of at least 0·80. The countries with highest combined prevalence of thinness and obesity in school-aged children and adolescents in 2022 were in Polynesia and Micronesia and the Caribbean for both sexes, and Chile and Qatar for boys. Combined prevalence was also high in some countries in south Asia, such as India and Pakistan, where thinness remained prevalent despite having declined. In 2022, obesity in school-aged children and adolescents was more prevalent than thinness with a posterior probability of at least 0·80 among girls in 133 countries (67%) and boys in 125 countries (63%), whereas the converse was true in 35 countries (18%) and 42 countries (21%), respectively. In almost all countries for both adults and school-aged children and adolescents, the increases in double burden were driven by increases in obesity, and decreases in double burden by declining underweight or thinness. Interpretation The combined burden of underweight and obesity has increased in most countries, driven by an increase in obesity, while underweight and thinness remain prevalent in south Asia and parts of Africa. A healthy nutrition transition that enhances access to nutritious foods is needed to address the remaining burden of underweight while curbing and reversing the increase in obesit

    Common variants in SOX-2 and congenital cataract genes contribute to age-related nuclear cataract

    Get PDF
    Nuclear cataract is the most common type of age-related cataract and a leading cause of blindness worldwide. Age-related nuclear cataract is heritable (h2 = 0.48), but little is known about specific genetic factors underlying this condition. Here we report findings from the largest to date multi-ethnic meta-analysis of genome-wide association studies (discovery cohort N = 14,151 and replication N = 5299) of the International Cataract Genetics Consortium. We confirmed the known genetic association of CRYAA (rs7278468, P = 2.8 × 10−16) with nuclear cataract and identified five new loci associated with this disease: SOX2-OT (rs9842371, P = 1.7 × 1

    Multiancestry Genome-Wide Association Study of Lipid Levels Incorporating Gene-Alcohol Interactions

    Get PDF
    A person's lipid profile is influenced by genetic variants and alcohol consumption, but the contribution of interactions between these exposures has not been studied. We therefore incorporated gene-alcohol interactions into a multiancestry genome-wide association study of levels of high-density lipoprotein cholesterol, low-density lipoprotein cholesterol, and triglycerides. We included 45 studies in stage 1 (genome-wide discovery) and 66 studies in stage 2 (focused follow-up), for a total of 394,584 individuals from 5 ancestry groups. Analyses covered the period July 2014-November 2017. Genetic main effects and interaction effects were jointly assessed by means of a 2-degrees-of-freedom (df) test, and a 1-df test was used to assess the interaction effects alone. Variants at 495 loci were at least suggestively associated (P <1 x 10(-6)) with lipid levels in stage 1 and were evaluated in stage 2, followed by combined analyses of stage 1 and stage 2. In the combined analysis of stages 1 and 2, a total of 147 independent loci were associated with lipid levels at P <5 x 10(-8) using 2-df tests, of which 18 were novel. No genome-wide-significant associations were found testing the interaction effect alone. The novel loci included several genes (proprotein convertase subtilisin/kexin type 5 (PCSK5), vascular endothelial growth factor B (VEGFB), and apolipoprotein B mRNA editing enzyme, catalytic polypeptide 1 (APOBEC1) complementation factor (A1CF)) that have a putative role in lipid metabolism on the basis of existing evidence from cellular and experimental models.Peer reviewe

    A multi-ancestry genome-wide study incorporating gene-smoking interactions identifies multiple new loci for pulse pressure and mean arterial pressure

    Get PDF
    Elevated blood pressure (BP), a leading cause of global morbidity and mortality, is influenced by both genetic and lifestyle factors. Cigarette smoking is one such lifestyle factor. Across five ancestries, we performed a genome-wide gene-smoking interaction study of mean arterial pressure (MAP) and pulse pressure (PP) in 129 913 individuals in stage 1 and follow-up analysis in 480 178 additional individuals in stage 2. We report here 136 loci significantly associated with MAP and/or PP. Of these, 61 were previously published through main-effect analysis of BP traits, 37 were recently reported by us for systolic BP and/or diastolic BP through gene-smoking interaction analysis and 38 were newly identified (P <5 x 10(-8), false discovery rate <0.05). We also identified nine new signals near known loci. Of the 136 loci, 8 showed significant interaction with smoking status. They include CSMD1 previously reported for insulin resistance and BP in the spontaneously hypertensive rats. Many of the 38 new loci show biologic plausibility for a role in BP regulation. SLC26A7 encodes a chloride/bicarbonate exchanger expressed in the renal outer medullary collecting duct. AVPR1A is widely expressed, including in vascular smooth muscle cells, kidney, myocardium and brain. FHAD1 is a long non-coding RNA overexpressed in heart failure. TMEM51 was associated with contractile function in cardiomyocytes. CASP9 plays a central role in cardiomyocyte apoptosis. Identified only in African ancestry were 30 novel loci. Our findings highlight the value of multi-ancestry investigations, particularly in studies of interaction with lifestyle factors, where genomic and lifestyle differences may contribute to novel findings.Peer reviewe

    Heterogeneous contributions of change in population distribution of body mass index to change in obesity and underweight NCD Risk Factor Collaboration (NCD-RisC)

    Get PDF
    From 1985 to 2016, the prevalence of underweight decreased, and that of obesity and severe obesity increased, in most regions, with significant variation in the magnitude of these changes across regions. We investigated how much change in mean body mass index (BMI) explains changes in the prevalence of underweight, obesity, and severe obesity in different regions using data from 2896 population-based studies with 187 million participants. Changes in the prevalence of underweight and total obesity, and to a lesser extent severe obesity, are largely driven by shifts in the distribution of BMI, with smaller contributions from changes in the shape of the distribution. In East and Southeast Asia and sub-Saharan Africa, the underweight tail of the BMI distribution was left behind as the distribution shifted. There is a need for policies that address all forms of malnutrition by making healthy foods accessible and affordable, while restricting unhealthy foods through fiscal and regulatory restrictions
    corecore