31 research outputs found

    Large expert-curated database for benchmarking document similarity detection in biomedical literature search

    Get PDF
    Document recommendation systems for locating relevant literature have mostly relied on methods developed a decade ago. This is largely due to the lack of a large offline gold-standard benchmark of relevant documents that cover a variety of research fields such that newly developed literature search techniques can be compared, improved and translated into practice. To overcome this bottleneck, we have established the RElevant LIterature SearcH consortium consisting of more than 1500 scientists from 84 countries, who have collectively annotated the relevance of over 180 000 PubMed-listed articles with regard to their respective seed (input) article/s. The majority of annotations were contributed by highly experienced, original authors of the seed articles. The collected data cover 76% of all unique PubMed Medical Subject Headings descriptors. No systematic biases were observed across different experience levels, research fields or time spent on annotations. More importantly, annotations of the same document pairs contributed by different scientists were highly concordant. We further show that the three representative baseline methods used to generate recommended articles for evaluation (Okapi Best Matching 25, Term Frequency–Inverse Document Frequency and PubMed Related Articles) had similar overall performances. Additionally, we found that these methods each tend to produce distinct collections of recommended articles, suggesting that a hybrid method may be required to completely capture all relevant articles. The established database server located at https://relishdb.ict.griffith.edu.au is freely available for the downloading of annotation data and the blind testing of new methods. We expect that this benchmark will be useful for stimulating the development of new powerful techniques for title and title/abstract-based search engines for relevant articles in biomedical research

    Large expert-curated database for benchmarking document similarity detection in biomedical literature search

    Get PDF
    Document recommendation systems for locating relevant literature have mostly relied on methods developed a decade ago. This is largely due to the lack of a large offline gold-standard benchmark of relevant documents that cover a variety of research fields such that newly developed literature search techniques can be compared, improved and translated into practice. To overcome this bottleneck, we have established the RElevant LIterature SearcH consortium consisting of more than 1500 scientists from 84 countries, who have collectively annotated the relevance of over 180 000 PubMed-listed articles with regard to their respective seed (input) article/s. The majority of annotations were contributed by highly experienced, original authors of the seed articles. The collected data cover 76% of all unique PubMed Medical Subject Headings descriptors. No systematic biases were observed across different experience levels, research fields or time spent on annotations. More importantly, annotations of the same document pairs contributed by different scientists were highly concordant. We further show that the three representative baseline methods used to generate recommended articles for evaluation (Okapi Best Matching 25, Term Frequency-Inverse Document Frequency and PubMed Related Articles) had similar overall performances. Additionally, we found that these methods each tend to produce distinct collections of recommended articles, suggesting that a hybrid method may be required to completely capture all relevant articles. The established database server located at https://relishdb.ict.griffith.edu.au is freely available for the downloading of annotation data and the blind testing of new methods. We expect that this benchmark will be useful for stimulating the development of new powerful techniques for title and title/abstract-based search engines for relevant articles in biomedical research.Peer reviewe

    Differences in height explain gender differences in the response to the oral glucose tolerance test - the AusDiab study

    Full text link
    Aim To determine the extent of gender-related differences in the prevalence of glucose intolerance for the Australian population and whether body size may explain such differences.Methods Cross-sectional data were collected from a national cohort of 11 247 Australians aged &ge; 25 years. Glucose tolerance status was assessed according to both fasting plasma glucose (FPG) and 2-h plasma glucose (2hPG) levels following a 75-g oral glucose tolerance test (OGTT). Anthropometric and glycated haemoglobin measurements were also made.Results Undiagnosed diabetes and non-diabetic glucose abnormalities were more prevalent among men than women when based only on the FPG results (diabetes: men 2.2%, women 1.6%, P = 0.02; impaired fasting glycaemia: men 12.3%, women 6.6%, P &lt; 0.001). In contrast 16.0% of women and 13.0% of men had a 2hPG abnormality (either diabetes or impaired glucose tolerance, P = 0.14). Women had a mean FPG 0.3 mmol/l lower than men (P &lt; 0.001), but 2hPG 0.3 mmol/l higher (P = 0.002) and FPG-2hPG increment 0.5 mmol/l greater (P &lt; 0.001). The gender difference in mean 2hPG and FPG-2hPG increment disappeared following adjustment for height. For both genders, those in the shortest height quartile had 2hPG levels 0.5 mmol/l higher than the tallest quartile, but height showed almost no relationship with the FPG.Conclusions Men and women had different glycaemic profiles; women had higher mean 2hPG levels, despite lower fasting levels. It appeared that the higher 2hPG levels for women related to lesser height and may be a consequence of using a fixed glucose load in the OGTT, irrespective of body size.<br /
    corecore