15 research outputs found

    Large expert-curated database for benchmarking document similarity detection in biomedical literature search

    Get PDF
    Document recommendation systems for locating relevant literature have mostly relied on methods developed a decade ago. This is largely due to the lack of a large offline gold-standard benchmark of relevant documents that cover a variety of research fields such that newly developed literature search techniques can be compared, improved and translated into practice. To overcome this bottleneck, we have established the RElevant LIterature SearcH consortium consisting of more than 1500 scientists from 84 countries, who have collectively annotated the relevance of over 180 000 PubMed-listed articles with regard to their respective seed (input) article/s. The majority of annotations were contributed by highly experienced, original authors of the seed articles. The collected data cover 76% of all unique PubMed Medical Subject Headings descriptors. No systematic biases were observed across different experience levels, research fields or time spent on annotations. More importantly, annotations of the same document pairs contributed by different scientists were highly concordant. We further show that the three representative baseline methods used to generate recommended articles for evaluation (Okapi Best Matching 25, Term Frequency-Inverse Document Frequency and PubMed Related Articles) had similar overall performances. Additionally, we found that these methods each tend to produce distinct collections of recommended articles, suggesting that a hybrid method may be required to completely capture all relevant articles. The established database server located at https://relishdb.ict.griffith.edu.au is freely available for the downloading of annotation data and the blind testing of new methods. We expect that this benchmark will be useful for stimulating the development of new powerful techniques for title and title/abstract-based search engines for relevant articles in biomedical research.Peer reviewe

    Correlations between physical and chemical defences in plants: tradeoffs, syndromes, or just many different ways to skin a herbivorous cat?

    Get PDF
    � Most plant species have a range of traits that deter herbivores. However, understanding of how different defences are related to one another is surprisingly weak. Many authors argue that defence traits trade off against one another, while others argue that they form coordinated defence syndromes. � We collected a dataset of unprecedented taxonomic and geographic scope (261 species spanning 80 families, from 75 sites across the globe) to investigate relationships among four chemical and six physical defences. � Five of the 45 pairwise correlations between defence traits were significant and three of these were tradeoffs. The relationship between species’ overall chemical and physical defence levels was marginally nonsignificant (P = 0.08), and remained nonsignificant after accounting for phylogeny, growth form and abundance. Neither categorical principal component analysis (PCA) nor hierarchical cluster analysis supported the idea that species displayed defence syndromes. � Our results do not support arguments for tradeoffs or for coordinated defence syndromes. Rather, plants display a range of combinations of defence traits. We suggest this lack of consistent defence syndromes may be adaptive, resulting from selective pressure to deploy a different combination of defences to coexisting species

    'Hepitopes': A Database of HLA Class I Epitopes in Hepatitis B Virus

    No full text
    This is a spreadsheet containing the results of a systematic literature review to identify and curate all known and putative HLA Class I epitopes in hepatitis B virus (HBV). <div> <p>We performed our literature review in January 2016, searching Medline and Embase via the OVID search interface made available by the University of Oxford. No date restrictions were imposed; Medline was searched from 1946-2016 and Embase from 1974-2016. The relevant subject headings for Epitopes, HLA Antigens, CD8 Antigens and Hepatitis B from the thesauri (MESH and EMTREE) were exploded and searched. In addition the terms epitope* (to pick up singular and plural) and Hepatitis B and HBV were searched in the title and abstract fields. We also identified additional references by searching the bibliographies of relevant articles. </p><p>As well as details of each citation, we recorded the HBV protein and sequence-numbered location of each epitope (based on a published reference strain; Liu WC, et al. Aligning to the sample-specific reference sequence to optimize the accuracy of next-generation sequencing analysis for hepatitis B virus. <i>Hepatol Int</i> 2016;10:147-157).</p> <p>Each citation was reviewed by a primary reviewer and then again by a second expert to ensure the records are as accurate as possible. </p><p>Over time, we aim to update and refine the dataset such that it becomes a growing resource for virologists, immunologists and those in the field of vaccine design.</p><p>A live interactive version can be viewed at http://www.expmedndm.ox.ac.uk/hepitopes.</p><p>The visualisation code can be viewed at <a href="https://github.com/ox-it/hepitopes" rel="noreferrer" target="_blank">https://github.com/ox-it/hepitopes</a>. </p><p>As the database is updated and refined, we will submit updated versions to Figshare. In parallel, we have uploaded a static version to Oxford Research Archive (ORA) at the point of initial publication; DOI: 10.5287/bodleian:zr0VAr78q</p> </div
    corecore