23 research outputs found

    Large expert-curated database for benchmarking document similarity detection in biomedical literature search

    Get PDF
    Document recommendation systems for locating relevant literature have mostly relied on methods developed a decade ago. This is largely due to the lack of a large offline gold-standard benchmark of relevant documents that cover a variety of research fields such that newly developed literature search techniques can be compared, improved and translated into practice. To overcome this bottleneck, we have established the RElevant LIterature SearcH consortium consisting of more than 1500 scientists from 84 countries, who have collectively annotated the relevance of over 180 000 PubMed-listed articles with regard to their respective seed (input) article/s. The majority of annotations were contributed by highly experienced, original authors of the seed articles. The collected data cover 76% of all unique PubMed Medical Subject Headings descriptors. No systematic biases were observed across different experience levels, research fields or time spent on annotations. More importantly, annotations of the same document pairs contributed by different scientists were highly concordant. We further show that the three representative baseline methods used to generate recommended articles for evaluation (Okapi Best Matching 25, Term Frequency–Inverse Document Frequency and PubMed Related Articles) had similar overall performances. Additionally, we found that these methods each tend to produce distinct collections of recommended articles, suggesting that a hybrid method may be required to completely capture all relevant articles. The established database server located at https://relishdb.ict.griffith.edu.au is freely available for the downloading of annotation data and the blind testing of new methods. We expect that this benchmark will be useful for stimulating the development of new powerful techniques for title and title/abstract-based search engines for relevant articles in biomedical research

    Large expert-curated database for benchmarking document similarity detection in biomedical literature search

    Get PDF
    Document recommendation systems for locating relevant literature have mostly relied on methods developed a decade ago. This is largely due to the lack of a large offline gold-standard benchmark of relevant documents that cover a variety of research fields such that newly developed literature search techniques can be compared, improved and translated into practice. To overcome this bottleneck, we have established the RElevant LIterature SearcH consortium consisting of more than 1500 scientists from 84 countries, who have collectively annotated the relevance of over 180 000 PubMed-listed articles with regard to their respective seed (input) article/s. The majority of annotations were contributed by highly experienced, original authors of the seed articles. The collected data cover 76% of all unique PubMed Medical Subject Headings descriptors. No systematic biases were observed across different experience levels, research fields or time spent on annotations. More importantly, annotations of the same document pairs contributed by different scientists were highly concordant. We further show that the three representative baseline methods used to generate recommended articles for evaluation (Okapi Best Matching 25, Term Frequency-Inverse Document Frequency and PubMed Related Articles) had similar overall performances. Additionally, we found that these methods each tend to produce distinct collections of recommended articles, suggesting that a hybrid method may be required to completely capture all relevant articles. The established database server located at https://relishdb.ict.griffith.edu.au is freely available for the downloading of annotation data and the blind testing of new methods. We expect that this benchmark will be useful for stimulating the development of new powerful techniques for title and title/abstract-based search engines for relevant articles in biomedical research.Peer reviewe

    Spectral and atmospheric characterization of 51 Eridani b using VLT/SPHERE

    Get PDF
    51 Eridani b is an exoplanet around a young (20 Myr) nearby (29.4 pc) F0-type star, recently discovered by direct imaging. Being only 0.5" away from its host star it is well suited for spectroscopic analysis using integral field spectrographs. We aim to refine the atmospheric properties of this and to further constrain the architecture of the system by searching for additional companions. Using the SPHERE instrument at the VLT we extend the spectral coverage of the planet to the complete Y- to H-band range and provide photometry in the K12-bands (2.11, 2.25 micron). The object is compared to other cool and peculiar dwarfs. Furthermore, the posterior probability distributions of cloudy and clear atmospheric models are explored using MCMC. We verified our methods by determining atmospheric parameters for the two benchmark brown dwarfs Gl 570D and HD 3651B. For probing the innermost region for additional companions, archival VLT-NACO (L') SAM data is used. We present the first spectrophotometric measurements in the Y- and K-bands for the planet and revise its J-band flux to values 40% fainter than previous measurements. Cloudy models with uniform cloud coverage provide a good match to the data. We derive the temperature, radius, surface gravity, metallicity and cloud sedimentation parameter f_sed. We find that the atmosphere is highly super-solar (Fe/H~1.0) with an extended, thick cloud cover of small particles. The model radius and surface gravity suggest planetary masses of about 9 M_jup. The evolutionary model only provides a lower mass limit of >2 M_jup (for pure hot-start). The cold-start model cannot explain the planet's luminosity. The SPHERE and NACO/SAM detection limits probe the 51 Eri system at Solar System scales and exclude brown-dwarf companions more massive than 20 M_jup beyond separations of ~2.5 au and giant planets more massive than 2 M_jup beyond 9 au.Comment: 29 pages, 31 figures, accepted for publication in A&
    corecore