15 research outputs found

    Large expert-curated database for benchmarking document similarity detection in biomedical literature search

    Get PDF
    Document recommendation systems for locating relevant literature have mostly relied on methods developed a decade ago. This is largely due to the lack of a large offline gold-standard benchmark of relevant documents that cover a variety of research fields such that newly developed literature search techniques can be compared, improved and translated into practice. To overcome this bottleneck, we have established the RElevant LIterature SearcH consortium consisting of more than 1500 scientists from 84 countries, who have collectively annotated the relevance of over 180 000 PubMed-listed articles with regard to their respective seed (input) article/s. The majority of annotations were contributed by highly experienced, original authors of the seed articles. The collected data cover 76% of all unique PubMed Medical Subject Headings descriptors. No systematic biases were observed across different experience levels, research fields or time spent on annotations. More importantly, annotations of the same document pairs contributed by different scientists were highly concordant. We further show that the three representative baseline methods used to generate recommended articles for evaluation (Okapi Best Matching 25, Term Frequency-Inverse Document Frequency and PubMed Related Articles) had similar overall performances. Additionally, we found that these methods each tend to produce distinct collections of recommended articles, suggesting that a hybrid method may be required to completely capture all relevant articles. The established database server located at https://relishdb.ict.griffith.edu.au is freely available for the downloading of annotation data and the blind testing of new methods. We expect that this benchmark will be useful for stimulating the development of new powerful techniques for title and title/abstract-based search engines for relevant articles in biomedical research.Peer reviewe

    Quantitative vegetation reconstruction from pollen analysis and historical inventory data around a Danish small forest hollow

    No full text
    Questions Can the model performance of the landscape reconstruction algorithm (LRA) for small forest hollows be validated through comparison to inventory-based vegetation reconstructions from the last 150yrs? Does the application of LRA and the comparison to historical data enhance interpretation of the pollen record? Location Denmark. The Gribskov-Ostrup small forest hollow (56 degrees N, 12 degrees 20E, 44m a.s.l.) in the forest of Gribskov, eastern Denmark. Methods Pollen analysis was carried out on a small forest hollow, and LRA used to derive pollen-based quantitative estimates of past vegetation. Historical forest inventory data and maps were used to reconstruct the vegetation within three different circles around the hollow (20, 50 and 200m ring widths) for five time periods during the last 150yrs. The results of the two approaches were compared in order to evaluate model performance, and the LRA-based reconstruction used to describe how the model changes interpretation of vegetation development during the last ca. 6500yrs compared to the use of pollen percentages alone. Results Distance-weighted inventory-based reconstructions within 200m of the hollow's edge provide the best match with the LRA-modelled vegetation. Precise validation of the model is not possible due to insufficient historical data, but the comparison indicates that the LRA reconstruction for Gribskov tends to (1) underestimate tree cover and overestimate open areas, (2) give a too high representation of on-site pollen types, (3) give an underestimation of Fagus and (4) a small overestimation of Quercus and Corylus. Despite these uncertainties, application of the LRA model shows a higher degree of openness than would be apparent from the uncorrected pollen diagram, and makes it possible to attempt to distinguish changes at the local scale from regional vegetation changes, thus giving a clearer picture of the vegetation changes at the site. Conclusions We demonstrate that the estimates of the LRA model applied to pollen data from small forest hollows can be compared with small-scale historical data to evaluate model performance

    A comparison of charcoal measurements for reconstruction of Mediterranean paleo-fire frequency in the mountains of Corsica

    No full text
    International audienceAbstract Fire-history reconstructions inferred from sedimentary charcoal records are based on measuring sieved charcoal fragment area, estimating fragment volume, or counting fragments. Similar fire histories are reconstructed from these three approaches for boreal lake sediment cores, using locally defined thresholds. Here, we test the same approach for a montane Mediterranean lake in which taphonomical processes might differ from boreal lakes through fragmentation of charcoal particles. The Mediterranean charcoal series are characterized by highly variable charcoal accumulation rates. Results there indicate that the three proxies do not provide comparable fire histories. The differences are attributable to charcoal fragmentation. This could be linked to fire type (crown or surface fires) or taphonomical processes, including charcoal transportation in the catchment area or in the sediment. The lack of correlation between the concentration of charcoal and of mineral matter suggests that fragmentation is not linked to erosion. Reconstructions based on charcoal area are more robust and stable than those based on fragment counts. Area-based reconstructions should therefore be used instead of the particle-counting method when fragmentation may influence the fragment abundance

    Long-term forest dynamics at Gribskov, eastern Denmark with early-Holocene evidence for thermophilous broadleaved tree species

    No full text
    We report on a full-Holocene pollen, charcoal and macrofossil record from a small forest hollow in Gribskov, eastern Denmark. The Fagus sylvatica pollen record suggests the establishment of a small Fagus population at Gribskov in the early Holocene together with early establishment of other thermophilous broadleaved trees, including Quercus sp., Tilia sp. and Ulmus sp. The macrofossils contribute to the vegetation reconstruction with evidence for local presence of species with low pollen productivity or easily degraded pollen types such as Populus. The charcoal record shows frequent burning during two periods of the early Holocene and from c. 3000 cal. BP to present. The early-Holocene part of the record indicates a highly disturbed forest ecosystem with frequent fires and abundant macrofossils of particularly Betula sp. and Populus sp. The sediment stratigraphy and age-depth relationships give no clear indication of post-depositional disturbance, although a possible short-lived hiatus occurs around 6500 cal. BP. The early pollen record from thermophilous trees could indicate that there may have been some downwash following sediment desiccation through wood peat layers deposited between c. 6500 and 10,000 cal. BP, but the overall biostratigraphy is consistent with other Danish small hollow records
    corecore