9 research outputs found

    Large expert-curated database for benchmarking document similarity detection in biomedical literature search

    Get PDF
    Document recommendation systems for locating relevant literature have mostly relied on methods developed a decade ago. This is largely due to the lack of a large offline gold-standard benchmark of relevant documents that cover a variety of research fields such that newly developed literature search techniques can be compared, improved and translated into practice. To overcome this bottleneck, we have established the RElevant LIterature SearcH consortium consisting of more than 1500 scientists from 84 countries, who have collectively annotated the relevance of over 180 000 PubMed-listed articles with regard to their respective seed (input) article/s. The majority of annotations were contributed by highly experienced, original authors of the seed articles. The collected data cover 76% of all unique PubMed Medical Subject Headings descriptors. No systematic biases were observed across different experience levels, research fields or time spent on annotations. More importantly, annotations of the same document pairs contributed by different scientists were highly concordant. We further show that the three representative baseline methods used to generate recommended articles for evaluation (Okapi Best Matching 25, Term Frequency-Inverse Document Frequency and PubMed Related Articles) had similar overall performances. Additionally, we found that these methods each tend to produce distinct collections of recommended articles, suggesting that a hybrid method may be required to completely capture all relevant articles. The established database server located at https://relishdb.ict.griffith.edu.au is freely available for the downloading of annotation data and the blind testing of new methods. We expect that this benchmark will be useful for stimulating the development of new powerful techniques for title and title/abstract-based search engines for relevant articles in biomedical research.Peer reviewe

    Pb Mineral Precipitation in Solutions of Sulfate, Carbonate and Phosphate: Measured and Modeled Pb Solubility and Pb2+ Activity

    No full text
    Lead (Pb) solubility is commonly limited by dissolution–precipitation reactions of secondary mineral phases in contaminated soils and water. In the research described here, Pb solubility and free Pb2+ ion activities were measured following the precipitation of Pb minerals from aqueous solutions containing sulfate or carbonate in a 1:5 mole ratio in the absence and presence of phosphate over the pH range 4.0–9.0. Using X-ray diffraction and Fourier-transform infrared spectroscopic analysis, we identified anglesite formed in sulfate-containing solutions at low pH. At higher pH, Pb carbonate and carbonate-sulfate minerals, hydrocerussite and leadhillite, were formed in preference to anglesite. Precipitates formed in the Pb-carbonate systems over the pH range of 6 to 9 were composed of cerussite and hydrocerussite, with the latter favored only at the highest pH investigated. The addition of phosphate into the Pb-sulfate and Pb-carbonate systems resulted in the precipitation of Pb3(PO4)2 and structurally related pyromorphite minerals and prevented Pb sulfate and carbonate mineral formation. Phosphate increased the efficiency of Pb removal from solution and decreased free Pb2+ ion activity, causing over 99.9% of Pb to be precipitated. Free Pb2+ ion activities measured using the ion-selective electrode revealed lower values than predicted from thermodynamic constants, indicating that the precipitated minerals may have lower KSP values than generally reported in thermodynamic databases. Conversely, dissolved Pb was frequently greater than predicted based on a speciation model using accepted thermodynamic constants for Pb ion-pair formation in solution. The tendency of the thermodynamic models to underestimate Pb solubility while overestimating free Pb2+ activity in these systems, at least in the higher pH range, indicates that soluble Pb ion-pair formation constants and KSP values need correction in the models

    Trace element associations with Fe- and Mn-oxides in soil nodules: Comparison of selective dissolution with electron probe microanalysis

    No full text
    Selective dissolution methods have been largely used to get insight on trace element association with solid phases. Modern instrumental techniques offer many tools to test the validity of selective dissolution methods and should be systematically used to this end. The association of trace elements with Fe- and Mn-oxides in soil nodules has been studied here by electron probe microanalysis. The results were compared with findings from an earlier study on selective dissolution of the same nodules by hydroxylamine hydrochloride, acidified hydrogen peroxide, and Na-citrate-bicarbonate-dithionite. Electron probe microanalysis results were consistent with previous findings using selective dissolution and showed that P, As and Cr were mainly present in Fe-oxides, while Co was mainly associated with Mn-oxide phases. These results support the applicability of the studied selective dissolution methods for fractionation of trace elements in soils and sediments containing appreciable amounts of Fe and Mn-oxide phase

    Large expert-curated database for benchmarking document similarity detection in biomedical literature search

    No full text

    Large expert-curated database for benchmarking document similarity detection in biomedical literature search

    No full text
    Document recommendation systems for locating relevant literature have mostly relied on methods developed a decade ago. This is largely due to the lack of a large offline gold-standard benchmark of relevant documents that cover a variety of research fields such that newly developed literature search techniques can be compared, improved and translated into practice. To overcome this bottleneck, we have established the RElevant LIterature SearcH consortium consisting of more than 1500 scientists from 84 countries, who have collectively annotated the relevance of over 180 000 PubMed-listed articles with regard to their respective seed (input) article/s. The majority of annotations were contributed by highly experienced, original authors of the seed articles. The collected data cover 76% of all unique PubMed Medical Subject Headings descriptors. No systematic biases were observed across different experience levels, research fields or time spent on annotations. More importantly, annotations of the same document pairs contributed by different scientists were highly concordant. We further show that the three representative baseline methods used to generate recommended articles for evaluation (Okapi Best Matching 25, Term Frequency-Inverse Document Frequency and PubMed Related Articles) had similar overall performances. Additionally, we found that these methods each tend to produce distinct collections of recommended articles, suggesting that a hybrid method may be required to completely capture all relevant articles. The established database server located at https://relishdb.ict.griffith.edu.au is freely available for the downloading of annotation data and the blind testing of new methods. We expect that this benchmark will be useful for stimulating the development of new powerful techniques for title and title/abstract-based search engines for relevant articles in biomedical science. © The Author(s) 2019. Published by Oxford University Press
    corecore