17 research outputs found

    Abbreviation definition identification based on automatic precision estimates

    Get PDF
    <p>Abstract</p> <p>Background</p> <p>The rapid growth of biomedical literature presents challenges for automatic text processing, and one of the challenges is abbreviation identification. The presence of unrecognized abbreviations in text hinders indexing algorithms and adversely affects information retrieval and extraction. Automatic abbreviation definition identification can help resolve these issues. However, abbreviations and their definitions identified by an automatic process are of uncertain validity. Due to the size of databases such as MEDLINE only a small fraction of abbreviation-definition pairs can be examined manually. An automatic way to estimate the accuracy of abbreviation-definition pairs extracted from text is needed. In this paper we propose an abbreviation definition identification algorithm that employs a variety of strategies to identify the most probable abbreviation definition. In addition our algorithm produces an accuracy estimate, pseudo-precision, for each strategy without using a human-judged gold standard. The pseudo-precisions determine the order in which the algorithm applies the strategies in seeking to identify the definition of an abbreviation.</p> <p>Results</p> <p>On the Medstract corpus our algorithm produced 97% precision and 85% recall which is higher than previously reported results. We also annotated 1250 randomly selected MEDLINE records as a gold standard. On this set we achieved 96.5% precision and 83.2% recall. This compares favourably with the well known Schwartz and Hearst algorithm.</p> <p>Conclusion</p> <p>We developed an algorithm for abbreviation identification that uses a variety of strategies to identify the most probable definition for an abbreviation and also produces an estimated accuracy of the result. This process is purely automatic.</p

    The subtropical nutrient spiral

    Get PDF
    Author Posting. © American Geophysical Union, 2003. This article is posted here by permission of American Geophysical Union for personal use, not for redistribution. The definitive version was published in Global Biogeochemical Cycles 17 (2003): 1110, doi:10.1029/2003GB002085.We present an extended series of observations and more comprehensive analysis of a tracer-based measure of new production in the Sargasso Sea near Bermuda using the 3He flux gauge technique. The estimated annually averaged nitrate flux of 0.84 ± 0.26 mol m−2 yr−1 constitutes only that nitrate physically transported to the euphotic zone, not nitrogen from biological sources (e.g., nitrogen fixation or zooplankton migration). We show that the flux estimate is quantitatively consistent with other observations, including decade timescale evolution of the 3H + 3He inventory in the main thermocline and export production estimates. However, we argue that the flux cannot be supplied in the long term by local diapycnal or isopycnal processes. These considerations lead us to propose a three-dimensional pathway whereby nutrients remineralized within the main thermocline are returned to the seasonally accessible layers within the subtropical gyre. We describe this mechanism, which we call “the nutrient spiral,” as a sequence of steps where (1) nutrient-rich thermocline waters are entrained into the Gulf Stream, (2) enhanced diapycnal mixing moves nutrients upward onto lighter densities, (3) detrainment and enhanced isopycnal mixing injects these waters into the seasonally accessible layer of the gyre recirculation region, and (4) the nutrients become available to biota via eddy heaving and wintertime convection. The spiral is closed when nutrients are utilized, exported, and then remineralized within the thermocline. We present evidence regarding the characteristics of the spiral and discuss some implications of its operation within the biogeochemical cycle of the subtropical ocean.This work was supported by grants from the National Science Foundation (OCE-0221247) and NSF/ONR NOPP (N000140210370)
    corecore