875 research outputs found

    Annotating patient clinical records with syntactic chunks and named entities: the Harvey corpus

    Get PDF
    The free text notes typed by physicians during patient consultations contain valuable information for the study of disease and treatment. These notes are difficult to process by existing natural language analysis tools since they are highly telegraphic (omitting many words), and contain many spelling mistakes, inconsistencies in punctuation, and non-standard word order. To support information extraction and classification tasks over such text, we describe a de-identified corpus of free text notes, a shallow syntactic and named entity annotation scheme for this kind of text, and an approach to training domain specialists with no linguistic background to annotate the text. Finally, we present a statistical chunking system for such clinical text with a stable learning rate and good accuracy, indicating that the manual annotation is consistent and that the annotation scheme is tractable for machine learning

    Bayesian Inference in Processing Experimental Data: Principles and Basic Applications

    Full text link
    This report introduces general ideas and some basic methods of the Bayesian probability theory applied to physics measurements. Our aim is to make the reader familiar, through examples rather than rigorous formalism, with concepts such as: model comparison (including the automatic Ockham's Razor filter provided by the Bayesian approach); parametric inference; quantification of the uncertainty about the value of physical quantities, also taking into account systematic effects; role of marginalization; posterior characterization; predictive distributions; hierarchical modelling and hyperparameters; Gaussian approximation of the posterior and recovery of conventional methods, especially maximum likelihood and chi-square fits under well defined conditions; conjugate priors, transformation invariance and maximum entropy motivated priors; Monte Carlo estimates of expectation, including a short introduction to Markov Chain Monte Carlo methods.Comment: 40 pages, 2 figures, invited paper for Reports on Progress in Physic

    Evaluating the Current Status of American Shad Stocks in Three Virginia Rivers

    Get PDF
    Directed commercial fisheries for American shad Alosa sapidissima in the primary Virginia tributaries of the Chesapeake Bay have been under moratorium since 1994. Monitoring of adult American shad within these rivers has been ongoing since 1998 through a cooperative program involving commercial fishers. The monitoring program is designed to mimic traditional commercial fishing practices so that stock status can be inferred by comparing contemporary catch-per-unit-effort levels with those derived from historic logbooks. In this paper, we present analyses of the available monitoring and historic catch rate data along with updated stock status information for American shad in the James, York, and Rappahannock rivers. Two analytical methods were used to derive annual indices of relative abundance; both methods yielded very similar patterns for each river system. Comparisons of contemporary and historic indices of relative abundance suggest that American shad in the James and York rivers continue to persist at low levels of abundance. Measures of stock abundance in the Rappahannock River have been higher than the logbook reference value for much of the monitoring period. However, current moratoria and restoration strategies, which include hatchery releases of fry, the removal of obstructions blocking spawning and nursery habitat, and reductions in bycatch from other fisheries, should continue into the foreseeable future

    Search for the standard model Higgs boson in the H to ZZ to 2l 2nu channel in pp collisions at sqrt(s) = 7 TeV

    Get PDF
    A search for the standard model Higgs boson in the H to ZZ to 2l 2nu decay channel, where l = e or mu, in pp collisions at a center-of-mass energy of 7 TeV is presented. The data were collected at the LHC, with the CMS detector, and correspond to an integrated luminosity of 4.6 inverse femtobarns. No significant excess is observed above the background expectation, and upper limits are set on the Higgs boson production cross section. The presence of the standard model Higgs boson with a mass in the 270-440 GeV range is excluded at 95% confidence level.Comment: Submitted to JHE

    Search for New Physics with Jets and Missing Transverse Momentum in pp collisions at sqrt(s) = 7 TeV

    Get PDF
    A search for new physics is presented based on an event signature of at least three jets accompanied by large missing transverse momentum, using a data sample corresponding to an integrated luminosity of 36 inverse picobarns collected in proton--proton collisions at sqrt(s)=7 TeV with the CMS detector at the LHC. No excess of events is observed above the expected standard model backgrounds, which are all estimated from the data. Exclusion limits are presented for the constrained minimal supersymmetric extension of the standard model. Cross section limits are also presented using simplified models with new particles decaying to an undetected particle and one or two jets
    corecore