877 research outputs found
Annotating patient clinical records with syntactic chunks and named entities: the Harvey corpus
The free text notes typed by physicians during patient consultations contain valuable information for the study of disease and treatment. These notes are difficult to process by existing natural language analysis tools since they are highly telegraphic (omitting many words), and contain many spelling mistakes, inconsistencies in punctuation, and non-standard word order. To support information extraction and classification tasks over such text, we describe a de-identified corpus of free text notes, a shallow syntactic and named entity annotation scheme for this kind of text, and an approach to training domain specialists with no linguistic background to annotate the text. Finally, we present a statistical chunking system for such clinical text with a stable learning rate and good accuracy, indicating that the manual annotation is consistent and that the annotation scheme is tractable for machine learning
Bayesian Inference in Processing Experimental Data: Principles and Basic Applications
This report introduces general ideas and some basic methods of the Bayesian
probability theory applied to physics measurements. Our aim is to make the
reader familiar, through examples rather than rigorous formalism, with concepts
such as: model comparison (including the automatic Ockham's Razor filter
provided by the Bayesian approach); parametric inference; quantification of the
uncertainty about the value of physical quantities, also taking into account
systematic effects; role of marginalization; posterior characterization;
predictive distributions; hierarchical modelling and hyperparameters; Gaussian
approximation of the posterior and recovery of conventional methods, especially
maximum likelihood and chi-square fits under well defined conditions; conjugate
priors, transformation invariance and maximum entropy motivated priors; Monte
Carlo estimates of expectation, including a short introduction to Markov Chain
Monte Carlo methods.Comment: 40 pages, 2 figures, invited paper for Reports on Progress in Physic
Evaluating the Current Status of American Shad Stocks in Three Virginia Rivers
Directed commercial fisheries for American shad Alosa sapidissima in the primary Virginia tributaries of the Chesapeake Bay have been under moratorium since 1994. Monitoring of adult American shad within these rivers has been ongoing since 1998 through a cooperative program involving commercial fishers. The monitoring program is designed to mimic traditional commercial fishing practices so that stock status can be inferred by comparing contemporary catch-per-unit-effort levels with those derived from historic logbooks. In this paper, we present analyses of the available monitoring and historic catch rate data along with updated stock status information for American shad in the James, York, and Rappahannock rivers. Two analytical methods were used to derive annual indices of relative abundance; both methods yielded very similar patterns for each river system. Comparisons of contemporary and historic indices of relative abundance suggest that American shad in the James and York rivers continue to persist at low levels of abundance. Measures of stock abundance in the Rappahannock River have been higher than the logbook reference value for much of the monitoring period. However, current moratoria and restoration strategies, which include hatchery releases of fry, the removal of obstructions blocking spawning and nursery habitat, and reductions in bycatch from other fisheries, should continue into the foreseeable future
Search for the standard model Higgs boson in the H to ZZ to 2l 2nu channel in pp collisions at sqrt(s) = 7 TeV
A search for the standard model Higgs boson in the H to ZZ to 2l 2nu decay
channel, where l = e or mu, in pp collisions at a center-of-mass energy of 7
TeV is presented. The data were collected at the LHC, with the CMS detector,
and correspond to an integrated luminosity of 4.6 inverse femtobarns. No
significant excess is observed above the background expectation, and upper
limits are set on the Higgs boson production cross section. The presence of the
standard model Higgs boson with a mass in the 270-440 GeV range is excluded at
95% confidence level.Comment: Submitted to JHE
Search for New Physics with Jets and Missing Transverse Momentum in pp collisions at sqrt(s) = 7 TeV
A search for new physics is presented based on an event signature of at least
three jets accompanied by large missing transverse momentum, using a data
sample corresponding to an integrated luminosity of 36 inverse picobarns
collected in proton--proton collisions at sqrt(s)=7 TeV with the CMS detector
at the LHC. No excess of events is observed above the expected standard model
backgrounds, which are all estimated from the data. Exclusion limits are
presented for the constrained minimal supersymmetric extension of the standard
model. Cross section limits are also presented using simplified models with new
particles decaying to an undetected particle and one or two jets
- …