27,781 research outputs found
The QCD phase diagram and statistics friendly distributions
The preliminary STAR data for proton cumulants for central collisions at s=7.7GeV component proton multiplicity distribution. We show that this two-component distribution is statistics friendly in that factorial cumulants of surprisingly high orders may be extracted with a relatively small number of events. As a consequence the two-component model can be tested and verified right now with the presently available STAR data from the first phase of the RHIC beam energy scan
An operational definition of quark and gluon jets
While "quark" and "gluon" jets are often treated as separate, well-defined
objects in both theoretical and experimental contexts, no precise, practical,
and hadron-level definition of jet flavor presently exists. To remedy this
issue, we develop and advocate for a data-driven, operational definition of
quark and gluon jets that is readily applicable at colliders. Rather than
specifying a per-jet flavor label, we aggregately define quark and gluon jets
at the distribution level in terms of measured hadronic cross sections.
Intuitively, quark and gluon jets emerge as the two maximally separable
categories within two jet samples in data. Benefiting from recent work on
data-driven classifiers and topic modeling for jets, we show that the practical
tools needed to implement our definition already exist for experimental
applications. As an informative example, we demonstrate the power of our
operational definition using Z+jet and dijet samples, illustrating that pure
quark and gluon distributions and fractions can be successfully extracted in a
fully well-defined manner.Comment: 38 pages, 10 figures, 1 table; v2: updated to match JHEP versio
Unsupervised Keyword Extraction from Polish Legal Texts
In this work, we present an application of the recently proposed unsupervised
keyword extraction algorithm RAKE to a corpus of Polish legal texts from the
field of public procurement. RAKE is essentially a language and domain
independent method. Its only language-specific input is a stoplist containing a
set of non-content words. The performance of the method heavily depends on the
choice of such a stoplist, which should be domain adopted. Therefore, we
complement RAKE algorithm with an automatic approach to selecting non-content
words, which is based on the statistical properties of term distribution
Extracting Event Dynamics from Event-by-Event Analysis
The problem of eliminating the statistical fluctuations and extracting the
event dynamics from event-by-event analysis is discussed. New moments
(for continuous distribution), and (for anomalous distribution) are
proposed, which are experimentally measurable and can eliminate the Poissonian
type statistical fluctuations to recover the dynamical moments and
. In this way, the dynamical distribution of the event-averaged
transverse momentum \bar{\pt} can be extracted, and the anomalous scaling of
dynamical distribution, if exists, can be recovered, through event-by-event
analysis of experimental data.Comment: 15 pages, 2 eps figures, Phys. Rev. C accepte
Measuring the galaxy power spectrum and scale-scale correlations with multiresolution-decomposed covariance -- I. method
We present a method of measuring galaxy power spectrum based on the
multiresolution analysis of the discrete wavelet transformation (DWT). Since
the DWT representation has strong capability of suppressing the off-diagonal
components of the covariance for selfsimilar clustering, the DWT covariance for
popular models of the cold dark matter cosmogony generally is diagonal, or
(scale)-diagonal in the scale range, in which the second scale-scale
correlations are weak. In this range, the DWT covariance gives a lossless
estimation of the power spectrum, which is equal to the corresponding Fourier
power spectrum banded with a logarithmical scaling. In the scale range, in
which the scale-scale correlation is significant, the accuracy of a power
spectrum detection depends on the scale-scale or band-band correlations. This
is, for a precision measurements of the power spectrum, a measurement of the
scale-scale or band-band correlations is needed. We show that the DWT
covariance can be employed to measuring both the band-power spectrum and second
order scale-scale correlation. We also present the DWT algorithm of the binning
and Poisson sampling with real observational data. We show that the alias
effect appeared in usual binning schemes can exactly be eliminated by the DWT
binning. Since Poisson process possesses diagonal covariance in the DWT
representation, the Poisson sampling and selection effects on the power
spectrum and second order scale-scale correlation detection are suppressed into
minimum. Moreover, the effect of the non-Gaussian features of the Poisson
sampling can be calculated in this frame.Comment: AAS Latex file, 44 pages, accepted for publication in Ap
Limits and Confidence Intervals in the Presence of Nuisance Parameters
We study the frequentist properties of confidence intervals computed by the
method known to statisticians as the Profile Likelihood. It is seen that the
coverage of these intervals is surprisingly good over a wide range of possible
parameter values for important classes of problems, in particular whenever
there are additional nuisance parameters with statistical or systematic errors.
Programs are available for calculating these intervals.Comment: 6 figure
Impact of variance components on reliability of absolute quantification using digital PCR
Background: Digital polymerase chain reaction (dPCR) is an increasingly popular technology for detecting and quantifying target nucleic acids. Its advertised strength is high precision absolute quantification without needing reference curves. The standard data analytic approach follows a seemingly straightforward theoretical framework but ignores sources of variation in the data generating process. These stem from both technical and biological factors, where we distinguish features that are 1) hard-wired in the equipment, 2) user-dependent and 3) provided by manufacturers but may be adapted by the user. The impact of the corresponding variance components on the accuracy and precision of target concentration estimators presented in the literature is studied through simulation.
Results: We reveal how system-specific technical factors influence accuracy as well as precision of concentration estimates. We find that a well-chosen sample dilution level and modifiable settings such as the fluorescence cut-off for target copy detection have a substantial impact on reliability and can be adapted to the sample analysed in ways that matter. User-dependent technical variation, including pipette inaccuracy and specific sources of sample heterogeneity, leads to a steep increase in uncertainty of estimated concentrations. Users can discover this through replicate experiments and derived variance estimation. Finally, the detection performance can be improved by optimizing the fluorescence intensity cut point as suboptimal thresholds reduce the accuracy of concentration estimates considerably.
Conclusions: Like any other technology, dPCR is subject to variation induced by natural perturbations, systematic settings as well as user-dependent protocols. Corresponding uncertainty may be controlled with an adapted experimental design. Our findings point to modifiable key sources of uncertainty that form an important starting point for the development of guidelines on dPCR design and data analysis with correct precision bounds. Besides clever choices of sample dilution levels, experiment-specific tuning of machine settings can greatly improve results. Well-chosen data-driven fluorescence intensity thresholds in particular result in major improvements in target presence detection. We call on manufacturers to provide sufficiently detailed output data that allows users to maximize the potential of the method in their setting and obtain high precision and accuracy for their experiments
- …