27,781 research outputs found

    The QCD phase diagram and statistics friendly distributions

    Get PDF
    The preliminary STAR data for proton cumulants for central collisions at s=7.7GeV component proton multiplicity distribution. We show that this two-component distribution is statistics friendly in that factorial cumulants of surprisingly high orders may be extracted with a relatively small number of events. As a consequence the two-component model can be tested and verified right now with the presently available STAR data from the first phase of the RHIC beam energy scan

    An operational definition of quark and gluon jets

    Full text link
    While "quark" and "gluon" jets are often treated as separate, well-defined objects in both theoretical and experimental contexts, no precise, practical, and hadron-level definition of jet flavor presently exists. To remedy this issue, we develop and advocate for a data-driven, operational definition of quark and gluon jets that is readily applicable at colliders. Rather than specifying a per-jet flavor label, we aggregately define quark and gluon jets at the distribution level in terms of measured hadronic cross sections. Intuitively, quark and gluon jets emerge as the two maximally separable categories within two jet samples in data. Benefiting from recent work on data-driven classifiers and topic modeling for jets, we show that the practical tools needed to implement our definition already exist for experimental applications. As an informative example, we demonstrate the power of our operational definition using Z+jet and dijet samples, illustrating that pure quark and gluon distributions and fractions can be successfully extracted in a fully well-defined manner.Comment: 38 pages, 10 figures, 1 table; v2: updated to match JHEP versio

    Unsupervised Keyword Extraction from Polish Legal Texts

    Full text link
    In this work, we present an application of the recently proposed unsupervised keyword extraction algorithm RAKE to a corpus of Polish legal texts from the field of public procurement. RAKE is essentially a language and domain independent method. Its only language-specific input is a stoplist containing a set of non-content words. The performance of the method heavily depends on the choice of such a stoplist, which should be domain adopted. Therefore, we complement RAKE algorithm with an automatic approach to selecting non-content words, which is based on the statistical properties of term distribution

    Extracting Event Dynamics from Event-by-Event Analysis

    Full text link
    The problem of eliminating the statistical fluctuations and extracting the event dynamics from event-by-event analysis is discussed. New moments GpG_p (for continuous distribution), and Gq,pG_{q,p} (for anomalous distribution) are proposed, which are experimentally measurable and can eliminate the Poissonian type statistical fluctuations to recover the dynamical moments CpC_p and Cq,pC_{q,p}. In this way, the dynamical distribution of the event-averaged transverse momentum \bar{\pt} can be extracted, and the anomalous scaling of dynamical distribution, if exists, can be recovered, through event-by-event analysis of experimental data.Comment: 15 pages, 2 eps figures, Phys. Rev. C accepte

    Measuring the galaxy power spectrum and scale-scale correlations with multiresolution-decomposed covariance -- I. method

    Get PDF
    We present a method of measuring galaxy power spectrum based on the multiresolution analysis of the discrete wavelet transformation (DWT). Since the DWT representation has strong capability of suppressing the off-diagonal components of the covariance for selfsimilar clustering, the DWT covariance for popular models of the cold dark matter cosmogony generally is diagonal, or jj(scale)-diagonal in the scale range, in which the second scale-scale correlations are weak. In this range, the DWT covariance gives a lossless estimation of the power spectrum, which is equal to the corresponding Fourier power spectrum banded with a logarithmical scaling. In the scale range, in which the scale-scale correlation is significant, the accuracy of a power spectrum detection depends on the scale-scale or band-band correlations. This is, for a precision measurements of the power spectrum, a measurement of the scale-scale or band-band correlations is needed. We show that the DWT covariance can be employed to measuring both the band-power spectrum and second order scale-scale correlation. We also present the DWT algorithm of the binning and Poisson sampling with real observational data. We show that the alias effect appeared in usual binning schemes can exactly be eliminated by the DWT binning. Since Poisson process possesses diagonal covariance in the DWT representation, the Poisson sampling and selection effects on the power spectrum and second order scale-scale correlation detection are suppressed into minimum. Moreover, the effect of the non-Gaussian features of the Poisson sampling can be calculated in this frame.Comment: AAS Latex file, 44 pages, accepted for publication in Ap

    Limits and Confidence Intervals in the Presence of Nuisance Parameters

    Full text link
    We study the frequentist properties of confidence intervals computed by the method known to statisticians as the Profile Likelihood. It is seen that the coverage of these intervals is surprisingly good over a wide range of possible parameter values for important classes of problems, in particular whenever there are additional nuisance parameters with statistical or systematic errors. Programs are available for calculating these intervals.Comment: 6 figure

    Impact of variance components on reliability of absolute quantification using digital PCR

    Get PDF
    Background: Digital polymerase chain reaction (dPCR) is an increasingly popular technology for detecting and quantifying target nucleic acids. Its advertised strength is high precision absolute quantification without needing reference curves. The standard data analytic approach follows a seemingly straightforward theoretical framework but ignores sources of variation in the data generating process. These stem from both technical and biological factors, where we distinguish features that are 1) hard-wired in the equipment, 2) user-dependent and 3) provided by manufacturers but may be adapted by the user. The impact of the corresponding variance components on the accuracy and precision of target concentration estimators presented in the literature is studied through simulation. Results: We reveal how system-specific technical factors influence accuracy as well as precision of concentration estimates. We find that a well-chosen sample dilution level and modifiable settings such as the fluorescence cut-off for target copy detection have a substantial impact on reliability and can be adapted to the sample analysed in ways that matter. User-dependent technical variation, including pipette inaccuracy and specific sources of sample heterogeneity, leads to a steep increase in uncertainty of estimated concentrations. Users can discover this through replicate experiments and derived variance estimation. Finally, the detection performance can be improved by optimizing the fluorescence intensity cut point as suboptimal thresholds reduce the accuracy of concentration estimates considerably. Conclusions: Like any other technology, dPCR is subject to variation induced by natural perturbations, systematic settings as well as user-dependent protocols. Corresponding uncertainty may be controlled with an adapted experimental design. Our findings point to modifiable key sources of uncertainty that form an important starting point for the development of guidelines on dPCR design and data analysis with correct precision bounds. Besides clever choices of sample dilution levels, experiment-specific tuning of machine settings can greatly improve results. Well-chosen data-driven fluorescence intensity thresholds in particular result in major improvements in target presence detection. We call on manufacturers to provide sufficiently detailed output data that allows users to maximize the potential of the method in their setting and obtain high precision and accuracy for their experiments
    corecore