1,445 research outputs found
Data-Discriminants of Likelihood Equations
Maximum likelihood estimation (MLE) is a fundamental computational problem in
statistics. The problem is to maximize the likelihood function with respect to
given data on a statistical model. An algebraic approach to this problem is to
solve a very structured parameterized polynomial system called likelihood
equations. For general choices of data, the number of complex solutions to the
likelihood equations is finite and called the ML-degree of the model. The only
solutions to the likelihood equations that are statistically meaningful are the
real/positive solutions. However, the number of real/positive solutions is not
characterized by the ML-degree. We use discriminants to classify data according
to the number of real/positive solutions of the likelihood equations. We call
these discriminants data-discriminants (DD). We develop a probabilistic
algorithm for computing DDs. Experimental results show that, for the benchmarks
we have tried, the probabilistic algorithm is more efficient than the standard
elimination algorithm. Based on the computational results, we discuss the real
root classification problem for the 3 by 3 symmetric matrix~model.Comment: 2 table
Classification without labels: Learning from mixed samples in high energy physics
Modern machine learning techniques can be used to construct powerful models
for difficult collider physics problems. In many applications, however, these
models are trained on imperfect simulations due to a lack of truth-level
information in the data, which risks the model learning artifacts of the
simulation. In this paper, we introduce the paradigm of classification without
labels (CWoLa) in which a classifier is trained to distinguish statistical
mixtures of classes, which are common in collider physics. Crucially, neither
individual labels nor class proportions are required, yet we prove that the
optimal classifier in the CWoLa paradigm is also the optimal classifier in the
traditional fully-supervised case where all label information is available.
After demonstrating the power of this method in an analytical toy example, we
consider a realistic benchmark for collider physics: distinguishing quark-
versus gluon-initiated jets using mixed quark/gluon training samples. More
generally, CWoLa can be applied to any classification problem where labels or
class proportions are unknown or simulations are unreliable, but statistical
mixtures of the classes are available.Comment: 18 pages, 5 figures; v2: intro extended and references added; v3:
additional discussion to match JHEP versio
Likelihood Geometry
We study the critical points of monomial functions over an algebraic subset
of the probability simplex. The number of critical points on the Zariski
closure is a topological invariant of that embedded projective variety, known
as its maximum likelihood degree. We present an introduction to this theory and
its statistical motivations. Many favorite objects from combinatorial algebraic
geometry are featured: toric varieties, A-discriminants, hyperplane
arrangements, Grassmannians, and determinantal varieties. Several new results
are included, especially on the likelihood correspondence and its bidegree.
These notes were written for the second author's lectures at the CIME-CIRM
summer course on Combinatorial Algebraic Geometry at Levico Terme in June 2013.Comment: 45 pages; minor changes and addition
Rank discriminants for predicting phenotypes from RNA expression
Statistical methods for analyzing large-scale biomolecular data are
commonplace in computational biology. A notable example is phenotype prediction
from gene expression data, for instance, detecting human cancers,
differentiating subtypes and predicting clinical outcomes. Still, clinical
applications remain scarce. One reason is that the complexity of the decision
rules that emerge from standard statistical learning impedes biological
understanding, in particular, any mechanistic interpretation. Here we explore
decision rules for binary classification utilizing only the ordering of
expression among several genes; the basic building blocks are then two-gene
expression comparisons. The simplest example, just one comparison, is the TSP
classifier, which has appeared in a variety of cancer-related discovery
studies. Decision rules based on multiple comparisons can better accommodate
class heterogeneity, and thereby increase accuracy, and might provide a link
with biological mechanism. We consider a general framework ("rank-in-context")
for designing discriminant functions, including a data-driven selection of the
number and identity of the genes in the support ("context"). We then specialize
to two examples: voting among several pairs and comparing the median expression
in two groups of genes. Comprehensive experiments assess accuracy relative to
other, more complex, methods, and reinforce earlier observations that simple
classifiers are competitive.Comment: Published in at http://dx.doi.org/10.1214/14-AOAS738 the Annals of
Applied Statistics (http://www.imstat.org/aoas/) by the Institute of
Mathematical Statistics (http://www.imstat.org
Search for the Standard Model Higgs boson in ee interactions at \root{s} = 189 GeV
A search for the Standard Model Higgs boson is carried out on 176.4pb
of data collected by the L3 detector at a center-of-mass energy of 189 GeV. The
data are consistent with the expectations of Standard Model processes and no
evidence of a Higgs signal is observed. Combining the results of this search
with those at lower center-of-mass energies, a lower limit on the mass of the
Standard Model Higgs boson of 95.3 GeV is set at the 95% confidence level
- âŠ