6,912 research outputs found
On capacity of optical communications over a lossy bosonic channel with a receiver employing the most general coherent electro-optic feedback control
We study the problem of designing optical receivers to discriminate between
multiple coherent states using coherent processing receivers---i.e., one that
uses arbitrary coherent feedback control and quantum-noise-limited direct
detection---which was shown by Dolinar to achieve the minimum error probability
in discriminating any two coherent states. We first derive and re-interpret
Dolinar's binary-hypothesis minimum-probability-of-error receiver as the one
that optimizes the information efficiency at each time instant, based on
recursive Bayesian updates within the receiver. Using this viewpoint, we
propose a natural generalization of Dolinar's receiver design to discriminate
coherent states each of which could now be a codeword, i.e., a sequence of
coherent states each drawn from a modulation alphabet. We analyze the
channel capacity of the pure-loss optical channel with a general
coherent-processing receiver in the low-photon number regime and compare it
with the capacity achievable with direct detection and the Holevo limit
(achieving the latter would require a quantum joint-detection receiver). We
show compelling evidence that despite the optimal performance of Dolinar's
receiver for the binary coherent-state hypothesis test (either in error
probability or mutual information), the asymptotic communication rate
achievable by such a coherent-processing receiver is only as good as direct
detection. This suggests that in the infinitely-long codeword limit, all
potential benefits of coherent processing at the receiver can be obtained by
designing a good code and direct detection, with no feedback within the
receiver.Comment: 17 pages, 5 figure
Continuous testing for Poisson process intensities: A new perspective on scanning statistics
We propose a novel continuous testing framework to test the intensities of
Poisson Processes. This framework allows a rigorous definition of the complete
testing procedure, from an infinite number of hypothesis to joint error rates.
Our work extends traditional procedures based on scanning windows, by
controlling the family-wise error rate and the false discovery rate in a
non-asymptotic manner and in a continuous way. The decision rule is based on a
\pvalue process that can be estimated by a Monte-Carlo procedure. We also
propose new test statistics based on kernels. Our method is applied in
Neurosciences and Genomics through the standard test of homogeneity, and the
two-sample test
Toward Optimal Feature Selection in Naive Bayes for Text Categorization
Automated feature selection is important for text categorization to reduce
the feature size and to speed up the learning process of classifiers. In this
paper, we present a novel and efficient feature selection framework based on
the Information Theory, which aims to rank the features with their
discriminative capacity for classification. We first revisit two information
measures: Kullback-Leibler divergence and Jeffreys divergence for binary
hypothesis testing, and analyze their asymptotic properties relating to type I
and type II errors of a Bayesian classifier. We then introduce a new divergence
measure, called Jeffreys-Multi-Hypothesis (JMH) divergence, to measure
multi-distribution divergence for multi-class classification. Based on the
JMH-divergence, we develop two efficient feature selection methods, termed
maximum discrimination () and methods, for text categorization.
The promising results of extensive experiments demonstrate the effectiveness of
the proposed approaches.Comment: This paper has been submitted to the IEEE Trans. Knowledge and Data
Engineering. 14 pages, 5 figure
A flexible regression model for count data
Poisson regression is a popular tool for modeling count data and is applied
in a vast array of applications from the social to the physical sciences and
beyond. Real data, however, are often over- or under-dispersed and, thus, not
conducive to Poisson regression. We propose a regression model based on the
Conway--Maxwell-Poisson (COM-Poisson) distribution to address this problem. The
COM-Poisson regression generalizes the well-known Poisson and logistic
regression models, and is suitable for fitting count data with a wide range of
dispersion levels. With a GLM approach that takes advantage of exponential
family properties, we discuss model estimation, inference, diagnostics, and
interpretation, and present a test for determining the need for a COM-Poisson
regression over a standard Poisson regression. We compare the COM-Poisson to
several alternatives and illustrate its advantages and usefulness using three
data sets with varying dispersion.Comment: Published in at http://dx.doi.org/10.1214/09-AOAS306 the Annals of
Applied Statistics (http://www.imstat.org/aoas/) by the Institute of
Mathematical Statistics (http://www.imstat.org
Online Updating of Statistical Inference in the Big Data Setting
We present statistical methods for big data arising from online analytical
processing, where large amounts of data arrive in streams and require fast
analysis without storage/access to the historical data. In particular, we
develop iterative estimating algorithms and statistical inferences for linear
models and estimating equations that update as new data arrive. These
algorithms are computationally efficient, minimally storage-intensive, and
allow for possible rank deficiencies in the subset design matrices due to
rare-event covariates. Within the linear model setting, the proposed
online-updating framework leads to predictive residual tests that can be used
to assess the goodness-of-fit of the hypothesized model. We also propose a new
online-updating estimator under the estimating equation setting. Theoretical
properties of the goodness-of-fit tests and proposed estimators are examined in
detail. In simulation studies and real data applications, our estimator
compares favorably with competing approaches under the estimating equation
setting.Comment: Submitted to Technometric
Statistics for the Luria-Delbr\"uck distribution
The Luria-Delbr\"uck distribution is a classical model of mutations in cell
kinetics. It is obtained as a limit when the probability of mutation tends to
zero and the number of divisions to infinity. It can be interpreted as a
compound Poisson distribution (for the number of mutations) of exponential
mixtures (for the developing time of mutant clones) of geometric distributions
(for the number of cells produced by a mutant clone in a given time). The
probabilistic interpretation, and a rigourous proof of convergence in the
general case, are deduced from classical results on Bellman-Harris branching
processes. The two parameters of the Luria-Delbr\"uck distribution are the
expected number of mutations, which is the parameter of interest, and the
relative fitness of normal cells compared to mutants, which is the heavy tail
exponent. Both can be simultaneously estimated by the maximum likehood method.
However, the computation becomes numerically unstable as soon as the maximal
value of the sample is large, which occurs frequently due to the heavy tail
property. Based on the empirical generating function, robust estimators are
proposed and their asymptotic variance is given. They are comparable in
precision to maximum likelihood estimators, with a much broader range of
calculability, a better numerical stability, and a negligible computing time
- …