6,898 research outputs found
One-Class Semi-Supervised Learning: Detecting Linearly Separable Class by its Mean
In this paper, we presented a novel semi-supervised one-class classification
algorithm which assumes that class is linearly separable from other elements.
We proved theoretically that class is linearly separable if and only if it is
maximal by probability within the sets with the same mean. Furthermore, we
presented an algorithm for identifying such linearly separable class utilizing
linear programming. We described three application cases including an
assumption of linear separability, Gaussian distribution, and the case of
linear separability in transformed space of kernel functions. Finally, we
demonstrated the work of the proposed algorithm on the USPS dataset and
analyzed the relationship of the performance of the algorithm and the size of
the initially labeled sample
Discrete R\'enyi Classifiers
Consider the binary classification problem of predicting a target variable
from a discrete feature vector . When the probability
distribution is known, the optimal classifier, leading to the
minimum misclassification rate, is given by the Maximum A-posteriori
Probability decision rule. However, estimating the complete joint distribution
is computationally and statistically impossible for large
values of . An alternative approach is to first estimate some low order
marginals of and then design the classifier based on the
estimated low order marginals. This approach is also helpful when the complete
training data instances are not available due to privacy concerns. In this
work, we consider the problem of finding the optimum classifier based on some
estimated low order marginals of . We prove that for a given set of
marginals, the minimum Hirschfeld-Gebelein-Renyi (HGR) correlation principle
introduced in [1] leads to a randomized classification rule which is shown to
have a misclassification rate no larger than twice the misclassification rate
of the optimal classifier. Then, under a separability condition, we show that
the proposed algorithm is equivalent to a randomized linear regression
approach. In addition, this method naturally results in a robust feature
selection method selecting a subset of features having the maximum worst case
HGR correlation with the target variable. Our theoretical upper-bound is
similar to the recent Discrete Chebyshev Classifier (DCC) approach [2], while
the proposed algorithm has significant computational advantages since it only
requires solving a least square optimization problem. Finally, we numerically
compare our proposed algorithm with the DCC classifier and show that the
proposed algorithm results in better misclassification rate over various
datasets
simode: R Package for statistical inference of ordinary differential equations using separable integral-matching
In this paper we describe simode: Separable Integral Matching for Ordinary
Differential Equations. The statistical methodologies applied in the package
focus on several minimization procedures of an integral-matching criterion
function, taking advantage of the mathematical structure of the differential
equations like separability of parameters from equations. Application of
integral based methods to parameter estimation of ordinary differential
equations was shown to yield more accurate and stable results comparing to
derivative based ones. Linear features such as separability were shown to ease
optimization and inference. We demonstrate the functionalities of the package
using various systems of ordinary differential equations
An improved semidefinite programming hierarchy for testing entanglement
We present a stronger version of the Doherty-Parrilo-Spedalieri (DPS)
hierarchy of approximations for the set of separable states. Unlike DPS, our
hierarchy converges exactly at a finite number of rounds for any fixed input
dimension. This yields an algorithm for separability testing which is singly
exponential in dimension and polylogarithmic in accuracy. Our analysis makes
use of tools from algebraic geometry, but our algorithm is elementary and
differs from DPS only by one simple additional collection of constraints.Comment: 22 pages. v2: published version, adds numerical results. Matlab code
available at https://github.com/isobovine/dpsplus
Wideband Massive MIMO Channel Estimation via Sequential Atomic Norm Minimization
The recently introduced atomic norm minimization (ANM) framework for
parameter estimation is a promising candidate towards low overhead channel
estimation in wireless communications. However, previous works on ANM-based
channel estimation evaluated performance on channels with artificially imposed
channel path separability, which cannot be guaranteed in practice. In addition,
direct application of the ANM framework for massive MIMO channel estimation is
computationally infeasible due to the large dimensions. In this paper, a
low-complexity ANM-based channel estimator for wideband massive MIMO is
proposed, consisting of two sequential steps, the first estimating the channel
over the spatial and the second over the frequency dimension. Its mean squared
error performance is analytically characterized in terms of tight lower bounds.
It is shown that the proposed algorithm achieves excellent performance that is
close to the best that can be achieved by any unbiased channel estimator in the
regime of low to moderate number of channel paths, without any restrictions on
their separability.Comment: extended version of paper submitted to globalSIP 201
Necessary and Sufficient Conditions and a Provably Efficient Algorithm for Separable Topic Discovery
We develop necessary and sufficient conditions and a novel provably
consistent and efficient algorithm for discovering topics (latent factors) from
observations (documents) that are realized from a probabilistic mixture of
shared latent factors that have certain properties. Our focus is on the class
of topic models in which each shared latent factor contains a novel word that
is unique to that factor, a property that has come to be known as separability.
Our algorithm is based on the key insight that the novel words correspond to
the extreme points of the convex hull formed by the row-vectors of a suitably
normalized word co-occurrence matrix. We leverage this geometric insight to
establish polynomial computation and sample complexity bounds based on a few
isotropic random projections of the rows of the normalized word co-occurrence
matrix. Our proposed random-projections-based algorithm is naturally amenable
to an efficient distributed implementation and is attractive for modern
web-scale distributed data mining applications.Comment: Typo corrected; Revised argument in Lemma 3 and
Nonnegative Matrix Factorization for Signal and Data Analytics: Identifiability, Algorithms, and Applications
Nonnegative matrix factorization (NMF) has become a workhorse for signal and
data analytics, triggered by its model parsimony and interpretability. Perhaps
a bit surprisingly, the understanding to its model identifiability---the major
reason behind the interpretability in many applications such as topic mining
and hyperspectral imaging---had been rather limited until recent years.
Beginning from the 2010s, the identifiability research of NMF has progressed
considerably: Many interesting and important results have been discovered by
the signal processing (SP) and machine learning (ML) communities. NMF
identifiability has a great impact on many aspects in practice, such as
ill-posed formulation avoidance and performance-guaranteed algorithm design. On
the other hand, there is no tutorial paper that introduces NMF from an
identifiability viewpoint. In this paper, we aim at filling this gap by
offering a comprehensive and deep tutorial on model identifiability of NMF as
well as the connections to algorithms and applications. This tutorial will help
researchers and graduate students grasp the essence and insights of NMF,
thereby avoiding typical `pitfalls' that are often times due to unidentifiable
NMF formulations. This paper will also help practitioners pick/design suitable
factorization tools for their own problems.Comment: accepted version, IEEE Signal Processing Magazine; supplementary
materials added. Some minor revisions implemente
On the Upper Limit of Separability
We propose an approach to rapidly find the upper limit of separability
between datasets that is directly applicable to HEP classification problems.
The most common HEP classification task is to use values (variables) for an
object (event) to estimate the probability that it is signal vs. background.
Most techniques first use known samples to identify differences in how signal
and background events are distributed throughout the -dimensional variable
space, then use those differences to classify events of unknown type.
Qualitatively, the greater the differences, the more effectively one can
classify events of unknown type. We will show that the Mutual Information (MI)
between the -dimensional signal-background mixed distribution and the
answers for the known events, tells us the upper-limit of separation for that
set of variables. We will then compare that value to the Jensen-Shannon
Divergence between the output distributions from a classifier to test whether
it has extracted all possible information from the input variables. We will
also discuss speed improvements to a standard method for calculating MI.
Our approach will allow one to: a) quickly measure the maximum possible
effectiveness of a large number of potential discriminating variables
independent of any specific classification algorithm, b) identify potential
discriminating variables that are redundant, and c) determine whether a
classification algorithm has achieved the maximum possible separation. We test
these claims first on simple distributions and then on Monte Carlo samples
generated for Supersymmetry and Higgs searches. In all cases, we were able to
a) predict the separation that a classification algorithm would reach, b)
identify variables that carried no additional discriminating power, and c)
identify whether an algorithm had reached the optimum separation. Our code is
publicly available
Relaxations of separability in multipartite systems: Semidefinite programs, witnesses and volumes
While entanglement is believed to be an important ingredient in understanding
quantum many-body physics, the complexity of its characterization scales very
unfavorably with the size of the system. Finding super-sets of the set of
separable states that admit a simpler description has proven to be a fruitful
approach in the bipartite setting. In this paper we discuss a systematic way of
characterizing multiparticle entanglement via various relaxations. We
furthermore describe an operational witness construction arising from such
relaxations that is capable of detecting every entangled state. Finally, we
also derive an analytic upper-bound on the volume of biseparable states and
show that the volume of the states with a positive partial transpose for any
split rapidly outgrows this volume. This proves that simple semi-definite
relaxations in the multiparticle case cannot be an equally good approximation
for any scenario.Comment: 26 pages. In v2: proposed SDP implemented, analytical example
included, typos corrected, references added (published version
Sum-rate Maximization in Sub-28 GHz Millimeter-Wave MIMO Interfering Networks
MIMO systems in the lower part of the millimeter-wave spectrum band (i.e.,
below 28 GHz) do not exhibit enough directivity and selectively, as their
counterparts in higher bands of the spectrum (i.e., above 60 GHz), and thus
still suffer from the detrimental effect of interference, on the system
sum-rate. As such systems exhibit large numbers of antennas and short coherence
times for the channel, traditional methods of distributed coordination are
ill-suited, and the resulting communication overhead would offset the gains of
coordination. In this work, we propose algorithms for tackling the sum-rate
maximization problem, that are designed to address the above limitations. We
derive a lower bound on the sum-rate, a so-called DLT bound (i.e., a difference
of log and trace), shed light on its tightness, and highlight its decoupled
nature at both the transmitters and receivers. Moreover, we derive the solution
to each of the subproblems, that we dub non-homogeneous waterfilling (a
variation on the MIMO waterfilling solution), and underline an inherent
desirable feature: its ability to turn-off streams exhibiting low-SINR, and
contribute to greatly speeding up the convergence of the proposed algorithm. We
then show the convergence of the resulting algorithm, max-DLT, to a stationary
point of the DLT bound. Finally, we rely on extensive simulations of various
network configurations, to establish the fast-converging nature of our proposed
schemes, and thus their suitability for addressing the short coherence
interval, as well as the increased system dimensions, arising when managing
interference in lower bands of the millimeter wave spectrum. Moreover, our
results also suggest that interference management still brings about
significant performance gains, especially in dense deployments.Comment: 16 page
- …