3 research outputs found
A Note on Reverse Pinsker Inequalities
A simple method is shown to provide optimal variational bounds on
-divergences with possible constraints on relative information extremums.
Known results are refined or proved to be optimal as particular cases.Comment: To appear in the IEEE Transactions on Information Theor
Beyond -Divergence: Domain Adaptation Theory With Jensen-Shannon Divergence
We reveal the incoherence between the widely-adopted empirical domain
adversarial training and its generally-assumed theoretical counterpart based on
-divergence. Concretely, we find that -divergence is
not equivalent to Jensen-Shannon divergence, the optimization objective in
domain adversarial training. To this end, we establish a new theoretical
framework by directly proving the upper and lower target risk bounds based on
joint distributional Jensen-Shannon divergence. We further derive
bi-directional upper bounds for marginal and conditional shifts. Our framework
exhibits inherent flexibilities for different transfer learning problems, which
is usable for various scenarios where -divergence-based theory
fails to adapt. From an algorithmic perspective, our theory enables a generic
guideline unifying principles of semantic conditional matching, feature
marginal matching, and label marginal shift correction. We employ algorithms
for each principle and empirically validate the benefits of our framework on
real datasets
A Non-Parametric Subspace Analysis Approach with Application to Anomaly Detection Ensembles
Identifying anomalies in multi-dimensional datasets is an important task in
many real-world applications. A special case arises when anomalies are occluded
in a small set of attributes, typically referred to as a subspace, and not
necessarily over the entire data space. In this paper, we propose a new
subspace analysis approach named Agglomerative Attribute Grouping (AAG) that
aims to address this challenge by searching for subspaces that are comprised of
highly correlative attributes. Such correlations among attributes represent a
systematic interaction among the attributes that can better reflect the
behavior of normal observations and hence can be used to improve the
identification of two particularly interesting types of abnormal data samples:
anomalies that are occluded in relatively small subsets of the attributes and
anomalies that represent a new data class. AAG relies on a novel
multi-attribute measure, which is derived from information theory measures of
partitions, for evaluating the "information distance" between groups of data
attributes. To determine the set of subspaces to use, AAG applies a variation
of the well-known agglomerative clustering algorithm with the proposed
multi-attribute measure as the underlying distance function. Finally, the set
of subspaces is used in an ensemble for anomaly detection. Extensive evaluation
demonstrates that, in the vast majority of cases, the proposed AAG method (i)
outperforms classical and state-of-the-art subspace analysis methods when used
in anomaly detection ensembles, and (ii) generates fewer subspaces with a fewer
number of attributes each (on average), thus resulting in a faster training
time for the anomaly detection ensemble. Furthermore, in contrast to existing
methods, the proposed AAG method does not require any tuning of parameters.Comment: 41 pages, 9 figure