612 research outputs found
One-Class Support Measure Machines for Group Anomaly Detection
We propose one-class support measure machines (OCSMMs) for group anomaly
detection which aims at recognizing anomalous aggregate behaviors of data
points. The OCSMMs generalize well-known one-class support vector machines
(OCSVMs) to a space of probability measures. By formulating the problem as
quantile estimation on distributions, we can establish an interesting
connection to the OCSVMs and variable kernel density estimators (VKDEs) over
the input space on which the distributions are defined, bridging the gap
between large-margin methods and kernel density estimators. In particular, we
show that various types of VKDEs can be considered as solutions to a class of
regularization problems studied in this paper. Experiments on Sloan Digital Sky
Survey dataset and High Energy Particle Physics dataset demonstrate the
benefits of the proposed framework in real-world applications.Comment: Conference on Uncertainty in Artificial Intelligence (UAI2013
Nonparametric Anomaly Detection and Secure Communication
Two major security challenges in information systems are detection of anomalous data patterns that reflect malicious intrusions into data storage systems and protection of data from malicious eavesdropping during data transmissions. The first problem typically involves design of statistical tests to identify data variations, and the second problem generally involves design of communication schemes to transmit data securely in the presence of malicious eavesdroppers. The main theme of this thesis is to exploit information theoretic and statistical tools to address the above two security issues in order to provide information theoretically provable security, i.e., anomaly detection with vanishing probability of error and guaranteed secure communication with vanishing leakage rate at eavesdroppers.
First, the anomaly detection problem is investigated, in which typical and anomalous patterns (i.e., distributions that generate data) are unknown \emph{a priori}. Two types of problems are investigated. The first problem considers detection of the existence of anomalous geometric structures over networks, and the second problem considers the detection of a set of anomalous data streams out of a large number of data streams. In both problems, anomalous data are assumed to be generated by a distribution , which is different from a distribution generating typical samples. For both problems, kernel-based tests are proposed, which are based on maximum mean discrepancy (MMD) that measures the distance between mean embeddings of distributions into a reproducing kernel Hilbert space. These tests are nonparametric without exploiting the information about and and are universally applicable to arbitrary and . Furthermore, these tests are shown to be statistically consistent under certain conditions on the parameters of the problems. These conditions are further shown to be necessary or nearly necessary, which implies that the MMD-based tests are order level optimal or nearly order level optimal. Numerical results are provided to demonstrate the performance of the proposed tests.
The secure communication problem is then investigated, for which the focus is on degraded broadcast channels. In such channels, one transmitter sends messages to multiple receivers, the channel quality of which can be ordered. Two specific models are studied. In the first model, layered decoding and layered secrecy are required, i.e., each receiver decodes one more message than the receiver with one level worse channel quality, and this message should be kept secure from all receivers with worse channel qualities. In the second model, secrecy only outside a bounded range is required, i.e., each message is required to be kept secure from the receiver with two-level worse channel quality. Communication schemes for both models are designed and the corresponding achievable rate regions (i.e., inner bounds on the capacity region) are characterized. Furthermore, outer bounds on the capacity region are developed, which match the inner bounds, and hence the secrecy capacity regions are established for both models
- …