612 research outputs found

    One-Class Support Measure Machines for Group Anomaly Detection

    Full text link
    We propose one-class support measure machines (OCSMMs) for group anomaly detection which aims at recognizing anomalous aggregate behaviors of data points. The OCSMMs generalize well-known one-class support vector machines (OCSVMs) to a space of probability measures. By formulating the problem as quantile estimation on distributions, we can establish an interesting connection to the OCSVMs and variable kernel density estimators (VKDEs) over the input space on which the distributions are defined, bridging the gap between large-margin methods and kernel density estimators. In particular, we show that various types of VKDEs can be considered as solutions to a class of regularization problems studied in this paper. Experiments on Sloan Digital Sky Survey dataset and High Energy Particle Physics dataset demonstrate the benefits of the proposed framework in real-world applications.Comment: Conference on Uncertainty in Artificial Intelligence (UAI2013

    Nonparametric Anomaly Detection and Secure Communication

    Get PDF
    Two major security challenges in information systems are detection of anomalous data patterns that reflect malicious intrusions into data storage systems and protection of data from malicious eavesdropping during data transmissions. The first problem typically involves design of statistical tests to identify data variations, and the second problem generally involves design of communication schemes to transmit data securely in the presence of malicious eavesdroppers. The main theme of this thesis is to exploit information theoretic and statistical tools to address the above two security issues in order to provide information theoretically provable security, i.e., anomaly detection with vanishing probability of error and guaranteed secure communication with vanishing leakage rate at eavesdroppers. First, the anomaly detection problem is investigated, in which typical and anomalous patterns (i.e., distributions that generate data) are unknown \emph{a priori}. Two types of problems are investigated. The first problem considers detection of the existence of anomalous geometric structures over networks, and the second problem considers the detection of a set of anomalous data streams out of a large number of data streams. In both problems, anomalous data are assumed to be generated by a distribution qq, which is different from a distribution pp generating typical samples. For both problems, kernel-based tests are proposed, which are based on maximum mean discrepancy (MMD) that measures the distance between mean embeddings of distributions into a reproducing kernel Hilbert space. These tests are nonparametric without exploiting the information about pp and qq and are universally applicable to arbitrary pp and qq. Furthermore, these tests are shown to be statistically consistent under certain conditions on the parameters of the problems. These conditions are further shown to be necessary or nearly necessary, which implies that the MMD-based tests are order level optimal or nearly order level optimal. Numerical results are provided to demonstrate the performance of the proposed tests. The secure communication problem is then investigated, for which the focus is on degraded broadcast channels. In such channels, one transmitter sends messages to multiple receivers, the channel quality of which can be ordered. Two specific models are studied. In the first model, layered decoding and layered secrecy are required, i.e., each receiver decodes one more message than the receiver with one level worse channel quality, and this message should be kept secure from all receivers with worse channel qualities. In the second model, secrecy only outside a bounded range is required, i.e., each message is required to be kept secure from the receiver with two-level worse channel quality. Communication schemes for both models are designed and the corresponding achievable rate regions (i.e., inner bounds on the capacity region) are characterized. Furthermore, outer bounds on the capacity region are developed, which match the inner bounds, and hence the secrecy capacity regions are established for both models
    corecore