14,627 research outputs found

    Practical Attacks Against Graph-based Clustering

    Full text link
    Graph modeling allows numerous security problems to be tackled in a general way, however, little work has been done to understand their ability to withstand adversarial attacks. We design and evaluate two novel graph attacks against a state-of-the-art network-level, graph-based detection system. Our work highlights areas in adversarial machine learning that have not yet been addressed, specifically: graph-based clustering techniques, and a global feature space where realistic attackers without perfect knowledge must be accounted for (by the defenders) in order to be practical. Even though less informed attackers can evade graph clustering with low cost, we show that some practical defenses are possible.Comment: ACM CCS 201

    Label noise detection under the Noise at Random model with ensemble filters

    Full text link
    Label noise detection has been widely studied in Machine Learning because of its importance in improving training data quality. Satisfactory noise detection has been achieved by adopting ensembles of classifiers. In this approach, an instance is assigned as mislabeled if a high proportion of members in the pool misclassifies it. Previous authors have empirically evaluated this approach; nevertheless, they mostly assumed that label noise is generated completely at random in a dataset. This is a strong assumption since other types of label noise are feasible in practice and can influence noise detection results. This work investigates the performance of ensemble noise detection under two different noise models: the Noisy at Random (NAR), in which the probability of label noise depends on the instance class, in comparison to the Noisy Completely at Random model, in which the probability of label noise is entirely independent. In this setting, we investigate the effect of class distribution on noise detection performance since it changes the total noise level observed in a dataset under the NAR assumption. Further, an evaluation of the ensemble vote threshold is conducted to contrast with the most common approaches in the literature. In many performed experiments, choosing a noise generation model over another can lead to different results when considering aspects such as class imbalance and noise level ratio among different classes.Comment: Accepted for publication in IOS Press Intelligent Data Analysis. This paper will appear in Volume 26(5) of the IDA journal. The publication date for this issue is September 202

    Invariances and Data Augmentation for Supervised Music Transcription

    Full text link
    This paper explores a variety of models for frame-based music transcription, with an emphasis on the methods needed to reach state-of-the-art on human recordings. The translation-invariant network discussed in this paper, which combines a traditional filterbank with a convolutional neural network, was the top-performing model in the 2017 MIREX Multiple Fundamental Frequency Estimation evaluation. This class of models shares parameters in the log-frequency domain, which exploits the frequency invariance of music to reduce the number of model parameters and avoid overfitting to the training data. All models in this paper were trained with supervision by labeled data from the MusicNet dataset, augmented by random label-preserving pitch-shift transformations.Comment: 6 page

    Data assimilation using bayesian filters and B-spline geological models

    Get PDF
    This paper proposes a new approach to problems of data assimilation, also known as history matching, of oilfield production data by adjustment of the location and sharpness of patterns of geological facies. Traditionally, this problem has been addressed using gradient based approaches with a level set parameterization of the geology. Gradient-based methods are robust, but computationally demanding with real-world reservoir problems and insufficient for reservoir management uncertainty assessment. Recently, the ensemble filter approach has been used to tackle this problem because of its high efficiency from the standpoint of implementation, computational cost, and performance. Incorporation of level set parameterization in this approach could further deal with the lack of differentiability with respect to facies type, but its practical implementation is based on some assumptions that are not easily satisfied in real problems. In this work, we propose to describe the geometry of the permeability field using B-spline curves. This transforms history matching of the discrete facies type to the estimation of continuous B-spline control points. As filtering scheme, we use the ensemble square-root filter (EnSRF). The efficacy of the EnSRF with the B-spline parameterization is investigated through three numerical experiments, in which the reservoir contains a curved channel, a disconnected channel or a 2-dimensional closed feature. It is found that the application of the proposed method to the problem of adjusting facies edges to match production data is relatively straightforward and provides statistical estimates of the distribution of geological facies and of the state of the reservoir
    corecore