15 research outputs found

    Training Neural Networks with Stochastic Hessian-Free Optimization

    Full text link
    Hessian-free (HF) optimization has been successfully used for training deep autoencoders and recurrent networks. HF uses the conjugate gradient algorithm to construct update directions through curvature-vector products that can be computed on the same order of time as gradients. In this paper we exploit this property and study stochastic HF with gradient and curvature mini-batches independent of the dataset size. We modify Martens' HF for these settings and integrate dropout, a method for preventing co-adaptation of feature detectors, to guard against overfitting. Stochastic Hessian-free optimization gives an intermediary between SGD and HF that achieves competitive performance on both classification and deep autoencoder experiments.Comment: 11 pages, ICLR 201

    Exemplar-Centered Supervised Shallow Parametric Data Embedding

    Full text link
    Metric learning methods for dimensionality reduction in combination with k-Nearest Neighbors (kNN) have been extensively deployed in many classification, data embedding, and information retrieval applications. However, most of these approaches involve pairwise training data comparisons, and thus have quadratic computational complexity with respect to the size of training set, preventing them from scaling to fairly big datasets. Moreover, during testing, comparing test data against all the training data points is also expensive in terms of both computational cost and resources required. Furthermore, previous metrics are either too constrained or too expressive to be well learned. To effectively solve these issues, we present an exemplar-centered supervised shallow parametric data embedding model, using a Maximally Collapsing Metric Learning (MCML) objective. Our strategy learns a shallow high-order parametric embedding function and compares training/test data only with learned or precomputed exemplars, resulting in a cost function with linear computational complexity for both training and testing. We also empirically demonstrate, using several benchmark datasets, that for classification in two-dimensional embedding space, our approach not only gains speedup of kNN by hundreds of times, but also outperforms state-of-the-art supervised embedding approaches.Comment: accepted to IJCAI201

    Spectrogram Based Window Selection for the Detection of Voltage Variation

    Get PDF
    This paper presents the application of spectrogram with K-nearest neighbors (KNN) and Support Vector Machine (SVM) for window selection and voltage variation classification. The voltage variation signals such as voltage sag, swell and interruption are simulated in Matlab and analyzed in spectrogram with different windows which are 256, 512 and 1024. The variations analyzed by spectrogram are displayed in time-frequency representation (TFR) and voltage per unit (PU) graphs. The parameters are calculated from the TFR obtained and be used as inputs for KNN and SVM classifiers. The signals obtained are then added with noise (0SNR and 20SNR) and used in classification. The tested data contain voltage variation signals obtained using the mathematical models simulated in Matlab and the signals added with noise. Classification accuracy of each window by each classifier is obtained and compared along with the TFR and voltage PU graphs to select the best window to be used to analyze the best window to be used to analyze the voltage variation signals in spectrogram. The results showed window 1024 is more suitable to be used

    Applying interval type-2 fuzzy rule based classifiers through a cluster-based class representation

    Get PDF
    Fuzzy Rule-Based Classification Systems (FRBCSs) have the potential to provide so-called interpretable classifiers, i.e. classifiers which can be introspective, understood, validated and augmented by human experts by relying on fuzzy-set based rules. This paper builds on prior work for interval type-2 fuzzy set based FRBCs where the fuzzy sets and rules of the classifier are generated using an initial clustering stage. By introducing Subtractive Clustering in order to identify multiple cluster prototypes, the proposed approach has the potential to deliver improved classification performance while maintaining good interpretability, i.e. without resulting in an excessive number of rules. The paper provides a detailed overview of the proposed FRBC framework, followed by a series of exploratory experiments on both linearly and non-linearly separable datasets, comparing results to existing rule-based and SVM approaches. Overall, initial results indicate that the approach enables comparable classification performance to non rule-based classifiers such as SVM, while often achieving this with a very small number of rules

    Applying interval type-2 fuzzy rule based classifiers through a cluster-based class representation

    Get PDF
    Fuzzy Rule-Based Classification Systems (FRBCSs) have the potential to provide so-called interpretable classifiers, i.e. classifiers which can be introspective, understood, validated and augmented by human experts by relying on fuzzy-set based rules. This paper builds on prior work for interval type-2 fuzzy set based FRBCs where the fuzzy sets and rules of the classifier are generated using an initial clustering stage. By introducing Subtractive Clustering in order to identify multiple cluster prototypes, the proposed approach has the potential to deliver improved classification performance while maintaining good interpretability, i.e. without resulting in an excessive number of rules. The paper provides a detailed overview of the proposed FRBC framework, followed by a series of exploratory experiments on both linearly and non-linearly separable datasets, comparing results to existing rule-based and SVM approaches. Overall, initial results indicate that the approach enables comparable classification performance to non rule-based classifiers such as SVM, while often achieving this with a very small number of rules

    Substitution of hazardous chemical substances using Deep Learning and t-SNE

    Get PDF
    Manufacturing companies in the European Union are obliged to regularly analyze their recipes to find safer alternatives for hazardous substances. Unfortunately, available substance information is dispersed, heterogeneous and stored in databases of many private and public entities. In addition, the number of existing chemical substances already surpassed 85,000 with over 200 attributes describing substance characteristics, which makes it impossible for experts to collect and manually review this data. We tackle these issues by introducing a novel machine learning approach for alternative assessment. After developing a central database, we design an approach that performs nearest neighbor search in latent space obtained by deep autoencoders. Furthermore, we implement a post-hoc explanation technique, t-SNE, to visualize deep embeddings that enables to justify model outcomes. The application in a real-world project with a manufacturer shows that this approach can help process experts to identify possible replacement candidates more quickly and fosters comprehensibility through visualization

    Online Class-Incremental Continual Learning with Adversarial Shapley Value

    Full text link
    As image-based deep learning becomes pervasive on every device, from cell phones to smart watches, there is a growing need to develop methods that continually learn from data while minimizing memory footprint and power consumption. While memory replay techniques have shown exceptional promise for this task of continual learning, the best method for selecting which buffered images to replay is still an open question. In this paper, we specifically focus on the online class-incremental setting where a model needs to learn new classes continually from an online data stream. To this end, we contribute a novel Adversarial Shapley value scoring method that scores memory data samples according to their ability to preserve latent decision boundaries for previously observed classes (to maintain learning stability and avoid forgetting) while interfering with latent decision boundaries of current classes being learned (to encourage plasticity and optimal learning of new class boundaries). Overall, we observe that our proposed ASER method provides competitive or improved performance compared to state-of-the-art replay-based continual learning methods on a variety of datasets.Comment: Proceedings of the 35th AAAI Conference on Artificial Intelligence (AAAI-21
    corecore