11,108 research outputs found

    One-Class Classification: Taxonomy of Study and Review of Techniques

    Full text link
    One-class classification (OCC) algorithms aim to build classification models when the negative class is either absent, poorly sampled or not well defined. This unique situation constrains the learning of efficient classifiers by defining class boundary just with the knowledge of positive class. The OCC problem has been considered and applied under many research themes, such as outlier/novelty detection and concept learning. In this paper we present a unified view of the general problem of OCC by presenting a taxonomy of study for OCC problems, which is based on the availability of training data, algorithms used and the application domains applied. We further delve into each of the categories of the proposed taxonomy and present a comprehensive literature review of the OCC algorithms, techniques and methodologies with a focus on their significance, limitations and applications. We conclude our paper by discussing some open research problems in the field of OCC and present our vision for future research.Comment: 24 pages + 11 pages of references, 8 figure

    Wireless Data Acquisition for Edge Learning: Data-Importance Aware Retransmission

    Full text link
    By deploying machine-learning algorithms at the network edge, edge learning can leverage the enormous real-time data generated by billions of mobile devices to train AI models, which enable intelligent mobile applications. In this emerging research area, one key direction is to efficiently utilize radio resources for wireless data acquisition to minimize the latency of executing a learning task at an edge server. Along this direction, we consider the specific problem of retransmission decision in each communication round to ensure both reliability and quantity of those training data for accelerating model convergence. To solve the problem, a new retransmission protocol called data-importance aware automatic-repeat-request (importance ARQ) is proposed. Unlike the classic ARQ focusing merely on reliability, importance ARQ selectively retransmits a data sample based on its uncertainty which helps learning and can be measured using the model under training. Underpinning the proposed protocol is a derived elegant communication-learning relation between two corresponding metrics, i.e., signal-to-noise ratio (SNR) and data uncertainty. This relation facilitates the design of a simple threshold based policy for importance ARQ. The policy is first derived based on the classic classifier model of support vector machine (SVM), where the uncertainty of a data sample is measured by its distance to the decision boundary. The policy is then extended to the more complex model of convolutional neural networks (CNN) where data uncertainty is measured by entropy. Extensive experiments have been conducted for both the SVM and CNN using real datasets with balanced and imbalanced distributions. Experimental results demonstrate that importance ARQ effectively copes with channel fading and noise in wireless data acquisition to achieve faster model convergence than the conventional channel-aware ARQ.Comment: This is an updated version: 1) extension to general classifiers; 2) consideration of imbalanced classification in the experiments. Submitted to IEEE Journal for possible publicatio

    Multimodal Subspace Support Vector Data Description

    Get PDF
    In this paper, we propose a novel method for projecting data from multiple modalities to a new subspace optimized for one-class classification. The proposed method iteratively transforms the data from the original feature space of each modality to a new common feature space along with finding a joint compact description of data coming from all the modalities. For data in each modality, we define a separate transformation to map the data from the corresponding feature space to the new optimized subspace by exploiting the available information from the class of interest only. We also propose different regularization strategies for the proposed method and provide both linear and non-linear formulations. The proposed Multimodal Subspace Support Vector Data Description outperforms all the competing methods using data from a single modality or fusing data from all modalities in four out of five datasets.Comment: 26 pages manuscript (6 tables, 2 figures), 24 pages supplementary material (27 tables, 10 figures). The manuscript and supplementary material are combined as a single .pdf (50 pages) fil

    Splitting hybrid Make-To-Order and Make-To-Stock demand profiles

    Get PDF
    In this paper a demand time series is analysed to support Make-To-Stock (MTS) and Make-To-Order (MTO) production decisions. Using a purely MTS production strategy based on the given demand can lead to unnecessarily high inventory levels thus it is necessary to identify likely MTO episodes. This research proposes a novel outlier detection algorithm based on special density measures. We divide the time series' histogram into three clusters. One with frequent-low volume covers MTS items whilst a second accounts for high volumes which is dedicated to MTO items. The third cluster resides between the previous two with its elements being assigned to either the MTO or MTS class. The algorithm can be applied to a variety of time series such as stationary and non-stationary ones. We use empirical data from manufacturing to study the extent of inventory savings. The percentage of MTO items is reflected in the inventory savings which were shown to be an average of 18.1%.Comment: demand analysis; time series; outlier detection; production strategy; Make-To-Order(MTO); Make-To-Stock(MTS); 15 pages, 9 figure
    • …
    corecore