11,108 research outputs found
One-Class Classification: Taxonomy of Study and Review of Techniques
One-class classification (OCC) algorithms aim to build classification models
when the negative class is either absent, poorly sampled or not well defined.
This unique situation constrains the learning of efficient classifiers by
defining class boundary just with the knowledge of positive class. The OCC
problem has been considered and applied under many research themes, such as
outlier/novelty detection and concept learning. In this paper we present a
unified view of the general problem of OCC by presenting a taxonomy of study
for OCC problems, which is based on the availability of training data,
algorithms used and the application domains applied. We further delve into each
of the categories of the proposed taxonomy and present a comprehensive
literature review of the OCC algorithms, techniques and methodologies with a
focus on their significance, limitations and applications. We conclude our
paper by discussing some open research problems in the field of OCC and present
our vision for future research.Comment: 24 pages + 11 pages of references, 8 figure
Wireless Data Acquisition for Edge Learning: Data-Importance Aware Retransmission
By deploying machine-learning algorithms at the network edge, edge learning
can leverage the enormous real-time data generated by billions of mobile
devices to train AI models, which enable intelligent mobile applications. In
this emerging research area, one key direction is to efficiently utilize radio
resources for wireless data acquisition to minimize the latency of executing a
learning task at an edge server. Along this direction, we consider the specific
problem of retransmission decision in each communication round to ensure both
reliability and quantity of those training data for accelerating model
convergence. To solve the problem, a new retransmission protocol called
data-importance aware automatic-repeat-request (importance ARQ) is proposed.
Unlike the classic ARQ focusing merely on reliability, importance ARQ
selectively retransmits a data sample based on its uncertainty which helps
learning and can be measured using the model under training. Underpinning the
proposed protocol is a derived elegant communication-learning relation between
two corresponding metrics, i.e., signal-to-noise ratio (SNR) and data
uncertainty. This relation facilitates the design of a simple threshold based
policy for importance ARQ. The policy is first derived based on the classic
classifier model of support vector machine (SVM), where the uncertainty of a
data sample is measured by its distance to the decision boundary. The policy is
then extended to the more complex model of convolutional neural networks (CNN)
where data uncertainty is measured by entropy. Extensive experiments have been
conducted for both the SVM and CNN using real datasets with balanced and
imbalanced distributions. Experimental results demonstrate that importance ARQ
effectively copes with channel fading and noise in wireless data acquisition to
achieve faster model convergence than the conventional channel-aware ARQ.Comment: This is an updated version: 1) extension to general classifiers; 2)
consideration of imbalanced classification in the experiments. Submitted to
IEEE Journal for possible publicatio
Multimodal Subspace Support Vector Data Description
In this paper, we propose a novel method for projecting data from multiple
modalities to a new subspace optimized for one-class classification. The
proposed method iteratively transforms the data from the original feature space
of each modality to a new common feature space along with finding a joint
compact description of data coming from all the modalities. For data in each
modality, we define a separate transformation to map the data from the
corresponding feature space to the new optimized subspace by exploiting the
available information from the class of interest only. We also propose
different regularization strategies for the proposed method and provide both
linear and non-linear formulations. The proposed Multimodal Subspace Support
Vector Data Description outperforms all the competing methods using data from a
single modality or fusing data from all modalities in four out of five
datasets.Comment: 26 pages manuscript (6 tables, 2 figures), 24 pages supplementary
material (27 tables, 10 figures). The manuscript and supplementary material
are combined as a single .pdf (50 pages) fil
Splitting hybrid Make-To-Order and Make-To-Stock demand profiles
In this paper a demand time series is analysed to support Make-To-Stock (MTS)
and Make-To-Order (MTO) production decisions. Using a purely MTS production
strategy based on the given demand can lead to unnecessarily high inventory
levels thus it is necessary to identify likely MTO episodes.
This research proposes a novel outlier detection algorithm based on special
density measures. We divide the time series' histogram into three clusters. One
with frequent-low volume covers MTS items whilst a second accounts for high
volumes which is dedicated to MTO items. The third cluster resides between the
previous two with its elements being assigned to either the MTO or MTS class.
The algorithm can be applied to a variety of time series such as stationary and
non-stationary ones.
We use empirical data from manufacturing to study the extent of inventory
savings. The percentage of MTO items is reflected in the inventory savings
which were shown to be an average of 18.1%.Comment: demand analysis; time series; outlier detection; production strategy;
Make-To-Order(MTO); Make-To-Stock(MTS); 15 pages, 9 figure
- …