355,593 research outputs found
Multi-task learning for intelligent data processing in granular computing context
Classification is a popular task in many application areas, such as decision making, rating, sentiment analysis and pattern recognition. In the recent years, due to the vast and rapid increase in the size of data, classification has been mainly undertaken in the way of supervised machine learning. In this context, a classification task involves data labelling, feature extraction,feature selection and learning of classifiers. In traditional machine learning, data is usually single-labelled by experts, i.e., each instance is only assigned one class label, since experts assume that different classes are mutually exclusive and each instance is clear-cut. However, the above assumption does not always hold in real applications. For example, in the context of emotion detection, there could be more than one emotion identified from the same person. On the other hand, feature selection has typically been done by evaluating feature subsets in terms of their relevance to all the classes. However, it is possible that a feature is only relevant to one class, but is irrelevant to all the other classes. Based on the above argumentation on data labelling and feature selection, we propose in this paper a framework of multi-task learning. In particular, we consider
traditional machine learning to be single task learning, and argue the necessity to turn it into multi-task learning to allow an instance to belong to more than one class (i.e., multi-task classification) and to achieve class specific feature selection (i.e.,multi-task feature selection). Moreover, we report two experimental studies in terms of fuzzy multi-task classification and rule learning based multi-task feature selection. The results show empirically that it is necessary to undertake multi-task learning for both classification and feature selection
Learning Interpretable Rules for Multi-label Classification
Multi-label classification (MLC) is a supervised learning problem in which,
contrary to standard multiclass classification, an instance can be associated
with several class labels simultaneously. In this chapter, we advocate a
rule-based approach to multi-label classification. Rule learning algorithms are
often employed when one is not only interested in accurate predictions, but
also requires an interpretable theory that can be understood, analyzed, and
qualitatively evaluated by domain experts. Ideally, by revealing patterns and
regularities contained in the data, a rule-based theory yields new insights in
the application domain. Recently, several authors have started to investigate
how rule-based models can be used for modeling multi-label data. Discussing
this task in detail, we highlight some of the problems that make rule learning
considerably more challenging for MLC than for conventional classification.
While mainly focusing on our own previous work, we also provide a short
overview of related work in this area.Comment: Preprint version. To appear in: Explainable and Interpretable Models
in Computer Vision and Machine Learning. The Springer Series on Challenges in
Machine Learning. Springer (2018). See
http://www.ke.tu-darmstadt.de/bibtex/publications/show/3077 for further
informatio
Learning Bayesian network classifiers for multidimensional supervised classification problems by means of a multiobjective approach
A classical supervised classification task tries to predict a single class variable based on a data set composed of a set of labeled examples. However, in many real domains more than one variable could be considered as a class variable, so a generalization of the single-class classification problem to the simultaneous prediction of a set of class variables should be developed. This problem is called multi-dimensional supervised classification.
In this paper, we deal with the problem of learning Bayesian net work classifiers for multi-dimensional supervised classification problems. In order to do that, we have generalized the classical single-class Bayesian network classifier to the prediction of several class variables. In addition, we have defined new classification rules for probabilistic classifiers in multi-dimensional problems.
We present a learning approach following a multi-objective strategy which considers the accuracy of each class variable separately as the functions to optimize. The solution of the learning approach is a Pareto set of non-dominated multi-dimensional Bayesian network classifiers and their accuracies for the different class variables, so a decision maker can easily choose by hand the classifier that best suits the particular problem and domain
Multitask Learning for Network Traffic Classification
Traffic classification has various applications in today's Internet, from
resource allocation, billing and QoS purposes in ISPs to firewall and malware
detection in clients. Classical machine learning algorithms and deep learning
models have been widely used to solve the traffic classification task. However,
training such models requires a large amount of labeled data. Labeling data is
often the most difficult and time-consuming process in building a classifier.
To solve this challenge, we reformulate the traffic classification into a
multi-task learning framework where bandwidth requirement and duration of a
flow are predicted along with the traffic class. The motivation of this
approach is twofold: First, bandwidth requirement and duration are useful in
many applications, including routing, resource allocation, and QoS
provisioning. Second, these two values can be obtained from each flow easily
without the need for human labeling or capturing flows in a controlled and
isolated environment. We show that with a large amount of easily obtainable
data samples for bandwidth and duration prediction tasks, and only a few data
samples for the traffic classification task, one can achieve high accuracy. We
conduct two experiment with ISCX and QUIC public datasets and show the efficacy
of our approach
- …