6,104 research outputs found
Learning Interpretable Rules for Multi-label Classification
Multi-label classification (MLC) is a supervised learning problem in which,
contrary to standard multiclass classification, an instance can be associated
with several class labels simultaneously. In this chapter, we advocate a
rule-based approach to multi-label classification. Rule learning algorithms are
often employed when one is not only interested in accurate predictions, but
also requires an interpretable theory that can be understood, analyzed, and
qualitatively evaluated by domain experts. Ideally, by revealing patterns and
regularities contained in the data, a rule-based theory yields new insights in
the application domain. Recently, several authors have started to investigate
how rule-based models can be used for modeling multi-label data. Discussing
this task in detail, we highlight some of the problems that make rule learning
considerably more challenging for MLC than for conventional classification.
While mainly focusing on our own previous work, we also provide a short
overview of related work in this area.Comment: Preprint version. To appear in: Explainable and Interpretable Models
in Computer Vision and Machine Learning. The Springer Series on Challenges in
Machine Learning. Springer (2018). See
http://www.ke.tu-darmstadt.de/bibtex/publications/show/3077 for further
informatio
Client-side mobile user profile for content management using data mining techniques
Mobile device can be used as a medium to send and receive the mobile internet content. However, there are several limitations using mobile internet. Content personalisation has been viewed as an important area when using mobile internet. In order for personalisation to be successful, understanding the user is important. In this paper, we explore the implementation of the user profile at client-side, which may be used whenever user connect to the mobile content provider. The client-side user profile can help to free the provider in performing analysis by using data mining technique at the mobile device. This research investigates the conceptual idea of using clustering and classification of user profile at the client-site mobile. In this paper, we applied K-means and compared several other classification algorithms like TwoStep, Kohenen and Anomaly to determine the boundaries of the important factors using information ranking separation
Survey on Classification Algorithms for Data Mining:(Comparison and Evaluation)
Data mining concept is growing fast in popularity, it is a technology that involving methods at the intersection of (Artificial intelligent, Machine learning, Statistics and database system), the main goal of data mining process is to extract information from a large data into form which could be understandable for further use. Some algorithms of data mining are used to give solutions to classification problems in database. In this paper a comparison among three classification’s algorithms will be studied, these are (K- Nearest Neighbor classifier, Decision tree and Bayesian network) algorithms. The paper will demonstrate the strength and accuracy of each algorithm for classification in term of performance efficiency and time complexity required. For model validation purpose, twenty-four-month data analysis is conducted on a mock-up basis. Keywords: Decision tree, Bayesian network, k- nearest neighbour classifier
The IBMAP approach for Markov networks structure learning
In this work we consider the problem of learning the structure of Markov
networks from data. We present an approach for tackling this problem called
IBMAP, together with an efficient instantiation of the approach: the IBMAP-HC
algorithm, designed for avoiding important limitations of existing
independence-based algorithms. These algorithms proceed by performing
statistical independence tests on data, trusting completely the outcome of each
test. In practice tests may be incorrect, resulting in potential cascading
errors and the consequent reduction in the quality of the structures learned.
IBMAP contemplates this uncertainty in the outcome of the tests through a
probabilistic maximum-a-posteriori approach. The approach is instantiated in
the IBMAP-HC algorithm, a structure selection strategy that performs a
polynomial heuristic local search in the space of possible structures. We
present an extensive empirical evaluation on synthetic and real data, showing
that our algorithm outperforms significantly the current independence-based
algorithms, in terms of data efficiency and quality of learned structures, with
equivalent computational complexities. We also show the performance of IBMAP-HC
in a real-world application of knowledge discovery: EDAs, which are
evolutionary algorithms that use structure learning on each generation for
modeling the distribution of populations. The experiments show that when
IBMAP-HC is used to learn the structure, EDAs improve the convergence to the
optimum
Feature-based time-series analysis
This work presents an introduction to feature-based time-series analysis. The
time series as a data type is first described, along with an overview of the
interdisciplinary time-series analysis literature. I then summarize the range
of feature-based representations for time series that have been developed to
aid interpretable insights into time-series structure. Particular emphasis is
given to emerging research that facilitates wide comparison of feature-based
representations that allow us to understand the properties of a time-series
dataset that make it suited to a particular feature-based representation or
analysis algorithm. The future of time-series analysis is likely to embrace
approaches that exploit machine learning methods to partially automate human
learning to aid understanding of the complex dynamical patterns in the time
series we measure from the world.Comment: 28 pages, 9 figure
Machine Learning Methods for Fuzzy Pattern Tree Induction
This thesis elaborates on a novel approach to fuzzy machine learning, that is, the combination of machine learning methods with mathematical tools for modeling and information processing based on fuzzy logic. More specifically, the thesis is devoted to so-called fuzzy pattern trees, a model class that has recently been introduced for representing dependencies between
input and output variables in supervised learning tasks, such as classification and regression. Due to its hierarchical, modular structure and the use of different types of (nonlinear) aggregation operators, a fuzzy pattern tree has the ability to represent such dependencies in a very exible and compact way, thereby offering a reasonable balance between accuracy and model transparency.
The focus of the thesis is on novel algorithms for pattern tree induction, i.e., for learning fuzzy pattern trees from observed data. In total, three new algorithms are introduced and compared to an existing method for the data-driven construction of pattern trees. While the first two algorithms are mainly geared toward an improvement of predictive accuracy, the last one focuses on eficiency aspects and seeks to make the learning process faster. The description and discussion of each algorithm is complemented with theoretical analyses and empirical studies in order to show the effectiveness of the proposed solutions
- …