5,786 research outputs found
Rough Set Based Rule Evaluations and Their Applications
Knowledge discovery is an important process in data analysis, data
mining and machine learning. Typically knowledge is presented in the
form of rules. However, knowledge discovery systems often generate a
huge amount of rules. One of the challenges we face is how to
automatically discover interesting and meaningful knowledge from
such discovered rules. It is infeasible for human beings to select
important and interesting rules manually. How to provide a measure
to evaluate the qualities of rules in order to facilitate the
understanding of data mining results becomes our focus. In this
thesis, we present a series of rule evaluation techniques for the
purpose of facilitating the knowledge understanding process. These
evaluation techniques help not only to reduce the number of rules,
but also to extract higher quality rules. Empirical studies on both
artificial data sets and real world data sets demonstrate how such
techniques can contribute to practical systems such as ones for
medical diagnosis and web personalization.
In the first part of this thesis, we discuss several rule evaluation
techniques that are proposed towards rule postprocessing. We show
how properly defined rule templates can be used as a rule evaluation
approach. We propose two rough set based measures, a Rule Importance
Measure, and a Rules-As-Attributes Measure,
%a measure of considering rules as attributes,
to rank the important and interesting rules. In the second part of
this thesis, we show how data preprocessing can help with rule
evaluation. Because well preprocessed data is essential for
important rule generation, we propose a new approach for processing
missing attribute values for enhancing the generated rules. In the
third part of this thesis, a rough set based rule evaluation system
is demonstrated to show the effectiveness of the measures proposed
in this thesis. Furthermore, a new user-centric web personalization
system is used as a case study to demonstrate how the proposed
evaluation measures can be used in an actual application
A comparative study of the AHP and TOPSIS methods for implementing load shedding scheme in a pulp mill system
The advancement of technology had encouraged mankind to design and create useful
equipment and devices. These equipment enable users to fully utilize them in various
applications. Pulp mill is one of the heavy industries that consumes large amount of
electricity in its production. Due to this, any malfunction of the equipment might
cause mass losses to the company. In particular, the breakdown of the generator
would cause other generators to be overloaded. In the meantime, the subsequence
loads will be shed until the generators are sufficient to provide the power to other
loads. Once the fault had been fixed, the load shedding scheme can be deactivated.
Thus, load shedding scheme is the best way in handling such condition. Selected load
will be shed under this scheme in order to protect the generators from being
damaged. Multi Criteria Decision Making (MCDM) can be applied in determination
of the load shedding scheme in the electric power system. In this thesis two methods
which are Analytic Hierarchy Process (AHP) and Technique for Order Preference by
Similarity to Ideal Solution (TOPSIS) were introduced and applied. From this thesis,
a series of analyses are conducted and the results are determined. Among these two
methods which are AHP and TOPSIS, the results shown that TOPSIS is the best
Multi criteria Decision Making (MCDM) for load shedding scheme in the pulp mill
system. TOPSIS is the most effective solution because of the highest percentage
effectiveness of load shedding between these two methods. The results of the AHP
and TOPSIS analysis to the pulp mill system are very promising
Predictive User Modeling with Actionable Attributes
Different machine learning techniques have been proposed and used for
modeling individual and group user needs, interests and preferences. In the
traditional predictive modeling instances are described by observable
variables, called attributes. The goal is to learn a model for predicting the
target variable for unseen instances. For example, for marketing purposes a
company consider profiling a new user based on her observed web browsing
behavior, referral keywords or other relevant information. In many real world
applications the values of some attributes are not only observable, but can be
actively decided by a decision maker. Furthermore, in some of such applications
the decision maker is interested not only to generate accurate predictions, but
to maximize the probability of the desired outcome. For example, a direct
marketing manager can choose which type of a special offer to send to a client
(actionable attribute), hoping that the right choice will result in a positive
response with a higher probability. We study how to learn to choose the value
of an actionable attribute in order to maximize the probability of a desired
outcome in predictive modeling. We emphasize that not all instances are equally
sensitive to changes in actions. Accurate choice of an action is critical for
those instances, which are on the borderline (e.g. users who do not have a
strong opinion one way or the other). We formulate three supervised learning
approaches for learning to select the value of an actionable attribute at an
instance level. We also introduce a focused training procedure which puts more
emphasis on the situations where varying the action is the most likely to take
the effect. The proof of concept experimental validation on two real-world case
studies in web analytics and e-learning domains highlights the potential of the
proposed approaches
Customer profile classification using transactional data
Customer profiles are by definition made up of factual and transactional data. It is often the case that due to reasons such as high cost of data acquisition and/or protection, only the transactional data are available for data mining operations. Transactional data, however, tend to be highly sparse and skewed due to a large proportion of customers engaging in very few transactions. This can result in a bias in the prediction accuracy of classifiers built using them towards the larger proportion of customers with fewer transactions. This paper investigates an approach for accurately and confidently grouping and classifying customers in bins on the basis of the number of their transactions. The experiments we conducted on a highly sparse and skewed real-world transactional data show that our proposed approach can be used to identify a critical point at which customer profiles can be more confidently distinguished
Layered evaluation of interactive adaptive systems : framework and formative methods
Peer reviewedPostprin
- …