Search CORE

4 research outputs found

A Novel Rule Ordering Approach in Classification Association Rule Mining

Author: Frans Coenen
Qin Xin
Yanbo J. Wang
Publication venue: Springer-Verlag
Publication date: 01/01/2007
Field of study

Abstract. A Classification Association Rule (CAR), a common type of mined knowledge in Data Mining, describes an implicative co-occurring relationship between a set of binary-valued data-attributes (items) and a pre-defined class, expressed in the form of an “antecedent ⇒ consequent-class ” rule. Classification Association Rule Mining (CARM) is a recent Classification Rule Mining (CRM) approach that builds an Association Rule Mining (ARM) based classifier using CARs. Regardless of which particular methodology is used to build it, a classifier is usually presented as an ordered CAR list, based on an applied rule ordering strategy. Five existing rule ordering mechanisms can be identified: (1) Confidence-Support-size_of_Antecedent (CSA), (2) size_of_Antecedent-Confidence-Support (ACS), (3) Weighted Relative Accuracy (WRA), (4) Laplace Accuracy, and (5) χ 2 Testing. In this paper, we divide the above mechanisms into two groups: (i) pure “support-confidence ” framework like, and (ii) additive score assigning like. We consequently propose a hybrid rule ordering approach by combining one approach taken from (i) and another approach taken from (ii). The experimental results show that the proposed rule ordering approach performs well with respect to the accuracy of classification

CiteSeerX

Language-independent pre-processing of large document bases for text classification

Author: Justin. Wang Yanbo
Publication venue
Publication date
Field of study

Text classification is a well-known topic in the research of knowledge discovery in databases. Algorithms for text classification generally involve two stages. The first is concerned with identification of textual features (i.e. words andlor phrases) that may be relevant to the classification process. The second is concerned with classification rule mining and categorisation of "unseen" textual data. The first stage is the subject of this thesis and often involves an analysis of text that is both language-specific (and possibly domain-specific), and that may also be computationally costly especially when dealing with large datasets. Existing approaches to this stage are not, therefore, generally applicable to all languages. In this thesis, we examine a number of alternative keyword selection methods and phrase generation strategies, coupled with two potential significant word list construction mechanisms and two final significant word selection mechanisms, to identify such words andlor phrases in a given textual dataset that are expected to serve to distinguish between classes, by simple, language-independent statistical properties. We present experimental results, using common (large) textual datasets presented in two distinct languages, to show that the proposed approaches can produce good performance with respect to both classification accuracy and processing efficiency. In other words, the study presented in this thesis demonstrates the possibility of efficiently solving the traditional text classification problem in a language-independent (also domain-independent) manner

University of Liverpool Repository

Combining SOA and BPM Technologies for Cross-System Process Automation

Author: Herr Sebastian
Läufer Konstantin
Shafaee John
Thiruvathukal George K.
Wirtz Guido
Publication venue: Loyola eCommons
Publication date: 01/01/2008
Field of study

This paper summarizes the results of an industry case study that introduced a cross-system business process automation solution based on a combination of SOA and BPM standard technologies (i.e., BPMN, BPEL, WSDL). Besides discussing major weaknesses of the existing, custom-built, solution and comparing them against experiences with the developed prototype, the paper presents a course of action for transforming the current solution into the proposed solution. This includes a general approach, consisting of four distinct steps, as well as specific action items that are to be performed for every step. The discussion also covers language and tool support and challenges arising from the transformation

Loyola eCommons