31,290 research outputs found
On the use of the Rademacher complexity in mining sequential patterns
A sequential pattern is a sequence of sets of items. Mining sequential patterns from very large datasets is a fundamental problem in data mining. This thesis formally proves the first rigorous and efficiently computable bound on the Rademacher complexity of sequential patterns. This result is then applied to two key tasks: mining frequent sequential patterns from a given dataset using progressive sampling, and mining true frequent sequential patterns from an unknown generative process
Classification with Single Constraint Progressive Mining of Sequential Patterns
Classification based on sequential pattern data has become an important topic to explore. One of research has been carried was the Classify-By-Sequence, CBS. CBS classified data based on sequential patterns obtained from AprioriLike sequential pattern mining. Sequential patterns obtained were called CSP, Classifiable Sequential Patterns. CSP was used as classifier rules or features for the classification task. CBS used AprioriLike algorithm to search for sequential patterns. However, AprioriLike algorithm took a long time to search for them. Moreover, not all sequential patterns were important for the user. In order to get the right and meaningful features for classification, user uses a constraint in sequential pattern mining. Constraint is also expected to reduce the number of sequential patterns that are short and less meaningful to the user. Therefore, we developed CBS_CLASS* with Single Constraint Progressive Mining of Sequential Patterns or Single Constraint PISA or PISA*. CBS_Class* with PISA* was proven to classify data in faster time since it only processed lesser number of sequential patterns but still conform to user’s need. The experiment result showed that compared to CBS_CLASS, CBS_Class* reduced the classification execution time by 89.8%. Moreover, the accuracy of the classification process can still be maintained.
A Constraint Guided Progressive Sequential Mining Waterfall Model for CRM
CRM has been realized as a core for the growth of any enterprise. This requires both the customer satisfaction and fulfillment of customer requirement, which can only be achieved by analyzing consumer behaviors. The data mining has become an effective tool since often the organizations have large databases of information on customers. However, the traditional data mining techniques have no relevant mechanism to provide guidance for business understanding, model selection and dynamic changes made in the databases. This article helps in understanding and maintaining the requirement of continuous data mining process for CRM in dynamic environment. A novel integrative model, Constraint Guided Progressive SequentialMiningWaterfall (CGPSMW) for knowledge discovery process is proposed. The key performance factors that include management of marketing, sales, knowledge, technology among others those are required for the successful implementation of CRM. We have studied how the sequential pattern mining performed on progressive databases instead of static databases in conjunction with these CRM performance indicators can result in highly efficient and effective useful patterns. This would further help in classification of customers which any enterprise should focus on to achieve its growth and benefit. An organization has limited number of resources that it can only use for valuable customers to reap the fruits of CRM. The different steps of the proposed CGP-SMW model give a detailed elaboration how to keep focus on these customers in dynamic scenarios
A Novel Progressive Multi-label Classifier for Classincremental Data
In this paper, a progressive learning algorithm for multi-label
classification to learn new labels while retaining the knowledge of previous
labels is designed. New output neurons corresponding to new labels are added
and the neural network connections and parameters are automatically
restructured as if the label has been introduced from the beginning. This work
is the first of the kind in multi-label classifier for class-incremental
learning. It is useful for real-world applications such as robotics where
streaming data are available and the number of labels is often unknown. Based
on the Extreme Learning Machine framework, a novel universal classifier with
plug and play capabilities for progressive multi-label classification is
developed. Experimental results on various benchmark synthetic and real
datasets validate the efficiency and effectiveness of our proposed algorithm.Comment: 5 pages, 3 figures, 4 table
Applications of concurrent access patterns in web usage mining
This paper builds on the original data mining and modelling research which has proposed the discovery of novel structural relation patterns, applying the approach in web usage mining. The focus of attention here is on concurrent access patterns (CAP), where an overarching framework illuminates the methodology for web access patterns post-processing. Data pre-processing, pattern discovery and patterns analysis all proceed in association with access patterns mining, CAP mining and CAP modelling. Pruning and selection of access pat-terns takes place as necessary, allowing further CAP mining and modelling to be pursued in the search for the most interesting concurrent access patterns. It is shown that higher level CAPs can be modelled in a way which brings greater structure to bear on the process of knowledge discovery. Experiments with real-world datasets highlight the applicability of the approach in web navigation
- …