193,966 research outputs found

    ADVANCES IN KNOWLEDGE DISCOVERY IN DATABASES

    Get PDF
    The Knowledge Discovery in Databases and Data Mining field proposes the development of methods and techniques for assigning useful meanings for data stored in databases. It gathers researches from many study fields like machine learning, pattern recognition, databases, statistics, artificial intelligence, knowledge acquisition for expert systems, data visualization and grids. While Data Mining represents a set of specific algorithms of finding useful meanings in stored data, Knowledge Discovery in Databases represents the overall process of finding knowledge and includes the Data Mining as one step among others such as selection, pre�processing, transformation and interpretation of mined data. This paper aims to point the most important steps that were made in the Knowledge Discovery in Databases field of study and to show how the overall process of discovering can be improved in the future.

    Applications of concurrent access patterns in web usage mining

    Get PDF
    This paper builds on the original data mining and modelling research which has proposed the discovery of novel structural relation patterns, applying the approach in web usage mining. The focus of attention here is on concurrent access patterns (CAP), where an overarching framework illuminates the methodology for web access patterns post-processing. Data pre-processing, pattern discovery and patterns analysis all proceed in association with access patterns mining, CAP mining and CAP modelling. Pruning and selection of access pat-terns takes place as necessary, allowing further CAP mining and modelling to be pursued in the search for the most interesting concurrent access patterns. It is shown that higher level CAPs can be modelled in a way which brings greater structure to bear on the process of knowledge discovery. Experiments with real-world datasets highlight the applicability of the approach in web navigation

    Actionable knowledge discovery : methodologies and frameworks

    Full text link
    University of Technology, Sydney. Faculty of Engineering and Information Technology.Most data mining algorithms and tools stop at the mining and delivery of patterns satisfying expected technical interestingness. There are often many patterns mined but business people either are not interested in them or do not know what follow-up actions to take to support their business decisions. This issue has seriously affected the widespread employment of advanced data mining techniques in greatly promoting enterprise operational quality and productivity. In this thesis, a formal and systematic view of actionable knowledge discovery (AKD for short) has been proposed from the system and microeconomy perspectives. AKD is a closed-loop optimization problem-solving process from problem definition, framework/model design to actionable pattern discovery, and to deliver operationalizable business rules that can be seamlessly associated or integrated with business processes and systems. To support AKD, corresponding methodologies, frameworks and tools have been proposed with case studies in the real world to address critical challenges facing the traditional KDD and. to cater for crucially important factors surrounding real-life AKD. First, a comprehensive survey and retrospection on the existing data mining methodologies, issues and challenges in actionable knowledge discovery are reviewed. Second, a practical data mining methodology: domain driven data mining is addressed. Third, several frameworks have been proposed to support domain drivenactionable knowledge discovery. Fourth, case studies of domain-driven actionable pattern mining in stock markets and social security data are presented to demonstrate the usefulness and potential of the proposed domain driven actionable knowledge discovery. In summary, this thesis explores in detail how domain driven actionable knowledge discovery can be effectively and efficiently applied to the discovery and delivery of knowledge satisfying both technical and business concerns as well as to support smart decision-making in the real world. The issues and techniques addressed in this thesis have potential to promote the research on critical KDD challenges, and contribute to the paradigm shift from data-centered and technical significance-oriented hidden pattern mining to domain-driven and balanced actionable knowledge discovery. The proposed methodologies and frameworks are flexible, general and effective to be expanded and applied to mining real-life complex data for actionable knowledge

    Visually Mining Interesting Patterns in Multivariate Datasets

    Get PDF
    Data mining for patterns and knowledge discovery in multivariate datasets are very important processes and tasks to help analysts understand the dataset, describe the dataset, and predict unknown data values. However, conventional computer-supported data mining approaches often limit the user from getting involved in the mining process and performing interactions during the pattern discovery. Besides, without the visual representation of the extracted knowledge, the analysts can have difficulty explaining and understanding the patterns. Therefore, instead of directly applying automatic data mining techniques, it is necessary to develop appropriate techniques and visualization systems that allow users to interactively perform knowledge discovery, visually examine the patterns, adjust the parameters, and discover more interesting patterns based on their requirements. In the dissertation, I will discuss different proposed visualization systems to assist analysts in mining patterns and discovering knowledge in multivariate datasets, including the design, implementation, and the evaluation. Three types of different patterns are proposed and discussed, including trends, clusters of subgroups, and local patterns. For trend discovery, the parameter space is visualized to allow the user to visually examine the space and find where good linear patterns exist. For cluster discovery, the user is able to interactively set the query range on a target attribute, and retrieve all the sub-regions that satisfy the user\u27s requirements. The sub-regions that satisfy the same query and are neareach other are grouped and aggregated to form clusters. For local pattern discovery, the patterns for the local sub-region with a focal point and its neighbors are computationally extracted and visually represented. To discover interesting local neighbors, the extracted local patterns are integrated and visually shown to the analysts. Evaluations of the three visualization systems using formal user studies are also performed and discussed

    Flexible constrained sampling with guarantees for pattern mining

    Get PDF
    Pattern sampling has been proposed as a potential solution to the infamous pattern explosion. Instead of enumerating all patterns that satisfy the constraints, individual patterns are sampled proportional to a given quality measure. Several sampling algorithms have been proposed, but each of them has its limitations when it comes to 1) flexibility in terms of quality measures and constraints that can be used, and/or 2) guarantees with respect to sampling accuracy. We therefore present Flexics, the first flexible pattern sampler that supports a broad class of quality measures and constraints, while providing strong guarantees regarding sampling accuracy. To achieve this, we leverage the perspective on pattern mining as a constraint satisfaction problem and build upon the latest advances in sampling solutions in SAT as well as existing pattern mining algorithms. Furthermore, the proposed algorithm is applicable to a variety of pattern languages, which allows us to introduce and tackle the novel task of sampling sets of patterns. We introduce and empirically evaluate two variants of Flexics: 1) a generic variant that addresses the well-known itemset sampling task and the novel pattern set sampling task as well as a wide range of expressive constraints within these tasks, and 2) a specialized variant that exploits existing frequent itemset techniques to achieve substantial speed-ups. Experiments show that Flexics is both accurate and efficient, making it a useful tool for pattern-based data exploration.Comment: Accepted for publication in Data Mining & Knowledge Discovery journal (ECML/PKDD 2017 journal track

    Survey On Moving Towards Frequent Pattern Growth for Infrequent Weighted Itemset Mining

    Get PDF
    In data mining and knowledge discovery technique domain, frequent pattern mining plays an important role but it does not consider different weight value of the items. Association Rule Mining is to find the correlation between data. The frequent itemsets are patterns or items like itemsets, substructures, or subsequences that come out in a data set frequently or continuously. In this paper we are presenting survey of various frequent pattern mining and weighted itemset mining. Different articles related to frequent and weighted infrequent itemset mining were proposed. This paper focus on survey of various Existing Algorithms related to frequent and infrequent itemset mining which creates a path for future researches in the field of Association Rule Mining

    Soft data mining, computational theory of perceptions, and rough-fuzzy approach

    Get PDF
    Data mining and knowledge discovery is described from pattern recognition point of view along with the relevance of soft computing. Key features of the computational theory of perceptions and its significance in pattern recognition and knowledge discovery problems are explained. Role of fuzzy-granulation (f-granulation) in machine and human intelligence, and its modeling through rough-fuzzy integration are discussed. Merits of fuzzy granular computation, in terms of performance and computation time, for the task of case generation in large scale case-based reasoning systems are illustrated through an example
    • …
    corecore