3,564 research outputs found

    Cluster Optimization for Improved Web Usage Mining

    Get PDF
    Now days, World Wide Web (WWW) has become rich and most powerful source of information. Conversely, it has become tricky and critical task to retrieve actual information due to its continuous expansion in dimensions. Web Usage Mining is a step-wise technique of extracting useful access patterns of the user from web. Web personalization makes use of web usage mining techniques, for knowledge acquisition process done by analyzing the user navigational patterns. The web page personalization involves clustering of different web pages having similar navigation patterns for an individual. Since cluster size expands due to the frequent access, optimization or shrinking the size of clusters becomes a chief consideration. This paper proposes a tactic of cluster optimization based on concept of swarm intelligence techniques. Later on based on the recognition of user access patterns, clustering is implemented using neural fuzzy approach i.e. NEF Class algorithm and cluster optimization is implemented using Ant Nest Mate Approach

    Process-mining-enabled audit of information systems: Methodology and an application

    Get PDF
    Current methodologies for Information Systems (ISs) audits suffer from some limitations that could question the effectiveness of such procedures in detecting deviations, frauds, or abuses. Process Mining (PM), a set of business-process-related diagnostic and improvement techniques, can tackle these weaknesses, but literature lacks contributions that address this possibility concretely. Thus, by framing PM as an Expert System (ES) engine, this paper presents a five-step PM-based methodology for IS audits and validates it through a case in a freight export port process managed by a Port Community System (PCS), an open electronic platform enabling information exchange among port stakeholders. The validation pointed out some advantages (e.g. depth of analysis, easier automation, less invasiveness) of our PM-enabled methodology over extant ESs and tools for IS audit. The substantive test and the check on the PCS processing controls and output controls allowed to identify four major non-conformances likely implying both legal and operational risks, and two unforeseen process deviations that were not known by the port authority, but that could improve the flexibility of the process. These outcomes set the stage for an export process reengineering, and for revising the boundaries in the process flow of the PCS

    Discovering usage patterns from web server logs

    Get PDF
    As the amount of information available on the World Wide Web (WWW) increases rapidly, the number of sites that hold particular information also increases. In order to have some insights o the site usage, system administrator needs tools that can aid in his usage site’s analysis.To achieve this goal, the use of web mining too is necessary to discover the usage pattern of a particular site. For the purpose of this study, server logs from the educational portal were retrieved, pre-processed and analyzed. Information collected by the Web servers are kept in the server logs and used as the main source of data for analyzing users’ navigation patterns. Once the server logs have been preprocessed and sessions have been obtained, there are several kinds of access pattern mining that can be performed, depending on the needs of the analyst. In this study, data mining technique known as Generalized Association Rule was used in order to get some insights into website usage pattern. The findings from this study provide an overview of the usage pattern of particular educational portal. The study also demonstrates how Generalized Association Rule can be used in site usage analysis. Such a technique enables the discovery of hidden information within the web server logs using data mining technique

    Automatic Finding Trapezoidal Membership Functions in Mining Fuzzy Association Rules Based on Learning Automata

    Get PDF
    Association rule mining is an important data mining technique used for discovering relationships among all data items. Membership functions have a significant impact on the outcome of the mining association rules. An important challenge in fuzzy association rule mining is finding an appropriate membership functions, which is an optimization issue. In the most relevant studies of fuzzy association rule mining, only triangle membership functions are considered. This study, as the first attempt, used a team of continuous action-set learning automata (CALA) to find both the appropriate number and positions of trapezoidal membership functions (TMFs). The spreads and centers of the TMFs were taken into account as parameters for the research space and a new approach for the establishment of a CALA team to optimize these parameters was introduced. Additionally, to increase the convergence speed of the proposed approach and remove bad shapes of membership functions, a new heuristic approach has been proposed. Experiments on two real data sets showed that the proposed algorithm improves the efficiency of the extracted rules by finding optimized membership functions

    Finding usage patterns from generalized weblog data

    Get PDF
    Buried in the enormous, heterogeneous and distributed information, contained in the web server access logs, is knowledge with great potential value. As websites continue to grow in number and complexity, web usage mining systems face two significant challenges - scalability and accuracy. This thesis develops a web data generalization technique and incorporates it into the web usage mining framework in an attempt to exploit this information-rich source of data for effective and efficient pattern discovery. Given a concept hierarchy on the web pages, generalization replaces actual page-clicks with their general concepts. Existing methods do this by taking a level-based cut through the concept hierarchy. This adversely affects the quality of mined patterns since, depending on the depth of the chosen level, either significant pages of user interests get coalesced, or many insignificant concepts are retained. We present a usage driven concept ascension algorithm, which only preserves significant items, possibly at different levels in the hierarchy. Concept usage is estimated using a small stratified sample of the large weblog data. A usage threshold is then used to define the nodes to be pruned in the hierarchy for generalization. Our experiments on large real weblog data demonstrate improved performance in terms of quality and computation time of the pattern discovery process. Our algorithm yields an effective and scalable tool for web usage mining

    A Review of Rule Learning Based Intrusion Detection Systems and Their Prospects in Smart Grids

    Get PDF

    Data mining in soft computing framework: a survey

    Get PDF
    The present article provides a survey of the available literature on data mining using soft computing. A categorization has been provided based on the different soft computing tools and their hybridizations used, the data mining function implemented, and the preference criterion selected by the model. The utility of the different soft computing methodologies is highlighted. Generally fuzzy sets are suitable for handling the issues related to understandability of patterns, incomplete/noisy data, mixed media information and human interaction, and can provide approximate solutions faster. Neural networks are nonparametric, robust, and exhibit good learning and generalization capabilities in data-rich environments. Genetic algorithms provide efficient search algorithms to select a model, from mixed media data, based on some preference criterion/objective function. Rough sets are suitable for handling different types of uncertainty in data. Some challenges to data mining and the application of soft computing methodologies are indicated. An extensive bibliography is also included
    • …
    corecore