10,085 research outputs found

    Effective pattern discovery for text mining

    Get PDF
    Many data mining techniques have been proposed for mining useful patterns in text documents. However, how to effectively use and update discovered patterns is still an open research issue, especially in the domain of text mining. Since most existing text mining methods adopted term-based approaches, they all suffer from the problems of polysemy and synonymy. Over the years, people have often held the hypothesis that pattern (or phrase) based approaches should perform better than the term-based ones, but many experiments did not support this hypothesis. This paper presents an innovative technique, effective pattern discovery which includes the processes of pattern deploying and pattern evolving, to improve the effectiveness of using and updating discovered patterns for finding relevant and interesting information. Substantial experiments on RCV1 data collection and TREC topics demonstrate that the proposed solution achieves encouraging performance

    Using Information Filtering in Web Data Mining Process

    Get PDF
    Web service-oriented Grid is becoming a standard for achieving loosely coupled distributed computing. Grid services could easily be specified with web-service based interfaces. In this paper we first envisage a realistic Grid market with players such as end-users, brokers and service providers participating co-operatively with an aim to meet requirements and earn profit. End-users wish to use functionality of Grid services by paying the minimum possible price or price confined within a specified budget, brokers aim to maximise profit whilst establishing a SLA (Service Level Agreement) and satisfying end-user needs and at the same time resisting the volatility of service execution time and availability. Service providers aim to develop price models based on end-user or broker demands that will maximise their profit. In this paper we focus on developing stochastic approaches to end-user workflow scheduling that provides QoS guarantees by establishing a SLA. We also develop a novel 2-stage stochastic programming technique that aims at establishing a SLA with end-users regarding satisfying their workflow QoS requirements. We develop a scheduling (workload allocation) technique based on linear programming that embeds the negotiated workflow QoS into the program and model Grid services as generalised queues. This technique is shown to outperform existing scheduling techniques that don't rely on real-time performance information

    Pattern Based Mining For Relevant Document Extraction

    Get PDF
    This paper presents efficient mining algorithm for discovering patterns from text collection and search for useful and interesting patterns. For extracting useful information we used pattern based model containing frequent sequential patterns and pruned the meaningless patterns. Here an innovative and effective technique is used for pattern discovery which includes SPM & FP growth algorithms for pattern mining and applies the processes of pattern deploying, pattern evolving, to improve the effectiveness of using and updating discovered patterns for finding relevant and interesting information
    corecore