7 research outputs found

    Pattern Based Mining For Relevant Document Extraction

    Get PDF
    This paper presents efficient mining algorithm for discovering patterns from text collection and search for useful and interesting patterns. For extracting useful information we used pattern based model containing frequent sequential patterns and pruned the meaningless patterns. Here an innovative and effective technique is used for pattern discovery which includes SPM & FP growth algorithms for pattern mining and applies the processes of pattern deploying, pattern evolving, to improve the effectiveness of using and updating discovered patterns for finding relevant and interesting information

    Automatic pattern-taxonomy extraction for web mining

    Full text link
    In this paper, we propose a model for discovering frequent sequential patterns, phrases, which can be used as profile descriptors of documents. It is indubitable that we can obtain numerous phrases using data mining algorithms. However, it is difficult to use these phrases effectively for answering what users want. Therefore, we present a pattern taxonomy extraction model which performs the task of extracting descriptive frequent sequential patterns by pruning the meaningless ones. The model then is extended and tested by applying it to the information filtering system. The results of the experiment show that pattern-based methods outperform the keyword-based methods. The results also indicate that removal of meaningless patterns not only reduces the cost of computation but also improves the effectiveness of the system. <br /

    STORE-AND-SEARCH: A MODEL FOR KNOWLEDGE DISCOVERY

    Get PDF
    Abstract: The combination of two powerful technologies -the Semantic Web and Data Mining -will probably bring the internet and even the intranet closer to human reasoning than we ever thought possible. The internet is simply viewed as one huge, distributed database just waiting to be made sense of. Preliminary work in transforming this huge corpus of text, images, sound and video is already available. There is still a long way to go until efficient algorithms for automatic conversion of traditional data into ontologies and concept hierarchies will be found. In this paper we present two approaches to semantic web mining, each concerning a different aspect -yet focusing on the same basic problem: making sense of already-existing data designed originally only for human readers. The first one is an approach to recurring pattern mining and the second is a store-and-search model for knowledge discovery. We present in this paper only a small subset of work undergone in this exciting field of Semantic Web Mining, but we hope that it will provide a glimpse into the realm of possibilities that it opens

    Effective pattern discovery for text mining

    Get PDF
    Many data mining techniques have been proposed for mining useful patterns in text documents. However, how to effectively use and update discovered patterns is still an open research issue, especially in the domain of text mining. Since most existing text mining methods adopted term-based approaches, they all suffer from the problems of polysemy and synonymy. Over the years, people have often held the hypothesis that pattern (or phrase) based approaches should perform better than the term-based ones, but many experiments did not support this hypothesis. This paper presents an innovative technique, effective pattern discovery which includes the processes of pattern deploying and pattern evolving, to improve the effectiveness of using and updating discovered patterns for finding relevant and interesting information. Substantial experiments on RCV1 data collection and TREC topics demonstrate that the proposed solution achieves encouraging performance

    The semantics of N-soft sets, their applications, and a coda about three-way decision

    Get PDF
    This paper presents the first detailed analysis of the semantics of N-soft sets. The two benchmark semantics associated with soft sets are perfect fits for N-soft sets. We argue that N-soft sets allow for an utterly new interpretation in logical terms, whereby N-soft sets can be interpreted as a generalized form of incomplete soft sets. Applications include aggregation strategies for these settings. Finally, three-way decision models are designed with both a qualitative and a quantitative character. The first is based on the concepts of V-kernel, V-core and V-support. The second uses an extended form of cardinality that is reminiscent of the idea of scalar sigma-count as a proxy of the cardinality of a fuzzy set

    An Information Filtering Model on the Web and its Application in JobAgent

    No full text
    Machine-learning techniques play the important roles for information filtering. The main objective of machine-learning is to obtain users' profiles. To decrease the burden of on-line learning, it is important to seek suitable structures to represent user information needs. This paper proposes a model for information filtering on the Web. The user information need is described into two levels in this model: profiles on category level, and Boolean queries on document level. To efficiently estimate the relevance between the user information need and documents, the user information need is treated as a rough set on the space of documents. The rough set decision theory is used to classify the new documents according to the user information need. In return for this, the new documents are divided into three parts: positive region, boundary region, and negative region. An experimental system JobAgent is also presented to verify this model, and it shows that the rough set based model can provide an efficient approach to solve the information overload problem
    corecore