89,814 research outputs found

    Re-mining item associations: methodology and a case study in apparel retailing

    Get PDF
    Association mining is the conventional data mining technique for analyzing market basket data and it reveals the positive and negative associations between items. While being an integral part of transaction data, pricing and time information have not been integrated into market basket analysis in earlier studies. This paper proposes a new approach to mine price, time and domain related attributes through re-mining of association mining results. The underlying factors behind positive and negative relationships can be characterized and described through this second data mining stage. The applicability of the methodology is demonstrated through the analysis of data coming from a large apparel retail chain, and its algorithmic complexity is analyzed in comparison to the existing techniques

    The potential of text mining in data integration and network biology for plant research : a case study on Arabidopsis

    Get PDF
    Despite the availability of various data repositories for plant research, a wealth of information currently remains hidden within the biomolecular literature. Text mining provides the necessary means to retrieve these data through automated processing of texts. However, only recently has advanced text mining methodology been implemented with sufficient computational power to process texts at a large scale. In this study, we assess the potential of large-scale text mining for plant biology research in general and for network biology in particular using a state-of-the-art text mining system applied to all PubMed abstracts and PubMed Central full texts. We present extensive evaluation of the textual data for Arabidopsis thaliana, assessing the overall accuracy of this new resource for usage in plant network analyses. Furthermore, we combine text mining information with both protein-protein and regulatory interactions from experimental databases. Clusters of tightly connected genes are delineated from the resulting network, illustrating how such an integrative approach is essential to grasp the current knowledge available for Arabidopsis and to uncover gene information through guilt by association. All large-scale data sets, as well as the manually curated textual data, are made publicly available, hereby stimulating the application of text mining data in future plant biology studies

    Generating ordered list of Recommended Items: a Hybrid Recommender System of Microblog

    Full text link
    Precise recommendation of followers helps in improving the user experience and maintaining the prosperity of twitter and microblog platforms. In this paper, we design a hybrid recommender system of microblog as a solution of KDD Cup 2012, track 1 task, which requires predicting users a user might follow in Tencent Microblog. We describe the background of the problem and present the algorithm consisting of keyword analysis, user taxonomy, (potential)interests extraction and item recommendation. Experimental result shows the high performance of our algorithm. Some possible improvements are discussed, which leads to further study.Comment: 7 page

    Scope of negation detection in sentiment analysis

    Get PDF
    An important part of information-gathering behaviour has always been to find out what other people think and whether they have favourable (positive) or unfavourable (negative) opinions about the subject. This survey studies the role of negation in an opinion-oriented information-seeking system. We investigate the problem of determining the polarity of sentiments in movie reviews when negation words, such as not and hardly occur in the sentences. We examine how different negation scopes (window sizes) affect the classification accuracy. We used term frequencies to evaluate the discrimination capacity of our system with different window sizes. The results show that there is no significant difference in classification accuracy when different window sizes have been applied. However, negation detection helped to identify more opinion or sentiment carrying expressions. We conclude that traditional negation detection methods are inadequate for the task of sentiment analysis in this domain and that progress is to be made by exploiting information about how opinions are expressed implicitly

    Data Mining in Health-Care: Issues and a Research Agenda

    Get PDF
    While data mining has become a much-lauded tool in business and related fields, its role in the healthcare arena is still being explored. Currently, most applications of data mining in healthcare can be categorized into two areas: decision support for clinical practice, and policy planning/decision making. However, it is challenging to find empirical literature in this area since a substantial amount of existing work in data mining for health care is conceptual in nature. In this paper, we review the challenges that limit the progress made in this area and present considerations for the future of data mining in healthcare
    corecore