89,814 research outputs found
Re-mining item associations: methodology and a case study in apparel retailing
Association mining is the conventional data mining technique for analyzing market basket data and it reveals the positive and negative associations between items. While being an integral part of transaction data, pricing and time information have not been integrated into market basket analysis in earlier studies. This paper proposes a new approach to mine price, time and domain related attributes through re-mining of association mining results. The underlying factors behind positive and negative relationships can be characterized and described through this second data mining stage. The applicability of the methodology is demonstrated through the analysis of data coming from a large apparel retail chain, and its algorithmic complexity is analyzed in comparison to the existing techniques
The potential of text mining in data integration and network biology for plant research : a case study on Arabidopsis
Despite the availability of various data repositories for plant research, a wealth of information currently remains hidden within the biomolecular literature. Text mining provides the necessary means to retrieve these data through automated processing of texts. However, only recently has advanced text mining methodology been implemented with sufficient computational power to process texts at a large scale. In this study, we assess the potential of large-scale text mining for plant biology research in general and for network biology in particular using a state-of-the-art text mining system applied to all PubMed abstracts and PubMed Central full texts. We present extensive evaluation of the textual data for Arabidopsis thaliana, assessing the overall accuracy of this new resource for usage in plant network analyses. Furthermore, we combine text mining information with both protein-protein and regulatory interactions from experimental databases. Clusters of tightly connected genes are delineated from the resulting network, illustrating how such an integrative approach is essential to grasp the current knowledge available for Arabidopsis and to uncover gene information through guilt by association. All large-scale data sets, as well as the manually curated textual data, are made publicly available, hereby stimulating the application of text mining data in future plant biology studies
Generating ordered list of Recommended Items: a Hybrid Recommender System of Microblog
Precise recommendation of followers helps in improving the user experience
and maintaining the prosperity of twitter and microblog platforms. In this
paper, we design a hybrid recommender system of microblog as a solution of KDD
Cup 2012, track 1 task, which requires predicting users a user might follow in
Tencent Microblog. We describe the background of the problem and present the
algorithm consisting of keyword analysis, user taxonomy, (potential)interests
extraction and item recommendation. Experimental result shows the high
performance of our algorithm. Some possible improvements are discussed, which
leads to further study.Comment: 7 page
Scope of negation detection in sentiment analysis
An important part of information-gathering behaviour has always been to find out what other people think and whether they have favourable (positive) or unfavourable (negative) opinions about the subject. This survey studies the role of negation in an opinion-oriented information-seeking system. We investigate the problem of determining the polarity of sentiments in movie reviews when negation words, such as not and hardly occur in the sentences. We examine how different negation scopes (window sizes) affect the classification accuracy. We used term frequencies to evaluate the discrimination capacity of our system with different window sizes. The results show that there is no significant difference in classification accuracy when different window sizes have been applied. However, negation detection helped to identify more opinion or sentiment carrying expressions. We conclude that traditional negation detection methods are inadequate for the task of sentiment analysis in this domain and that progress is to be made by exploiting information about how opinions are expressed implicitly
Data Mining in Health-Care: Issues and a Research Agenda
While data mining has become a much-lauded tool in business and related fields, its role in the healthcare arena is still being explored. Currently, most applications of data mining in healthcare can be categorized into two areas: decision support for clinical practice, and policy planning/decision making. However, it is challenging to find empirical literature in this area since a substantial amount of existing work in data mining for health care is conceptual in nature. In this paper, we review the challenges that limit the progress made in this area and present considerations for the future of data mining in healthcare
- …