58,664 research outputs found

    The contribution of data mining to information science

    Get PDF
    The information explosion is a serious challenge for current information institutions. On the other hand, data mining, which is the search for valuable information in large volumes of data, is one of the solutions to face this challenge. In the past several years, data mining has made a significant contribution to the field of information science. This paper examines the impact of data mining by reviewing existing applications, including personalized environments, electronic commerce, and search engines. For these three types of application, how data mining can enhance their functions is discussed. The reader of this paper is expected to get an overview of the state of the art research associated with these applications. Furthermore, we identify the limitations of current work and raise several directions for future research

    Web Mining Functions in an Academic Search Application

    Get PDF
    This paper deals with Web mining and the different categories of Web mining like content, structure and usage mining. The application of Web mining in an academic search application has been discussed. The paper concludes with open problems related to Web mining. The present work can be a useful input to Web users, Web Administrators in a university environment.Database, HITS, IR, NLP, Web mining

    Off-line vs. On-line Evaluation of Recommender Systems in Small E-commerce

    Full text link
    In this paper, we present our work towards comparing on-line and off-line evaluation metrics in the context of small e-commerce recommender systems. Recommending on small e-commerce enterprises is rather challenging due to the lower volume of interactions and low user loyalty, rarely extending beyond a single session. On the other hand, we usually have to deal with lower volumes of objects, which are easier to discover by users through various browsing/searching GUIs. The main goal of this paper is to determine applicability of off-line evaluation metrics in learning true usability of recommender systems (evaluated on-line in A/B testing). In total 800 variants of recommending algorithms were evaluated off-line w.r.t. 18 metrics covering rating-based, ranking-based, novelty and diversity evaluation. The off-line results were afterwards compared with on-line evaluation of 12 selected recommender variants and based on the results, we tried to learn and utilize an off-line to on-line results prediction model. Off-line results shown a great variance in performance w.r.t. different metrics with the Pareto front covering 68\% of the approaches. Furthermore, we observed that on-line results are considerably affected by the novelty of users. On-line metrics correlates positively with ranking-based metrics (AUC, MRR, nDCG) for novice users, while too high values of diversity and novelty had a negative impact on the on-line results for them. For users with more visited items, however, the diversity became more important, while ranking-based metrics relevance gradually decrease.Comment: Submitted to ACM Hypertext 2020 Conferenc

    Inferring Networks of Substitutable and Complementary Products

    Full text link
    In a modern recommender system, it is important to understand how products relate to each other. For example, while a user is looking for mobile phones, it might make sense to recommend other phones, but once they buy a phone, we might instead want to recommend batteries, cases, or chargers. These two types of recommendations are referred to as substitutes and complements: substitutes are products that can be purchased instead of each other, while complements are products that can be purchased in addition to each other. Here we develop a method to infer networks of substitutable and complementary products. We formulate this as a supervised link prediction task, where we learn the semantics of substitutes and complements from data associated with products. The primary source of data we use is the text of product reviews, though our method also makes use of features such as ratings, specifications, prices, and brands. Methodologically, we build topic models that are trained to automatically discover topics from text that are successful at predicting and explaining such relationships. Experimentally, we evaluate our system on the Amazon product catalog, a large dataset consisting of 9 million products, 237 million links, and 144 million reviews.Comment: 12 pages, 6 figure
    corecore