6 research outputs found

    An Efficient and Scalable Recommender System for the Smart Web

    Get PDF
    This proceeding at: 11th International Conference on Innovations in Information Technology (IIT) Innovations 2015. Special Theme: Smart Cities, Big Data, Sustainable Development. Took place at 2015, November, 01 - 03, in Dubai, United Arab Emirates (IEEE IIT 2015).This work describes the development of a web recommender system implementing both collaborative filtering and content-based filtering. Moreover, it supports two different working modes, either sponsored or related, depending on whether websites are to be recommended based on a list of ongoing ad campaigns or in the user preferences. Novel recommendation algorithms are proposed and implemented, which fully rely on set operations such as union and intersection in order to compute the set of recommendations to be provided to end users. The recommender system is deployed over a real-time big data architecture designed to work with Apache Hadoop ecosystem, thus supporting horizontal scalability, and is able to provide recommendations as a service by means of a RESTful API. The performance of the recommender is measured, resulting in the system being able to provide dozens of recommendations in few milliseconds in a single-node cluster setup.This research work is part of Memento Data Analysis project, co-funded by the Spanish Ministry of Industry, Energy and Tourism with no. TSI-020601-2012-99 and TSI-020110-2009-137.Publicad

    Online Caching with no Regret: Optimistic Learning via Recommendations

    Full text link
    The design of effective online caching policies is an increasingly important problem for content distribution networks, online social networks and edge computing services, among other areas. This paper proposes a new algorithmic toolbox for tackling this problem through the lens of optimistic online learning. We build upon the Follow-the-Regularized-Leader (FTRL) framework, which is developed further here to include predictions for the file requests, and we design online caching algorithms for bipartite networks with fixed-size caches or elastic leased caches subject to time-average budget constraints. The predictions are provided by a content recommendation system that influences the users viewing activity and hence can naturally reduce the caching network's uncertainty about future requests. We also extend the framework to learn and utilize the best request predictor in cases where many are available. We prove that the proposed {optimistic} learning caching policies can achieve sub-zero performance loss (regret) for perfect predictions, and maintain the sub-linear regret bound O(T)O(\sqrt T), which is the best achievable bound for policies that do not use predictions, even for arbitrary-bad predictions. The performance of the proposed algorithms is evaluated with detailed trace-driven numerical tests.Comment: arXiv admin note: substantial text overlap with arXiv:2202.1059

    Optimistic No-regret Algorithms for Discrete Caching

    Full text link
    We take a systematic look at the problem of storing whole files in a cache with limited capacity in the context of optimistic learning, where the caching policy has access to a prediction oracle (provided by, e.g., a Neural Network). The successive file requests are assumed to be generated by an adversary, and no assumption is made on the accuracy of the oracle. In this setting, we provide a universal lower bound for prediction-assisted online caching and proceed to design a suite of policies with a range of performance-complexity trade-offs. All proposed policies offer sublinear regret bounds commensurate with the accuracy of the oracle. Our results substantially improve upon all recently-proposed online caching policies, which, being unable to exploit the oracle predictions, offer only O(T)O(\sqrt{T}) regret. In this pursuit, we design, to the best of our knowledge, the first comprehensive optimistic Follow-the-Perturbed leader policy, which generalizes beyond the caching problem. We also study the problem of caching files with different sizes and the bipartite network caching problem. Finally, we evaluate the efficacy of the proposed policies through extensive numerical experiments using real-world traces.Comment: Accepted to ACM SIGMETRICS 202

    In CARSWe Trust: How Context-Aware Recommendations Affect Customers’ Trust And Other Business Performance Measures Of Recommender Systems

    Get PDF
    Most of the work on Context-Aware Recommender Systems (CARSes) has focused on demonstrating that the contextual information leads to more accurate recommendations and on developing efficient recommendation algorithms utilizing this additional contextual information. Little work has been done, however, on studying how much the contextual information affects purchasing behavior and trust of customers. In this paper, we study how including context in recommendations affects customers’ trust, sales and other crucial business-related performance measures. To do this, we performed a live controlled experiment with real customers of a commercial European online publisher. We delivered content-based recommendations and context-aware recommendations to two groups of customers and to a control group. We measured the recommendations’ accuracy and diversification, how much customers spent purchasing products during the experiment, quantity and price of their purchases and the customers’ level of trust. We aim at demonstrating that accuracy and diversification have only limited direct effect on customers’ purchasing behavior, but they affect trust which drives the customer purchasing behavior. We also want to prove that CARSes can increase both recommendations’ accuracy and diversification compared to other recommendation engines. This means that including contextual information in recommendations not only increases accuracy, as was demonstrated in previous studies, but it is crucial for improving trust which, in turn, can affect other business-related performance measures, such as company’s sales.Polytechnic of Bari, Italy; NYU Stern School of Busines

    Personalized large scale classification of public tenders on hadoop

    Get PDF
    Ce projet a été réalisé dans le cadre d’un partenariat entre Fujitsu Canada et Université Laval. Les besoins du projets ont été centrés sur une problématique d’affaire définie conjointement avec Fujitsu. Le projet consistait à classifier un corpus d’appels d’offres électroniques avec une approche orienté big data. L’objectif était d’identifier avec un très fort rappel les offres pertinentes au domaine d’affaire de l’entreprise. Après une séries d’expérimentations à petite échelle qui nous ont permise d’illustrer empiriquement (93% de rappel) l’efficacité de notre approche basé sur l’algorithme BNS (Bi-Normal Separation), nous avons implanté un système complet qui exploite l’infrastructure technologique big data Hadoop. Nos expérimentations sur le système complet démontrent qu’il est possible d’obtenir une performance de classification tout aussi efficace à grande échelle (91% de rappel) tout en exploitant les gains de performance rendus possible par l’architecture distribuée de Hadoop.This project was completed as part of an innovation partnership with Fujitsu Canada and Université Laval. The needs and objectives of the project were centered on a business problem defined jointly with Fujitsu. Our project aimed to classify a corpus of electronic public tenders based on state of the art Hadoop big data technology. The objective was to identify with high recall public tenders relevant to the IT services business of Fujitsu Canada. A small scale prototype based on the BNS algorithm (Bi-Normal Separation) was empirically shown to classify with high recall (93%) the public tender corpus. The prototype was then re-implemented on a full scale Hadoop cluster using Apache Pig for the data preparation pipeline and using Apache Mahout for classification. Our experimentation show that the large scale system not only maintains high recall (91%) on the classification task, but can readily take advantage of the massive scalability gains made possible by Hadoop’s distributed architecture
    corecore