446 research outputs found

    An event distribution platform for recommending cultural activities

    Get PDF

    Pyramid: Enhancing Selectivity in Big Data Protection with Count Featurization

    Full text link
    Protecting vast quantities of data poses a daunting challenge for the growing number of organizations that collect, stockpile, and monetize it. The ability to distinguish data that is actually needed from data collected "just in case" would help these organizations to limit the latter's exposure to attack. A natural approach might be to monitor data use and retain only the working-set of in-use data in accessible storage; unused data can be evicted to a highly protected store. However, many of today's big data applications rely on machine learning (ML) workloads that are periodically retrained by accessing, and thus exposing to attack, the entire data store. Training set minimization methods, such as count featurization, are often used to limit the data needed to train ML workloads to improve performance or scalability. We present Pyramid, a limited-exposure data management system that builds upon count featurization to enhance data protection. As such, Pyramid uniquely introduces both the idea and proof-of-concept for leveraging training set minimization methods to instill rigor and selectivity into big data management. We integrated Pyramid into Spark Velox, a framework for ML-based targeting and personalization. We evaluate it on three applications and show that Pyramid approaches state-of-the-art models while training on less than 1% of the raw data

    Bootstrapped Personalized Popularity for Cold Start Recommender Systems

    Get PDF
    Recommender Systems are severely hampered by the well-known Cold Start problem, identified by the lack of information on new items and users. This has led to research efforts focused on data imputation and augmentation models as predominantly data pre-processing strategies, yet their improvement of cold-user performance is largely indirect and often comes at the price of a reduction in accuracy for warmer users. To address these limitations, we propose Bootstrapped Personalized Popularity (B2P), a novel framework that improves performance for cold users (directly) and cold items (implicitly) via popularity models personalized with item metadata. B2P is scalable to very large datasets and directly addresses the Cold Start problem, so it can complement existing Cold Start strategies. Experiments on a real-world dataset from the BBC iPlayer and a public dataset demonstrate that B2P (1) significantly improves cold-user performance, (2) boosts warm-user performance for bootstrapped models by lowering their training sparsity, and (3) improves total recommendation accuracy at a competitive diversity level relative to existing high-performing Collaborative Filtering models. We demonstrate that B2P is a powerful and scalable framework for strongly cold datasets

    Content-boosted Matrix Factorization Techniques for Recommender Systems

    Full text link
    Many businesses are using recommender systems for marketing outreach. Recommendation algorithms can be either based on content or driven by collaborative filtering. We study different ways to incorporate content information directly into the matrix factorization approach of collaborative filtering. These content-boosted matrix factorization algorithms not only improve recommendation accuracy, but also provide useful insights about the contents, as well as make recommendations more easily interpretable

    Towards Serendipity for Content–Based Recommender Systems

    Get PDF
    Recommender systems are intelligent applications build to predict the rating or preference that a user would give to an item. One of the fundamental recommendation methods in the content-based method that predict ratings by exploiting attributes about the users and items such as users’ profile and textual content of items. A current issue faces by recommender systems based on this method is that the systems seem to recommend too similar items to what users have known. Thus, creating over-specialisation issues, in which a self-referential loop is created that leaves user in their own circle of finding and never get expose to new items. In order for these systems to be of significance used, it is important that not only relevant items been recommender, but the items must be also interesting and serendipitous. Having a serendipitous recommendation let users explore new items that they least expect. This has resulted in the issues of serendipity in recommender systems. However, it is difficult to define serendipity because in recommender system, there is no consensus definition for this term. Most of researchers define serendipity based on their research purposes. From the reviews, majority shows that unexpected as the important aspect in defining serendipity. Thus, in this paper, we aim to formally define the concept of serendipity in recommender systems based on the literature work done. We also reviewed few approaches that apply serendipity in the content-based methods in recommendation. Techniques that used Linked Open Data (LOD) approaches seems to be a good candidate to find relevant, unexpected and novel item in a large dataset.

    What does BERT know about books, movies and music? Probing BERT for Conversational Recommendation

    Full text link
    Heavily pre-trained transformer models such as BERT have recently shown to be remarkably powerful at language modelling by achieving impressive results on numerous downstream tasks. It has also been shown that they are able to implicitly store factual knowledge in their parameters after pre-training. Understanding what the pre-training procedure of LMs actually learns is a crucial step for using and improving them for Conversational Recommender Systems (CRS). We first study how much off-the-shelf pre-trained BERT "knows" about recommendation items such as books, movies and music. In order to analyze the knowledge stored in BERT's parameters, we use different probes that require different types of knowledge to solve, namely content-based and collaborative-based. Content-based knowledge is knowledge that requires the model to match the titles of items with their content information, such as textual descriptions and genres. In contrast, collaborative-based knowledge requires the model to match items with similar ones, according to community interactions such as ratings. We resort to BERT's Masked Language Modelling head to probe its knowledge about the genre of items, with cloze style prompts. In addition, we employ BERT's Next Sentence Prediction head and representations' similarity to compare relevant and non-relevant search and recommendation query-document inputs to explore whether BERT can, without any fine-tuning, rank relevant items first. Finally, we study how BERT performs in a conversational recommendation downstream task. Overall, our analyses and experiments show that: (i) BERT has knowledge stored in its parameters about the content of books, movies and music; (ii) it has more content-based knowledge than collaborative-based knowledge; and (iii) fails on conversational recommendation when faced with adversarial data.Comment: Accepted for publication at RecSys'2

    Automatic User Profile Construction for a Personalized News Recommender System Using Twitter

    Get PDF
    Modern society has now grown accustomed to reading online or digital news. However, the huge corpus of information available online poses a challenge to users when trying to find relevant articles. A hybrid system “Personalized News Recommender Using Twitter’ has been developed to recommend articles to a user based on the popularity of the articles and also the profile of the user. The hybrid system is a fusion of a collaborative recommender system developed using tweets from the “Twitter” public timeline and a content recommender system based the user’s past interests summarized in their conceptual user profile. In previous work, a user’s profile was built manually by asking the user to explicitly rate his/her interest in a category by entering a score for the corresponding category. This is not a reliable approach as the user may not be able to accurately specify their interest for a category with a number. In this work, an automatic profile builder was developed that uses an implicit approach to build the user’s profile. The specificity of the user profile was also increased to incorporate fifteen categories versus seven in the previous system. We concluded with an experiment to study the impact of automatic profile builder and the increased set of categories on the accuracy of the hybrid news recommender syste

    Influence of Social Circles on User Recommendations

    Get PDF
    Recommender systems are powerful tools that filter and recommend content relevant to a user. One of the most popular techniques used in recommender systems is collaborative filtering. Collaborative filtering has been successfully incorporated in many applications. However, these recommendation systems require a minimum number of users, items, and ratings in order to provide effective recommendations. This results in the infamous cold start problem where the system is not able to produce effective recommendations for new users. In recent times, with escalation in the popularity and usage of social networks, people tend to share their experiences in the form of reviews and ratings on social media. The components of social media like influence of friends, users\u27 interests, and friends\u27 interests create many opportunities to develop solutions for sparsity and cold start problems in recommender systems. This research observes these patterns and analyzes the role of social trust in baseline social recommender algorithms SocialMF - a matrix factorization-based model, SocialFD - a model that uses distance metric learning, and GraphRec - an attention-based deep learning model. Through extensive experimentation, this research compares the performance and results of these algorithms on datasets that these algorithms were tested on and one new dataset using the evaluations metrics such as root mean squared error (RMSE) and mean absolute error (MAE). By modifying the social trust component of these datasets, this project focuses on investigating the impact of trust on performance of these models. Experimental results of this research suggest that there is no conclusive evidence on how trust propagation plays a major part in these models. Moreover, these models show slightly improved performance when supplied with modified trust data
    • …
    corecore