3,255 research outputs found
Eliciting New Wikipedia Users' Interests via Automatically Mined Questionnaires: For a Warm Welcome, Not a Cold Start
Every day, thousands of users sign up as new Wikipedia contributors. Once
joined, these users have to decide which articles to contribute to, which users
to seek out and learn from or collaborate with, etc. Any such task is a hard
and potentially frustrating one given the sheer size of Wikipedia. Supporting
newcomers in their first steps by recommending articles they would enjoy
editing or editors they would enjoy collaborating with is thus a promising
route toward converting them into long-term contributors. Standard recommender
systems, however, rely on users' histories of previous interactions with the
platform. As such, these systems cannot make high-quality recommendations to
newcomers without any previous interactions -- the so-called cold-start
problem. The present paper addresses the cold-start problem on Wikipedia by
developing a method for automatically building short questionnaires that, when
completed by a newly registered Wikipedia user, can be used for a variety of
purposes, including article recommendations that can help new editors get
started. Our questionnaires are constructed based on the text of Wikipedia
articles as well as the history of contributions by the already onboarded
Wikipedia editors. We assess the quality of our questionnaire-based
recommendations in an offline evaluation using historical data, as well as an
online evaluation with hundreds of real Wikipedia newcomers, concluding that
our method provides cohesive, human-readable questions that perform well
against several baselines. By addressing the cold-start problem, this work can
help with the sustainable growth and maintenance of Wikipedia's diverse editor
community.Comment: Accepted at the 13th International AAAI Conference on Web and Social
Media (ICWSM-2019
How Algorithmic Confounding in Recommendation Systems Increases Homogeneity and Decreases Utility
Recommendation systems are ubiquitous and impact many domains; they have the
potential to influence product consumption, individuals' perceptions of the
world, and life-altering decisions. These systems are often evaluated or
trained with data from users already exposed to algorithmic recommendations;
this creates a pernicious feedback loop. Using simulations, we demonstrate how
using data confounded in this way homogenizes user behavior without increasing
utility
Recommender System Using Collaborative Filtering Algorithm
With the vast amount of data that the world has nowadays, institutions are looking for more and more accurate ways of using this data. Companies like Amazon use their huge amounts of data to give recommendations for users. Based on similarities among items, systems can give predictions for a new item’s rating. Recommender systems use the user, item, and ratings information to predict how other users will like a particular item.
Recommender systems are now pervasive and seek to make profit out of customers or successfully meet their needs. However, to reach this goal, systems need to parse a lot of data and collect information, sometimes from different resources, and predict how the user will like the product or item. The computation power needed is considerable. Also, companies try to avoid flooding customer mailboxes with hundreds of products each morning, thus they are looking for one email or text that will make the customer look and act.
The motivation to do the project comes from my eagerness to learn website design and get a deep understanding of recommender systems. Applying machine learning dynamically is one of the goals that I set for myself and I wanted to go beyond that and verify my result. Thus, I had to use a large dataset to test the algorithm and compare each technique in terms of error rate. My experience with applying collaborative filtering helps me to understand that finding a solution is not enough, but to strive for a fast and ultimate one. In my case, testing my algorithm in a large data set required me to refine the coding strategy of the algorithm many times to speed the process.
In this project, I have designed a website that uses different techniques for recommendations. User-based, Item-based, and Model-based approaches of collaborative filtering are what I have used. Every technique has its way of predicting the user rating for a new item based on existing users’ data. To evaluate each method, I used Movie Lens, an external data set of users, items, and ratings, and calculated the error rate using Mean Absolute Error Rate (MAE) and Root Mean Squared Error (RMSE). Finally, each method has its strengths and weaknesses that relate to the domain in which I am applying these methods
- …