10 research outputs found

    Using Wikipedia to boost collaborative filtering techniques

    Full text link
    One important challenge in the field of recommender systems is the sparsity of available data. This problem limits the ability of recommender systems to provide accurate predictions of user ratings. We overcome this problem by using the publicly available user generated information contained in Wikipedia. We identify similarities between items by mapping them to Wikipedia pages and finding similarities in the text and commonalities in the links and categories of each page. These similarities can be used in the recommendation process and improve ranking predictions. We find that this method is most effective in cases where ratings are extremely sparse or nonexistent. Preliminary experimental results on the MovieLens dataset are encouraging

    Content-boosted Matrix Factorization Techniques for Recommender Systems

    Full text link
    Many businesses are using recommender systems for marketing outreach. Recommendation algorithms can be either based on content or driven by collaborative filtering. We study different ways to incorporate content information directly into the matrix factorization approach of collaborative filtering. These content-boosted matrix factorization algorithms not only improve recommendation accuracy, but also provide useful insights about the contents, as well as make recommendations more easily interpretable

    Enhancing Collaborative Filtering Using Implicit Relations in Data

    Get PDF
    International audienceThis work presents a Recommender System (RS) that relies on distributed recommendation techniques and implicit relations in data. In order to simplify the experience of users, recommender systems pre-select and filter information in which they may be interested in. Users express their interests in items by giving their opinion (explicit data) and navigating through the web-page (implicit data). The Matrix Fac-torization (MF) recommendation technique analyze this feedback, but it does not take more heterogeneous data into account. In order to improve recommendations, the description of items can be used to increase the relations among data. Our proposal extends MF techniques by adding implicit relations in an independent layer. Indeed, using past preferences, we deeply analyze the implicit interest of users in the attributes of items. By using this, we transform ratings and predictions into " semantic values " , where the term semantic indicates the expansion in the meaning of ratings. The experimentation phase uses MovieLens and IMDb database. We compare our work against a simple Matrix Factorization technique. Results show accurate personalized recommendations. At least but not at last, both recommendation analysis and semantic analysis can be par-allelized, alleviating time processing in large amount of data

    Social Media Analytics of Smoking Cessation Intervention: User Behavior Analysis, Classification, and Prediction

    Get PDF
    Tobacco use causes a large number of diseases and deaths in the United States. Traditional intervention programs are based on face-to-face consulting, and social support is offered to help smoking quitters control stress and achieve better intervention outcomes. However, the scalability of these traditional intervention programs is limited by time and location. With the development of Web 2.0, many intervention programs of smoking cessation are developed online to reach a wider population. QuitNet is a popular website for smoking cessation that provides different services to help users quit smoking. It builds communities on different social media for people to discuss issues of smoking cessation and provide social support for each other. In this dissertation, we develop a comprehensive study to understand user behavior and their discussion interactions in online communities of smoking cessation. We compare user features and behaviors on different social media channels, analyze user interactions from the perspective of social support exchange, and apply data mining techniques to analyze discussion content and recommend threads for users. Health communities are developed on different types of social media. For example, QuitNet has Web forums on its own Web site while it also has its appearance on Facebook. The user participation may vary on different social media platforms. Users may also behave differently depending on the functions and design of the social media platforms. So, as the first step in this dissertation, we carry out a preliminary study to compare smoking cessation communities on different social media channels. We analyze user characteristics and behaviors in QuitNet Forum and QuitNet Facebook with statistical analysis and social network analysis. It is found that most users of QuitNet Forum are early smoking quitters, and they participate in discussions more actively than users of QuitNet Facebook. However, users of QuitNet Facebook have a wider spectrum of quitting statuses and interaction behaviors. Second, we are interested in user behaviors and how they exchange social support in online communities. Social support is "an exchange of resources between two individuals perceived by the provider or the recipient to be intended to enhance the well-being of the recipient". As QuitNet Forum attracts much more active users than QuitNet Facebook, it provides a better platform for our research purpose. So, we focus on QuitNet Forum, developing a classification scheme through qualitative analysis to categorize discussion topics and types of social support on the forum. Patterns of user behaviors are defined and identified. Social networks are built to analyze user interactions of social support exchange. It is found that users at different quit stages have different behaviors to exchange social support, and different types of social support flow between users at different quit stages. Discussion topics, user behaviors and patterns of social support exchanges are thoroughly analyzed. However, due to a huge amount of information on QuitNet Forum, it is difficult for users to find proper topics or peers to discuss or interact with. It would be helpful if we could apply machine learning techniques to understand user generated information in online health communities, and recommend discussion topics to users to participate in. We develop classifiers to categorize posts and comments on QuitNet Forum in terms of user intentions and social support types. User behaviors and patterns are used to help developing various feature sets. Then, we develop recommendation techniques to recommend threads for users to participate in. Based on traditional Collaborative Filtering and content-based approaches, we integrate classification results and user quit stages to develop recommendation systems. The experiments show that integrating classification results or user health statuses can achieve the best recommendation results with different percentages of unknown data. In this dissertation, we implement all-sided studies for online smoking cessation communities, including comprehensive analytics and applications. The proposed frameworks and approaches could be applied to other health communities. In the future, we will apply more analytics and techniques to a larger data set, and develop user-end applications to serve and improve online health intervention programs and communities.Ph.D., Computer Science -- Drexel University, 201


    Get PDF
    In academic research communities, a typical way to spread ideas or seek for collaboration is through research talks, which might be presented at departmental colloquia or might be in given at conferences. Given a large number of research talks, with some of them happening in parallel, it becomes increasingly harder to focus on those of that are of most interest. To solve this problem, talk recommendation systems can help academics identify the most useful talks among many. This dissertation investigates methods to improve research talk recommendations, both for conference attendees and for faculty and students at a research university. More specifically, the focus of this thesis is the use of external information about user interests as a way to address the challenges of having limited data about target users. The thesis examines several kinds of external sources such as user home page, bibliography, external bookmarks, and user profiles from external information systems and explores impact of this information on the quality of talk recommendation in a general situation and in a cold-start context. For this study, the dissertation uses data from two existing talk recommendation systems, CoMeT and Conference Navigator 3, and an academic paper search system, SciNet

    Semantic Selection of Internet Sources through SWRL Enabled OWL Ontologies

    Get PDF
    This research examines the problem of Information Overload (IO) and give an overview of various attempts to resolve it. Furthermore, argue that instead of fighting IO, it is advisable to start learning how to live with it. It is unlikely that in modern information age, where users are producer and consumer of information, the amount of data and information generated would decrease. Furthermore, when managing IO, users are confined to the algorithms and policies of commercial Search Engines and Recommender Systems (RSs), which create results that also add to IO. this research calls to initiate a change in thinking: this by giving greater power to users when addressing the relevance and accuracy of internet searches, which helps in IO. However powerful search engines are, they do not process enough semantics in the moment when search queries are formulated. This research proposes a semantic selection of internet sources, through SWRL enabled OWL ontologies. the research focuses on SWT and its Stack because they (a)secure the semantic interpretation of the environments where internet searches take place and (b) guarantee reasoning that results in the selection of suitable internet sources in a particular moment of internet searches. Therefore, it is important to model the behaviour of users through OWL concepts and reason upon them in order to address IO when searching the internet. Thus, user behaviour is itemized through user preferences, perceptions and expectations from internet searches. The proposed approach in this research is a Software Engineering (SE) solution which provides computations based on the semantics of the environment stored in the ontological model