29,308 research outputs found

    Discovering the Impact of Knowledge in Recommender Systems: A Comparative Study

    Get PDF
    Recommender systems engage user profiles and appropriate filtering techniques to assist users in finding more relevant information over the large volume of information. User profiles play an important role in the success of recommendation process since they model and represent the actual user needs. However, a comprehensive literature review of recommender systems has demonstrated no concrete study on the role and impact of knowledge in user profiling and filtering approache. In this paper, we review the most prominent recommender systems in the literature and examine the impression of knowledge extracted from different sources. We then come up with this finding that semantic information from the user context has substantial impact on the performance of knowledge based recommender systems. Finally, some new clues for improvement the knowledge-based profiles have been proposed.Comment: 14 pages, 3 tables; International Journal of Computer Science & Engineering Survey (IJCSES) Vol.2, No.3, August 201

    From Micro to Macro: Uncovering and Predicting Information Cascading Process with Behavioral Dynamics

    Full text link
    Cascades are ubiquitous in various network environments. How to predict these cascades is highly nontrivial in several vital applications, such as viral marketing, epidemic prevention and traffic management. Most previous works mainly focus on predicting the final cascade sizes. As cascades are typical dynamic processes, it is always interesting and important to predict the cascade size at any time, or predict the time when a cascade will reach a certain size (e.g. an threshold for outbreak). In this paper, we unify all these tasks into a fundamental problem: cascading process prediction. That is, given the early stage of a cascade, how to predict its cumulative cascade size of any later time? For such a challenging problem, how to understand the micro mechanism that drives and generates the macro phenomenons (i.e. cascading proceese) is essential. Here we introduce behavioral dynamics as the micro mechanism to describe the dynamic process of a node's neighbors get infected by a cascade after this node get infected (i.e. one-hop subcascades). Through data-driven analysis, we find out the common principles and patterns lying in behavioral dynamics and propose a novel Networked Weibull Regression model for behavioral dynamics modeling. After that we propose a novel method for predicting cascading processes by effectively aggregating behavioral dynamics, and propose a scalable solution to approximate the cascading process with a theoretical guarantee. We extensively evaluate the proposed method on a large scale social network dataset. The results demonstrate that the proposed method can significantly outperform other state-of-the-art baselines in multiple tasks including cascade size prediction, outbreak time prediction and cascading process prediction.Comment: 10 pages, 11 figure

    Why We Read Wikipedia

    Get PDF
    Wikipedia is one of the most popular sites on the Web, with millions of users relying on it to satisfy a broad range of information needs every day. Although it is crucial to understand what exactly these needs are in order to be able to meet them, little is currently known about why users visit Wikipedia. The goal of this paper is to fill this gap by combining a survey of Wikipedia readers with a log-based analysis of user activity. Based on an initial series of user surveys, we build a taxonomy of Wikipedia use cases along several dimensions, capturing users' motivations to visit Wikipedia, the depth of knowledge they are seeking, and their knowledge of the topic of interest prior to visiting Wikipedia. Then, we quantify the prevalence of these use cases via a large-scale user survey conducted on live Wikipedia with almost 30,000 responses. Our analyses highlight the variety of factors driving users to Wikipedia, such as current events, media coverage of a topic, personal curiosity, work or school assignments, or boredom. Finally, we match survey responses to the respondents' digital traces in Wikipedia's server logs, enabling the discovery of behavioral patterns associated with specific use cases. For instance, we observe long and fast-paced page sequences across topics for users who are bored or exploring randomly, whereas those using Wikipedia for work or school spend more time on individual articles focused on topics such as science. Our findings advance our understanding of reader motivations and behavior on Wikipedia and can have implications for developers aiming to improve Wikipedia's user experience, editors striving to cater to their readers' needs, third-party services (such as search engines) providing access to Wikipedia content, and researchers aiming to build tools such as recommendation engines.Comment: Published in WWW'17; v2 fixes caption of Table

    A novel Big Data analytics and intelligent technique to predict driver's intent

    Get PDF
    Modern age offers a great potential for automatically predicting the driver's intent through the increasing miniaturization of computing technologies, rapid advancements in communication technologies and continuous connectivity of heterogeneous smart objects. Inside the cabin and engine of modern cars, dedicated computer systems need to possess the ability to exploit the wealth of information generated by heterogeneous data sources with different contextual and conceptual representations. Processing and utilizing this diverse and voluminous data, involves many challenges concerning the design of the computational technique used to perform this task. In this paper, we investigate the various data sources available in the car and the surrounding environment, which can be utilized as inputs in order to predict driver's intent and behavior. As part of investigating these potential data sources, we conducted experiments on e-calendars for a large number of employees, and have reviewed a number of available geo referencing systems. Through the results of a statistical analysis and by computing location recognition accuracy results, we explored in detail the potential utilization of calendar location data to detect the driver's intentions. In order to exploit the numerous diverse data inputs available in modern vehicles, we investigate the suitability of different Computational Intelligence (CI) techniques, and propose a novel fuzzy computational modelling methodology. Finally, we outline the impact of applying advanced CI and Big Data analytics techniques in modern vehicles on the driver and society in general, and discuss ethical and legal issues arising from the deployment of intelligent self-learning cars

    Explaining Machine Learning Classifiers through Diverse Counterfactual Explanations

    Full text link
    Post-hoc explanations of machine learning models are crucial for people to understand and act on algorithmic predictions. An intriguing class of explanations is through counterfactuals, hypothetical examples that show people how to obtain a different prediction. We posit that effective counterfactual explanations should satisfy two properties: feasibility of the counterfactual actions given user context and constraints, and diversity among the counterfactuals presented. To this end, we propose a framework for generating and evaluating a diverse set of counterfactual explanations based on determinantal point processes. To evaluate the actionability of counterfactuals, we provide metrics that enable comparison of counterfactual-based methods to other local explanation methods. We further address necessary tradeoffs and point to causal implications in optimizing for counterfactuals. Our experiments on four real-world datasets show that our framework can generate a set of counterfactuals that are diverse and well approximate local decision boundaries, outperforming prior approaches to generating diverse counterfactuals. We provide an implementation of the framework at https://github.com/microsoft/DiCE.Comment: 13 page

    Analysis and Forecasting of Trending Topics in Online Media Streams

    Full text link
    Among the vast information available on the web, social media streams capture what people currently pay attention to and how they feel about certain topics. Awareness of such trending topics plays a crucial role in multimedia systems such as trend aware recommendation and automatic vocabulary selection for video concept detection systems. Correctly utilizing trending topics requires a better understanding of their various characteristics in different social media streams. To this end, we present the first comprehensive study across three major online and social media streams, Twitter, Google, and Wikipedia, covering thousands of trending topics during an observation period of an entire year. Our results indicate that depending on one's requirements one does not necessarily have to turn to Twitter for information about current events and that some media streams strongly emphasize content of specific categories. As our second key contribution, we further present a novel approach for the challenging task of forecasting the life cycle of trending topics in the very moment they emerge. Our fully automated approach is based on a nearest neighbor forecasting technique exploiting our assumption that semantically similar topics exhibit similar behavior. We demonstrate on a large-scale dataset of Wikipedia page view statistics that forecasts by the proposed approach are about 9-48k views closer to the actual viewing statistics compared to baseline methods and achieve a mean average percentage error of 45-19% for time periods of up to 14 days.Comment: ACM Multimedia 201
    corecore