22,920 research outputs found

    Why We Read Wikipedia

    Get PDF
    Wikipedia is one of the most popular sites on the Web, with millions of users relying on it to satisfy a broad range of information needs every day. Although it is crucial to understand what exactly these needs are in order to be able to meet them, little is currently known about why users visit Wikipedia. The goal of this paper is to fill this gap by combining a survey of Wikipedia readers with a log-based analysis of user activity. Based on an initial series of user surveys, we build a taxonomy of Wikipedia use cases along several dimensions, capturing users' motivations to visit Wikipedia, the depth of knowledge they are seeking, and their knowledge of the topic of interest prior to visiting Wikipedia. Then, we quantify the prevalence of these use cases via a large-scale user survey conducted on live Wikipedia with almost 30,000 responses. Our analyses highlight the variety of factors driving users to Wikipedia, such as current events, media coverage of a topic, personal curiosity, work or school assignments, or boredom. Finally, we match survey responses to the respondents' digital traces in Wikipedia's server logs, enabling the discovery of behavioral patterns associated with specific use cases. For instance, we observe long and fast-paced page sequences across topics for users who are bored or exploring randomly, whereas those using Wikipedia for work or school spend more time on individual articles focused on topics such as science. Our findings advance our understanding of reader motivations and behavior on Wikipedia and can have implications for developers aiming to improve Wikipedia's user experience, editors striving to cater to their readers' needs, third-party services (such as search engines) providing access to Wikipedia content, and researchers aiming to build tools such as recommendation engines.Comment: Published in WWW'17; v2 fixes caption of Table

    Population Density-based Hospital Recommendation with Mobile LBS Big Data

    Full text link
    The difficulty of getting medical treatment is one of major livelihood issues in China. Since patients lack prior knowledge about the spatial distribution and the capacity of hospitals, some hospitals have abnormally high or sporadic population densities. This paper presents a new model for estimating the spatiotemporal population density in each hospital based on location-based service (LBS) big data, which would be beneficial to guiding and dispersing outpatients. To improve the estimation accuracy, several approaches are proposed to denoise the LBS data and classify people by detecting their various behaviors. In addition, a long short-term memory (LSTM) based deep learning is presented to predict the trend of population density. By using Baidu large-scale LBS logs database, we apply the proposed model to 113 hospitals in Beijing, P. R. China, and constructed an online hospital recommendation system which can provide users with a hospital rank list basing the real-time population density information and the hospitals' basic information such as hospitals' levels and their distances. We also mine several interesting patterns from these LBS logs by using our proposed system

    A User-Centered Concept Mining System for Query and Document Understanding at Tencent

    Full text link
    Concepts embody the knowledge of the world and facilitate the cognitive processes of human beings. Mining concepts from web documents and constructing the corresponding taxonomy are core research problems in text understanding and support many downstream tasks such as query analysis, knowledge base construction, recommendation, and search. However, we argue that most prior studies extract formal and overly general concepts from Wikipedia or static web pages, which are not representing the user perspective. In this paper, we describe our experience of implementing and deploying ConcepT in Tencent QQ Browser. It discovers user-centered concepts at the right granularity conforming to user interests, by mining a large amount of user queries and interactive search click logs. The extracted concepts have the proper granularity, are consistent with user language styles and are dynamically updated. We further present our techniques to tag documents with user-centered concepts and to construct a topic-concept-instance taxonomy, which has helped to improve search as well as news feeds recommendation in Tencent QQ Browser. We performed extensive offline evaluation to demonstrate that our approach could extract concepts of higher quality compared to several other existing methods. Our system has been deployed in Tencent QQ Browser. Results from online A/B testing involving a large number of real users suggest that the Impression Efficiency of feeds users increased by 6.01% after incorporating the user-centered concepts into the recommendation framework of Tencent QQ Browser.Comment: Accepted by KDD 201

    Crime Scene Re-investigation: A Postmortem Analysis of Game Account Stealers' Behaviors

    Full text link
    As item trading becomes more popular, users can change their game items or money into real money more easily. At the same time, hackers turn their eyes on stealing other users game items or money because it is much easier to earn money than traditional gold-farming by running game bots. Game companies provide various security measures to block account- theft attempts, but many security measures on the user-side are disregarded by users because of lack of usability. In this study, we propose a server-side account theft detection system base on action sequence analysis to protect game users from malicious hackers. We tested this system in the real Massively Multiplayer Online Role Playing Game (MMORPG). By analyzing users full game play log, our system can find the particular action sequences of hackers with high accuracy. Also, we can trace where the victim accounts stolen money goes.Comment: 7 pages, 8 figures, In Proceedings of the 15th Annual Workshop on Network and Systems Support for Games (NetGames 2017
    • …
    corecore