22,920 research outputs found
Why We Read Wikipedia
Wikipedia is one of the most popular sites on the Web, with millions of users
relying on it to satisfy a broad range of information needs every day. Although
it is crucial to understand what exactly these needs are in order to be able to
meet them, little is currently known about why users visit Wikipedia. The goal
of this paper is to fill this gap by combining a survey of Wikipedia readers
with a log-based analysis of user activity. Based on an initial series of user
surveys, we build a taxonomy of Wikipedia use cases along several dimensions,
capturing users' motivations to visit Wikipedia, the depth of knowledge they
are seeking, and their knowledge of the topic of interest prior to visiting
Wikipedia. Then, we quantify the prevalence of these use cases via a
large-scale user survey conducted on live Wikipedia with almost 30,000
responses. Our analyses highlight the variety of factors driving users to
Wikipedia, such as current events, media coverage of a topic, personal
curiosity, work or school assignments, or boredom. Finally, we match survey
responses to the respondents' digital traces in Wikipedia's server logs,
enabling the discovery of behavioral patterns associated with specific use
cases. For instance, we observe long and fast-paced page sequences across
topics for users who are bored or exploring randomly, whereas those using
Wikipedia for work or school spend more time on individual articles focused on
topics such as science. Our findings advance our understanding of reader
motivations and behavior on Wikipedia and can have implications for developers
aiming to improve Wikipedia's user experience, editors striving to cater to
their readers' needs, third-party services (such as search engines) providing
access to Wikipedia content, and researchers aiming to build tools such as
recommendation engines.Comment: Published in WWW'17; v2 fixes caption of Table
Population Density-based Hospital Recommendation with Mobile LBS Big Data
The difficulty of getting medical treatment is one of major livelihood issues
in China. Since patients lack prior knowledge about the spatial distribution
and the capacity of hospitals, some hospitals have abnormally high or sporadic
population densities. This paper presents a new model for estimating the
spatiotemporal population density in each hospital based on location-based
service (LBS) big data, which would be beneficial to guiding and dispersing
outpatients. To improve the estimation accuracy, several approaches are
proposed to denoise the LBS data and classify people by detecting their various
behaviors. In addition, a long short-term memory (LSTM) based deep learning is
presented to predict the trend of population density. By using Baidu
large-scale LBS logs database, we apply the proposed model to 113 hospitals in
Beijing, P. R. China, and constructed an online hospital recommendation system
which can provide users with a hospital rank list basing the real-time
population density information and the hospitals' basic information such as
hospitals' levels and their distances. We also mine several interesting
patterns from these LBS logs by using our proposed system
A User-Centered Concept Mining System for Query and Document Understanding at Tencent
Concepts embody the knowledge of the world and facilitate the cognitive
processes of human beings. Mining concepts from web documents and constructing
the corresponding taxonomy are core research problems in text understanding and
support many downstream tasks such as query analysis, knowledge base
construction, recommendation, and search. However, we argue that most prior
studies extract formal and overly general concepts from Wikipedia or static web
pages, which are not representing the user perspective. In this paper, we
describe our experience of implementing and deploying ConcepT in Tencent QQ
Browser. It discovers user-centered concepts at the right granularity
conforming to user interests, by mining a large amount of user queries and
interactive search click logs. The extracted concepts have the proper
granularity, are consistent with user language styles and are dynamically
updated. We further present our techniques to tag documents with user-centered
concepts and to construct a topic-concept-instance taxonomy, which has helped
to improve search as well as news feeds recommendation in Tencent QQ Browser.
We performed extensive offline evaluation to demonstrate that our approach
could extract concepts of higher quality compared to several other existing
methods. Our system has been deployed in Tencent QQ Browser. Results from
online A/B testing involving a large number of real users suggest that the
Impression Efficiency of feeds users increased by 6.01% after incorporating the
user-centered concepts into the recommendation framework of Tencent QQ Browser.Comment: Accepted by KDD 201
Crime Scene Re-investigation: A Postmortem Analysis of Game Account Stealers' Behaviors
As item trading becomes more popular, users can change their game items or
money into real money more easily. At the same time, hackers turn their eyes on
stealing other users game items or money because it is much easier to earn
money than traditional gold-farming by running game bots. Game companies
provide various security measures to block account- theft attempts, but many
security measures on the user-side are disregarded by users because of lack of
usability. In this study, we propose a server-side account theft detection
system base on action sequence analysis to protect game users from malicious
hackers. We tested this system in the real Massively Multiplayer Online Role
Playing Game (MMORPG). By analyzing users full game play log, our system can
find the particular action sequences of hackers with high accuracy. Also, we
can trace where the victim accounts stolen money goes.Comment: 7 pages, 8 figures, In Proceedings of the 15th Annual Workshop on
Network and Systems Support for Games (NetGames 2017
- …