42 research outputs found
Why We Read Wikipedia
Wikipedia is one of the most popular sites on the Web, with millions of users
relying on it to satisfy a broad range of information needs every day. Although
it is crucial to understand what exactly these needs are in order to be able to
meet them, little is currently known about why users visit Wikipedia. The goal
of this paper is to fill this gap by combining a survey of Wikipedia readers
with a log-based analysis of user activity. Based on an initial series of user
surveys, we build a taxonomy of Wikipedia use cases along several dimensions,
capturing users' motivations to visit Wikipedia, the depth of knowledge they
are seeking, and their knowledge of the topic of interest prior to visiting
Wikipedia. Then, we quantify the prevalence of these use cases via a
large-scale user survey conducted on live Wikipedia with almost 30,000
responses. Our analyses highlight the variety of factors driving users to
Wikipedia, such as current events, media coverage of a topic, personal
curiosity, work or school assignments, or boredom. Finally, we match survey
responses to the respondents' digital traces in Wikipedia's server logs,
enabling the discovery of behavioral patterns associated with specific use
cases. For instance, we observe long and fast-paced page sequences across
topics for users who are bored or exploring randomly, whereas those using
Wikipedia for work or school spend more time on individual articles focused on
topics such as science. Our findings advance our understanding of reader
motivations and behavior on Wikipedia and can have implications for developers
aiming to improve Wikipedia's user experience, editors striving to cater to
their readers' needs, third-party services (such as search engines) providing
access to Wikipedia content, and researchers aiming to build tools such as
recommendation engines.Comment: Published in WWW'17; v2 fixes caption of Table
Conscript Your Friends into Larger Anonymity Sets with JavaScript
We present the design and prototype implementation of ConScript, a framework
for using JavaScript to allow casual Web users to participate in an anonymous
communication system. When a Web user visits a cooperative Web site, the site
serves a JavaScript application that instructs the browser to create and submit
"dummy" messages into the anonymity system. Users who want to send non-dummy
messages through the anonymity system use a browser plug-in to replace these
dummy messages with real messages. Creating such conscripted anonymity sets can
increase the anonymity set size available to users of remailer, e-voting, and
verifiable shuffle-style anonymity systems. We outline ConScript's
architecture, we address a number of potential attacks against ConScript, and
we discuss the ethical issues related to deploying such a system. Our
implementation results demonstrate the practicality of ConScript: a workstation
running our ConScript prototype JavaScript client generates a dummy message for
a mix-net in 81 milliseconds and it generates a dummy message for a
DoS-resistant DC-net in 156 milliseconds.Comment: An abbreviated version of this paper will appear at the WPES 2013
worksho
When Backpressure Meets Predictive Scheduling
Motivated by the increasing popularity of learning and predicting human user
behavior in communication and computing systems, in this paper, we investigate
the fundamental benefit of predictive scheduling, i.e., predicting and
pre-serving arrivals, in controlled queueing systems. Based on a lookahead
window prediction model, we first establish a novel equivalence between the
predictive queueing system with a \emph{fully-efficient} scheduling scheme and
an equivalent queueing system without prediction. This connection allows us to
analytically demonstrate that predictive scheduling necessarily improves system
delay performance and can drive it to zero with increasing prediction power. We
then propose the \textsf{Predictive Backpressure (PBP)} algorithm for achieving
optimal utility performance in such predictive systems. \textsf{PBP}
efficiently incorporates prediction into stochastic system control and avoids
the great complication due to the exponential state space growth in the
prediction window size. We show that \textsf{PBP} can achieve a utility
performance that is within of the optimal, for any ,
while guaranteeing that the system delay distribution is a
\emph{shifted-to-the-left} version of that under the original Backpressure
algorithm. Hence, the average packet delay under \textsf{PBP} is strictly
better than that under Backpressure, and vanishes with increasing prediction
window size. This implies that the resulting utility-delay tradeoff with
predictive scheduling beats the known optimal tradeoff for systems without prediction
DOBBS: Towards a Comprehensive Dataset to Study the Browsing Behavior of Online Users
The investigation of the browsing behavior of users provides useful
information to optimize web site design, web browser design, search engines
offerings, and online advertisement. This has been a topic of active research
since the Web started and a large body of work exists. However, new online
services as well as advances in Web and mobile technologies clearly changed the
meaning behind "browsing the Web" and require a fresh look at the problem and
research, specifically in respect to whether the used models are still
appropriate. Platforms such as YouTube, Netflix or last.fm have started to
replace the traditional media channels (cinema, television, radio) and media
distribution formats (CD, DVD, Blu-ray). Social networks (e.g., Facebook) and
platforms for browser games attracted whole new, particularly less tech-savvy
audiences. Furthermore, advances in mobile technologies and devices made
browsing "on-the-move" the norm and changed the user behavior as in the mobile
case browsing is often being influenced by the user's location and context in
the physical world. Commonly used datasets, such as web server access logs or
search engines transaction logs, are inherently not capable of capturing the
browsing behavior of users in all these facets. DOBBS (DERI Online Behavior
Study) is an effort to create such a dataset in a non-intrusive, completely
anonymous and privacy-preserving way. To this end, DOBBS provides a browser
add-on that users can install, which keeps track of their browsing behavior
(e.g., how much time they spent on the Web, how long they stay on a website,
how often they visit a website, how they use their browser, etc.). In this
paper, we outline the motivation behind DOBBS, describe the add-on and captured
data in detail, and present some first results to highlight the strengths of
DOBBS
Twitter session analytics: profiling susers' short-term behavioral changes
[Proceeding of]: 8th International Conference (SocInfo 2016), Bellevue, WA, USA, November 11-14, 2016.Human behavior shows strong daily, weekly, and monthlypatterns. In this work, we demonstrate online behavioral changes thatoccur on a much smaller time scale: minutes, rather than days or weeks.Specifically, we study how people distribute their effort over differenttasks during periods of activity on the Twitter social platform. Wedemonstrate that later in a session on Twitter, people prefer to perform simpler tasks, such as replying and retweeting others' posts, ratherthan composing original messages, and they also tend to post shortermessages. We measure the strength of this effect empirically and statistically using mixed-effects models, and find that the first post of a sessionis up to 25 % more likely to be a composed message, and 10-20 % lesslikely to be a reply or retweet. Qualitatively, our results hold for differentpopulations of Twitter users segmented by how active and well-connectedthey are. Although our work does not resolve the mechanisms responsible for these behavioral changes, our results offer insights for improvinguser experience and engagement on online social platforms.Publicad
Leveraging Human Thinking Style for User Attribution in Digital Forensic Process
User attribution, the process of identifying a human in a digital medium, is a research area that has receive significant attention in information security research areas, with a little research focus on digital forensics. This study explored the probability of the existence of a digital fingerprint based on human thinking style, which can be used to identify an online user. To achieve this, the study utilized Server-side web data of 43-respondents were collected for 10-months as well as a self-report thinking style measurement instrument. Cluster dichotomies from five thinking styles were extracted. Supervised machine-learning techniques were then applied to distinguish individuals on each dichotomy. The result showed that thinking styles of individuals on different dichotomies could be reliably distinguished on the Internet using a Meta classifier of Logistic model tree with bagging technique. The study further modeled how the observed signature can be adopted for a digital forensic process, using high-level universal modeling language modeling process- specifically, the behavioral state-model and use-case modeling process. In addition to the application of this result in forensics process, this result finds relevance and application in human-centered graphical user interface design for recommender system as well as in e-commerce services. It also finds application in online profiling processes, especially in e-learning system