64,974 research outputs found
Profiling user activities with minimal traffic traces
Understanding user behavior is essential to personalize and enrich a user's
online experience. While there are significant benefits to be accrued from the
pursuit of personalized services based on a fine-grained behavioral analysis,
care must be taken to address user privacy concerns. In this paper, we consider
the use of web traces with truncated URLs - each URL is trimmed to only contain
the web domain - for this purpose. While such truncation removes the
fine-grained sensitive information, it also strips the data of many features
that are crucial to the profiling of user activity. We show how to overcome the
severe handicap of lack of crucial features for the purpose of filtering out
the URLs representing a user activity from the noisy network traffic trace
(including advertisement, spam, analytics, webscripts) with high accuracy. This
activity profiling with truncated URLs enables the network operators to provide
personalized services while mitigating privacy concerns by storing and sharing
only truncated traffic traces.
In order to offset the accuracy loss due to truncation, our statistical
methodology leverages specialized features extracted from a group of
consecutive URLs that represent a micro user action like web click, chat reply,
etc., which we call bursts. These bursts, in turn, are detected by a novel
algorithm which is based on our observed characteristics of the inter-arrival
time of HTTP records. We present an extensive experimental evaluation on a real
dataset of mobile web traces, consisting of more than 130 million records,
representing the browsing activities of 10,000 users over a period of 30 days.
Our results show that the proposed methodology achieves around 90% accuracy in
segregating URLs representing user activities from non-representative URLs
IMPACT: Investigation of Mobile-user Patterns Across University Campuses using WLAN Trace Analysis
We conduct the most comprehensive study of WLAN traces to date. Measurements
collected from four major university campuses are analyzed with the aim of
developing fundamental understanding of realistic user behavior in wireless
networks. Both individual user and inter-node (group) behaviors are
investigated and two classes of metrics are devised to capture the underlying
structure of such behaviors.
For individual user behavior we observe distinct patterns in which most users
are 'on' for a small fraction of the time, the number of access points visited
is very small and the overall on-line user mobility is quite low. We clearly
identify categories of heavy and light users. In general, users exhibit high
degree of similarity over days and weeks.
For group behavior, we define metrics for encounter patterns and friendship.
Surprisingly, we find that a user, on average, encounters less than 6% of the
network user population within a month, and that encounter and friendship
relations are highly asymmetric. We establish that number of encounters follows
a biPareto distribution, while friendship indexes follow an exponential
distribution. We capture the encounter graph using a small world model, the
characteristics of which reach steady state after only one day.
We hope for our study to have a great impact on realistic modeling of network
usage and mobility patterns in wireless networks.Comment: 16 pages, 31 figure
Interpretable Machine Learning for Privacy-Preserving Pervasive Systems
Our everyday interactions with pervasive systems generate traces that capture
various aspects of human behavior and enable machine learning algorithms to
extract latent information about users. In this paper, we propose a machine
learning interpretability framework that enables users to understand how these
generated traces violate their privacy
Understanding the Detection of View Fraud in Video Content Portals
While substantial effort has been devoted to understand fraudulent activity
in traditional online advertising (search and banner), more recent forms such
as video ads have received little attention. The understanding and
identification of fraudulent activity (i.e., fake views) in video ads for
advertisers, is complicated as they rely exclusively on the detection
mechanisms deployed by video hosting portals. In this context, the development
of independent tools able to monitor and audit the fidelity of these systems
are missing today and needed by both industry and regulators.
In this paper we present a first set of tools to serve this purpose. Using
our tools, we evaluate the performance of the audit systems of five major
online video portals. Our results reveal that YouTube's detection system
significantly outperforms all the others. Despite this, a systematic evaluation
indicates that it may still be susceptible to simple attacks. Furthermore, we
find that YouTube penalizes its videos' public and monetized view counters
differently, the former being more aggressive. This means that views identified
as fake and discounted from the public view counter are still monetized. We
speculate that even though YouTube's policy puts in lots of effort to
compensate users after an attack is discovered, this practice places the burden
of the risk on the advertisers, who pay to get their ads displayed.Comment: To appear in WWW 2016, Montr\'eal, Qu\'ebec, Canada. Please cite the
conference version of this pape
- …