24,402 research outputs found
Case study: disclosure of indirect device fingerprinting in privacy policies
Recent developments in online tracking make it harder for
individuals to detect and block trackers. This is especially true for de-
vice fingerprinting techniques that websites use to identify and track
individual devices. Direct trackers { those that directly ask the device
for identifying information { can often be blocked with browser configu-
rations or other simple techniques. However, some sites have shifted to
indirect tracking methods, which attempt to uniquely identify a device
by asking the browser to perform a seemingly-unrelated task. One type
of indirect tracking known as Canvas fingerprinting causes the browser
to render a graphic recording rendering statistics as a unique identifier.
Even experts find it challenging to discern some indirect fingerprinting
methods. In this work, we aim to observe how indirect device fingerprint-
ing methods are disclosed in privacy policies, and consider whether the
disclosures are sufficient to enable website visitors to block the track-
ing methods. We compare these disclosures to the disclosure of direct
fingerprinting methods on the same websites.
Our case study analyzes one indirect ngerprinting technique, Canvas
fingerprinting. We use an existing automated detector of this fingerprint-
ing technique to conservatively detect its use on Alexa Top 500 websites
that cater to United States consumers, and we examine the privacy poli-
cies of the resulting 28 websites. Disclosures of indirect fingerprinting
vary in specificity. None described the specific methods with enough
granularity to know the website used Canvas fingerprinting. Conversely,
many sites did provide enough detail about usage of direct fingerprint-
ing methods to allow a website visitor to reliably detect and block those
techniques.
We conclude that indirect fingerprinting methods are often technically
difficult to detect, and are not identified with specificity in legal privacy
notices. This makes indirect fingerprinting more difficult to block, and
therefore risks disturbing the tentative armistice between individuals and
websites currently in place for direct fingerprinting. This paper illustrates
differences in fingerprinting approaches, and explains why technologists,
technology lawyers, and policymakers need to appreciate the challenges
of indirect fingerprinting.Accepted manuscrip
Evaluating advanced search interfaces using established information-seeking model
When users have poorly defined or complex goals search interfaces offering only keyword searching facilities provide inadequate support to help them reach their information-seeking objectives. The emergence of interfaces with more advanced capabilities such as faceted browsing and result clustering can go some way to some way toward addressing such problems. The evaluation of these interfaces, however, is challenging since they generally offer diverse and versatile search environments that introduce overwhelming amounts of independent variables to user studies; choosing the interface object as the only independent variable in a study would reveal very little about why one design out-performs another. Nonetheless if we could effectively compare these interfaces we would have a way to determine which was best for a given scenario and begin to learn why. In this article we present a formative framework for the evaluation of advanced search interfaces through the quantification of the strengths and weaknesses of the interfaces in supporting user tactics and varying user conditions. This framework combines established models of users, user needs, and user behaviours to achieve this. The framework is applied to evaluate three search interfaces and demonstrates the potential value of this approach to interactive IR evaluation
Predicting User-Interactions on Reddit
In order to keep up with the demand of curating the deluge of crowd-sourced
content, social media platforms leverage user interaction feedback to make
decisions about which content to display, highlight, and hide. User
interactions such as likes, votes, clicks, and views are assumed to be a proxy
of a content's quality, popularity, or news-worthiness. In this paper we ask:
how predictable are the interactions of a user on social media? To answer this
question we recorded the clicking, browsing, and voting behavior of 186 Reddit
users over a year. We present interesting descriptive statistics about their
combined 339,270 interactions, and we find that relatively simple models are
able to predict users' individual browse- or vote-interactions with reasonable
accuracy.Comment: Presented at ASONAM 201
Web Data Extraction, Applications and Techniques: A Survey
Web Data Extraction is an important problem that has been studied by means of
different scientific tools and in a broad range of applications. Many
approaches to extracting data from the Web have been designed to solve specific
problems and operate in ad-hoc domains. Other approaches, instead, heavily
reuse techniques and algorithms developed in the field of Information
Extraction.
This survey aims at providing a structured and comprehensive overview of the
literature in the field of Web Data Extraction. We provided a simple
classification framework in which existing Web Data Extraction applications are
grouped into two main classes, namely applications at the Enterprise level and
at the Social Web level. At the Enterprise level, Web Data Extraction
techniques emerge as a key tool to perform data analysis in Business and
Competitive Intelligence systems as well as for business process
re-engineering. At the Social Web level, Web Data Extraction techniques allow
to gather a large amount of structured data continuously generated and
disseminated by Web 2.0, Social Media and Online Social Network users and this
offers unprecedented opportunities to analyze human behavior at a very large
scale. We discuss also the potential of cross-fertilization, i.e., on the
possibility of re-using Web Data Extraction techniques originally designed to
work in a given domain, in other domains.Comment: Knowledge-based System
- …