24,402 research outputs found

    Case study: disclosure of indirect device fingerprinting in privacy policies

    Full text link
    Recent developments in online tracking make it harder for individuals to detect and block trackers. This is especially true for de- vice fingerprinting techniques that websites use to identify and track individual devices. Direct trackers { those that directly ask the device for identifying information { can often be blocked with browser configu- rations or other simple techniques. However, some sites have shifted to indirect tracking methods, which attempt to uniquely identify a device by asking the browser to perform a seemingly-unrelated task. One type of indirect tracking known as Canvas fingerprinting causes the browser to render a graphic recording rendering statistics as a unique identifier. Even experts find it challenging to discern some indirect fingerprinting methods. In this work, we aim to observe how indirect device fingerprint- ing methods are disclosed in privacy policies, and consider whether the disclosures are sufficient to enable website visitors to block the track- ing methods. We compare these disclosures to the disclosure of direct fingerprinting methods on the same websites. Our case study analyzes one indirect ngerprinting technique, Canvas fingerprinting. We use an existing automated detector of this fingerprint- ing technique to conservatively detect its use on Alexa Top 500 websites that cater to United States consumers, and we examine the privacy poli- cies of the resulting 28 websites. Disclosures of indirect fingerprinting vary in specificity. None described the specific methods with enough granularity to know the website used Canvas fingerprinting. Conversely, many sites did provide enough detail about usage of direct fingerprint- ing methods to allow a website visitor to reliably detect and block those techniques. We conclude that indirect fingerprinting methods are often technically difficult to detect, and are not identified with specificity in legal privacy notices. This makes indirect fingerprinting more difficult to block, and therefore risks disturbing the tentative armistice between individuals and websites currently in place for direct fingerprinting. This paper illustrates differences in fingerprinting approaches, and explains why technologists, technology lawyers, and policymakers need to appreciate the challenges of indirect fingerprinting.Accepted manuscrip

    Evaluating advanced search interfaces using established information-seeking model

    No full text
    When users have poorly defined or complex goals search interfaces offering only keyword searching facilities provide inadequate support to help them reach their information-seeking objectives. The emergence of interfaces with more advanced capabilities such as faceted browsing and result clustering can go some way to some way toward addressing such problems. The evaluation of these interfaces, however, is challenging since they generally offer diverse and versatile search environments that introduce overwhelming amounts of independent variables to user studies; choosing the interface object as the only independent variable in a study would reveal very little about why one design out-performs another. Nonetheless if we could effectively compare these interfaces we would have a way to determine which was best for a given scenario and begin to learn why. In this article we present a formative framework for the evaluation of advanced search interfaces through the quantification of the strengths and weaknesses of the interfaces in supporting user tactics and varying user conditions. This framework combines established models of users, user needs, and user behaviours to achieve this. The framework is applied to evaluate three search interfaces and demonstrates the potential value of this approach to interactive IR evaluation

    Predicting User-Interactions on Reddit

    Full text link
    In order to keep up with the demand of curating the deluge of crowd-sourced content, social media platforms leverage user interaction feedback to make decisions about which content to display, highlight, and hide. User interactions such as likes, votes, clicks, and views are assumed to be a proxy of a content's quality, popularity, or news-worthiness. In this paper we ask: how predictable are the interactions of a user on social media? To answer this question we recorded the clicking, browsing, and voting behavior of 186 Reddit users over a year. We present interesting descriptive statistics about their combined 339,270 interactions, and we find that relatively simple models are able to predict users' individual browse- or vote-interactions with reasonable accuracy.Comment: Presented at ASONAM 201

    Web Data Extraction, Applications and Techniques: A Survey

    Full text link
    Web Data Extraction is an important problem that has been studied by means of different scientific tools and in a broad range of applications. Many approaches to extracting data from the Web have been designed to solve specific problems and operate in ad-hoc domains. Other approaches, instead, heavily reuse techniques and algorithms developed in the field of Information Extraction. This survey aims at providing a structured and comprehensive overview of the literature in the field of Web Data Extraction. We provided a simple classification framework in which existing Web Data Extraction applications are grouped into two main classes, namely applications at the Enterprise level and at the Social Web level. At the Enterprise level, Web Data Extraction techniques emerge as a key tool to perform data analysis in Business and Competitive Intelligence systems as well as for business process re-engineering. At the Social Web level, Web Data Extraction techniques allow to gather a large amount of structured data continuously generated and disseminated by Web 2.0, Social Media and Online Social Network users and this offers unprecedented opportunities to analyze human behavior at a very large scale. We discuss also the potential of cross-fertilization, i.e., on the possibility of re-using Web Data Extraction techniques originally designed to work in a given domain, in other domains.Comment: Knowledge-based System
    • …
    corecore