11,891 research outputs found
Web Tracking: Mechanisms, Implications, and Defenses
This articles surveys the existing literature on the methods currently used
by web services to track the user online as well as their purposes,
implications, and possible user's defenses. A significant majority of reviewed
articles and web resources are from years 2012-2014. Privacy seems to be the
Achilles' heel of today's web. Web services make continuous efforts to obtain
as much information as they can about the things we search, the sites we visit,
the people with who we contact, and the products we buy. Tracking is usually
performed for commercial purposes. We present 5 main groups of methods used for
user tracking, which are based on sessions, client storage, client cache,
fingerprinting, or yet other approaches. A special focus is placed on
mechanisms that use web caches, operational caches, and fingerprinting, as they
are usually very rich in terms of using various creative methodologies. We also
show how the users can be identified on the web and associated with their real
names, e-mail addresses, phone numbers, or even street addresses. We show why
tracking is being used and its possible implications for the users (price
discrimination, assessing financial credibility, determining insurance
coverage, government surveillance, and identity theft). For each of the
tracking methods, we present possible defenses. Apart from describing the
methods and tools used for keeping the personal data away from being tracked,
we also present several tools that were used for research purposes - their main
goal is to discover how and by which entity the users are being tracked on
their desktop computers or smartphones, provide this information to the users,
and visualize it in an accessible and easy to follow way. Finally, we present
the currently proposed future approaches to track the user and show that they
can potentially pose significant threats to the users' privacy.Comment: 29 pages, 212 reference
Recommended from our members
Linking early geospatial documents, one place at a time: annotation of geographic documents with Recogito
Recogito is an open source tool for the semi-automatic annotation of place references in maps and texts. It was developed as part of the Pelagios 3 research project, which aims to build up a comprehensive directory of places referred to in early maps and geographic writing predating the year 1492. Pelagios 3 focuses specifically on sources from the Classical Latin, Greek and Byzantine periods; on Mappae Mundi and narrative texts from the European Medieval period; on Late Medieval Portolans; and on maps and texts from the early Islamic and early Chinese traditions. Since the start of the project in September 2013, the team has harvested more than 120,000 toponyms, manually verifying almost 60,000 of them. Furthermore, the team held two public annotation workshops supported through the Open Humanities Awards 2014. In these workshops, a mixed audience of students and academics of different backgrounds used Recogito to add several thousand contributions on each workshop day.
A number of benefits arise out of this work: on the one hand, the digital identification of places – and the names used for them – makes the documents' contents amenable to information retrieval technology, i.e. documents become more easily search- and discoverable to users than through conventional metadata-based search alone. On the other hand, the documents are opened up to new forms of re-use. For example, it becomes possible to “map” and compare the narrative of texts, and the contents of maps with modern day tools like Web maps and GIS; or to analyze and contrast documents’ geographic properties, toponymy and spatial relationships. Seen in a wider context, we argue that initiatives such as ours contribute to the growing ecosystem of the “Graph of Humanities Data” that is gathering pace in the Digital Humanities (linking data about people, places, events, canonical references, etc.), which has the potential to open up new avenues for computational and quantitative research in a variety of fields including History, Geography, Archaeology, Classics, Genealogy and Modern Languages
Measuring Online Social Bubbles
Social media have quickly become a prevalent channel to access information,
spread ideas, and influence opinions. However, it has been suggested that
social and algorithmic filtering may cause exposure to less diverse points of
view, and even foster polarization and misinformation. Here we explore and
validate this hypothesis quantitatively for the first time, at the collective
and individual levels, by mining three massive datasets of web traffic, search
logs, and Twitter posts. Our analysis shows that collectively, people access
information from a significantly narrower spectrum of sources through social
media and email, compared to search. The significance of this finding for
individual exposure is revealed by investigating the relationship between the
diversity of information sources experienced by users at the collective and
individual level. There is a strong correlation between collective and
individual diversity, supporting the notion that when we use social media we
find ourselves inside "social bubbles". Our results could lead to a deeper
understanding of how technology biases our exposure to new information
Using contextual information to understand searching and browsing behavior
There is great imbalance in the richness of information on the web and the succinctness and poverty of search requests of web users, making their queries only a partial description of the underlying complex information needs. Finding ways to better leverage contextual information and make search context-aware holds the promise to dramatically improve the search experience of users. We conducted a series of studies to discover, model and utilize contextual information in order to understand and improve users' searching and browsing behavior on the web. Our results capture important aspects of context under the realistic conditions of different online search services, aiming to ensure that our scientific insights and solutions transfer to the operational settings of real world applications
Finding not seeking: Law on the UK’s Social Science Information Gateway
Unpublished article on the Social Science Information Gateway by Steve Whittle, Information Systems Manager at the Institute of Advanced Legal Studie
Where to Go on Your Next Trip? Optimizing Travel Destinations Based on User Preferences
Recommendation based on user preferences is a common task for e-commerce
websites. New recommendation algorithms are often evaluated by offline
comparison to baseline algorithms such as recommending random or the most
popular items. Here, we investigate how these algorithms themselves perform and
compare to the operational production system in large scale online experiments
in a real-world application. Specifically, we focus on recommending travel
destinations at Booking.com, a major online travel site, to users searching for
their preferred vacation activities. To build ranking models we use
multi-criteria rating data provided by previous users after their stay at a
destination. We implement three methods and compare them to the current
baseline in Booking.com: random, most popular, and Naive Bayes. Our general
conclusion is that, in an online A/B test with live users, our Naive-Bayes
based ranker increased user engagement significantly over the current online
system.Comment: 6 pages, 2 figures in SIGIR 2015, SIRIP Symposium on IR in Practic
Recommended from our members
Learning by volunteer computing, thinking and gaming: What and how are volunteers learning by participating in Virtual Citizen Science?
Citizen Science (CS) refers to a form of research collaboration that engages volunteers without formal scientific training in contributing to empirical scientific projects. Virtual Citizen Science (VCS) projects engage participants in online tasks. VCS has demonstrated its usefulness for research, however little is known about its learning potential for volunteers. This paper reports on research exploring the learning outcomes and processes in VCS. In order to identify different kinds of learning, 32 exploratory interviews of volunteers were conducted in three different VCS projects. We found six main learning outcomes related to different participants' activities in the project. Volunteers learn on four dimensions that are directly related to the scope of the VCS project: they learn at the task/game level, acquire pattern recognition skills, on-topic content knowledge, and improve their scientific literacy. Thanks to indirect opportunities of VCS projects, volunteers learn on two additional dimensions: off topic knowledge and skills, and personal development. Activities through which volunteers learn can be categorized in two levels: at a micro (task/game) level that is direct participation to the task, and at a macro level, i.e. use of project documentation, personal research on the Internet, and practicing specific roles in project communities. Both types are influenced by interactions with others in chat or forums. Most learning happens to be informal, unstructured and social. Volunteers do not only learn from others by interacting with scientists and their peers, but also by working for others: they gain knowledge, new status and skills by acting as active participants, moderators, editors, translators, community managers, etc. in a project community. This research highlights these informal and social aspects in adult learning and science education and also stresses the importance for learning through the indirect opportunities provided by the project: the main one being the opportunity to participate and progress in a project community, according to one's tastes and skills
TRAP: using TaRgeted Ads to unveil Google personal Profiles
In the last decade, the advertisement market spread significantly in the web and mobile app system. Its effectiveness is also due thanks to the possibility to target the advertisement on the specific interests of the actual user, other than on the content of the website hosting the advertisement. In this scenario, became of great value services that collect and hence can provide information about the browsing user, like Facebook and Google. In this paper, we show how to maliciously exploit the Google Targeted Advertising system to infer personal information in Google user profiles. In particular, the attack we consider is external from Google and relies on combining data from Google AdWords with other data collected from a website of the Google Display Network. We validate the effectiveness of our proposed attack, also discussing possible application scenarios. The result of our research shows a significant practical privacy issue behind such type of targeted advertising service, and call for further investigation and the design of more privacy-aware solutions, possibly without impeding ?the current business model involved in online advertisement.
- …