35 research outputs found
Online advertising: analysis of privacy threats and protection approaches
Online advertising, the pillar of the “free” content on the Web, has revolutionized the marketing business in recent years by creating a myriad of new opportunities for advertisers to reach potential customers. The current advertising model builds upon an intricate infrastructure composed of a variety of intermediary entities and technologies whose main aim is to deliver personalized ads. For this purpose, a wealth of user data is collected, aggregated, processed and traded behind the scenes at an unprecedented rate. Despite the enormous value of online advertising, however, the intrusiveness and ubiquity of these practices prompt serious privacy concerns. This article surveys the online advertising infrastructure and its supporting technologies, and presents a thorough overview of the underlying privacy risks and the solutions that may mitigate them. We first analyze the threats and potential privacy attackers in this scenario of online advertising. In particular, we examine the main components of the advertising infrastructure in terms of tracking capabilities, data collection, aggregation level and privacy risk, and overview the tracking and data-sharing technologies employed by these components. Then, we conduct a comprehensive survey of the most relevant privacy mechanisms, and classify and compare them on the basis of their privacy guarantees and impact on the Web.Peer ReviewedPostprint (author's final draft
ATOM: A Generalizable Technique for Inferring Tracker-Advertiser Data Sharing in the Online Behavioral Advertising Ecosystem
Data sharing between online trackers and advertisers is a key component in
online behavioral advertising. This sharing can be facilitated through a
variety of processes, including those not observable to the user's browser. The
unobservability of these processes limits the ability of researchers and
auditors seeking to verify compliance with regulations which require complete
disclosure of data sharing partners. Unfortunately, the applicability of
existing techniques to make inferences about unobservable data sharing
relationships is limited due to their dependence on protocol- or case-specific
artifacts of the online behavioral advertising ecosystem (e.g., they work only
when client-side header bidding is used for ad delivery or when advertisers
perform ad retargeting). As behavioral advertising technologies continue to
evolve rapidly, the availability of these artifacts and the effectiveness of
transparency solutions dependent on them remain ephemeral. In this paper, we
propose a generalizable technique, called ATOM, to infer data sharing
relationships between online trackers and advertisers. ATOM is different from
prior work in that it is universally applicable -- i.e., independent of ad
delivery protocols or availability of artifacts. ATOM leverages the insight
that by the very nature of behavioral advertising, ad creatives themselves can
be used to infer data sharing between trackers and advertisers -- after all,
the topics and brands showcased in an ad are dependent on the data available to
the advertiser. Therefore, by selectively blocking trackers and monitoring
changes in the characteristics of ads delivered by advertisers, ATOM is able to
identify data sharing relationships between trackers and advertisers. The
relationships discovered by our implementation of ATOM include those not found
using prior approaches and are validated by external sources.Comment: Accepted at PETS'22 16 Pages 3 Tables 2 Figure
adPerf: Characterizing the Performance of Third-party Ads
Monetizing websites and web apps through online advertising is widespread in
the web ecosystem. The online advertising ecosystem nowadays forces publishers
to integrate ads from these third-party domains. On the one hand, this raises
several privacy and security concerns that are actively studied in recent
years. On the other hand, given the ability of today's browsers to load dynamic
web pages with complex animations and Javascript, online advertising has also
transformed and can have a significant impact on webpage performance. The
performance cost of online ads is critical since it eventually impacts user
satisfaction as well as their Internet bill and device energy consumption.
In this paper, we apply an in-depth and first-of-a-kind performance
evaluation of web ads. Unlike prior efforts that rely primarily on adblockers,
we perform a fine-grained analysis on the web browser's page loading process to
demystify the performance cost of web ads. We aim to characterize the cost by
every component of an ad, so the publisher, ad syndicate, and advertiser can
improve the ad's performance with detailed guidance. For this purpose, we
develop an infrastructure, adPerf, for the Chrome browser that classifies page
loading workloads into ad-related and main-content at the granularity of
browser activities (such as Javascript and Layout). Our evaluations show that
online advertising entails more than 15% of browser page loading workload and
approximately 88% of that is spent on JavaScript. We also track the sources and
delivery chain of web ads and analyze performance considering the origin of the
ad contents. We observe that 2 of the well-known third-party ad domains
contribute to 35% of the ads performance cost and surprisingly, top news
websites implicitly include unknown third-party ads which in some cases build
up to more than 37% of the ads performance cost
An Automated Approach to Auditing Disclosure of Third-Party Data Collection in Website Privacy Policies
A dominant regulatory model for web privacy is "notice and choice". In this
model, users are notified of data collection and provided with options to
control it. To examine the efficacy of this approach, this study presents the
first large-scale audit of disclosure of third-party data collection in website
privacy policies. Data flows on one million websites are analyzed and over
200,000 websites' privacy policies are audited to determine if users are
notified of the names of the companies which collect their data. Policies from
25 prominent third-party data collectors are also examined to provide deeper
insights into the totality of the policy environment. Policies are additionally
audited to determine if the choice expressed by the "Do Not Track" browser
setting is respected.
Third-party data collection is wide-spread, but fewer than 15% of attributed
data flows are disclosed. The third-parties most likely to be disclosed are
those with consumer services users may be aware of, those without consumer
services are less likely to be mentioned. Policies are difficult to understand
and the average time requirement to read both a given site{\guillemotright}s
policy and the associated third-party policies exceeds 84 minutes. Only 7% of
first-party site policies mention the Do Not Track signal, and the majority of
such mentions are to specify that the signal is ignored. Among third-party
policies examined, none offer unqualified support for the Do Not Track signal.
Findings indicate that current implementations of "notice and choice" fail to
provide notice or respect choice
Privacy in online advertising platforms
Online advertising is consistently considered as the pillar of the "free• content on the Web since it is commonly the funding source of websites. Furthermore, the option of delivering personalizad ads has tumed advertising into a really valuable service for users, who receive ads tailored to their interests. Given its success in getting paying customers, online advertising is fueling a billionaire business. The current advertising model builds upon an intricate infrastructure whose main aim is to deliver personalized ads. For this purpose, a wealth of user data is collected, aggregated, processed and traded at an unprecedented rate. However, the intrusiveness and ubiquity of these practices prorrpt serious privacy concems.
In view of the inherent corrplexity behind the operation of ad platforms, privacy risks in the online advertising ecosystem could be studied from multiple perspectives. Naturally, most of the efforts unveiling these privacy issues concentrate on a specific entity, technology, behavior or context. However, such a segmented approach rright underestimate the benefits of a wider vision of a systerric problem. A lot of privacy protection echanisms have been proposed from the industry and acaderria. The most popular ones resort to radical strategies that hinder the ad distribution process, thus seriously affecting the online advertising ecosystem. Others involve significantly changing the ecosystem, which unfortunately may not be suitable in these times. Consequently, to encourage the adoption of privacy protection in this context, it is fundamental to pose mechanisms that aim at balancing the trade-off between user privacy and the web business model.
First, this thesis deals with the need to have a wide perspective of the privacy risks for users within the online advertising ecosystem and the protection approaches available. We survey the online advertising infrastructure and its supporting
technologies, and present a thorough overview of the undertying privacy risks and the solutions that may rritigate them. Through a systematic effort, we analyze the threats and potential privacy attackers in this scenario of online advertising.Then, we conduct a corrprehensive survey of the most relevant privacy mechanisms, and classify and con-pare them on the basis of their privacy guarantees and irrpact on the Web. Subsequently, we study the privacy risks derived from real-time bidding, a key enabling technology of modem online advertising. We experimentally explore the potential abuse of the process of user data sharing, necessary to support the auction-based
system in online advertising. Accordingly, we propase a system to regula te the distribution of u ser tracking data to potentially interested entities, depending on their previous behavior.This consists in reducing the nurnber of advertising agencies receiving user data. Doing so may affect the ad platform's revenue, thus the proposed system is designed to maxirrize the revenue while the abuse by advertising agencies is prevented to a large degree. Experimentally, the results of evaluation suggest that this system is able to correct rrisbehaving entities, consequently enhancing user privacy. Finally, we analyze the irrpact of online advertising and tracking from the particular perspective of lberoamerica.We study the third-party and ad tracking triggered within local websites in this heterogeneous region not previously studied. We found out that user location in this context would affect privacy since the intensity of third-party traffic, including advertising related flows of information, varies from country to country when local web traffic is simulated, although the total nurnber of entities behind this traffic seems stable. The type of content served by websites is also a parameter affecting the leve! of third-party tracking:publishers assiciated with news shopping categories generate more third-party traffic and such intensity is exarbated for top-world sitesLa publicitat en línia té un paper important a Internet que permet finançar habitualment l'operació de llocs web que ofereixen contingut lliure als usuaris. A més, la personalització dels anuncis ha tornat la publicitat en línia un servei valuós per als usuaris. Si aconseguirem que hi hagi molts compradors siguin més que possibles, es promourà un negoci milionari. El model d'anuncis vigents es basa en una infraestructura completa que lliura els anuncis personalitzats. Pera això, es pot recopilar una gran quantitat de dades d'ús, agregar, processar i vendre molt ràpidament. Malauradament, aquestes pràctiques generen riscos de privadesa. Donada la complexitat de l'operació de les plataformes d'anuncis, els riscos de privacitat es poden estudiar des de diverses perspectives. Naturalment, els esforços per desenvolupar aquests problemes de privacitat es concentren en una entitat, tecnologia, comportament o context específic. Però aquest enfocament subestima els beneficis d'una perspectiva més àmplia d'un problema integral. Molts mecanismes de protecció han estat proposats des de la indústria i l’àmbit acadèmic. Els més populars apliquen estratègies radicals que obstrueixen la distribució d'anuncis, afectant seriosament l’ecosistema d'anuncis. També es pot modificar significativament l’ecosistema, el que no és factible per la seva conflictivitat. Així, amb la finalitat de fomentar l'adopció de protecció de privacitat, és fonamental plantejar solucions orientades a equilibrar les necessitats de privacitat amb el model de negocis de la web. Inicialment, la tesi ofereix una visió amplia dels riscos de privacitat i els mecanismes de protecció a ecosistema d'anuncis en línia. Això es pot aconseguir basant-se en una revisió de la infraestructura i tecnologies subjacents en aquest context. Analitza sistemàticament les amenaces i potencies atacants. A continuació es revisa exhaustivament els mecanismes de privacitat més rellevants, i es classifica i es compara segons les garanties de privacitat que s'ofereixen i el seu possible impacte a la web. Seguidament, s'estudia els riscos de privadesa derivats de les ofertes en temps real, una tecnologia clau del sistema d'anuncis en línia modern. Experimentalment, s'inverteixen els riscos del procés de distribució de dades d'ús, part del sistema basat en licitacions de la publicitat en línia. Es proposa un sistema que regula la distribució de dades d'ús a tercers, depenent del seu comportament previ. Això consisteix en reduir el nombre d’agències anunciants que rebin dades d'ús. Per mitigar l’impacte sobre els ingressos del sistema d'anuncis, aquesta reducció és malaltia i l'objectiu de maximitzar els declaracions ingressades. Experimentalment, es troba que el sistema proposat corregir els comportaments maliciosos, millorant la privacitat dels usuaris. Finalment, s'analitza l'impacte del rastre i la publicitat en línia des de la perspectiva iberoamericana. Estudiem el rastreig de tercers i allò relacionat amb els anuncis que se generen en llocs web locals en aquesta regió heterogènia. Trobem que la ubicació de l'usuari en aquest context afecta la privacitat de l'usuari ja que aquest rastreig varia de país a país, tot i que el nombre total d'entitats darrere d'aquest transit sembla estable. El tipus de contingut afecta també el nivell de rastreig: llocs web de noticies o de compres generen més transit cap a tercers i aquesta intensitat s'exacerba en els llocs més populars.Postprint (published version
Privacy in online advertising platforms
Online advertising is consistently considered as the pillar of the "free• content on the Web since it is commonly the funding source of websites. Furthermore, the option of delivering personalizad ads has tumed advertising into a really valuable service for users, who receive ads tailored to their interests. Given its success in getting paying customers, online advertising is fueling a billionaire business. The current advertising model builds upon an intricate infrastructure whose main aim is to deliver personalized ads. For this purpose, a wealth of user data is collected, aggregated, processed and traded at an unprecedented rate. However, the intrusiveness and ubiquity of these practices prorrpt serious privacy concems.
In view of the inherent corrplexity behind the operation of ad platforms, privacy risks in the online advertising ecosystem could be studied from multiple perspectives. Naturally, most of the efforts unveiling these privacy issues concentrate on a specific entity, technology, behavior or context. However, such a segmented approach rright underestimate the benefits of a wider vision of a systerric problem. A lot of privacy protection echanisms have been proposed from the industry and acaderria. The most popular ones resort to radical strategies that hinder the ad distribution process, thus seriously affecting the online advertising ecosystem. Others involve significantly changing the ecosystem, which unfortunately may not be suitable in these times. Consequently, to encourage the adoption of privacy protection in this context, it is fundamental to pose mechanisms that aim at balancing the trade-off between user privacy and the web business model.
First, this thesis deals with the need to have a wide perspective of the privacy risks for users within the online advertising ecosystem and the protection approaches available. We survey the online advertising infrastructure and its supporting
technologies, and present a thorough overview of the undertying privacy risks and the solutions that may rritigate them. Through a systematic effort, we analyze the threats and potential privacy attackers in this scenario of online advertising.Then, we conduct a corrprehensive survey of the most relevant privacy mechanisms, and classify and con-pare them on the basis of their privacy guarantees and irrpact on the Web. Subsequently, we study the privacy risks derived from real-time bidding, a key enabling technology of modem online advertising. We experimentally explore the potential abuse of the process of user data sharing, necessary to support the auction-based
system in online advertising. Accordingly, we propase a system to regula te the distribution of u ser tracking data to potentially interested entities, depending on their previous behavior.This consists in reducing the nurnber of advertising agencies receiving user data. Doing so may affect the ad platform's revenue, thus the proposed system is designed to maxirrize the revenue while the abuse by advertising agencies is prevented to a large degree. Experimentally, the results of evaluation suggest that this system is able to correct rrisbehaving entities, consequently enhancing user privacy. Finally, we analyze the irrpact of online advertising and tracking from the particular perspective of lberoamerica.We study the third-party and ad tracking triggered within local websites in this heterogeneous region not previously studied. We found out that user location in this context would affect privacy since the intensity of third-party traffic, including advertising related flows of information, varies from country to country when local web traffic is simulated, although the total nurnber of entities behind this traffic seems stable. The type of content served by websites is also a parameter affecting the leve! of third-party tracking:publishers assiciated with news shopping categories generate more third-party traffic and such intensity is exarbated for top-world sitesLa publicitat en línia té un paper important a Internet que permet finançar habitualment l'operació de llocs web que ofereixen contingut lliure als usuaris. A més, la personalització dels anuncis ha tornat la publicitat en línia un servei valuós per als usuaris. Si aconseguirem que hi hagi molts compradors siguin més que possibles, es promourà un negoci milionari. El model d'anuncis vigents es basa en una infraestructura completa que lliura els anuncis personalitzats. Pera això, es pot recopilar una gran quantitat de dades d'ús, agregar, processar i vendre molt ràpidament. Malauradament, aquestes pràctiques generen riscos de privadesa. Donada la complexitat de l'operació de les plataformes d'anuncis, els riscos de privacitat es poden estudiar des de diverses perspectives. Naturalment, els esforços per desenvolupar aquests problemes de privacitat es concentren en una entitat, tecnologia, comportament o context específic. Però aquest enfocament subestima els beneficis d'una perspectiva més àmplia d'un problema integral. Molts mecanismes de protecció han estat proposats des de la indústria i l’àmbit acadèmic. Els més populars apliquen estratègies radicals que obstrueixen la distribució d'anuncis, afectant seriosament l’ecosistema d'anuncis. També es pot modificar significativament l’ecosistema, el que no és factible per la seva conflictivitat. Així, amb la finalitat de fomentar l'adopció de protecció de privacitat, és fonamental plantejar solucions orientades a equilibrar les necessitats de privacitat amb el model de negocis de la web. Inicialment, la tesi ofereix una visió amplia dels riscos de privacitat i els mecanismes de protecció a ecosistema d'anuncis en línia. Això es pot aconseguir basant-se en una revisió de la infraestructura i tecnologies subjacents en aquest context. Analitza sistemàticament les amenaces i potencies atacants. A continuació es revisa exhaustivament els mecanismes de privacitat més rellevants, i es classifica i es compara segons les garanties de privacitat que s'ofereixen i el seu possible impacte a la web. Seguidament, s'estudia els riscos de privadesa derivats de les ofertes en temps real, una tecnologia clau del sistema d'anuncis en línia modern. Experimentalment, s'inverteixen els riscos del procés de distribució de dades d'ús, part del sistema basat en licitacions de la publicitat en línia. Es proposa un sistema que regula la distribució de dades d'ús a tercers, depenent del seu comportament previ. Això consisteix en reduir el nombre d’agències anunciants que rebin dades d'ús. Per mitigar l’impacte sobre els ingressos del sistema d'anuncis, aquesta reducció és malaltia i l'objectiu de maximitzar els declaracions ingressades. Experimentalment, es troba que el sistema proposat corregir els comportaments maliciosos, millorant la privacitat dels usuaris. Finalment, s'analitza l'impacte del rastre i la publicitat en línia des de la perspectiva iberoamericana. Estudiem el rastreig de tercers i allò relacionat amb els anuncis que se generen en llocs web locals en aquesta regió heterogènia. Trobem que la ubicació de l'usuari en aquest context afecta la privacitat de l'usuari ja que aquest rastreig varia de país a país, tot i que el nombre total d'entitats darrere d'aquest transit sembla estable. El tipus de contingut afecta també el nivell de rastreig: llocs web de noticies o de compres generen més transit cap a tercers i aquesta intensitat s'exacerba en els llocs més populars