    Online advertising: analysis of privacy threats and protection approaches

    Online advertising, the pillar of the “free” content on the Web, has revolutionized the marketing business in recent years by creating a myriad of new opportunities for advertisers to reach potential customers. The current advertising model builds upon an intricate infrastructure composed of a variety of intermediary entities and technologies whose main aim is to deliver personalized ads. For this purpose, a wealth of user data is collected, aggregated, processed and traded behind the scenes at an unprecedented rate. Despite the enormous value of online advertising, however, the intrusiveness and ubiquity of these practices prompt serious privacy concerns. This article surveys the online advertising infrastructure and its supporting technologies, and presents a thorough overview of the underlying privacy risks and the solutions that may mitigate them. We first analyze the threats and potential privacy attackers in this scenario of online advertising. In particular, we examine the main components of the advertising infrastructure in terms of tracking capabilities, data collection, aggregation level and privacy risk, and overview the tracking and data-sharing technologies employed by these components. Then, we conduct a comprehensive survey of the most relevant privacy mechanisms, and classify and compare them on the basis of their privacy guarantees and impact on the Web.Peer ReviewedPostprint (author's final draft

    Keyword Targeting Optimization in Sponsored Search Advertising: Combining Selection and Matching

    In sponsored search advertising (SSA), advertisers need to select keywords and determine matching types for selected keywords simultaneously, i.e., keyword targeting. An optimal keyword targeting strategy guarantees reaching the right population effectively. This paper aims to address the keyword targeting problem, which is a challenging task because of the incomplete information of historical advertising performance indices and the high uncertainty in SSA environments. First, we construct a data distribution estimation model and apply a Markov Chain Monte Carlo method to make inference about unobserved indices (i.e., impression and click-through rate) over three keyword matching types (i.e., broad, phrase and exact). Second, we formulate a stochastic keyword targeting model (BB-KSM) combining operations of keyword selection and keyword matching to maximize the expected profit under the chance constraint of the budget, and develop a branch-and-bound algorithm incorporating a stochastic simulation process for our keyword targeting model. Finally, based on a realworld dataset collected from field reports and logs of past SSA campaigns, computational experiments are conducted to evaluate the performance of our keyword targeting strategy. Experimental results show that, (a) BB-KSM outperforms seven baselines in terms of profit; (b) BB-KSM shows its superiority as the budget increases, especially in situations with more keywords and keyword combinations; (c) the proposed data distribution estimation approach can effectively address the problem of incomplete performance indices over the three matching types and in turn significantly promotes the performance of keyword targeting decisions. This research makes important contributions to the SSA literature and the results offer critical insights into keyword management for SSA advertisers.Comment: 38 pages, 4 figures, 5 table

    Audience Prospecting for Dynamic-Product-Ads in Native Advertising

    With yearly revenue exceeding one billion USD, Yahoo Gemini native advertising marketplace serves more than two billion impressions daily to hundreds of millions of unique users. One of the fastest growing segments of Gemini native is dynamic-product-ads (DPA), where major advertisers, such as Amazon and Walmart, provide catalogs with millions of products for the system to choose from and present to users. The subject of this work is finding and expanding the right audience for each DPA ad, which is one of the many challenges DPA presents. Approaches such as targeting various user groups, e.g., users who already visited the advertisers' websites (Retargeting), users that searched for certain products (Search-Prospecting), or users that reside in preferred locations (Location-Prospecting), have limited audience expansion capabilities. In this work we present two new approaches for audience expansion that also maintain predefined performance goals. The Conversion-Prospecting approach predicts DPA conversion rates based on Gemini native logged data, and calculates the expected cost-per-action (CPA) for determining users' eligibility to products and optimizing DPA bids in Gemini native auctions. To support new advertisers and products, the Trending-Prospecting approach matches trending products to users by learning their tendency towards products from advertisers' sites logged events. The tendency scores indicate the popularity of the product and the similarity of the user to those who have previously engaged with this product. The two new prospecting approaches were tested online, serving real Gemini native traffic, demonstrating impressive DPA delivery and DPA revenue lifts while maintaining most traffic within the acceptable CPA range (i.e., performance goal). After a successful testing phase, the proposed approaches are currently in production and serve all Gemini native traffic.Comment: In Proc. IeeeBigData'2023 (Industry and Government Program

    Privacy in online advertising platforms

    Online advertising is consistently considered as the pillar of the "free• content on the Web since it is commonly the funding source of websites. Furthermore, the option of delivering personalizad ads has tumed advertising into a really valuable service for users, who receive ads tailored to their interests. Given its success in getting paying customers, online advertising is fueling a billionaire business. The current advertising model builds upon an intricate infrastructure whose main aim is to deliver personalized ads. For this purpose, a wealth of user data is collected, aggregated, processed and traded at an unprecedented rate. However, the intrusiveness and ubiquity of these practices prorrpt serious privacy concems. In view of the inherent corrplexity behind the operation of ad platforms, privacy risks in the online advertising ecosystem could be studied from multiple perspectives. Naturally, most of the efforts unveiling these privacy issues concentrate on a specific entity, technology, behavior or context. However, such a segmented approach rright underestimate the benefits of a wider vision of a systerric problem. A lot of privacy protection echanisms have been proposed from the industry and acaderria. The most popular ones resort to radical strategies that hinder the ad distribution process, thus seriously affecting the online advertising ecosystem. Others involve significantly changing the ecosystem, which unfortunately may not be suitable in these times. Consequently, to encourage the adoption of privacy protection in this context, it is fundamental to pose mechanisms that aim at balancing the trade-off between user privacy and the web business model. First, this thesis deals with the need to have a wide perspective of the privacy risks for users within the online advertising ecosystem and the protection approaches available. We survey the online advertising infrastructure and its supporting technologies, and present a thorough overview of the undertying privacy risks and the solutions that may rritigate them. Through a systematic effort, we analyze the threats and potential privacy attackers in this scenario of online advertising.Then, we conduct a corrprehensive survey of the most relevant privacy mechanisms, and classify and con-pare them on the basis of their privacy guarantees and irrpact on the Web. Subsequently, we study the privacy risks derived from real-time bidding, a key enabling technology of modem online advertising. We experimentally explore the potential abuse of the process of user data sharing, necessary to support the auction-based system in online advertising. Accordingly, we propase a system to regula te the distribution of u ser tracking data to potentially interested entities, depending on their previous behavior.This consists in reducing the nurnber of advertising agencies receiving user data. Doing so may affect the ad platform's revenue, thus the proposed system is designed to maxirrize the revenue while the abuse by advertising agencies is prevented to a large degree. Experimentally, the results of evaluation suggest that this system is able to correct rrisbehaving entities, consequently enhancing user privacy. Finally, we analyze the irrpact of online advertising and tracking from the particular perspective of lberoamerica.We study the third-party and ad tracking triggered within local websites in this heterogeneous region not previously studied. We found out that user location in this context would affect privacy since the intensity of third-party traffic, including advertising related flows of information, varies from country to country when local web traffic is simulated, although the total nurnber of entities behind this traffic seems stable. The type of content served by websites is also a parameter affecting the leve! of third-party tracking:publishers assiciated with news shopping categories generate more third-party traffic and such intensity is exarbated for top-world sitesLa publicitat en línia té un paper important a Internet que permet finançar habitualment l'operació de llocs web que ofereixen contingut lliure als usuaris. A més, la personalització dels anuncis ha tornat la publicitat en línia un servei valuós per als usuaris. Si aconseguirem que hi hagi molts compradors siguin més que possibles, es promourà un negoci milionari. El model d'anuncis vigents es basa en una infraestructura completa que lliura els anuncis personalitzats. Pera això, es pot recopilar una gran quantitat de dades d'ús, agregar, processar i vendre molt ràpidament. Malauradament, aquestes pràctiques generen riscos de privadesa. Donada la complexitat de l'operació de les plataformes d'anuncis, els riscos de privacitat es poden estudiar des de diverses perspectives. Naturalment, els esforços per desenvolupar aquests problemes de privacitat es concentren en una entitat, tecnologia, comportament o context específic. Però aquest enfocament subestima els beneficis d'una perspectiva més àmplia d'un problema integral. Molts mecanismes de protecció han estat proposats des de la indústria i l’àmbit acadèmic. Els més populars apliquen estratègies radicals que obstrueixen la distribució d'anuncis, afectant seriosament l’ecosistema d'anuncis. També es pot modificar significativament l’ecosistema, el que no és factible per la seva conflictivitat. Així, amb la finalitat de fomentar l'adopció de protecció de privacitat, és fonamental plantejar solucions orientades a equilibrar les necessitats de privacitat amb el model de negocis de la web. Inicialment, la tesi ofereix una visió amplia dels riscos de privacitat i els mecanismes de protecció a ecosistema d'anuncis en línia. Això es pot aconseguir basant-se en una revisió de la infraestructura i tecnologies subjacents en aquest context. Analitza sistemàticament les amenaces i potencies atacants. A continuació es revisa exhaustivament els mecanismes de privacitat més rellevants, i es classifica i es compara segons les garanties de privacitat que s'ofereixen i el seu possible impacte a la web. Seguidament, s'estudia els riscos de privadesa derivats de les ofertes en temps real, una tecnologia clau del sistema d'anuncis en línia modern. Experimentalment, s'inverteixen els riscos del procés de distribució de dades d'ús, part del sistema basat en licitacions de la publicitat en línia. Es proposa un sistema que regula la distribució de dades d'ús a tercers, depenent del seu comportament previ. Això consisteix en reduir el nombre d’agències anunciants que rebin dades d'ús. Per mitigar l’impacte sobre els ingressos del sistema d'anuncis, aquesta reducció és malaltia i l'objectiu de maximitzar els declaracions ingressades. Experimentalment, es troba que el sistema proposat corregir els comportaments maliciosos, millorant la privacitat dels usuaris. Finalment, s'analitza l'impacte del rastre i la publicitat en línia des de la perspectiva iberoamericana. Estudiem el rastreig de tercers i allò relacionat amb els anuncis que se generen en llocs web locals en aquesta regió heterogènia. Trobem que la ubicació de l'usuari en aquest context afecta la privacitat de l'usuari ja que aquest rastreig varia de país a país, tot i que el nombre total d'entitats darrere d'aquest transit sembla estable. El tipus de contingut afecta també el nivell de rastreig: llocs web de noticies o de compres generen més transit cap a tercers i aquesta intensitat s'exacerba en els llocs més populars

    Get PDF
    Suljettujen online-mainosalustojen strategiat — tapaukset Google ja Facebook

    This thesis studies closed ad platforms in the modern online advertising industry. The research in the field is still nascent and the concept of a closed ad platform doesn’t exist. The objective of the research was to discover the main factors determining the revenue of online advertising platforms and to understand why some publishers choose to establish their own closed ad platforms instead of selling their inventory for third-party ad platforms. The concept of a closed ad platform is defined leveraging the existing online advertising literature and the platform governance structure theory. Using the case study method, Google and Facebook were chosen as the cases as they have driven most of the innovation in the field and quickly gained significant market share. In total, 47 people were interviewed for this study, most of them working for advanced online advertisers. Based on the interviews, a microeconomic mathematic formula is created for modeling an ad platform’s net advertising revenue. The formula is used to identify the five main drivers of an ad platform’s revenue an each of them are studied in depth. The results suggest that the most important revenue drivers the ad platforms can affect are access to an active user base, the efficiency of ad serving and the comprehensiveness of measurement. Setting up a closed ad platform requires significant investments from a publisher and should be only done if it can improve the advertisers’ results. After it’s been established, a closed platform can leverage its position to collect user data and structured business data to optimize its performance further. The results provide a structured understanding of the main dynamics in the industry that can be used in decision-making and a basis for future research on closed ad platforms.Tämä diplomityö tutkii suljettuja mainosalustoja nykyaikaisella online-mainonta-alalla. Alan tutkimus on vielä aluillaan ja suljetun mainosalustan konseptia ei ole olemassa. Tämän tutkimuksen tavoitteena oli löytää online-mainosalustojen liikevaihdon määrittävät tekijät ja ymmärtää miksi jotkut julkaisijat valitsevat omien suljettujen mainosalustojen perustamisen mainospaikkojen kolmansien osapuolien mainosalustoille myymisen sijaan. Suljetun mainosalustan konsepti määritellään olemassaolevaa online- mainontakirjallisuutta ja alustojen hallintarakenneteoriaa hyödyntäen. Tapaustutkimusmenetelmää käyttäen, Google ja Facebook valittiin tapauksiksi, sillä ne ovat ajaneet eniten innovaatioita alalla ja nopeasti saavuttaneet merkittävän markkinaosuuden. Yhteensä 47 henkilöä haastateltiin tätä tutkimusta varten, useimmat heistä edistyneiden online-mainostajien työntekijöitä. Haastattelujen perusteella luodaan mikrotaloudellinen matemaattinen kaava mainosalustan nettoliikevaihdon mallintamiseksi. Kaavaa käytetään tunnistamaan mainosalustan liikevaihdon viisi pääkomponenttia, ja kuhunkin niistä perehdytään syvällisemmin. Tulokset viittaavat, että tärkeimmät liikevaihdon ajurit, joihin mainosalustat voivat vaikuttaa ovat pääsy aktiiviseen käyttäjäkantaan, mainosten näyttämisen tehokkuus ja mittaamisen kattavuus. Suljetun mainosalustan perustaminen vaatii merkittäviä investointeja julkaisijalta ja tulisi tehdä ainoastaan, jos sillä voidaan parantaa mainostajien tuloksia. Suljetun alustan perustamisen jälkeen sen positiota voidaan hyödyntää käyttäjädatan ja strukturoidun liiketoimintadatan keräämiseksi suorituskyvyn edelleen optimoimiseksi. Tulokset tarjoavat toimialan päädynamiikkojen ymmärryksen, jota voidaan käyttää päätöksenteossa sekä pohjana suljettujen mainosalustojen edelleen tutkimiseksi tulevaisuudessa