36 research outputs found
Optimizing the frequency capping: a robust and reliable methodology to define the number of ads to Maximize ROAS
The goal of digital marketing is to connect advertisers with users that are interested in their products. This means serving ads to users, and it could lead to a user receiving hundreds of impressions of the same ad. Consequently, advertisers can define a maximum threshold to the number of impressions a user can receive, referred to as Frequency Cap. However, low frequency caps mean many users are not engaging with the advertiser. By contrast, with high frequency caps, users may receive many ads leading to annoyance and wasting budget. We build a robust and reliable methodology to define the number of ads that should be delivered to different users to maximize the ROAS and reduce the possibility that users get annoyed with the ads" brand. The methodology uses a novel technique to find the optimal frequency capping based on the number of non-clicked impressions rather than the traditional number of received impressions. This methodology is validated using simulations and large-scale datasets obtained from real ad campaigns data. To sum up, our work proves that it is feasible to address the frequency capping optimization as a business problem, and we provide a framework that can be used to configure efficient frequency capping values.The research leading to these results received funding from the European Union’s Horizon 2020 innovation action programme under the grant agreement No 871370 (PIMCITY project); the Ministerio de Economía, Industria y Competitividad, Spain, and the European Social Fund(EU), under the Ramón y Cajal programme (Grant RyC-2015-17732); the Ministerio de Ciencia e Innovación under the project ACHILLES (Grant PID2019-104207RB-I00); the Community of Madrid synergic project EMPATIA-CM (Grant Y2018/TCS-5046); and the Fundación BBVA under the project AERIS
A large-scale analysis of Facebook's user-base and user engagement growth
Understanding the evolution of the user base as well as the user engagement of online services
is critical not only for the service operators but also for customers, investors, and users. While we can
find research works addressing this issue in online services, such as Twitter, MySpace, or Google+, such
detailed analysis is missing for Facebook, which is currently the largest online social network. This paper
presents the first detailed study on the demographic and geographic composition and evolution of the user
base and user engagement in Facebook over a period of three years. To this end, we have implemented a
measurement methodology that leverages the marketing API of Facebook to retrieve actual information about
the number of total users and the number of daily active users across 230 countries and age groups ranging
between 13 and 65+. The conducted analysis reveals that Facebook is still growing and geographically
expanding. Moreover, the growth pattern is heterogeneous across age groups, genders, and geographical
regions. In particular, from a demography perspective, Facebook shows the lowest growth pattern among
adolescents. Gender-based analysis showed that growth among men is still higher than the growth in women.
Our geographical analysis reveals that while Facebook growth is slower in western countries, it has the fastest
growth in the developing countries mainly located in Africa and Central Asia; analyzing the penetration of
these countries also shows that these countries are at earlier stages of Facebook penetration. Leveraging
external socioeconomic datasets, we also showed that this heterogeneous growth can be characterized
by indicators, such as availability and access to Internet, Facebook popularity, and factors related with
population growth and gender inequality.The work of Y. M. Kassa was supported by the European H2020 Project TYPES under Grant 653449. The work of R. Cuevas was
supported in part by the European H2020 Project SMOOTH under Grant 786741, in part by the Spanish Ministry of Economy and
Competitiveness, through the 5GCity Project, under Grant TEC2016-76795-C6-3-R, and in part by the La Caixa Foundation under
Agreement LCF/PR/MIT17/11820009. The work of A. Cuevas was supported in part by the Ministerio de Economía, Industria y
Competitividad, Spain, in part by the European Social Fund through the Ramón Y Cajal under Grant RyC-2015-17732, and in part by the
Ministerio de Economía, Industria y Competitividad, Spain, through the Project TEXEO, under Grant TEC2016-80339-R.Publicad
An ad-driven measurement technique for monitoring the browser marketplace
In this paper we present a novel active measurement methodology for monitoring the browser market landscape. It leverages the display ads delivered through online advertising campaigns to collect the browser brand and version of the device receiving the ad. While providing a similar accuracy to traditional techniques based on passive measurements, our methodology offers some advantages: (i) a lower entry barrier for researchers and practitioners interested in measuring the browser marketplace; (ii) it allows targeted measurements, which can be useful to fix biases in the data sample or to analyze specific aspects of the browser market. We analyze the performance, accuracy, and capabilities of our methodology through real experiments that overall produced more than 6M measurements.This work was supported in part by Ministerio de Economía y Empresa, Spain, under the Grant RyC-2015-17732, and in part by the
European H2020 projects SMOOTH (786741) and PIMCITY (871370)
Digital Marketing Attribution: Understanding the User Path
This article belongs to the Section Computer Science & EngineeringDigital marketing is a profitable business generating annual revenue over USD 200B and an inter-annual growth over 20%. The definition of efficient marketing investment strategies across different types of channels and campaigns is a key task in digital marketing. Attribution models are an instrument used to assess the return of investment of different channels and campaigns so that they can assist in the decision-making process. A new generation of more powerful data-driven attribution models has irrupted in the market in the last years. Unfortunately, its adoption is slower than expected. One of the main reasons is that the industry lacks a proper understanding of these models and how to configure them. To solve this issue, in this paper, we present an empirical study to better understand the key properties of user-paths and their impact on attribution models. Our analysis is based on a large-scale dataset including more than 95M user-paths from real advertising campaigns of an international hoteling group. The main contribution of the paper is a set of recommendation to build accurate, interpretable and computationally efficient attribution models such as: (i) the use of linear regression, an interpretable machine learning algorithm, to build accurate attribution models; (ii) user-paths including around 12 events are enough to produce accurate models; (iii) the recency of events considered in the user-paths is important for the accuracy of the model.The research leading to these results has received funding from: the European Union’s Horizon 2020 innovation action programme under grant agreement No 786741 (SMOOTH project) and the gran agreement No 871370 (PIMCITY project); the Ministerio de Economía, Industria y Competitividad, Spain, and the European Social Fund(EU), under the Ramón y Cajal programme (grant RyC-2015-17732);the Ministerio de Ciencia e Innovación under the project ACHILLES (Grant PID2019-104207RB-I00); the Community of Madrid synergic project EMPATIA-CM (Grant Y2018/TCS-5046)
Digital contact tracing: large-scale geolocation data as an alternative to bluetooth-based Apps failure
The currently deployed contact-tracing mobile apps have failed as an efficient solution in the context of the COVID-19 pandemic. None of them have managed to attract the number of active users required to achieve efficient operation. This urges the research community to re-open the debate and explore new avenues to lead to efficient contact-tracing solutions. In this paper, we contribute to this debate with an alternative contact-tracing solution that leverages the already available geolocation information owned by BigTech companies that have large penetration rates in most of the countries adopting contact-tracing mobile apps. Our solution provides sufficient privacy guarantees to protect the identity of infected users as well as to preclude Health Authorities from obtaining the contact graph from individuals.The research leading to these results received funding from the European Union’s Horizon 2020 innovation action programme under the grant agreement No 871370 (PIMCITY project); the Ministerio de Economía, Industria y Competitividad, Spain, and the European Social Fund(EU), under the Ramón y Cajal programme (Grant RyC-2015-17732); the Ministerio de Educación, Cultura y Deporte, Spain, through the FPU programme (Grant FPU16/05852); the Ministerio de Ciencia e Innovación under the project ACHILLES (Grant PID2019-104207RB-I00); the Community of Madrid synergic project EMPATIA-CM (Grant Y2018/TCS-5046); and the Fundación BBVA under the project
AERIS; and the NSERC Discovery Grant 2016-04521
A deep dive into the accuracy of IP geolocation databases and its impact on online advertising
The quest for every time more personalized Internet experience relies on the enriched contextual information about each user. Online advertising also follows this approach. Among the context information that advertising stakeholders leverage, location information is certainly one of them. However, when this information is not directly available from the end users, advertising stakeholders infer it using geolocation databases, matching IP addresses to a position on earth. The accuracy of this approach has often been questioned in the past: however, the reality check on an advertising stakeholder shows that this technique accounts for a large fraction of the served advertisements. In this paper, we revisit the work in the field, that is mostly from almost one decade ago, through the lenses of big data. More specifically, we, i) benchmark two commercial Internet geolocation databases, evaluate the quality of their information using a ground-truth database of user positions containing over 2 billion samples, ii) analyze the internals of these databases, devising a theoretical upper bound for the quality of the Internet geolocation approach, and iii) we run an empirical study that unveils the monetary impact of this technology by considering the costs associated with a real-world ad impressions dataset.This work was supported in part by European Union's Horizon 2020 innovation action programme under the PIMCITY Project under Grant 871370, in part by TESTABLE Project under Grant 101019206, in part by Agencia Estatal de Investigacion (AEI) under the ACHILLES Project under Grants PID2019-104207RB-I00/AEI/10.13039/501100011033, in part by the Spanish Ministry of Economic Affairs and Digital Transformation and European Union-Next GenerationEU through the UNICO 5G I+D 6G-RIEMANN-FR, in part by the agreement between the Community of Madrid and the Universidad Carlos III de Madrid for the funding of research projects on SARS-CoV-2 and COVID-19 disease, through project name Multi-source and multi-method prediction to support COVID-19 policy decision making, which was supported with REACT-EU funds from the European regional development fund a way of making Europe, and in part by TAPTAP-UC3M Chair in advanced AI and Data Science applied to advertising and marketing
A Model to Quantify the Success of a Sybil Attack Targeting RELOAD/Chord Resources
The Sybil attack is one of the most harmful security threats for distributed hash tables (DHTs). This attack is not only a theoretical one, but it has been spotted "in the wild", and even performed by researchers themselves to demonstrate its feasibility. In this letter we analyse the Sybil attack whose objective is that the targeted resource cannot be accessed by any user of a Chord DHT, by replacing all the replica nodes that store it with sybils. In particular, we propose a simple, yet complete model that provides the number of random node-IDs that an attacker would need to generate in order to succeed with certain probability. Therefore, our model enables to quantify the cost of performing a Sybil resource attack on RELOAD/Chord DHTs more accurately than previous works, and thus establishes the basis to measure the effectiveness of different solutions proposed in the literature to prevent or mitigate Sybil attacks.This work has been
partially supported by the EU FP7 TREND project (257740), the Spanish
T2C2 project (TIN2008-06739-C04-01) and the Madrid MEDIANET project
(S-2009/TIC-1468).European Community's Seventh Framework ProgramPublicad
On exploiting social relationship and personal background for content discovery in P2P networks
Content discovery is a critical issue in unstructured Peer-to-Peer (P2P) networks as nodes maintain only local network information. However, similarly without global information about human networks, one still can find specific persons via his/her friends by using social information. Therefore, in this paper, we investigate the problem of how social information (i.e., friends and background information) could benefit content discovery in P2P networks. We collect social information of 384,494 user profiles from Facebook, and build a social P2P network model based on the empirical analysis. In this model, we enrich nodes in P2P networks with social information and link nodes via their friendships. Each node extracts two types of social features – Knowledge and Similarity – and assigns more weight to the friends that have higher similarity and more knowledge. Furthermore, we present a novel content discovery algorithm which can explore the latent relationships among a node’s friends. A node computes stable scores for all its friends regarding their weight and the latent relationships. It then selects the top friends with higher scores to query content. Extensive experiments validate performance of the proposed mechanism. In particular, for personal interests searching, the proposed mechanism can achieve 100% of Search Success Rate by selecting the top 20 friends within two-hop. It also achieves 6.5 Hits on average, which improves 8x the performance of the compared methods.This work has been funded by the European Union under the project eCOUSIN (EU-FP7-318398) and the project SITAC (ITEA2-11020). It also has been partially funded by the Spanish Government through the MINEC eeCONTENT project (TEC2011-29688-C02-02)
Unique on Facebook: Formulation and Evidence of (Nano)targeting Individual Users with non-PII Data
Proceedings of: ACM Internet Measurement Conference (IMC '21), November 2-4, 2021, Virtual Event, USA.The privacy of an individual is bounded by the ability of a third party to reveal their identity. Certain data items such as a passport ID or a mobile phone number may be used to uniquely identify a person. These are referred to as Personal Identifiable Information (PII) items. Previous literature has also reported that, in datasets including millions of users, a combination of several non-PII items (which alone are not enough to identify an individual) can uniquely identify an individual within the dataset. In this paper, we define a data-driven model to quantify the number of interests from a user that make them unique on Facebook. To the best of our knowledge, this represents the first study of individuals’ uniqueness at the world population scale. Besides, users’ interests are actionable non-PII items that can be used to define ad campaigns and deliver tailored ads to Facebook users. We run an experiment through 21 Face-book ad campaigns that target three of the authors of this paper to prove that, if an advertiser knows enough interests from a user, the Facebook Advertising Platform can be systematically exploited to deliver ads exclusively to a specific user. We refer to this practice as nanotargeting. Finally, we discuss the harmful risks associated with nanotargeting such as psychological persuasion, user manipulation, or blackmailing, and provide easily implementable countermea-sures to preclude attacks based on nanotargeting campaigns on Facebook.This research received funding from the European Union’s Horizon 2020 innovation action programme under the PIMCITY project (Grant 871370) and the TESTABLE project (Grant 101019206); the Ministerio de Economía, Industria y Competitividad, Spain, and the European Social Fund(EU), under the Ramón y Cajal programme (Grant RyC-2015-17732); the Ministerio de Educación, Cultura y Deporte, Spain, through the FPU programme (Grant FPU16/05852); the Agencia Estatal de Investigación (AEI) under the ACHILLES project (Grant PID2019-104207RB-I00/AEI/10.13039/501100011033); the Community of Madrid synergic project EMPATIA-CM (Grant Y2018/TCS-5046); the Fundación BBVA under the project AERIS; and the Vienna Science and Technology Fund through the project “Emotional Well-Being in the Digital Society” (Grant VRG16-005)
Does Facebook use sensitive data for advertising purposes? worldwide analysis and GDPR Impact
Citizens Worldwide have demonstrated serious concerns regarding the management of personal information by online services. For instance, the 2015 Eurobarometer about data protection13 reveals that: 63% of citizens within the Eurpean Union (EU) do not trust online businesses, more than half do not like providing personal information in return for free services, and 53% do not like that Internet companies use their personal information in tailored advertising