Search CORE

14 research outputs found

Estimating Maximum Expected Value through Gaussian Approximation

Author: D'ERAMO CARLO
NUARA ALESSANDRO
RESTELLI MARCELLO
Publication venue: JMLR.org
Publication date: 01/01/2016
Field of study

Archivio istituzionale della ricerca - Politecnico di Milano

Estimating the maximum expected value in continuous reinforcement learning problems

Author: D'Eramo Carlo
Nuara Alessandro
Pirotta Matteo
Restelli Marcello
Publication venue: AAAI press
Publication date: 01/01/2017
Field of study

This paper is about the estimation of the maximum expected value of an infinite set of random variables. This estimation problem is relevant in many fields, like the Reinforcement Learning (RL) one. In RL it is well known that, in some stochastic environments, a bias in the estimation error can increase step-by-step the approximation error leading to large overestimates of the true action values. Recently, some approaches have been proposed to reduce such bias in order to get better action-value estimates, but are limited to finite problems. In this paper, we leverage on the recently proposed weighted estimator and on Gaussian process regression to derive a new method that is able to natively handle infinitely many random variables. We show how these techniques can be used to face both continuous state and continuous actions RL problems. To evaluate the effectiveness of the proposed approach we perform empirical comparisons with related approaches

Archivio istituzionale della ricerca - Politecnico di Milano

Association for the Advancement of Artificial Intelligence: AAAI Publications

A Combinatorial-Bandit Algorithm for the Online Joint Bid/Budget Optimization of Pay-per-Click Advertising Campaigns

Author: Gatti Nicola
Nuara Alessandro
Restelli Marcello
Trovo' Francesco
Publication venue
Publication date: 01/01/2018
Field of study

Pay-per-click advertising includes various formats (e.g., search, contextual, and social) with a total investment of more than 140 billion USD per year. An advertising campaign is composed of some subcampaigns-each with a different ad-and a cumulative daily budget. The allocation of the ads is ruled exploiting auction mechanisms. In this paper, we propose, for the first time to the best of our knowledge, an algorithm for the online joint bid/budget optimization of pay-per-click multi-channel advertising campaigns. We formulate the optimization problem as a combinatorial bandit problem, in which we use Gaussian Processes to estimate stochastic functions, Bayesian bandit techniques to address the exploration/exploitation problem, and a dynamic programming technique to solve a variation of the Multiple-Choice Knapsack problem. We experimentally evaluate our algorithm both in simulation-using a synthetic setting generated from real data from Yahoo!-and in a real-world application over an advertising period of two months

Archivio istituzionale della ricerca - Politecnico di Milano

Association for the Advancement of Artificial Intelligence: AAAI Publications

Online Joint Bid/Budget Optimization of Pay-per-click Advertising Campaigns

Author: Alessandro Nuara
Francesco Trovò
Marcello Restelli
Nicola Gatti
Publication venue
Publication date: 01/01/2018
Field of study

Archivio istituzionale della ricerca - Politecnico di Milano

Exploiting structure and uncertainty of Bellman updates in Markov decision processes

Author: Bonarini Andrea
D'Eramo Carlo
Nuara Alessandro
Restelli Marcello
Tateo Davide
Publication venue: 'Institute of Electrical and Electronics Engineers (IEEE)'
Publication date: 01/01/2017
Field of study

Archivio istituzionale della ricerca - Politecnico di Milano

Crossref

Targeting Optimization for Internet Advertising by Learning from Logged Bandit Feedback

Author: Gasparini Margherita
Gatti Nicola
Nuara Alessandro
Restelli Marcello
Trovò Francesco
Publication venue: 'Institute of Electrical and Electronics Engineers (IEEE)'
Publication date: 01/01/2018
Field of study

In the last two decades, online advertising has become the most effective way to sponsor a product or an event. The success of this advertising format is mainly due to the capability of the Internet channels to reach a broad audience and to target different groups of users with specific sponsored announces. This is of paramount importance for media agencies, companies whose primary goal is to design ad campaigns that target only those users who are interested in the sponsored product, thus avoiding unnecessary costs due to the display of ads to uninterested users. In the present work, we develop an automatic method to find the best user targets (a.k.a. Contexts) that a media agency can use in a given Internet advertising campaign. More specifically, we formulate the problem of target optimization as a Learning from Logged Bandit Feedback (LLBF) problem, and we propose the TargOpt algorithm, which uses a tree expansion of the target space to learn the partition that efficiently maximizes the campaign revenue. Furthermore, since the problem of finding the optimal target is intrinsically exponential in the number of the features, we propose a tree-search method, called A-TargOpt, and two heuristics to drive the tree expansion, aiming at providing an anytime solution. Finally, we present empirical evidence, on both synthetically generated and real-world data, that our algorithms provide a practical solution to find effective targets for Internet advertising

Archivio istituzionale della ricerca - Politecnico di Milano

Crossref

When Gaussian Processes Meet Combinatorial Bandits: GCB

Author: Alessandro Nuara
Francesco Trovò
Guglielmo Maria Accabi
Marcello Restelli
Nicola Gatti
Publication venue
Publication date: 01/01/2018
Field of study

Archivio istituzionale della ricerca - Politecnico di Milano

Dynamic Pricing with Volume Discounts in Online Settings

Author: Gatti Nicola
Genalti Gianmarco
Mussi Marco
Nuara Alessandro
Restelli Marcello
Trovó Francesco
Publication venue: Association for the Advancement of Artificial Intelligence
Publication date: 06/09/2023
Field of study

According to the main international reports, more pervasive industrial and business-process automation, thanks to machine learning and advanced analytic tools, will unlock more than 14 trillion USD worldwide annually by 2030. In the specific case of pricing problems, which constitute the class of problems we investigate in this paper, the estimated unlocked value will be about 0.5 trillion USD per year. In particular, this paper focuses on pricing in e-commerce when the objective function is profit maximization and only transaction data are available. This setting is one of the most common in real-world applications. Our work aims to find a pricing strategy that allows defining optimal prices at different volume thresholds to serve different classes of users. Furthermore, we face the major challenge, common in real-world settings, of dealing with limited data available. We design a two-phase online learning algorithm, namely PVD-B, capable of exploiting the data incrementally in an online fashion. The algorithm first estimates the demand curve and retrieves the optimal average price, and subsequently it offers discounts to differentiate the prices for each volume threshold. We ran a real-world 4-month-long A/B testing experiment in collaboration with an Italian e-commerce company, in which our algorithm PVD-B - corresponding to A configuration - has been compared with human pricing specialists - corresponding to B configuration. At the end of the experiment, our algorithm produced a total turnover of about 300 KEuros, outperforming the B configuration performance by about 55%. The Italian company we collaborated with decided to adopt our algorithm for more than 1,200 products since January 2022

Association for the Advancement of Artificial Intelligence: AAAI Publications

IDIL: Exploiting Interdependence to Optimize Multi-Channel Advertising Campaigns

Author: Alessandro Nuara
Francesco Trovò
Marcello Restelli
Maria Chiara Zaccardi
Nicola Gatti
Nicola Sosio
Publication venue
Publication date: 01/01/2019
Field of study

Archivio istituzionale della ricerca - Politecnico di Milano