Search CORE

8 research outputs found

Optimism Based Exploration in Large-Scale Recommender Systems

Author: Guo Hongbo
Naeff Ruben
Nikulkov Alex
Zhu Zheqing
Publication venue
Publication date: 05/04/2023
Field of study

Bandit learning algorithms have been an increasingly popular design choice for recommender systems. Despite the strong interest in bandit learning from the community, there remains multiple bottlenecks that prevent many bandit learning approaches from productionalization. Two of the most important bottlenecks are scaling to multi-task and A/B testing. Classic bandit algorithms, especially those leveraging contextual information, often requires reward for uncertainty estimation, which hinders their adoptions in multi-task recommender systems. Moreover, different from supervised learning algorithms, bandit learning algorithms emphasize greatly on the data collection process through their explorative nature. Such explorative behavior induces unfair evaluation for bandit learning agents in a classic A/B test setting. In this work, we present a novel design of production bandit learning life-cycle for recommender systems, along with a novel set of metrics to measure their efficiency in user exploration. We show through large-scale production recommender system experiments and in-depth analysis that our bandit agent design improves personalization for the production recommender system and our experiment design fairly evaluates the performance of bandit learning algorithms

arXiv.org e-Print Archive

Optimizing Long-term Value for Auction-Based Recommender Systems via On-Policy Reinforcement Learning

Author: Bhandari Jalaj
He Yuchen
Korenkevych Dmytro
Liu Fan
Nikulkov Alex
Xu Ruiyang
Zhu Zheqing
Publication venue
Publication date: 30/07/2023
Field of study

Auction-based recommender systems are prevalent in online advertising platforms, but they are typically optimized to allocate recommendation slots based on immediate expected return metrics, neglecting the downstream effects of recommendations on user behavior. In this study, we employ reinforcement learning to optimize for long-term return metrics in an auction-based recommender system. Utilizing temporal difference learning, a fundamental reinforcement learning algorithm, we implement an one-step policy improvement approach that biases the system towards recommendations with higher long-term user engagement metrics. This optimizes value over long horizons while maintaining compatibility with the auction framework. Our approach is grounded in dynamic programming ideas which show that our method provably improves upon the existing auction-based base policy. Through an online A/B test conducted on an auction-based recommender system which handles billions of impressions and users daily, we empirically establish that our proposed method outperforms the current production system in terms of long-term user engagement metrics

arXiv.org e-Print Archive

Assessing the Impact of U.S. Food Assistance Delivery Policies on Child Mortality in Northern Kenya

Author: Barrett Christopher B.
Mude Andrew G.
Nikulkov Alex
Wein Lawrence M.
Publication venue: 'Public Library of Science (PLoS)'
Publication date: 10/03/2023
Field of study

CGSpace

Assessing the Impact of U.S. Food Assistance Delivery Policies on Child Mortality in Northern Kenya

Author: Alex Nikulkov (3595562)
Andrew G. Mude (3595565)
Christopher B. Barrett (545374)
Lawrence M. Wein (558302)
Publication venue
Publication date: 20/12/2016
Field of study

<div>The U.S. is the main country in the world that delivers its food assistance primarily via transoceanic shipments of commodity-based in-kind food. This approach is costlier and less timely than cash-based assistance, which includes cash transfers, food vouchers, and local and regional procurement, where food is bought in or nearby the recipient country. The U.S.’s approach is exacerbated by a requirement that half of its transoceanic food shipments need to be sent on U.S.-flag vessels. We estimate the effect of these U.S. food assistance distribution policies on child mortality in northern Kenya by formulating and optimizing a supply chain model. In our model, monthly orders of transoceanic shipments and cash-based interventions are chosen to minimize child mortality subject to an annual budget constraint and to policy constraints on the allowable proportions of cash-based interventions and non-US-flag shipments. By varying the restrictiveness of these policy constraints, we assess the impact of possible changes in U.S. food aid policies on child mortality. The model includes an existing regression model that uses household survey data and geospatial data to forecast the mean mid-upper-arm circumference Z scores among children in a community, and allows food assistance to increase Z scores, and Z scores to influence mortality rates. We find that cash-based interventions are a much more powerful policy lever than the U.S.-flag vessel requirement: switching to cash-based interventions reduces child mortality from 4.4% to 3.7% (a 16.2% relative reduction) in our model, whereas eliminating the U.S.-flag vessel restriction without increasing the use of cash-based interventions generates a relative reduction in child mortality of only 1.1%. The great majority of the gains achieved by cash-based interventions are due to their reduced cost, not their reduced delivery lead times; i.e., the reduction of shipping expenses allows for more food to be delivered, which reduces child mortality.</div

Directory of Open Access Journals

PubMed Central

FigShare

Parameter values.

Author: Alex Nikulkov (3595562)
Andrew G. Mude (3595565)
Christopher B. Barrett (545374)
Lawrence M. Wein (558302)
Publication venue
Publication date
Field of study

Parameter values.</p

FigShare

The interrelationships among the components of the model.

Author: Alex Nikulkov (3595562)
Andrew G. Mude (3595565)
Christopher B. Barrett (545374)
Lawrence M. Wein (558302)
Publication venue
Publication date
Field of study

The interrelationships among the components of the model.</p

FigShare

Dependence of the annual mortality rate on the proportion of food assistance utilizing cash-based interventions (l) and the proportion of transoceanic shipments employing non-US-flag carriers (p).

Author: Alex Nikulkov (3595562)
Andrew G. Mude (3595565)
Christopher B. Barrett (545374)
Lawrence M. Wein (558302)
Publication venue
Publication date
Field of study

The current U.S. policy is represented by l = 0.65 and p = 0.5, the elimination of the U.S.-flag vessel requirement corresponds to p = 1.0, and l = 1.0 corresponds to the U.S. switching entirely to cash-based interventions.</p

FigShare

Assessing the Impact of U.S. Food Assistance Delivery Policies on Child Mortality in Northern Kenya

Author: A Abdulai
AG Mude
Alex Nikulkov
Andrew G. Mude
CB Barrett
Christopher B. Barrett
DC Heath
DP Bertsekas
E Bageant
E Lentz
EC Lentz
EC Lentz
G Smith
G Tadesse
H Michelson
J Upton
JH Requeja
L Liu
Lawrence M. Wein
M De Onis
M Garenne
N Nunn
P Harvey
R Lozano
R Schnepf
RN Srivastava
S Isanaka
SM Fishman
WJ Violette
Y Yang
Zulfiqar A. Bhutta
Publication venue: 'Public Library of Science (PLoS)'
Publication date
Field of study

Crossref