Search CORE

196 research outputs found

Optimal No-regret Learning in Repeated First-price Auctions

Author: Han Yanjun
Weissman Tsachy
Zhou Zhengyuan
Publication venue
Publication date: 08/05/2020
Field of study

We study online learning in repeated first-price auctions with censored feedback, where a bidder, only observing the winning bid at the end of each auction, learns to adaptively bid in order to maximize her cumulative payoff. To achieve this goal, the bidder faces a challenging dilemma: if she wins the bid--the only way to achieve positive payoffs--then she is not able to observe the highest bid of the other bidders, which we assume is iid drawn from an unknown distribution. This dilemma, despite being reminiscent of the exploration-exploitation trade-off in contextual bandits, cannot directly be addressed by the existing UCB or Thompson sampling algorithms in that literature, mainly because contrary to the standard bandits setting, when a positive reward is obtained here, nothing about the environment can be learned. In this paper, by exploiting the structural properties of first-price auctions, we develop the first learning algorithm that achieves

O(\sqrt{T}\log^2 T)

regret bound when the bidder's private values are stochastically generated. We do so by providing an algorithm on a general class of problems, which we call monotone group contextual bandits, where the same regret bound is established under stochastically generated contexts. Further, by a novel lower bound argument, we characterize an

\Omega(T^{2/3})

lower bound for the case where the contexts are adversarially generated, thus highlighting the impact of the contexts generation mechanism on the fundamental learning limit. Despite this, we further exploit the structure of first-price auctions and develop a learning algorithm that operates sample-efficiently (and computationally efficiently) in the presence of adversarially generated private values. We establish an

O(\sqrt{T}\log^3 T)

regret bound for this algorithm, hence providing a complete characterization of optimal learning guarantees for this problem

arXiv.org e-Print Archive

Development of Hypoxia Trapping Enhanced BB2R-Targeted Radiopharmaceutics for Prostate Cancer

Author: Zhou Zhengyuan
Publication venue: DigitalCommons@UNMC
Publication date: 14/08/2015
Field of study

The Gastrin-Releasing Peptide Receptor (BB2r) has been investigated as a diagnostic and therapeutic target for prostate and other cancers due to the high expression level on neoplastic relative to normal tissues. A variety of BB2r-targeted agents have been developed utilizing the bombesin(BBN) peptide, which has shown nanomolar binding affinity to human BB2r. However, as with most of the low-molecular weight, receptor-targeted drugs, a major challenge to clinical translation of BB2r-targeted agents is the low retention at the tumor site due to intrinsically high diffusion and efflux rates. Our laboratory seeks to address this deficiency by developing synthetic approaches to selectively increase retention of BB2r-targeted agents in prostate cancer. Hypoxic regions commonly exist in prostate tumors and many other cancers due to a chaotic vascular architecture which impedes delivery of oxygen. In this dissertation, we explore the incorporation of nitroimidazoles, a hypoxia-selective prodrug which irreversibly binds to intracellular nucleophiles in hypoxic tissues, into the BB2r-targeted agent paradigm. We seek to determine if these agents can increase the long-term retention in the tumor and thereby increase efficacy and clinical potential of BB2r-targeted agents. To that end, we have developed several generations of hypoxia trapping enhanced BBN analogs. Our first in vitro investigation of hypoxia-enhanced 111In-labeled BBN conjugates demonstrated significantly improved retention in hypoxic PC-3 human prostate cancer cells. However, it was determined that the proximity of the 2-nitroimidazole relative to the pharmacophore had a detrimental impact on BB2r binding affinity. To address the problem, our next generation of radioconjugates contained an extended linker to eliminate steric inhibition. The new design demonstrated substantially improved binding affinity and lower clearance rate of the 2-nitroimidazole containing radioconjugates under hypoxic conditions. In vivo biodistribution studies using a PC-3 xenograft mouse model revealed significant tumor retention enhancement. Further work is needed to clarify the mechanisms of cellular retention and to correlate the tumor hypoxia burden with the retention efficacy

University of Nebraska Medical Center Research: DigitalCommons@UNMC

Stochastic Nonsmooth Convex Optimization with Heavy-Tailed Noises

Author: Liu Zijian
Zhou Zhengyuan
Publication venue
Publication date: 25/03/2023
Field of study

Recently, several studies consider the stochastic optimization problem but in a heavy-tailed noise regime, i.e., the difference between the stochastic gradient and the true gradient is assumed to have a finite

p

-th moment (say being upper bounded by

\sigma^{p}

for some

\sigma\geq0

) where

p\in(1,2]

, which not only generalizes the traditional finite variance assumption (

p=2

) but also has been observed in practice for several different tasks. Under this challenging assumption, lots of new progress has been made for either convex or nonconvex problems, however, most of which only consider smooth objectives. In contrast, people have not fully explored and well understood this problem when functions are nonsmooth. This paper aims to fill this crucial gap by providing a comprehensive analysis of stochastic nonsmooth convex optimization with heavy-tailed noises. We revisit a simple clipping-based algorithm, whereas, which is only proved to converge in expectation but under the additional strong convexity assumption. Under appropriate choices of parameters, for both convex and strongly convex functions, we not only establish the first high-probability rates but also give refined in-expectation bounds compared with existing works. Remarkably, all of our results are optimal (or nearly optimal up to logarithmic factors) with respect to the time horizon

T

even when

T

is unknown in advance. Additionally, we show how to make the algorithm parameter-free with respect to

\sigma

, in other words, the algorithm can still guarantee convergence without any prior knowledge of

\sigma

arXiv.org e-Print Archive

Dynamic Batch Learning in High-Dimensional Sparse Linear Contextual Bandits

Author: Ren Zhimei
Zhou Zhengyuan
Publication venue
Publication date: 27/08/2020
Field of study

We study the problem of dynamic batch learning in high-dimensional sparse linear contextual bandits, where a decision maker, under a given maximum-number-of-batch constraint and only able to observe rewards at the end of each batch, can dynamically decide how many individuals to include in the next batch (at the end of the current batch) and what personalized action-selection scheme to adopt within each batch. Such batch constraints are ubiquitous in a variety of practical contexts, including personalized product offerings in marketing and medical treatment selection in clinical trials. We characterize the fundamental learning limit in this problem via a regret lower bound and provide a matching upper bound (up to log factors), thus prescribing an optimal scheme for this problem. To the best of our knowledge, our work provides the first inroad into a theoretical understanding of dynamic batch learning in high-dimensional sparse linear contextual bandits. Notably, even a special case of our result (when no batch constraint is present) yields the first minimax optimal

\tilde{O}(\sqrt{s_0T})

regret bound for standard online learning in high-dimensional linear contextual bandits (for the no-margin case), where

s_0

is the sparsity parameter (or an upper bound thereof) and

T

is the learning horizon. This result (both that

\tilde{O}(\sqrt{s_0 T})

is achievable and that

\Omega(\sqrt{s_0 T})

is a lower bound) appears to be unknown in the emerging literature of high-dimensional contextual bandits.Comment: 33 page

arXiv.org e-Print Archive

On the convergence of mirror descent beyond stochastic convex programming

Author: Zhou Zhengyuan
Mertikopoulos Panayotis
Bambos Nicholas
Boyd Stephen
Glynn Peter
Publication venue
Publication date: 01/01/1962
Field of study

In this paper, we examine the convergence of mirror descent in a class of stochastic optimization problems that are not necessarily convex (or even quasi-convex), and which we call variationally coherent. Since the standard technique of "ergodic averaging" offers no tangible benefits beyond convex programming, we focus directly on the algorithm's last generated sample (its "last iterate"), and we show that it converges with probabiility

1

if the underlying problem is coherent. We further consider a localized version of variational coherence which ensures local convergence of stochastic mirror descent (SMD) with high probability. These results contribute to the landscape of non-convex stochastic optimization by showing that (quasi-)convexity is not essential for convergence to a global minimum: rather, variational coherence, a much weaker requirement, suffices. Finally, building on the above, we reveal an interesting insight regarding the convergence speed of SMD: in problems with sharp minima (such as generic linear programs or concave minimization problems), SMD reaches a minimum point in a finite number of steps (a.s.), even in the presence of persistent gradient noise. This result is to be contrasted with existing black-box convergence rate estimates that are only asymptotic.Comment: 30 pages, 5 figure

arXiv.org e-Print Archive

Biblioteca Virtual del Patrimonio Bibliográfico (Virtual Library of Bibliographical Heritage)