196 research outputs found
Optimal No-regret Learning in Repeated First-price Auctions
We study online learning in repeated first-price auctions with censored
feedback, where a bidder, only observing the winning bid at the end of each
auction, learns to adaptively bid in order to maximize her cumulative payoff.
To achieve this goal, the bidder faces a challenging dilemma: if she wins the
bid--the only way to achieve positive payoffs--then she is not able to observe
the highest bid of the other bidders, which we assume is iid drawn from an
unknown distribution. This dilemma, despite being reminiscent of the
exploration-exploitation trade-off in contextual bandits, cannot directly be
addressed by the existing UCB or Thompson sampling algorithms in that
literature, mainly because contrary to the standard bandits setting, when a
positive reward is obtained here, nothing about the environment can be learned.
In this paper, by exploiting the structural properties of first-price
auctions, we develop the first learning algorithm that achieves
regret bound when the bidder's private values are
stochastically generated. We do so by providing an algorithm on a general class
of problems, which we call monotone group contextual bandits, where the same
regret bound is established under stochastically generated contexts. Further,
by a novel lower bound argument, we characterize an lower
bound for the case where the contexts are adversarially generated, thus
highlighting the impact of the contexts generation mechanism on the fundamental
learning limit. Despite this, we further exploit the structure of first-price
auctions and develop a learning algorithm that operates sample-efficiently (and
computationally efficiently) in the presence of adversarially generated private
values. We establish an regret bound for this algorithm,
hence providing a complete characterization of optimal learning guarantees for
this problem
Development of Hypoxia Trapping Enhanced BB2R-Targeted Radiopharmaceutics for Prostate Cancer
The Gastrin-Releasing Peptide Receptor (BB2r) has been investigated as a diagnostic and therapeutic target for prostate and other cancers due to the high expression level on neoplastic relative to normal tissues. A variety of BB2r-targeted agents have been developed utilizing the bombesin(BBN) peptide, which has shown nanomolar binding affinity to human BB2r. However, as with most of the low-molecular weight, receptor-targeted drugs, a major challenge to clinical translation of BB2r-targeted agents is the low retention at the tumor site due to intrinsically high diffusion and efflux rates. Our laboratory seeks to address this deficiency by developing synthetic approaches to selectively increase retention of BB2r-targeted agents in prostate cancer. Hypoxic regions commonly exist in prostate tumors and many other cancers due to a chaotic vascular architecture which impedes delivery of oxygen. In this dissertation, we explore the incorporation of nitroimidazoles, a hypoxia-selective prodrug which irreversibly binds to intracellular nucleophiles in hypoxic tissues, into the BB2r-targeted agent paradigm. We seek to determine if these agents can increase the long-term retention in the tumor and thereby increase efficacy and clinical potential of BB2r-targeted agents.
To that end, we have developed several generations of hypoxia trapping enhanced BBN analogs. Our first in vitro investigation of hypoxia-enhanced 111In-labeled BBN conjugates demonstrated significantly improved retention in hypoxic PC-3 human prostate cancer cells. However, it was determined that the proximity of the 2-nitroimidazole relative to the pharmacophore had a detrimental impact on BB2r binding affinity. To address the problem, our next generation of radioconjugates contained an extended linker to eliminate steric inhibition. The new design demonstrated substantially improved binding affinity and lower clearance rate of the 2-nitroimidazole containing radioconjugates under hypoxic conditions. In vivo biodistribution studies using a PC-3 xenograft mouse model revealed significant tumor retention enhancement. Further work is needed to clarify the mechanisms of cellular retention and to correlate the tumor hypoxia burden with the retention efficacy
Stochastic Nonsmooth Convex Optimization with Heavy-Tailed Noises
Recently, several studies consider the stochastic optimization problem but in
a heavy-tailed noise regime, i.e., the difference between the stochastic
gradient and the true gradient is assumed to have a finite -th moment (say
being upper bounded by for some ) where ,
which not only generalizes the traditional finite variance assumption ()
but also has been observed in practice for several different tasks. Under this
challenging assumption, lots of new progress has been made for either convex or
nonconvex problems, however, most of which only consider smooth objectives. In
contrast, people have not fully explored and well understood this problem when
functions are nonsmooth. This paper aims to fill this crucial gap by providing
a comprehensive analysis of stochastic nonsmooth convex optimization with
heavy-tailed noises. We revisit a simple clipping-based algorithm, whereas,
which is only proved to converge in expectation but under the additional strong
convexity assumption. Under appropriate choices of parameters, for both convex
and strongly convex functions, we not only establish the first high-probability
rates but also give refined in-expectation bounds compared with existing works.
Remarkably, all of our results are optimal (or nearly optimal up to logarithmic
factors) with respect to the time horizon even when is unknown in
advance. Additionally, we show how to make the algorithm parameter-free with
respect to , in other words, the algorithm can still guarantee
convergence without any prior knowledge of
Dynamic Batch Learning in High-Dimensional Sparse Linear Contextual Bandits
We study the problem of dynamic batch learning in high-dimensional sparse
linear contextual bandits, where a decision maker, under a given
maximum-number-of-batch constraint and only able to observe rewards at the end
of each batch, can dynamically decide how many individuals to include in the
next batch (at the end of the current batch) and what personalized
action-selection scheme to adopt within each batch. Such batch constraints are
ubiquitous in a variety of practical contexts, including personalized product
offerings in marketing and medical treatment selection in clinical trials. We
characterize the fundamental learning limit in this problem via a regret lower
bound and provide a matching upper bound (up to log factors), thus prescribing
an optimal scheme for this problem. To the best of our knowledge, our work
provides the first inroad into a theoretical understanding of dynamic batch
learning in high-dimensional sparse linear contextual bandits. Notably, even a
special case of our result (when no batch constraint is present) yields the
first minimax optimal regret bound for standard online
learning in high-dimensional linear contextual bandits (for the no-margin
case), where is the sparsity parameter (or an upper bound thereof) and
is the learning horizon. This result (both that
is achievable and that is a lower bound) appears to be
unknown in the emerging literature of high-dimensional contextual bandits.Comment: 33 page
On the convergence of mirror descent beyond stochastic convex programming
In this paper, we examine the convergence of mirror descent in a class of
stochastic optimization problems that are not necessarily convex (or even
quasi-convex), and which we call variationally coherent. Since the standard
technique of "ergodic averaging" offers no tangible benefits beyond convex
programming, we focus directly on the algorithm's last generated sample (its
"last iterate"), and we show that it converges with probabiility if the
underlying problem is coherent. We further consider a localized version of
variational coherence which ensures local convergence of stochastic mirror
descent (SMD) with high probability. These results contribute to the landscape
of non-convex stochastic optimization by showing that (quasi-)convexity is not
essential for convergence to a global minimum: rather, variational coherence, a
much weaker requirement, suffices. Finally, building on the above, we reveal an
interesting insight regarding the convergence speed of SMD: in problems with
sharp minima (such as generic linear programs or concave minimization
problems), SMD reaches a minimum point in a finite number of steps (a.s.), even
in the presence of persistent gradient noise. This result is to be contrasted
with existing black-box convergence rate estimates that are only asymptotic.Comment: 30 pages, 5 figure
- …