10 research outputs found
Autonomous Drug Design with Multi-Armed Bandits
Recent developments in artificial intelligence and automation support a new
drug design paradigm: autonomous drug design. Under this paradigm, generative
models can provide suggestions on thousands of molecules with specific
properties, and automated laboratories can potentially make, test and analyze
molecules with minimal human supervision. However, since still only a limited
number of molecules can be synthesized and tested, an obvious challenge is how
to efficiently select among provided suggestions in a closed-loop system. We
formulate this task as a stochastic multi-armed bandit problem with multiple
plays, volatile arms and similarity information. To solve this task, we adapt
previous work on multi-armed bandits to this setting, and compare our solution
with random sampling, greedy selection and decaying-epsilon-greedy selection
strategies. According to our simulation results, our approach has the potential
to perform better exploration and exploitation of the chemical space for
autonomous drug design
Bandits on graphs and structures
We investigate the structural properties of certain sequential decision-making problems with limited feedback (bandits) in order to bring the known algorithmic solutions closer to a practical use. In the first part, we put a special emphasis on structures that can be represented as graphs on actions, in the second part we study the large action spaces that can be of exponential size in the number of base actions or even infinite. We show how to take advantage of structures over the actions and (provably) learn faster
Harnessing Context for Budget-Limited Crowdsensing with Massive Uncertain Workers
Crowdsensing is an emerging paradigm of ubiquitous sensing, through which a
crowd of workers are recruited to perform sensing tasks collaboratively.
Although it has stimulated many applications, an open fundamental problem is
how to select among a massive number of workers to perform a given sensing task
under a limited budget. Nevertheless, due to the proliferation of smart devices
equipped with various sensors, it is very difficult to profile the workers in
terms of sensing ability. Although the uncertainties of the workers can be
addressed by standard Combinatorial Multi-Armed Bandit (CMAB) framework through
a trade-off between exploration and exploitation, we do not have sufficient
allowance to directly explore and exploit the workers under the limited budget.
Furthermore, since the sensor devices usually have quite limited resources, the
workers may have bounded capabilities to perform the sensing task for only few
times, which further restricts our opportunities to learn the uncertainty. To
address the above issues, we propose a Context-Aware Worker Selection (CAWS)
algorithm in this paper. By leveraging the correlation between the context
information of the workers and their sensing abilities, CAWS aims at maximizing
the expected total sensing revenue efficiently with both budget constraint and
capacity constraints respected, even when the number of the uncertain workers
are massive. The efficacy of CAWS can be verified by rigorous theoretical
analysis and extensive experiments
Combinatorial Neural Bandits
We consider a contextual combinatorial bandit problem where in each round a
learning agent selects a subset of arms and receives feedback on the selected
arms according to their scores. The score of an arm is an unknown function of
the arm's feature. Approximating this unknown score function with deep neural
networks, we propose algorithms: Combinatorial Neural UCB ()
and Combinatorial Neural Thompson Sampling (). We prove that
achieves or
regret, where is the
effective dimension of a neural tangent kernel matrix, is the size of a
subset of arms, and is the time horizon. For , we adapt an
optimistic sampling technique to ensure the optimism of the sampled
combinatorial action, achieving a worst-case (frequentist) regret of
. To the best of our knowledge, these
are the first combinatorial neural bandit algorithms with regret performance
guarantees. In particular, is the first Thompson sampling
algorithm with the worst-case regret guarantees for the general contextual
combinatorial bandit problem. The numerical experiments demonstrate the
superior performances of our proposed algorithms.Comment: Accepted in ICML 202
Design and Analysis of Beamforming in mmWave Networks
To support increasing data-intensive wireless applications, millimeter-wave (mmWave) communication emerges as the most promising wireless technology that offers high data rate connections by exploiting a large swath of spectrum. Beamforming (BF) that focuses the radio frequency power in a narrow direction, is adopted in mmWave communication to overcome the hostile path loss. However, the distinct high directionality feature caused by BF poses new challenges: 1) Beam alignment (BA) latency which is a processing delay that both the transmitter and the receiver align their beams to establish a reliable link. Existing BA methods incur significant BA latency on the order of seconds for a large number of beams; 2) Medium access control (MAC) degradation. To coordinate the BF training for multiple users, 802.11ad standard specifies a new MAC protocol in which all the users contend for BF training resources in a distributed manner. Due to the “deafness” problem caused by directional transmission, i.e., a user may not sense the transmission of other users, severe collisions occur in high user density scenarios, which significantly degrades the MAC performance; and 3) Backhaul congestion. All the base stations (BSs) in mmWave dense networks are connected to backbone network via backhaul links, in order to access remote content servers. Although BF technology can increase the data rate of the fronthaul links between users and the BS, the congested backhaul link becomes a new bottleneck, since deploying unconstrained wired backhaul links in mmWave dense networks is infeasible due to high costs. In this dissertation, we address each challenge respectively by 1) proposing an efficient BA algorithm; 2) evaluating and enhancing the 802.11ad MAC performance; and 3) designing an effective backhaul alleviation scheme.
Firstly, we propose an efficient BA algorithm to reduce processing latency. The existing BA methods search the entire beam space to identify the optimal transmit-receive beam pair, which leads to significant latency. Thus, an efficient BA algorithm without search- ing the entire beam space is desired. Accordingly, a learning-based BA algorithm, namely hierarchical BA (HBA) algorithm is proposed which takes advantage of the correlation structure among beams such that the information from nearby beams is extracted to iden- tify the optimal beam, instead of searching the entire beam space. Furthermore, the prior knowledge on the channel fluctuation is incorporated in the proposed algorithm to further accelerate the BA process. Theoretical analysis indicates that the proposed algorithm can effectively identify the optimal beam pair with low latency.
Secondly, we analyze and enhance the performance of BF training MAC (BFT-MAC) in 802.11ad. Existing analytical models for traditional omni-directional systems are un- suitable for BFT-MAC due to the distinct directional transmission feature in mmWave networks. Therefore, a thorough theoretical framework on BFT-MAC is necessary and significant. To this end, we develop a simple yet accurate analytical model to evaluate the performance of BFT-MAC. Based on our analytical model, we derive the closed-form expressions of average successful BF training probability, the normalized throughput, and the BF training latency. Asymptotic analysis indicates that the maximum normalized throughput of BFT-MAC is barely 1/e. Then, we propose an enhancement scheme which adaptively adjusts MAC parameters in tune with user density. The proposed scheme can effectively improve MAC performance in high user density scenarios.
Thirdly, to alleviate backhaul burden in mmWave dense networks, edge caching that proactively caches popular contents at the edge of mmWave networks, is employed. Since the cache resource of an individual BS can only store limited contents, this significantly throttles the caching performance. We propose a cooperative edge caching policy, namely device-to-device assisted cooperative edge caching (DCEC), to enlarge cached contents by jointly utilizing cache resources of adjacent users and BSs in proximity. In addition, the proposed caching policy brings an extra advantage that the high directional transmission in mmWave communications can naturally tackle the interference issue in the cooperative caching policy. We theoretically analyze the performance of DCEC scheme taking the network density, the practical directional antenna model and the stochastic information of network topology into consideration. Theoretical results demonstrate that the proposed policy can achieve higher performance in offloading the backhaul traffic and reducing the content retrieval delay, compared with the benchmark policy.
The research outcomes from the dissertation can provide insightful lights on under- standing the fundamental performance of the mmWave networks from the perspectives of BA, MAC, and backhaul. The schemes developed in the dissertation should offer practical and efficient solutions to build and optimize the mmWave networks