10 research outputs found

    Autonomous Drug Design with Multi-Armed Bandits

    Full text link
    Recent developments in artificial intelligence and automation support a new drug design paradigm: autonomous drug design. Under this paradigm, generative models can provide suggestions on thousands of molecules with specific properties, and automated laboratories can potentially make, test and analyze molecules with minimal human supervision. However, since still only a limited number of molecules can be synthesized and tested, an obvious challenge is how to efficiently select among provided suggestions in a closed-loop system. We formulate this task as a stochastic multi-armed bandit problem with multiple plays, volatile arms and similarity information. To solve this task, we adapt previous work on multi-armed bandits to this setting, and compare our solution with random sampling, greedy selection and decaying-epsilon-greedy selection strategies. According to our simulation results, our approach has the potential to perform better exploration and exploitation of the chemical space for autonomous drug design

    Bandits on graphs and structures

    Get PDF
    We investigate the structural properties of certain sequential decision-making problems with limited feedback (bandits) in order to bring the known algorithmic solutions closer to a practical use. In the first part, we put a special emphasis on structures that can be represented as graphs on actions, in the second part we study the large action spaces that can be of exponential size in the number of base actions or even infinite. We show how to take advantage of structures over the actions and (provably) learn faster

    Harnessing Context for Budget-Limited Crowdsensing with Massive Uncertain Workers

    Full text link
    Crowdsensing is an emerging paradigm of ubiquitous sensing, through which a crowd of workers are recruited to perform sensing tasks collaboratively. Although it has stimulated many applications, an open fundamental problem is how to select among a massive number of workers to perform a given sensing task under a limited budget. Nevertheless, due to the proliferation of smart devices equipped with various sensors, it is very difficult to profile the workers in terms of sensing ability. Although the uncertainties of the workers can be addressed by standard Combinatorial Multi-Armed Bandit (CMAB) framework through a trade-off between exploration and exploitation, we do not have sufficient allowance to directly explore and exploit the workers under the limited budget. Furthermore, since the sensor devices usually have quite limited resources, the workers may have bounded capabilities to perform the sensing task for only few times, which further restricts our opportunities to learn the uncertainty. To address the above issues, we propose a Context-Aware Worker Selection (CAWS) algorithm in this paper. By leveraging the correlation between the context information of the workers and their sensing abilities, CAWS aims at maximizing the expected total sensing revenue efficiently with both budget constraint and capacity constraints respected, even when the number of the uncertain workers are massive. The efficacy of CAWS can be verified by rigorous theoretical analysis and extensive experiments

    Combinatorial Neural Bandits

    Full text link
    We consider a contextual combinatorial bandit problem where in each round a learning agent selects a subset of arms and receives feedback on the selected arms according to their scores. The score of an arm is an unknown function of the arm's feature. Approximating this unknown score function with deep neural networks, we propose algorithms: Combinatorial Neural UCB (CN-UCB\texttt{CN-UCB}) and Combinatorial Neural Thompson Sampling (CN-TS\texttt{CN-TS}). We prove that CN-UCB\texttt{CN-UCB} achieves O~(d~T)\tilde{\mathcal{O}}(\tilde{d} \sqrt{T}) or O~(d~TK)\tilde{\mathcal{O}}(\sqrt{\tilde{d} T K}) regret, where d~\tilde{d} is the effective dimension of a neural tangent kernel matrix, KK is the size of a subset of arms, and TT is the time horizon. For CN-TS\texttt{CN-TS}, we adapt an optimistic sampling technique to ensure the optimism of the sampled combinatorial action, achieving a worst-case (frequentist) regret of O~(d~TK)\tilde{\mathcal{O}}(\tilde{d} \sqrt{TK}). To the best of our knowledge, these are the first combinatorial neural bandit algorithms with regret performance guarantees. In particular, CN-TS\texttt{CN-TS} is the first Thompson sampling algorithm with the worst-case regret guarantees for the general contextual combinatorial bandit problem. The numerical experiments demonstrate the superior performances of our proposed algorithms.Comment: Accepted in ICML 202

    Design and Analysis of Beamforming in mmWave Networks

    Get PDF
    To support increasing data-intensive wireless applications, millimeter-wave (mmWave) communication emerges as the most promising wireless technology that offers high data rate connections by exploiting a large swath of spectrum. Beamforming (BF) that focuses the radio frequency power in a narrow direction, is adopted in mmWave communication to overcome the hostile path loss. However, the distinct high directionality feature caused by BF poses new challenges: 1) Beam alignment (BA) latency which is a processing delay that both the transmitter and the receiver align their beams to establish a reliable link. Existing BA methods incur significant BA latency on the order of seconds for a large number of beams; 2) Medium access control (MAC) degradation. To coordinate the BF training for multiple users, 802.11ad standard specifies a new MAC protocol in which all the users contend for BF training resources in a distributed manner. Due to the “deafness” problem caused by directional transmission, i.e., a user may not sense the transmission of other users, severe collisions occur in high user density scenarios, which significantly degrades the MAC performance; and 3) Backhaul congestion. All the base stations (BSs) in mmWave dense networks are connected to backbone network via backhaul links, in order to access remote content servers. Although BF technology can increase the data rate of the fronthaul links between users and the BS, the congested backhaul link becomes a new bottleneck, since deploying unconstrained wired backhaul links in mmWave dense networks is infeasible due to high costs. In this dissertation, we address each challenge respectively by 1) proposing an efficient BA algorithm; 2) evaluating and enhancing the 802.11ad MAC performance; and 3) designing an effective backhaul alleviation scheme. Firstly, we propose an efficient BA algorithm to reduce processing latency. The existing BA methods search the entire beam space to identify the optimal transmit-receive beam pair, which leads to significant latency. Thus, an efficient BA algorithm without search- ing the entire beam space is desired. Accordingly, a learning-based BA algorithm, namely hierarchical BA (HBA) algorithm is proposed which takes advantage of the correlation structure among beams such that the information from nearby beams is extracted to iden- tify the optimal beam, instead of searching the entire beam space. Furthermore, the prior knowledge on the channel fluctuation is incorporated in the proposed algorithm to further accelerate the BA process. Theoretical analysis indicates that the proposed algorithm can effectively identify the optimal beam pair with low latency. Secondly, we analyze and enhance the performance of BF training MAC (BFT-MAC) in 802.11ad. Existing analytical models for traditional omni-directional systems are un- suitable for BFT-MAC due to the distinct directional transmission feature in mmWave networks. Therefore, a thorough theoretical framework on BFT-MAC is necessary and significant. To this end, we develop a simple yet accurate analytical model to evaluate the performance of BFT-MAC. Based on our analytical model, we derive the closed-form expressions of average successful BF training probability, the normalized throughput, and the BF training latency. Asymptotic analysis indicates that the maximum normalized throughput of BFT-MAC is barely 1/e. Then, we propose an enhancement scheme which adaptively adjusts MAC parameters in tune with user density. The proposed scheme can effectively improve MAC performance in high user density scenarios. Thirdly, to alleviate backhaul burden in mmWave dense networks, edge caching that proactively caches popular contents at the edge of mmWave networks, is employed. Since the cache resource of an individual BS can only store limited contents, this significantly throttles the caching performance. We propose a cooperative edge caching policy, namely device-to-device assisted cooperative edge caching (DCEC), to enlarge cached contents by jointly utilizing cache resources of adjacent users and BSs in proximity. In addition, the proposed caching policy brings an extra advantage that the high directional transmission in mmWave communications can naturally tackle the interference issue in the cooperative caching policy. We theoretically analyze the performance of DCEC scheme taking the network density, the practical directional antenna model and the stochastic information of network topology into consideration. Theoretical results demonstrate that the proposed policy can achieve higher performance in offloading the backhaul traffic and reducing the content retrieval delay, compared with the benchmark policy. The research outcomes from the dissertation can provide insightful lights on under- standing the fundamental performance of the mmWave networks from the perspectives of BA, MAC, and backhaul. The schemes developed in the dissertation should offer practical and efficient solutions to build and optimize the mmWave networks
    corecore