Search CORE

117 research outputs found

A Definition of Non-Stationary Bandits

Author: Kuang Xu
Liu Yueyang
Van Roy Benjamin
Publication venue
Publication date: 28/07/2023
Field of study

Despite the subject of non-stationary bandit learning having attracted much recent attention, we have yet to identify a formal definition of non-stationarity that can consistently distinguish non-stationary bandits from stationary ones. Prior work has characterized non-stationary bandits as bandits for which the reward distribution changes over time. We demonstrate that this definition can ambiguously classify the same bandit as both stationary and non-stationary; this ambiguity arises in the existing definition's dependence on the latent sequence of reward distributions. Moreover, the definition has given rise to two widely used notions of regret: the dynamic regret and the weak regret. These notions are not indicative of qualitative agent performance in some bandits. Additionally, this definition of non-stationary bandits has led to the design of agents that explore excessively. We introduce a formal definition of non-stationary bandits that resolves these issues. Our new definition provides a unified approach, applicable seamlessly to both Bayesian and frequentist formulations of bandits. Furthermore, our definition ensures consistent classification of two bandits offering agents indistinguishable experiences, categorizing them as either both stationary or both non-stationary. This advancement provides a more robust framework for non-stationary bandit learning

arXiv.org e-Print Archive

Non-Stationary Bandit Learning via Predictive Sampling

Author: Liu Yueyang
Van Roy Benjamin
Xu Kuang
Publication venue
Publication date: 13/03/2023
Field of study

Thompson sampling has proven effective across a wide range of stationary bandit environments. However, as we demonstrate in this paper, it can perform poorly when applied to non-stationary environments. We show that such failures are attributed to the fact that, when exploring, the algorithm does not differentiate actions based on how quickly the information acquired loses its usefulness due to non-stationarity. Building upon this insight, we propose predictive sampling, an algorithm that deprioritizes acquiring information that quickly loses usefulness. Theoretical guarantee on the performance of predictive sampling is established through a Bayesian regret bound. We provide versions of predictive sampling for which computations tractably scale to complex bandit environments of practical interest. Through numerical simulations, we demonstrate that predictive sampling outperforms Thompson sampling in all non-stationary environments examined

arXiv.org e-Print Archive

UPSCALE: Unconstrained Channel Pruning

Author: Güera David
Hadad Omer
Hao Hanxiang
Patnaik Kaushik
Ren Zhile
Shan Qi
Wan Alvin
Xu Yueyang
Publication venue
Publication date: 17/07/2023
Field of study

As neural networks grow in size and complexity, inference speeds decline. To combat this, one of the most effective compression techniques -- channel pruning -- removes channels from weights. However, for multi-branch segments of a model, channel removal can introduce inference-time memory copies. In turn, these copies increase inference latency -- so much so that the pruned model can be slower than the unpruned model. As a workaround, pruners conventionally constrain certain channels to be pruned together. This fully eliminates memory copies but, as we show, significantly impairs accuracy. We now have a dilemma: Remove constraints but increase latency, or add constraints and impair accuracy. In response, our insight is to reorder channels at export time, (1) reducing latency by reducing memory copies and (2) improving accuracy by removing constraints. Using this insight, we design a generic algorithm UPSCALE to prune models with any pruning pattern. By removing constraints from existing pruners, we improve ImageNet accuracy for post-training pruned models by 2.1 points on average -- benefiting DenseNet (+16.9), EfficientNetV2 (+7.9), and ResNet (+6.2). Furthermore, by reordering channels, UPSCALE improves inference speeds by up to 2x over a baseline export.Comment: 29 pages, 26 figures, accepted to ICML 202

arXiv.org e-Print Archive

Prediction and optimization of a desulphurization system using CMAC neural network and genetic algorithm

Author: Baosheng Jin
Xudong Wang
Yong Zhang
Yueyang Xu
Zhiwei Kong
Publication venue: 'Vilnius Gediminas Technical University'
Publication date: 01/04/2020
Field of study

In this paper, taking desulphurizing ratio and economic cost as two objectives, a ten-input two-output prediction model was structured and validated for desulphurization system. Cerebellar model articulation controller (CMAC) neural network and genetic algorithm (GA) were used for model building and optimization of cost respectively. In the model building process, the grey relation entropy analysis and uniform design method were used to screen the input variables and study the model parameters separately. Traditional regression analysis and proposed location number analysis method were adopted to analyze output errors of experiment group and predict the results of test group. Results show that regression analyses keep high fit degree with experiment group results while the fitting accuracies for test group are quite different. As for location number analysis, a power function between output errors and location numbers was fitted well with the data of experiment group and test group for SO2. Prediction model was initialized by location number analysis method. Model was validated and cost optimization case was performed with GA subsequently. The result shows that the optimal cost obtained from GA could be reduced by more than 30% compared with original optimal operating parameters under same constraints

Directory of Open Access Journals

VGTU Journals (Vilnius Gediminas Technical University - Vilnius Tech)