33 research outputs found

    Best-Arm Identification in Linear Bandits

    Get PDF
    We study the best-arm identification problem in linear bandit, where the rewards of the arms depend linearly on an unknown parameter θ∗\theta^* and the objective is to return the arm with the largest reward. We characterize the complexity of the problem and introduce sample allocation strategies that pull arms to identify the best arm with a fixed confidence, while minimizing the sample budget. In particular, we show the importance of exploiting the global linear structure to improve the estimate of the reward of near-optimal arms. We analyze the proposed strategies and compare their empirical performance. Finally, as a by-product of our analysis, we point out the connection to the GG-optimality criterion used in optimal experimental design.Comment: In Advances in Neural Information Processing Systems 27 (NIPS), 201

    Influence of the characteristics of the experimental data set used to identify anisotropy parameters

    Get PDF
    This work presents an investigation into the effect of the number and type of experimental input data used in parameter identification of Hill’48, Barlat’91 (Yld91) and Cazacu and Barlat’2001 (CB2001) yield criteria on the accuracy of the finite element simulation results. Different sets of experimental data are used to identify the anisotropy parameters of two metal sheets, exhibiting different anisotropic behaviour and hardening characteristics: a mild steel (DC06) and an aluminium alloy (AA6016-T4). Although it has been shown that the CB2001 yield criterion can lead to an accurate description of anisotropic behaviour of metallic sheets, its calibration requires a large set of experimental input data. A calibration procedure is proposed for CB2001 based on a reduced set of experimental data, i.e. where the results are limited to three uniaxial tensile tests, combined with artificial data obtained using the Barlat’91 yield criterion. Evaluation of the predictive capacity of the studied yield criteria, calibrated using different sets of experimental data, is made by comparing finite element simulation results with experimental results for the deep drawing of a crossshaped part. A satisfying agreement is observed between experimental and numerical thickness distributions, with a negligible effect of the number and type of experimental data for the Hill’48 and Yld91 yield criteria. On the contrary, CB2001 calibration is quite sensitive to the experimental data available, particularly biaxial values. Nevertheless, CB2001 calibration based on the combination of effective and artificial experimental data achieves satisfying results, which in the worst case are similar to the ones obtained with the Yld91.The authors gratefully acknowledge the financial support of the Portuguese Foundation for Science and Technology (FCT) via the projects PTDC/EMS-TEC/1805/2012 and PEst-C/EME/UI0285/2013 and by FEDER funds through the program COMPETE – Programa Operacional Factores de Competitividade, under the project CENTRO-07-0224-FEDER-002001 (MT4MOBI). The first author is also grateful for the Post-Doc grant.info:eu-repo/semantics/publishedVersio

    Improving genomics-based predictions for precision medicine through active elicitation of expert knowledge

    Get PDF
    Motivation: Precision medicine requires the ability to predict the efficacies of different treatments for a given individual using high-dimensional genomic measurements. However, identifying predictive features remains a challenge when the sample size is small. Incorporating expert knowledge offers a promising approach to improve predictions, but collecting such knowledge is laborious if the number of candidate features is very large. Results: We introduce a probabilistic framework to incorporate expert feedback about the impact of genomic measurements on the outcome of interest and present a novel approach to collect the feedback efficiently, based on Bayesian experimental design. The new approach outperformed other recent alternatives in two medical applications: prediction of metabolic traits and prediction of sensitivity of cancer cells to different drugs, both using genomic features as predictors. Furthermore, the intelligent approach to collect feedback reduced the workload of the expert to approximately 11%, compared to a baseline approach.Peer reviewe

    Allocation séquentielle de ressources dans le modèle de bandit linéaire

    Get PDF
    This thesis is dedicated to the study of resource allocation problems in uncertain environments, where an agent can sequentially select which action to take. After each step, the environment returns a noisy observation of the value of the selected action. These observations guide the agent in adapting his resource allocation strategy towards reaching a given objective. In the most typical setting of this kind, the stochastic multi-armed bandit (MAB), it is assumed that each observation is drawn from an unknown probability distribution associated with the selected action and gives no information on the expected value of the other actions. The MAB setting has been widely studied and optimal allocation strategies were proposed to solve various objectives under the MAB assumptions. Here, we consider a variant of the MAB setting where there exists a global linear structure in the environment and by selecting an action, the agent also gathers information on the value of the other actions. Therefore, the agent needs to adapt his resource allocation strategy to exploit the structure in the environment. In particular, we study the design of sequences of actions that the agent should take to reach objectives such as: (i) identifying the best value with a fixed confidence and using a minimum number of pulls, or (ii) minimizing the prediction error on the value of each action. In addition, we investigate how the knowledge gathered by a bandit algorithm in a given environment can be transferred to improve the performance in other similar environments.Dans cette thèse nous étudions des problèmes d'allocation de ressources dans des environnements incertains où un agent choisit ses actions séquentiellement. Après chaque pas, l'environnement fournit une observation bruitée sur la valeur de l'action choisie et l'agent doit utiliser ces observations pour allouer ses ressources de façon optimale. Dans le cadre le plus classique, dit modèle du bandit à plusieurs bras (MAB), on fait l'hypothèse que chaque observation est tirée aléatoirement d'une distribution de probabilité associée à l'action choisie et ne fournit aucune information sur les valeurs espérées des autres actions disponibles dans l'environnement. Ce modèle a été largement étudié dans la littérature et plusieurs stratégies optimales ont été proposées, notamment pour le cas où le but de l'agent est de maximiser la somme des observations. Ici, nous considérons une version du MAB où les actions ne sont plus indépendantes, mais chaque observation peut être utilisée pour estimer les valeurs de l'ensemble des actions de l'environnement. Plus précisément, nous proposons des stratégies d'allocation de ressources qui sont efficaces et adaptées à un environnement caractérisé par une structure linéaire globale. Nous étudions notamment les séquences d'actions qui mènent à : (i) identifier la meilleure action avec une précision donnée et en utilisant un nombre minimum d'observations, ou (ii) maximiser la précision d'estimation de la valeur de chaque action. De plus, nous étudions les cas où les observations provenant d'un algorithme de bandit dans un environnement donné peuvent améliorer par la suite la performance de l'agent dans d'autres environnements similaires

    Implementing Linear Bandits in Off-the-Shelf SQLite

    No full text
    International audienceThe linear multi-armed bandit is a reinforcement learning model that is largely used for sequential decision making in applications such as online advertising and recommender systems. We show that LinUCB, a well-known cumulative reward maximization algorithm for linear bandits, can be implemented in off-the-shelf SQLite. Additionally, our empirical study shows that, when dealing with small bandit data, our SQLite implementation is faster than an implementation in off-the-shelf Python. We believe that our findings open the door for many promising research directions on the topic of in-DBMS federated learning because (i) in the federated learning paradigm, many data owners contribute to the same learning task while locally storing their small data, and (ii) SQLite is a DBMS embedded in billions of devices, hence being able to implement federated learning on top of SQLite is of great practical interest

    Combining Users Feedback as a Source of Knowledge for Feature Selection in Regression

    No full text
    In many machine learning applications, when we face the curse of dimensionality where the number of features is too large while the number of training examples is too small; relevant features selection is a critical need before applying any prediction models on our small data.To extract the maximum information out of the data in this setting; a possible and interesting source of useful information would be the experts of the field. The previous works introduced the benefits of querying one expert on the relevance and weights of the coefficients in sparse linear regression and how this feedback improves the prediction. But this method falls under the assumption that the voter has sufficient knowledge. Naturally, it is very hard to ensure the level of accuracy, which exposes the feedback to the risk of being biased. We define a way to calculate the accuracy of a user feedback and propose two methods to combine votes of multiple users on the relevance of features, calculate the accuracy of the final votes and study the reduction in the prediction error in a synthetic setting

    Probabilistic Expert Knowledge Elicitation of Feature Relevances in Sparse Linear Regression

    No full text
    Peer reviewe
    corecore