484,915 research outputs found

    Counterfactual Learning from Bandit Feedback under Deterministic Logging: A Case Study in Statistical Machine Translation

    Full text link
    The goal of counterfactual learning for statistical machine translation (SMT) is to optimize a target SMT system from logged data that consist of user feedback to translations that were predicted by another, historic SMT system. A challenge arises by the fact that risk-averse commercial SMT systems deterministically log the most probable translation. The lack of sufficient exploration of the SMT output space seemingly contradicts the theoretical requirements for counterfactual learning. We show that counterfactual learning from deterministic bandit logs is possible nevertheless by smoothing out deterministic components in learning. This can be achieved by additive and multiplicative control variates that avoid degenerate behavior in empirical risk minimization. Our simulation experiments show improvements of up to 2 BLEU points by counterfactual learning from deterministic bandit feedback.Comment: Conference on Empirical Methods in Natural Language Processing (EMNLP), 2017, Copenhagen, Denmar

    Identifying reliable traits across laboratory mouse exploration arenas: A meta-analysis

    Get PDF
    This study is a meta-analysis of 367 mice from a collection of behaviour neuroscience and behaviour genetic studies run in the same lab in Zurich, Switzerland. We employed correlation-based statistics to confirm and quantify consistencies in behaviour across the testing environments. All 367 mice ran exactly the same behavioural arenas: the light/dark box, the null maze, the open field arena, an emergence task and finally an object exploration task. We analysed consistency of three movement types across those arenas (resting, scanning, progressing), and their relative preference for three zones of the arenas (home, transition, exploration). Results were that 5/6 measures showed strong individual-differences consistency across the tests. Mean inter-arena correlations for these five measures ranged from +.12 to +.53. Unrotated principal component factor analysis (UPCFA) and Cronbach’s alpha measures showed these traits to be reliable and substantial (32-63% of variance across the five arenas). UPCFA loadings then indicate which tasks give the best information about these cross-task traits. One measure (that of time spent in “intermediate” zones) was not reliable across arenas. Conclusions centre on the use of individual differences research and behavioural batteries to revise understandings of what measures in one task predict for behaviour in others. Developing better behaviour measures also makes sound scientific and ethical sense

    Closed-loop optimization of fast-charging protocols for batteries with machine learning.

    Get PDF
    Simultaneously optimizing many design parameters in time-consuming experiments causes bottlenecks in a broad range of scientific and engineering disciplines1,2. One such example is process and control optimization for lithium-ion batteries during materials selection, cell manufacturing and operation. A typical objective is to maximize battery lifetime; however, conducting even a single experiment to evaluate lifetime can take months to years3-5. Furthermore, both large parameter spaces and high sampling variability3,6,7 necessitate a large number of experiments. Hence, the key challenge is to reduce both the number and the duration of the experiments required. Here we develop and demonstrate a machine learning methodology  to efficiently optimize a parameter space specifying the current and voltage profiles of six-step, ten-minute fast-charging protocols for maximizing battery cycle life, which can alleviate range anxiety for electric-vehicle users8,9. We combine two key elements to reduce the optimization cost: an early-prediction model5, which reduces the time per experiment by predicting the final cycle life using data from the first few cycles, and a Bayesian optimization algorithm10,11, which reduces the number of experiments by balancing exploration and exploitation to efficiently probe the parameter space of charging protocols. Using this methodology, we rapidly identify high-cycle-life charging protocols among 224 candidates in 16 days (compared with over 500 days using exhaustive search without early prediction), and subsequently validate the accuracy and efficiency of our optimization approach. Our closed-loop methodology automatically incorporates feedback from past experiments to inform future decisions and can be generalized to other applications in battery design and, more broadly, other scientific domains that involve time-intensive experiments and multi-dimensional design spaces
    corecore