45,384 research outputs found

    Counterfactual Estimation and Optimization of Click Metrics for Search Engines

    Full text link
    Optimizing an interactive system against a predefined online metric is particularly challenging, when the metric is computed from user feedback such as clicks and payments. The key challenge is the counterfactual nature: in the case of Web search, any change to a component of the search engine may result in a different search result page for the same query, but we normally cannot infer reliably from search log how users would react to the new result page. Consequently, it appears impossible to accurately estimate online metrics that depend on user feedback, unless the new engine is run to serve users and compared with a baseline in an A/B test. This approach, while valid and successful, is unfortunately expensive and time-consuming. In this paper, we propose to address this problem using causal inference techniques, under the contextual-bandit framework. This approach effectively allows one to run (potentially infinitely) many A/B tests offline from search log, making it possible to estimate and optimize online metrics quickly and inexpensively. Focusing on an important component in a commercial search engine, we show how these ideas can be instantiated and applied, and obtain very promising results that suggest the wide applicability of these techniques

    A Learning-Based Guidance Selection Mechanism for a Formally Verified Sense and Avoid Algorithm

    Get PDF
    This paper describes a learning-based strategy for selecting conflict avoidance maneuvers for autonomous unmanned aircraft systems. The selected maneuvers are provided by a formally verified algorithm and they are guaranteed to solve any impending conflict under general assumptions about aircraft dynamics. The decision-making logic that selects the appropriate maneuvers is encoded in a stochastic policy encapsulated as a neural network. The networks parameters are optimized to maximize a reward function. The reward function penalizes loss of separation with other aircraft while rewarding resolutions that result in minimum excursions from the nominal flight plan. This paper provides a description of the technique and presents preliminary simulation results

    Finding kernel function for stock market prediction with support vector regression

    Get PDF
    Stock market prediction is one of the fascinating issues of stock market research. Accurate stock prediction becomes the biggest challenge in investment industry because the distribution of stock data is changing over the time. Time series forcasting, Neural Network (NN) and Support Vector Machine (SVM) are once commonly used for prediction on stock price. In this study, the data mining operation called time series forecasting is implemented. The large amount of stock data collected from Kuala Lumpur Stock Exchange is used for the experiment to test the validity of SVMs regression. SVM is a new machine learning technique with principle of structural minimization risk, which have greater generalization ability and proved success in time series prediction. Two kernel functions namely Radial Basis Function and polynomial are compared for finding the accurate prediction values. Besides that, backpropagation neural network are also used to compare the predictions performance. Several experiments are conducted and some analyses on the experimental results are done. The results show that SVM with polynomial kernels provide a promising alternative tool in KLSE stock market prediction

    The role of risk aversion in non-conscious decision making

    Get PDF
    To what extent can people choose advantageously without knowing why they are making those choices? This hotly debated question has capitalized on the Iowa Gambling Task (IGT), in which people often learn to choose advantageously without appearing to know why. However, because the IGT is unconstrained in many respects, this finding remains debated and other interpretations are possible (e.g., risk aversion, ambiguity aversion, limits of working memory, or insensitivity to reward/punishment can explain the finding of the IGT). Here we devised an improved variant of the IGT in which the deck-payoff contingency switches after subjects repeatedly choose from a good deck, offering the statistical power of repeated within-subject measures based on learning the reward contingencies associated with each deck. We found that participants exhibited low confidence in their choices, as probed with post-decision wagering, despite high accuracy in selecting advantageous decks in the task, which is putative evidence for non-conscious decision making. However, such a behavioral dissociation could also be explained by risk aversion, a tendency to avoid risky decisions under uncertainty. By explicitly measuring risk aversion for each individual, we predicted subjects’ post-decision wagering using Bayesian modeling. We found that risk aversion indeed does play a role, but that it did not explain the entire effect. Moreover, independently measured risk aversion was uncorrelated with risk aversion exhibited during our version of the IGT, raising the possibility that the latter risk aversion may be non-conscious. Our findings support the idea that people can make optimal choices without being fully aware of the basis of their decision. We suggest that non-conscious decision making may be mediated by emotional feelings of risk that are based on mechanisms distinct from those that support cognitive assessment of risk

    Sustainability experiments in the agri-food system : uncovering the factors of new governance and collaboration success

    Get PDF
    In recent years, research, society and industry recognize the need to transform the agri-food system towards sustainability. Within this process, sustainability experiments play a crucial role in transforming the structure, culture and practices. In literature, much attention is given to new business models, even if the transformation of conventional firms toward sustainability may offer opportunities to accelerate the transformation. Further acceleration could be achieved through collaboration of multiple actors across the agri-food system, but this calls for a systems approach. Therefore, we developed and applied a new sustainability experiment systems approach (SESA) consisting of an analytical framework that allows a reflective evaluation and cross-case analysis of multi-actor governance networks based on business and learning evaluation criteria. We performed a cross-case analysis of four agri-food sustainability experiments in Flanders to test and validate SESA. Hereby, the key factors of the success of collaboration and its performance were identified at the beginning of a sustainability experiment. Some of the key factors identified were risk sharing and the drivers to participate. We are convinced that these results may be used as an analytical tool for researchers, a tool to support and design new initiatives for policymakers, and a reflective tool for participating actors

    R-UCB: a Contextual Bandit Algorithm for Risk-Aware Recommender Systems

    Full text link
    Mobile Context-Aware Recommender Systems can be naturally modelled as an exploration/exploitation trade-off (exr/exp) problem, where the system has to choose between maximizing its expected rewards dealing with its current knowledge (exploitation) and learning more about the unknown user's preferences to improve its knowledge (exploration). This problem has been addressed by the reinforcement learning community but they do not consider the risk level of the current user's situation, where it may be dangerous to recommend items the user may not desire in her current situation if the risk level is high. We introduce in this paper an algorithm named R-UCB that considers the risk level of the user's situation to adaptively balance between exr and exp. The detailed analysis of the experimental results reveals several important discoveries in the exr/exp behaviour
    corecore