715 research outputs found

    Few-Shot Bayesian Imitation Learning with Logical Program Policies

    Full text link
    Humans can learn many novel tasks from a very small number (1--5) of demonstrations, in stark contrast to the data requirements of nearly tabula rasa deep learning methods. We propose an expressive class of policies, a strong but general prior, and a learning algorithm that, together, can learn interesting policies from very few examples. We represent policies as logical combinations of programs drawn from a domain-specific language (DSL), define a prior over policies with a probabilistic grammar, and derive an approximate Bayesian inference algorithm to learn policies from demonstrations. In experiments, we study five strategy games played on a 2D grid with one shared DSL. After a few demonstrations of each game, the inferred policies generalize to new game instances that differ substantially from the demonstrations. Our policy learning is 20--1,000x more data efficient than convolutional and fully convolutional policy learning and many orders of magnitude more computationally efficient than vanilla program induction. We argue that the proposed method is an apt choice for tasks that have scarce training data and feature significant, structured variation between task instances.Comment: AAAI 202

    Does consistency predict accuracy of beliefs?: Economists surveyed about PSA

    Get PDF
    Subjective beliefs and behavior regarding the Prostate Specific Antigen (PSA) test for prostate cancer were surveyed among attendees of the 2006 meeting of the American Economic Association. Logical inconsistency was measured in percentage deviations from a restriction imposed by Bayes’ Rule on pairs of conditional beliefs. Economists with inconsistent beliefs tended to be more accurate than average, and consistent Bayesians were substantially less accurate. Within a loss function framework, we look for and cannot find evidence that inconsistent beliefs cause economic losses. Subjective beliefs about cancer risks do not predict PSA testing decisions, but social influences do.logical consistency, predictive accuracy, elicitation, non-Bayesian, ecological rationality

    Does Consistency Predict Accuracy of Beliefs?: Economists Surveyed About PSA

    Get PDF
    Subjective beliefs and behavior regarding the Prostate Specific Antigen (PSA) test for prostate cancer were surveyed among attendees of the 2006 meeting of the American Economic Association. Logical inconsistency was measured in percentage deviations from a restriction imposed by Bayes’ Rule on pairs of conditional beliefs. Economists with inconsistent beliefs tended to be more accurate than average, and consistent Bayesians were substantially less accurate. Within a loss function framework, we look for and cannot find evidence that inconsistent beliefs cause economic losses. Subjective beliefs about cancer risks do not predict PSA testing decisions, but social influences do.logical consistency, predictive accuracy, elicitation, non-Bayesian, ecological rationality

    Learning Neuro-symbolic Programs for Language Guided Robot Manipulation

    Full text link
    Given a natural language instruction and an input scene, our goal is to train a model to output a manipulation program that can be executed by the robot. Prior approaches for this task possess one of the following limitations: (i) rely on hand-coded symbols for concepts limiting generalization beyond those seen during training [1] (ii) infer action sequences from instructions but require dense sub-goal supervision [2] or (iii) lack semantics required for deeper object-centric reasoning inherent in interpreting complex instructions [3]. In contrast, our approach can handle linguistic as well as perceptual variations, end-to-end trainable and requires no intermediate supervision. The proposed model uses symbolic reasoning constructs that operate on a latent neural object-centric representation, allowing for deeper reasoning over the input scene. Central to our approach is a modular structure consisting of a hierarchical instruction parser and an action simulator to learn disentangled action representations. Our experiments on a simulated environment with a 7-DOF manipulator, consisting of instructions with varying number of steps and scenes with different number of objects, demonstrate that our model is robust to such variations and significantly outperforms baselines, particularly in the generalization settings. The code, dataset and experiment videos are available at https://nsrmp.github.ioComment: International Conference on Robotics and Automation (ICRA), 202
    corecore