5 research outputs found

    Predictive Sampling: Real-time Behaviour Synthesis with MuJoCo

    Full text link
    We introduce MuJoCo MPC (MJPC), an open-source, interactive application and software framework for real-time predictive control, based on MuJoCo physics. MJPC allows the user to easily author and solve complex robotics tasks, and currently supports three shooting-based planners: derivative-based iLQG and Gradient Descent, and a simple derivative-free method we call Predictive Sampling. Predictive Sampling was designed as an elementary baseline, mostly for its pedagogical value, but turned out to be surprisingly competitive with the more established algorithms. This work does not present algorithmic advances, and instead, prioritises performant algorithms, simple code, and accessibility of model-based methods via intuitive and interactive software. MJPC is available at: github.com/deepmind/mujoco_mpc, a video summary can be viewed at: dpmd.ai/mjpc.Comment: Minor fixes and formattin

    RoboPianist: A Benchmark for High-Dimensional Robot Control

    Full text link
    We introduce a new benchmarking suite for high-dimensional control, targeted at testing high spatial and temporal precision, coordination, and planning, all with an underactuated system frequently making-and-breaking contacts. The proposed challenge is mastering the piano through bi-manual dexterity, using a pair of simulated anthropomorphic robot hands. We call it RoboPianist, and the initial version covers a broad set of 150 variable-difficulty songs. We investigate both model-free and model-based methods on the benchmark, characterizing their performance envelopes. We observe that while certain existing methods, when well-tuned, can achieve impressive levels of performance in certain aspects, there is significant room for improvement. RoboPianist provides a rich quantitative benchmarking environment, with human-interpretable results, high ease of expansion by simply augmenting the repertoire with new songs, and opportunities for further research, including in multi-task learning, zero-shot generalization, multimodal (sound, vision, touch) learning, and imitation. Supplementary information, including videos of our control policies, can be found at https://kzakka.com/robopianist

    Cost evaluation during decision making in patients at early stages of psychosis

    Get PDF
    Jumping to conclusions during probabilistic reasoning is a cognitive bias reliably observed in psychosis, and linked to delusion formation. Although the reasons for this cognitive bias are unknown, one suggestion is that psychosis patients may view sampling information as more costly. However, previous computational modelling has provided evidence that patients with chronic schizophrenia jump to conclusion because of noisy decision making. We developed a novel version of the classical beads-task, systematically manipulating the cost of information gathering in four blocks. For 31 individuals with early symptoms of psychosis and 31 healthy volunteers, we examined the numbers of ā€˜draws to decisionā€™ when information sampling had no, a fixed, or an escalating cost. Computational modelling involved estimating a cost of information sampling parameter and a cognitive noise parameter. Overall patients sampled less information than controls. However, group differences in numbers of draws became less prominent at higher cost trials, where less information was sampled. The attenuation of group difference was not due to floor effects, as in the most costly block participants sampled more information than an ideal Bayesian agent. Computational modelling showed that, in the condition with no objective cost to information sampling, patients attributed higher costs to information sampling than controls (Mann-Whiney U=289, p=0.007), with marginal evidence of differences in noise parameter estimates (t=1.86 df=60, p=0.07). In patients, individual differences in severity of psychotic symptoms were statistically significantly associated with higher cost of information sampling (rho=0.6, p=0.001) but not with more cognitive noise (rho=0.27, p=0.14); in controls cognitive noise predicted aspects of schizotypy (preoccupation and distress associated with delusion-like ideation on the Peters Delusion Inventory). Using a psychological manipulation and computational modelling, we provide evidence that early psychosis patients jump to conclusions because of attributing higher costs to sampling information, not because of being primarily noisy decision makers

    Language to Rewards for Robotic Skill Synthesis

    Full text link
    Large language models (LLMs) have demonstrated exciting progress in acquiring diverse new capabilities through in-context learning, ranging from logical reasoning to code-writing. Robotics researchers have also explored using LLMs to advance the capabilities of robotic control. However, since low-level robot actions are hardware-dependent and underrepresented in LLM training corpora, existing efforts in applying LLMs to robotics have largely treated LLMs as semantic planners or relied on human-engineered control primitives to interface with the robot. On the other hand, reward functions are shown to be flexible representations that can be optimized for control policies to achieve diverse tasks, while their semantic richness makes them suitable to be specified by LLMs. In this work, we introduce a new paradigm that harnesses this realization by utilizing LLMs to define reward parameters that can be optimized and accomplish variety of robotic tasks. Using reward as the intermediate interface generated by LLMs, we can effectively bridge the gap between high-level language instructions or corrections to low-level robot actions. Meanwhile, combining this with a real-time optimizer, MuJoCo MPC, empowers an interactive behavior creation experience where users can immediately observe the results and provide feedback to the system. To systematically evaluate the performance of our proposed method, we designed a total of 17 tasks for a simulated quadruped robot and a dexterous manipulator robot. We demonstrate that our proposed method reliably tackles 90% of the designed tasks, while a baseline using primitive skills as the interface with Code-as-policies achieves 50% of the tasks. We further validated our method on a real robot arm where complex manipulation skills such as non-prehensile pushing emerge through our interactive system.Comment: https://language-to-reward.github.io

    Catalytic residues in hydrolases: analysis of methods designed for ligand-binding site prediction

    Get PDF
    The comparison of eight tools applicable to ligand-binding site prediction is presented. The methods examined cover three types of approaches: the geometrical (CASTp, PASS, Pocket-Finder), the physicochemical (Q-SiteFinder, FOD) and the knowledge-based (ConSurf, SuMo, WebFEATURE). The accuracy of predictions was measured in reference to the catalytic residues documented in the Catalytic Site Atlas. The test was performed on a set comprising selected chains of hydrolases. The results were analysed with regard to size, polarity, secondary structure, accessible solvent area of predicted sites as well as parameters commonly used in machine learning (F-measure, MCC). The relative accuracies of predictions are presented in the ROC space, allowing determination of the optimal methods by means of the ROC convex hull. Additionally the minimum expected cost analysis was performed. Both advantages and disadvantages of the eight methods are presented. Characterization of protein chains in respect to the level of difficulty in the active site prediction is introduced. The main reasons for failures are discussed. Overall, the best performance offers SuMo followed by FOD, while Pocket-Finder is the best method among the geometrical approaches
    corecore