4,635 research outputs found

    Lifelong Federated Reinforcement Learning: A Learning Architecture for Navigation in Cloud Robotic Systems

    Full text link
    This paper was motivated by the problem of how to make robots fuse and transfer their experience so that they can effectively use prior knowledge and quickly adapt to new environments. To address the problem, we present a learning architecture for navigation in cloud robotic systems: Lifelong Federated Reinforcement Learning (LFRL). In the work, We propose a knowledge fusion algorithm for upgrading a shared model deployed on the cloud. Then, effective transfer learning methods in LFRL are introduced. LFRL is consistent with human cognitive science and fits well in cloud robotic systems. Experiments show that LFRL greatly improves the efficiency of reinforcement learning for robot navigation. The cloud robotic system deployment also shows that LFRL is capable of fusing prior knowledge. In addition, we release a cloud robotic navigation-learning website based on LFRL

    Rask Policy-Læring Gjennom Imitation Learning og Reinforcement Learning i Løfteoperasjoner

    Get PDF
    I denne forskningen ble implementering og evaluering av en ny læringsmetode for autonom kranoperasjon kalt LOKI-G (Locally Optimal search after K-step Imitation - Generalized) ved hjelp av Closed-form Continous-time (CfC) Artificial Neural Network (ANN) utforsket. Studien dreide seg om å takle Sim-to-real gapet ved å tillate modellen å lære "on edge" med minimale eksempler, noe som reduserer behovet for simulatorer. Det ble lagt vekt på å skape en effektiv, robust, pålitelig og forklarlig modell som kunne trenes for anvendelser i den virkelige verden. Forskningen involverte fem eksperimenter hvor modellens ytelse under varierende forhold ble gransket. Modellens reaksjon under basisforhold, sensorisk deprivasjon, endret målposisjon og objektgeneralisering ga betydelige innsikter i modellens evner og potensielle områder for forbedring. Resultatene demonstrerte CfC ANN's evne til å lære den grunnleggende oppgaven med høy nøyaktighet, og viste pålitelig oppførsel og utmerket ytelse under Zero-Shot Learning. Modellen viste imidlertid begrensninger med hensyn til å forstå dybde. Disse funnene har betydelige konsekvenser for å akselerere utviklingen av autonomi i kraner, noe som øker industriell effektivitet og sikkerhet, reduserer karbonutslipp og baner vei for bred adopsjon av autonome løfteoperasjoner. Fremtidige forskningsretninger antyder potensialet for å forbedre modellen ved å optimalisere hyperparametre, utvide modellen til multimodal operasjon, forbedre sikkerhet gjennom bruk av BarrierNet, og adoptere nye læringsmetoder for raskere konvergens. Refleksjoner om viktigheten av å vente under oppgaver og mengden og kvaliteten på data for trening dukket også opp i studien. Som konklusjon har dette arbeidet gitt et eksperimentelt bevis på konsept og et springbrett for fremtidig forskning i utviklingen av tilpasningsdyktige, robuste og pålitelige AI-modeller for autonome industrioperasjoner.In this research, the implementation and evaluation of a novel learning approach for an autonomous crane operation called LOKI-G (Locally Optimal search after K-step Imitation - Generalized) using Closed-form Continous-time (CfC) Artificial Neural Network (ANN) was explored. The study revolved around addressing the Sim-to-real gap by allowing the model to learn on edge with minimal examples, mitigating the need for simulators. An emphasis was placed on creating a sparse, robust, reliable, and explainable model that could be trained for real-world applications. The research involved five experiments where the model's performance under varying conditions was scrutinized. The model's response under baseline conditions, sensory deprivation, altered environment, and object generalization provided significant insights into the model's capabilities and potential areas for improvement. The results demonstrated the CfC ANN's ability to learn the fundamental task with high accuracy, exhibiting reliable behaviour and excellent performance during Zero-Shot Learning. The model, however, showed limitations in regard to understanding depth. These findings have significant implications for accelerating the development of autonomy in cranes, thus increasing industrial efficiency and safety, reducing carbon emissions and paving the way for the wide-scale adoption of autonomous lifting operations. Future research directions suggest the potential of improving the model by optimizing hyperparameters, extending the model to multimodal operation, ensuring safety through the application of BarrierNet, and adopting new learning methods for faster convergence. Reflections on the importance of waiting during tasks and the quantity and quality of data for training also surfaced during the study. In conclusion, this work has provided an experimental proof of concept and a springboard for future research into the development of adaptable, robust, and trustworthy AI models for autonomous industrial operations

    On inferring intentions in shared tasks for industrial collaborative robots

    Get PDF
    Inferring human operators' actions in shared collaborative tasks, plays a crucial role in enhancing the cognitive capabilities of industrial robots. In all these incipient collaborative robotic applications, humans and robots not only should share space but also forces and the execution of a task. In this article, we present a robotic system which is able to identify different human's intentions and to adapt its behavior consequently, only by means of force data. In order to accomplish this aim, three major contributions are presented: (a) force-based operator's intent recognition, (b) force-based dataset of physical human-robot interaction and (c) validation of the whole system in a scenario inspired by a realistic industrial application. This work is an important step towards a more natural and user-friendly manner of physical human-robot interaction in scenarios where humans and robots collaborate in the accomplishment of a task.Peer ReviewedPostprint (published version

    Anticipating Daily Intention using On-Wrist Motion Triggered Sensing

    Full text link
    Anticipating human intention by observing one's actions has many applications. For instance, picking up a cellphone, then a charger (actions) implies that one wants to charge the cellphone (intention). By anticipating the intention, an intelligent system can guide the user to the closest power outlet. We propose an on-wrist motion triggered sensing system for anticipating daily intentions, where the on-wrist sensors help us to persistently observe one's actions. The core of the system is a novel Recurrent Neural Network (RNN) and Policy Network (PN), where the RNN encodes visual and motion observation to anticipate intention, and the PN parsimoniously triggers the process of visual observation to reduce computation requirement. We jointly trained the whole network using policy gradient and cross-entropy loss. To evaluate, we collect the first daily "intention" dataset consisting of 2379 videos with 34 intentions and 164 unique action sequences. Our method achieves 92.68%, 90.85%, 97.56% accuracy on three users while processing only 29% of the visual observation on average

    Flexibly Instructable Agents

    Full text link
    This paper presents an approach to learning from situated, interactive tutorial instruction within an ongoing agent. Tutorial instruction is a flexible (and thus powerful) paradigm for teaching tasks because it allows an instructor to communicate whatever types of knowledge an agent might need in whatever situations might arise. To support this flexibility, however, the agent must be able to learn multiple kinds of knowledge from a broad range of instructional interactions. Our approach, called situated explanation, achieves such learning through a combination of analytic and inductive techniques. It combines a form of explanation-based learning that is situated for each instruction with a full suite of contextually guided responses to incomplete explanations. The approach is implemented in an agent called Instructo-Soar that learns hierarchies of new tasks and other domain knowledge from interactive natural language instructions. Instructo-Soar meets three key requirements of flexible instructability that distinguish it from previous systems: (1) it can take known or unknown commands at any instruction point; (2) it can handle instructions that apply to either its current situation or to a hypothetical situation specified in language (as in, for instance, conditional instructions); and (3) it can learn, from instructions, each class of knowledge it uses to perform tasks.Comment: See http://www.jair.org/ for any accompanying file
    corecore