402 research outputs found

    Daily Assistive Modular Robot Design Based on Multi-Objective Black-Box Optimization

    Full text link
    The range of robot activities is expanding from industries with fixed environments to diverse and changing environments, such as nursing care support and daily life support. In particular, autonomous construction of robots that are personalized for each user and task is required. Therefore, we develop an actuator module that can be reconfigured to various link configurations, can carry heavy objects using a locking mechanism, and can be easily operated by human teaching using a releasing mechanism. Given multiple target coordinates, a modular robot configuration that satisfies these coordinates and minimizes the required torque is automatically generated by Tree-structured Parzen Estimator (TPE), a type of black-box optimization. Based on the obtained results, we show that the robot can be reconfigured to perform various functions such as moving monitors and lights, serving food, and so on.Comment: Accepted at IROS2023, website - https://haraduka.github.io/auto-modular-design

    A method for Selecting Scenes and Emotion-based Descriptions for a Robot's Diary

    Full text link
    In this study, we examined scene selection methods and emotion-based descriptions for a robot's daily diary. We proposed a scene selection method and an emotion description method that take into account semantic and affective information, and created several types of diaries. Experiments were conducted to examine the change in sentiment values and preference of each diary, and it was found that the robot's feelings and impressions changed more from date to date when scenes were selected using the affective captions. Furthermore, we found that the robot's emotion generally improves the preference of the robot's diary regardless of the scene it describes. However, presenting negative or mixed emotions at once may decrease the preference of the diary or reduce the robot's robot-likeness, and thus the method of presenting emotions still needs further investigation.Comment: 6 pages, 5 figures, ROMAN 202

    Robotic Applications of Pre-Trained Vision-Language Models to Various Recognition Behaviors

    Full text link
    In recent years, a number of models that learn the relations between vision and language from large datasets have been released. These models perform a variety of tasks, such as answering questions about images, retrieving sentences that best correspond to images, and finding regions in images that correspond to phrases. Although there are some examples, the connection between these pre-trained vision-language models and robotics is still weak. If they are directly connected to robot motions, they lose their versatility due to the embodiment of the robot and the difficulty of data collection, and become inapplicable to a wide range of bodies and situations. Therefore, in this study, we categorize and summarize the methods to utilize the pre-trained vision-language models flexibly and easily in a way that the robot can understand, without directly connecting them to robot motions. We discuss how to use these models for robot motion selection and motion planning without re-training the models. We consider five types of methods to extract information understandable for robots, and show the results of state recognition, object recognition, affordance recognition, relation recognition, and anomaly detection based on the combination of these five methods. We expect that this study will add flexibility and ease-of-use, as well as new applications, to the recognition behavior of existing robots

    Online Estimation of Self-Body Deflection With Various Sensor Data Based on Directional Statistics

    Full text link
    In this paper, we propose a method for online estimation of the robot's posture. Our method uses von Mises and Bingham distributions as probability distributions of joint angles and 3D orientation, which are used in directional statistics. We constructed a particle filter using these distributions and configured a system to estimate the robot's posture from various sensor information (e.g., joint encoders, IMU sensors, and cameras). Furthermore, unlike tangent space approximations, these distributions can handle global features and represent sensor characteristics as observation noises. As an application, we show that the yaw drift of a 6-axis IMU sensor can be represented probabilistically to prevent adverse effects on attitude estimation. For the estimation, we used an approximate model that assumes the actual robot posture can be reproduced by correcting the joint angles of a rigid body model. In the experiment part, we tested the estimator's effectiveness by examining that the joint angles generated with the approximate model can be estimated using the link pose of the same model. We then applied the estimator to the actual robot and confirmed that the gripper position could be estimated, thereby verifying the validity of the approximate model in our situation.Comment: This work has been submitted to the IEEE for possible publication. Copyright may be transferred without notice, after which this version may no longer be accessibl

    Recognition of Heat-Induced Food State Changes by Time-Series Use of Vision-Language Model for Cooking Robot

    Full text link
    Cooking tasks are characterized by large changes in the state of the food, which is one of the major challenges in robot execution of cooking tasks. In particular, cooking using a stove to apply heat to the foodstuff causes many special state changes that are not seen in other tasks, making it difficult to design a recognizer. In this study, we propose a unified method for recognizing changes in the cooking state of robots by using the vision-language model that can discriminate open-vocabulary objects in a time-series manner. We collected data on four typical state changes in cooking using a real robot and confirmed the effectiveness of the proposed method. We also compared the conditions and discussed the types of natural language prompts and the image regions that are suitable for recognizing the state changes.Comment: Accepted at IAS18-202

    Binary State Recognition by Robots using Visual Question Answering of Pre-Trained Vision-Language Model

    Full text link
    Recognition of the current state is indispensable for the operation of a robot. There are various states to be recognized, such as whether an elevator door is open or closed, whether an object has been grasped correctly, and whether the TV is turned on or off. Until now, these states have been recognized by programmatically describing the state of a point cloud or raw image, by annotating and learning images, by using special sensors, etc. In contrast to these methods, we apply Visual Question Answering (VQA) from a Pre-Trained Vision-Language Model (PTVLM) trained on a large-scale dataset, to such binary state recognition. This idea allows us to intuitively describe state recognition in language without any re-training, thereby improving the recognition ability of robots in a simple and general way. We summarize various techniques in questioning methods and image processing, and clarify their properties through experiments

    Automatic Diary Generation System including Information on Joint Experiences between Humans and Robots

    Full text link
    In this study, we propose an automatic diary generation system that uses information from past joint experiences with the aim of increasing the favorability for robots through shared experiences between humans and robots. For the verbalization of the robot's memory, the system applies a large-scale language model, which is a rapidly developing field. Since this model does not have memories of experiences, it generates a diary by receiving information from joint experiences. As an experiment, a robot and a human went for a walk and generated a diary with interaction and dialogue history. The proposed diary achieved high scores in comfort and performance in the evaluation of the robot's impression. In the survey of diaries giving more favorable impressions, diaries with information on joint experiences were selected higher than diaries without such information, because diaries with information on joint experiences showed more cooperation between the robot and the human and more intimacy from the robot.Comment: 12 pages, 5 figures, IAS-1

    HumanMimic: Learning Natural Locomotion and Transitions for Humanoid Robot via Wasserstein Adversarial Imitation

    Full text link
    Transferring human motion skills to humanoid robots remains a significant challenge. In this study, we introduce a Wasserstein adversarial imitation learning system, allowing humanoid robots to replicate natural whole-body locomotion patterns and execute seamless transitions by mimicking human motions. First, we present a unified primitive-skeleton motion retargeting to mitigate morphological differences between arbitrary human demonstrators and humanoid robots. An adversarial critic component is integrated with Reinforcement Learning (RL) to guide the control policy to produce behaviors aligned with the data distribution of mixed reference motions. Additionally, we employ a specific Integral Probabilistic Metric (IPM), namely the Wasserstein-1 distance with a novel soft boundary constraint to stabilize the training process and prevent model collapse. Our system is evaluated on a full-sized humanoid JAXON in the simulator. The resulting control policy demonstrates a wide range of locomotion patterns, including standing, push-recovery, squat walking, human-like straight-leg walking, and dynamic running. Notably, even in the absence of transition motions in the demonstration dataset, robots showcase an emerging ability to transit naturally between distinct locomotion patterns as desired speed changes

    Muscle-Tendon Complex-Inspired Deformable Exteriors as a Wire-Drive Extension

    Full text link
    The 11th International Symposium on Adaptive Motion of Animals and Machines. Kobe University, Japan. 2023-06-06/09. Adaptive Motion of Animals and Machines Organizing Committee.Poster Session P5

    Semantic Scene Difference Detection in Daily Life Patroling by Mobile Robots using Pre-Trained Large-Scale Vision-Language Model

    Full text link
    It is important for daily life support robots to detect changes in their environment and perform tasks. In the field of anomaly detection in computer vision, probabilistic and deep learning methods have been used to calculate the image distance. These methods calculate distances by focusing on image pixels. In contrast, this study aims to detect semantic changes in the daily life environment using the current development of large-scale vision-language models. Using its Visual Question Answering (VQA) model, we propose a method to detect semantic changes by applying multiple questions to a reference image and a current image and obtaining answers in the form of sentences. Unlike deep learning-based methods in anomaly detection, this method does not require any training or fine-tuning, is not affected by noise, and is sensitive to semantic state changes in the real world. In our experiments, we demonstrated the effectiveness of this method by applying it to a patrol task in a real-life environment using a mobile robot, Fetch Mobile Manipulator. In the future, it may be possible to add explanatory power to changes in the daily life environment through spoken language.Comment: Accepted to 2023 IEEE/RSJ International Conference on Intelligent Robots and Systems (IROS 2023
    • …
    corecore