61 research outputs found
Dynamic Manipulation of Flexible Objects with Torque Sequence Using a Deep Neural Network
For dynamic manipulation of flexible objects, we propose an acquisition
method of a flexible object motion equation model using a deep neural network
and a control method to realize a target state by calculating an optimized
time-series joint torque command. By using the proposed method, any physics
model of a target object is not needed, and the object can be controlled as
intended. We applied this method to manipulations of a rigid object, a flexible
object with and without environmental contact, and a cloth, and verified its
effectiveness
Daily Assistive Modular Robot Design Based on Multi-Objective Black-Box Optimization
The range of robot activities is expanding from industries with fixed
environments to diverse and changing environments, such as nursing care support
and daily life support. In particular, autonomous construction of robots that
are personalized for each user and task is required. Therefore, we develop an
actuator module that can be reconfigured to various link configurations, can
carry heavy objects using a locking mechanism, and can be easily operated by
human teaching using a releasing mechanism. Given multiple target coordinates,
a modular robot configuration that satisfies these coordinates and minimizes
the required torque is automatically generated by Tree-structured Parzen
Estimator (TPE), a type of black-box optimization. Based on the obtained
results, we show that the robot can be reconfigured to perform various
functions such as moving monitors and lights, serving food, and so on.Comment: Accepted at IROS2023, website -
https://haraduka.github.io/auto-modular-design
A method for Selecting Scenes and Emotion-based Descriptions for a Robot's Diary
In this study, we examined scene selection methods and emotion-based
descriptions for a robot's daily diary. We proposed a scene selection method
and an emotion description method that take into account semantic and affective
information, and created several types of diaries. Experiments were conducted
to examine the change in sentiment values and preference of each diary, and it
was found that the robot's feelings and impressions changed more from date to
date when scenes were selected using the affective captions. Furthermore, we
found that the robot's emotion generally improves the preference of the robot's
diary regardless of the scene it describes. However, presenting negative or
mixed emotions at once may decrease the preference of the diary or reduce the
robot's robot-likeness, and thus the method of presenting emotions still needs
further investigation.Comment: 6 pages, 5 figures, ROMAN 202
Recognition of Heat-Induced Food State Changes by Time-Series Use of Vision-Language Model for Cooking Robot
Cooking tasks are characterized by large changes in the state of the food,
which is one of the major challenges in robot execution of cooking tasks. In
particular, cooking using a stove to apply heat to the foodstuff causes many
special state changes that are not seen in other tasks, making it difficult to
design a recognizer. In this study, we propose a unified method for recognizing
changes in the cooking state of robots by using the vision-language model that
can discriminate open-vocabulary objects in a time-series manner. We collected
data on four typical state changes in cooking using a real robot and confirmed
the effectiveness of the proposed method. We also compared the conditions and
discussed the types of natural language prompts and the image regions that are
suitable for recognizing the state changes.Comment: Accepted at IAS18-202
Binary State Recognition by Robots using Visual Question Answering of Pre-Trained Vision-Language Model
Recognition of the current state is indispensable for the operation of a
robot. There are various states to be recognized, such as whether an elevator
door is open or closed, whether an object has been grasped correctly, and
whether the TV is turned on or off. Until now, these states have been
recognized by programmatically describing the state of a point cloud or raw
image, by annotating and learning images, by using special sensors, etc. In
contrast to these methods, we apply Visual Question Answering (VQA) from a
Pre-Trained Vision-Language Model (PTVLM) trained on a large-scale dataset, to
such binary state recognition. This idea allows us to intuitively describe
state recognition in language without any re-training, thereby improving the
recognition ability of robots in a simple and general way. We summarize various
techniques in questioning methods and image processing, and clarify their
properties through experiments
VQA-based Robotic State Recognition Optimized with Genetic Algorithm
State recognition of objects and environment in robots has been conducted in
various ways. In most cases, this is executed by processing point clouds,
learning images with annotations, and using specialized sensors. In contrast,
in this study, we propose a state recognition method that applies Visual
Question Answering (VQA) in a Pre-Trained Vision-Language Model (PTVLM) trained
from a large-scale dataset. By using VQA, it is possible to intuitively
describe robotic state recognition in the spoken language. On the other hand,
there are various possible ways to ask about the same event, and the
performance of state recognition differs depending on the question. Therefore,
in order to improve the performance of state recognition using VQA, we search
for an appropriate combination of questions using a genetic algorithm. We show
that our system can recognize not only the open/closed of a refrigerator door
and the on/off of a display, but also the open/closed of a transparent door and
the state of water, which have been difficult to recognize.Comment: Accepted at ICRA202
- …