25 research outputs found
Framework of active robot learning
A thesis submitted to the University of Bedfordshire, in fulfilment of the requirements for the degree of Master of Science by researchIn recent years, cognitive robots have become an attractive research area of Artificial Intelligent (AI). High-order beliefs for cognitive robots regard the robots' thought about their users' intention and preference. The existing approaches to the development of such beliefs through machine learning rely on particular social cues or specifically defined award functions . Therefore, their applications can be limited.
This study carried out primary research on active robot learning (ARL) which facilitates a robot to develop high-order beliefs by actively collecting/discovering evidence it needs. The emphasis is on active learning, but not teaching. Hence, social cues and award functions are not necessary. In this study, the framework of ARL was developed. Fuzzy logic was employed in the framework for controlling robot and for identifying high-order beliefs. A simulation environment was set up where a human and a cognitive robot were modelled using MATLAB, and ARL was implemented through simulation.
Simulations were also performed in this study where the human and the robot tried to jointly lift a stick and keep the stick level. The simulation results show that under the framework a robot is able to discover the evidence it needs to confirm its user's intention
Test moment determination design in active robot learning
A thesis submitted to the University of Bedfordshire, in fulfilment of the requirements for the degree of Master of Science by researchIn recent years, service robots have been increasingly used in people's daily live.
These robots are autonomous or semiautonomous and are able to cooperate with their human users. Active robot learning (ARL) is an approach to the development of beliefs for the robots on their users' intention and preference, which is needed by the robots to facilitate the seamless cooperation with humans. This approach allows a robot to perform tests on its users and to build up the high-order beliefs according to the users' responses.
This study carried out primary research on designing the test moment determination component in ARL framework. The test moment determination component is used to decide right moment of taking a test action. In this study, an action plan theory was suggested to synthesis actions into a sequence, that is, an action plan, for a given task.
All actions are defined in a special format of precondition, action, post-condition and testing time. Forward chaining reasoning was introduced to establish connection between the actions and to synthesis individual actions into an action plan, corresponding to the given task. A simulation environment was set up where a human user and a service robot were modelled using MATLAB. Fuzzy control was employed for controlling the robot to carry out the cooperative action.
In order to examine the effect of test moment determination component, simulations were performed to execute a scenario where a robot passes on an object to a human user. The simulation results show that an action plan can be formed according to provided conditions and executed by simulated models properly. Test actions were taken at the moment determined by the test moment determination component to find the human user's intention
The development of test action bank for active robot learning
A thesis submitted to the University of Bedfordshire, in fulfilment of the requirements for the degree of Master of Science by researchIn the rapidly expanding service robotics research area, interactions between robots and humans become increasingly cornmon as more and more jobs will require cooperation between the robots and their human users. It is important to address cooperation between a robot and its user. ARL is a promising approach which facilitates a robot to develop high-order beliefs by actively performing test actions in order to obtain its user's intention from his responses to the actions. Test actions are crucial to ARL.
This study carried out primary research on developing a Test Action Bank (TAB) to provide test actions for ARL. In this study, a verb-based task classifier was developed to extract tasks from user's commands. Taught tasks and their corresponding test actions were proposed and stored in database to establish the TAB. A backward test actions retrieval method was used to locate a task in a task tree and retrieve its test actions from TAB. A simulation environment was set up with a service robot model and a user model to test TAB and demonstrate some test actions.
Simulations were also perfonned in this study, the simulation results proved TAB can successfully provide test actions according to different tasks and the proposed service robot model can demonstrate test actions
Continual Active Robot Learning using Self-organizing Neural Network
νμλ
Όλ¬Έ (μμ¬) -- μμΈλνκ΅ λνμ : 곡과λν μ»΄ν¨ν°κ³΅νλΆ, 2021. 2. μ₯λ³ν.μ΄ λ
Όλ¬Έμμλ μΈκ³΅μ§λ₯ λ‘λ΄μ΄ μ€μ νκ²½μ μ μνλ©΄μ μ£Όλ³μμ μ νλ λμμ κ°λ
μ μ§μμ μ΄κ³ λ₯λμ μΌλ‘ νμ΅νλ λ°©λ²μ μ μνλ€. μ΅κ·Ό λ₯λ¬λμ΄ λΉμ½μ μΌλ‘ λ°μ νλ©΄μ μΈκ³΅μ§λ₯ κ°μ , μ€νΌμ»€ λ±μ΄ κ°λ°λκ³ μμΌλ, μ΄λ° μ ν λλΆλΆμ μΌκ΄μ μΌλ‘ νμ΅λ μμ± μΈμμ΄λ μΌκ΅΄ μΈμ κ°μ κΈ°λ₯μ μ΄μ©νκΈ° λλ¬Έμ κ°λ³ λμ νκ²½μ΄ νμ΅ νκ²½κ³Ό λ€λ₯΄λ€λ©΄ μ±λ₯μ΄ ν¬κ² μ νλ μ μλ€. λν, μ¬κΈ°μ νμ©λλ λ₯λ¬λ λͺ¨λΈμ λλμ λ°μ΄ν°λ‘ μ€λ μκ° νμ΅μμΌμΌ νκ³ μ
λ ₯ μμμ λ°λΌ νκ΄΄μ λ§κ°μ΄ λνλ μ μλ€λ νκ³κ° μλ€. μΈκ³΅μ§λ₯ λ‘λ΄μ μλ‘ κ°μ§ν μμμ λ°μ΄ν°λ₯Ό κ³μ νμ΅ν΄ λκ°λ κ²μ΄ νμνλ©°, μ΄ μ°κ΅¬μμλ μ΄λ₯Ό μν΄ μ¬λμ νμ΅ λ°©μμ λͺ¨μ¬νλ λ° μ΄μ μ λ§μΆμλ€. μκΈ°μ‘°μ§ν μ κ²½λ§, μ¨λΌμΈ μ€μ§λ λ₯λ νμ΅μ λΉλ‘―νμ¬ μ¬λμ νμ΅ λ°©μμ λͺ¨μ¬ν κΈ°μ‘΄ λ¨Έμ λ¬λ κΈ°λ²μ λͺ¨λΈ ꡬ쑰μ νμ΅ κΈ°μ μΈ‘λ©΄μμ λΆμνκ³ μ΄λ€μ μ₯μ μ μ’
ν©ν μ μλ μλ‘μ΄ λͺ¨λΈμΈ CARLSONμ κ°λ°νλ€.
CARLSONμ λ‘λ΄μ΄ κ΄μΈ‘ν 물체 μ΄λ―Έμ§λ₯Ό μ
λ ₯λ°μ 물체 κ°λ
μ νμ΅νλ©°, μλ‘μ΄ λ°μ΄ν°λ₯Ό κΈ°μ‘΄ κ°λ
κ³Ό λμ‘°νλ©΄μ μ§μμ νμ₯ν΄ λκ°λ μκΈ°μ‘°μ§ν μ κ²½λ§ κ΅¬μ‘°λ‘ λμ΄ μλ€. 물체 μ΄λ―Έμ§λ μ°¨μμ΄ λκ³ μ‘μμ ν¬ν¨νλ―λ‘, ν¨μ¨μ μ΄κ³ μμ μ μΈ νμ΅μ μνμ¬ μ΄λ―Έμ§μμ ν΅μ¬μ μΈ νμμ μ°μ μΆμΆνλλ‘ νλ€. νμ μΆμΆμ λͺ¨λΈμ μΈμ½λ(encoder) λΆλΆμ΄ μννλ©°, μ΄λ νμμ μ΄λ―Έμ§λ‘ 볡μνλ λμ½λ(decoder)μ ν¨κ» νλ ¨λλ€. μΈμ½λμμ μΆμΆλ νμλ€μ μνΈ μ μ¬λμ λ°λΌ μ¬λ¬ κ°λ
μΌλ‘ λλκ³ κ° κ°λ
μ λν νμμ κ°μ§λ νλμ λ
Έλ(node)λ‘ κ΅°μ§νλλ€. κ΅°μ§ν κ³Όμ μ λ
Έλ μΆκ°μ μ‘°μ μ μ μ 곡λͺ
μ΄λ‘ (Grossberg 1987)μμμ²λΌ νν₯μ μμΈ‘κ³Ό μν₯μ νμ±νλ₯Ό ν΅ν΄ μ΄λ£¨μ΄μ§λ€. μΈμ½λμ λμ½λλ₯Ό ν¬ν¨ν μ 체 λͺ¨λΈμ λ°μ΄ν°κ° μ
λ ₯λ λλ§λ€ νμ΅νλ©°, νμ§ μ ν κΈ°λ²μ ν΅ν΄ μ μ¬ν λ
Έλ κ°μ μ 보λ₯Ό μ λ¬νκ³ λΆνμ€ν κ°λ
μ λν΄μλ λ₯λμ μ§μλ₯Ό ν΅ν΄ μ 보λ₯Ό 보좩ν¨μΌλ‘μ¨ λ°μ΄ν°κ° μ κ³ μ λ΅ νμ§κ° λλ¬Ό λλ ν¨κ³Όμ μΌλ‘ νμ΅ν μ μλ€.
μ΄ μ°κ΅¬μμλ μ€μ λ‘λ΄μμ λͺ¨λΈμ μ±λ₯μ κ²μ¦νκΈ° μνμ¬ ν΄λ¨Έλ
Έμ΄λ λ‘λ΄μΈ NAOλ‘ μ°μμ μΈ λ¬Όμ²΄ μ΄λ―Έμ§λ₯Ό μμ§νκ³ μκ° κ°μ²΄ μΈμ μ€νμ μννλ€. CARLSONμ μΌλ°μ μΈ λ₯λ¬λ λͺ¨λΈμΈ ν©μ±κ³± μ κ²½λ§(CNN)λ³΄λ€ νμ°ν λμ λΆλ₯ μ νλλ₯Ό 보μμΌλ©°, λ°μ΄ν° μμ νμ§κ° μ κ³ κ° λ°μ΄ν°λ₯Ό ν λ²μ©λ§ νμ΅ν μ μλ μ μ½νμμλ μμ μ μΌλ‘ λμνλ κ²μ κ²μ¦ν μ μμλ€. μΆκ°λ‘ μ μλ €μ§ μ«μ λ° λ¬Όμ²΄ μΈμ λ°μ΄ν°μ
μΈ MNIST, SVHN, Fashion-MNIST, CIFAR-10μμ μ¨λΌμΈ μ€μ§λ νμ΅ μλ리μ€λ₯Ό μ€μ νκ³ λͺ¨λΈμ μννμΌλ©°, λ§μ°¬κ°μ§λ‘ CARLSONμ΄ CNNλ³΄λ€ λμ μ±λ₯μ 보μ΄λ κ²μ νμΈνλ€.In this thesis, a continual and active machine learning method is proposed to make artificial intelligence (AI) robots adapt to real environments and form concepts of nearby objects. Recent advances in the field of AI have led to the development of smart home appliances or AI speakers, but most of these products may suffer performance degradation in actual use. This is because they use functions such as voice or face recognition without adjusting them to the individual operation environments. The deep learning techniques used for these functions need to be trained repeatedly with big data for a long time, and they have a risk of catastrophic forgetting when encountering increasingly diverse objects. Meanwhile, AI robots need to continuously learn skills and concepts from a relatively small number of newly acquired data. Since humans are the most well-known agents that learn this way, imitating human learning would be one of the most effective ways to achieve the desired robot learning. The proposed model, CARLSON, integrates the strengths of the previous human-like machine learning methods.
CARLSON is a self-organizing neural network that can expand the knowledge by comparing the incoming object image to the learned concepts. In order to increase the efficiency and stability of learning, the model first reduces the size and noise of high-dimensional input images by extracting informative features, or representations, from them. The feature extraction is carried out by an encoder which is jointly trained with a decoder that reconstructs images from representations. CARLSON divides the representations into groups in such a way that each group represents a single kind of objects, or an individual concept. The groups are implemented as nodes with means and variances that are created or adjusted by considering both top-down prediction and bottom-up activation as in Adaptive Resonance Theory (Grossberg 1987). The whole model including the encoder and decoder is trained in an end-to-end manner, and updated upon every new input. Using a label propagation method, CARLSON makes the similar nodes share information so that it can infer the object categories even when the labels are provided rarely. It can also actively ask a human operator about uncertain concepts to further make up for insufficient information.
To evaluate the model, a visual object dataset was constructed by collecting images with a humanoid robot NAO, and was used for object recognition experiments. CARLSON clearly outperformed a convolutional neural network (CNN) model and showed a stable performance even when the labels were given rarely and each data could be accessed only once during training. It also performed better than CNN in online semi-supervised recognition tasks using well-known digit and object classification datasets: MNIST, SVHN, Fashion-MNIST, and CIFAR-10.μ 1μ₯ μ λ‘ 1
μ 2μ₯ μ¬λμ νμ΅ λ°©μμ λͺ¨μ¬ν λ¨Έμ λ¬λ 5
2.1. μκΈ°μ‘°μ§ν μ κ²½λ§ 5
2.1.1. μΉμ λ
μκ³Ό k-νκ· κ΅°μ§ν 6
2.1.2. μκΈ°μ‘°μ§ν μ§λ 6
2.1.3. λ΄λ΄ κ°μ€ λ€νΈμν¬ 7
2.2. ν¨μ¨μ μΈ νμ΅ κΈ°μ 9
2.2.1. λ°μ΄ν°κ° μ μ λμ νμ΅ λ°©λ² 9
2.2.2. νμ§κ° μ μ λμ νμ΅ λ°©λ² 11
2.3. μ μ 곡λͺ
μ΄λ‘ μ ν΅ν μ§μ νμ΅ 12
μ 3μ₯ μ§μμ λ₯λ νμ΅ μκΈ°μ‘°μ§ν μ κ²½λ§ 15
3.1. νμ μΆμΆκ³Ό λ°μ΄ν° μ¬μμ± 15
3.2. κ°μ°μμ κ΅°μ§νμ λ
Έλ κ° μ 보 μ λ¬ 16
3.2.1. κ°λ
μμ±κ³Ό μ‘°μ 17
3.2.2. κ΅°μ§ νΉμ±μ νμ©ν μΈμ½λ νμ΅ 18
3.2.3. λ
Έλ κ° νμ§ μ 보 μ ν 21
3.3. λ°μ΄ν° μμ±μ ν΅ν λ₯λ νμ΅ 22
3.4. CARLSON νμ΅ λ° μΆλ‘ 23
μ 4μ₯ μΌλ°νλ CARLSON λͺ¨λΈ 25
4.1. μ κΈ°μ κ°λ
νμ± 26
4.1.1. λͺ
μμ κ°λ
μ‘°μ 27
4.1.2. κ°λ
λ³ν© λ° μΈλΆν 27
4.2. κ³μΈ΅μ νμν 28
4.3. λ₯λμ λ°μ΄ν° μ§μ 29
μ 5μ₯ μκ° κ°μ²΄ μΈμ μ€ν 31
5.1. μκ° κ°μ²΄ λ°μ΄ν°μ
31
5.1.1. λ‘λ΄μ μ΄μ©ν λ°μ΄ν° μμ§ 31
5.1.2. μ«μ λ° λ¬Όμ²΄ μΈμ λ°μ΄ν°μ
33
5.2. μ¨λΌμΈ μ€μ§λ νμ΅ μ€ν 33
5.2.1. λͺ¨λΈ ꡬν μμΈ 33
5.2.2. μ€ν μ€μ 34
5.2.3. μ€ν κ²°κ³Ό λ° λ
Όμ 35
μ 6μ₯ κ²° λ‘ 36
μ°Έκ³ λ¬Έν 37
Abstract 43Maste
Multimodal Hierarchical Dirichlet Process-based Active Perception
In this paper, we propose an active perception method for recognizing object
categories based on the multimodal hierarchical Dirichlet process (MHDP). The
MHDP enables a robot to form object categories using multimodal information,
e.g., visual, auditory, and haptic information, which can be observed by
performing actions on an object. However, performing many actions on a target
object requires a long time. In a real-time scenario, i.e., when the time is
limited, the robot has to determine the set of actions that is most effective
for recognizing a target object. We propose an MHDP-based active perception
method that uses the information gain (IG) maximization criterion and lazy
greedy algorithm. We show that the IG maximization criterion is optimal in the
sense that the criterion is equivalent to a minimization of the expected
Kullback--Leibler divergence between a final recognition state and the
recognition state after the next set of actions. However, a straightforward
calculation of IG is practically impossible. Therefore, we derive an efficient
Monte Carlo approximation method for IG by making use of a property of the
MHDP. We also show that the IG has submodular and non-decreasing properties as
a set function because of the structure of the graphical model of the MHDP.
Therefore, the IG maximization problem is reduced to a submodular maximization
problem. This means that greedy and lazy greedy algorithms are effective and
have a theoretical justification for their performance. We conducted an
experiment using an upper-torso humanoid robot and a second one using synthetic
data. The experimental results show that the method enables the robot to select
a set of actions that allow it to recognize target objects quickly and
accurately. The results support our theoretical outcomes.Comment: submitte
A Dataset of Anatomical Environments for Medical Robots: Modeling Respiratory Deformation
Anatomical models of a medical robot's environment can significantly help
guide design and development of a new robotic system. These models can be used
for benchmarking motion planning algorithms, evaluating controllers, optimizing
mechanical design choices, simulating procedures, and even as resources for
data generation. Currently, the time-consuming task of generating these
environments is repeatedly performed by individual research groups and rarely
shared broadly. This not only leads to redundant efforts, but also makes it
challenging to compare systems and algorithms accurately. In this work, we
present a collection of clinically-relevant anatomical environments for medical
robots operating in the lungs. Since anatomical deformation is a fundamental
challenge for medical robots operating in the lungs, we describe a way to model
respiratory deformation in these environments using patient-derived data. We
share the environments and deformation data publicly by adding them to the
Medical Robotics Anatomical Dataset (Med-RAD), our public dataset of anatomical
environments for medical robots
Human-Machine Collaborative Optimization via Apprenticeship Scheduling
Coordinating agents to complete a set of tasks with intercoupled temporal and
resource constraints is computationally challenging, yet human domain experts
can solve these difficult scheduling problems using paradigms learned through
years of apprenticeship. A process for manually codifying this domain knowledge
within a computational framework is necessary to scale beyond the
``single-expert, single-trainee" apprenticeship model. However, human domain
experts often have difficulty describing their decision-making processes,
causing the codification of this knowledge to become laborious. We propose a
new approach for capturing domain-expert heuristics through a pairwise ranking
formulation. Our approach is model-free and does not require enumerating or
iterating through a large state space. We empirically demonstrate that this
approach accurately learns multifaceted heuristics on a synthetic data set
incorporating job-shop scheduling and vehicle routing problems, as well as on
two real-world data sets consisting of demonstrations of experts solving a
weapon-to-target assignment problem and a hospital resource allocation problem.
We also demonstrate that policies learned from human scheduling demonstration
via apprenticeship learning can substantially improve the efficiency of a
branch-and-bound search for an optimal schedule. We employ this human-machine
collaborative optimization technique on a variant of the weapon-to-target
assignment problem. We demonstrate that this technique generates solutions
substantially superior to those produced by human domain experts at a rate up
to 9.5 times faster than an optimization approach and can be applied to
optimally solve problems twice as complex as those solved by a human
demonstrator.Comment: Portions of this paper were published in the Proceedings of the
International Joint Conference on Artificial Intelligence (IJCAI) in 2016 and
in the Proceedings of Robotics: Science and Systems (RSS) in 2016. The paper
consists of 50 pages with 11 figures and 4 table