25 research outputs found

    Framework of active robot learning

    Get PDF
    A thesis submitted to the University of Bedfordshire, in fulfilment of the requirements for the degree of Master of Science by researchIn recent years, cognitive robots have become an attractive research area of Artificial Intelligent (AI). High-order beliefs for cognitive robots regard the robots' thought about their users' intention and preference. The existing approaches to the development of such beliefs through machine learning rely on particular social cues or specifically defined award functions . Therefore, their applications can be limited. This study carried out primary research on active robot learning (ARL) which facilitates a robot to develop high-order beliefs by actively collecting/discovering evidence it needs. The emphasis is on active learning, but not teaching. Hence, social cues and award functions are not necessary. In this study, the framework of ARL was developed. Fuzzy logic was employed in the framework for controlling robot and for identifying high-order beliefs. A simulation environment was set up where a human and a cognitive robot were modelled using MATLAB, and ARL was implemented through simulation. Simulations were also performed in this study where the human and the robot tried to jointly lift a stick and keep the stick level. The simulation results show that under the framework a robot is able to discover the evidence it needs to confirm its user's intention

    Test moment determination design in active robot learning

    Get PDF
    A thesis submitted to the University of Bedfordshire, in fulfilment of the requirements for the degree of Master of Science by researchIn recent years, service robots have been increasingly used in people's daily live. These robots are autonomous or semiautonomous and are able to cooperate with their human users. Active robot learning (ARL) is an approach to the development of beliefs for the robots on their users' intention and preference, which is needed by the robots to facilitate the seamless cooperation with humans. This approach allows a robot to perform tests on its users and to build up the high-order beliefs according to the users' responses. This study carried out primary research on designing the test moment determination component in ARL framework. The test moment determination component is used to decide right moment of taking a test action. In this study, an action plan theory was suggested to synthesis actions into a sequence, that is, an action plan, for a given task. All actions are defined in a special format of precondition, action, post-condition and testing time. Forward chaining reasoning was introduced to establish connection between the actions and to synthesis individual actions into an action plan, corresponding to the given task. A simulation environment was set up where a human user and a service robot were modelled using MATLAB. Fuzzy control was employed for controlling the robot to carry out the cooperative action. In order to examine the effect of test moment determination component, simulations were performed to execute a scenario where a robot passes on an object to a human user. The simulation results show that an action plan can be formed according to provided conditions and executed by simulated models properly. Test actions were taken at the moment determined by the test moment determination component to find the human user's intention

    The development of test action bank for active robot learning

    Get PDF
    A thesis submitted to the University of Bedfordshire, in fulfilment of the requirements for the degree of Master of Science by researchIn the rapidly expanding service robotics research area, interactions between robots and humans become increasingly cornmon as more and more jobs will require cooperation between the robots and their human users. It is important to address cooperation between a robot and its user. ARL is a promising approach which facilitates a robot to develop high-order beliefs by actively performing test actions in order to obtain its user's intention from his responses to the actions. Test actions are crucial to ARL. This study carried out primary research on developing a Test Action Bank (TAB) to provide test actions for ARL. In this study, a verb-based task classifier was developed to extract tasks from user's commands. Taught tasks and their corresponding test actions were proposed and stored in database to establish the TAB. A backward test actions retrieval method was used to locate a task in a task tree and retrieve its test actions from TAB. A simulation environment was set up with a service robot model and a user model to test TAB and demonstrate some test actions. Simulations were also perfonned in this study, the simulation results proved TAB can successfully provide test actions according to different tasks and the proposed service robot model can demonstrate test actions

    Continual Active Robot Learning using Self-organizing Neural Network

    Get PDF
    ν•™μœ„λ…Όλ¬Έ (석사) -- μ„œμšΈλŒ€ν•™κ΅ λŒ€ν•™μ› : κ³΅κ³ΌλŒ€ν•™ 컴퓨터곡학뢀, 2021. 2. μž₯병탁.이 λ…Όλ¬Έμ—μ„œλŠ” 인곡지λŠ₯ λ‘œλ΄‡μ΄ μ‹€μ œ ν™˜κ²½μ— μ μ‘ν•˜λ©΄μ„œ μ£Όλ³€μ—μ„œ μ ‘ν•˜λŠ” λŒ€μƒμ˜ κ°œλ…μ„ 지속적이고 λŠ₯λ™μ μœΌλ‘œ ν•™μŠ΅ν•˜λŠ” 방법을 μ œμ•ˆν•œλ‹€. 졜근 λ”₯λŸ¬λ‹μ΄ λΉ„μ•½μ μœΌλ‘œ λ°œμ „ν•˜λ©΄μ„œ 인곡지λŠ₯ κ°€μ „, μŠ€ν”Όμ»€ 등이 개발되고 μžˆμœΌλ‚˜, 이런 μ œν’ˆ λŒ€λΆ€λΆ„μ€ μΌκ΄„μ μœΌλ‘œ ν•™μŠ΅λœ μŒμ„± μΈμ‹μ΄λ‚˜ μ–Όκ΅΄ 인식 같은 κΈ°λŠ₯을 μ΄μš©ν•˜κΈ° λ•Œλ¬Έμ— κ°œλ³„ λ™μž‘ ν™˜κ²½μ΄ ν•™μŠ΅ ν™˜κ²½κ³Ό λ‹€λ₯΄λ‹€λ©΄ μ„±λŠ₯이 크게 μ €ν•˜λ  수 μžˆλ‹€. λ˜ν•œ, 여기에 ν™œμš©λ˜λŠ” λ”₯λŸ¬λ‹ λͺ¨λΈμ€ λŒ€λŸ‰μ˜ λ°μ΄ν„°λ‘œ 였랜 μ‹œκ°„ ν•™μŠ΅μ‹œμΌœμ•Ό ν•˜κ³  μž…λ ₯ μˆœμ„œμ— 따라 파괴적 망각이 λ‚˜νƒ€λ‚  수 μžˆλ‹€λŠ” ν•œκ³„κ°€ μžˆλ‹€. 인곡지λŠ₯ λ‘œλ΄‡μ€ μƒˆλ‘œ κ°μ§€ν•œ μ†Œμˆ˜μ˜ 데이터λ₯Ό 계속 ν•™μŠ΅ν•΄ λ‚˜κ°€λŠ” 것이 ν•„μš”ν•˜λ©°, 이 μ—°κ΅¬μ—μ„œλŠ” 이λ₯Ό μœ„ν•΄ μ‚¬λžŒμ˜ ν•™μŠ΅ 방식을 λͺ¨μ‚¬ν•˜λŠ” 데 μ΄ˆμ μ„ λ§žμΆ”μ—ˆλ‹€. μžκΈ°μ‘°μ§ν™” 신경망, 온라인 쀀지도 λŠ₯동 ν•™μŠ΅μ„ λΉ„λ‘―ν•˜μ—¬ μ‚¬λžŒμ˜ ν•™μŠ΅ 방식을 λͺ¨μ‚¬ν•œ κΈ°μ‘΄ λ¨Έμ‹ λŸ¬λ‹ 기법을 λͺ¨λΈ ꡬ쑰와 ν•™μŠ΅ 기제 μΈ‘λ©΄μ—μ„œ λΆ„μ„ν•˜κ³  μ΄λ“€μ˜ μž₯점을 μ’…ν•©ν•  수 μžˆλŠ” μƒˆλ‘œμš΄ λͺ¨λΈμΈ CARLSON을 κ°œλ°œν–ˆλ‹€. CARLSON은 λ‘œλ΄‡μ΄ κ΄€μΈ‘ν•œ 물체 이미지λ₯Ό μž…λ ₯λ°›μ•„ 물체 κ°œλ…μ„ ν•™μŠ΅ν•˜λ©°, μƒˆλ‘œμš΄ 데이터λ₯Ό κΈ°μ‘΄ κ°œλ…κ³Ό λŒ€μ‘°ν•˜λ©΄μ„œ 지식을 ν™•μž₯ν•΄ λ‚˜κ°€λŠ” μžκΈ°μ‘°μ§ν™” 신경망 ꡬ쑰둜 λ˜μ–΄ μžˆλ‹€. 물체 μ΄λ―Έμ§€λŠ” 차원이 λ†’κ³  μž‘μŒμ„ ν¬ν•¨ν•˜λ―€λ‘œ, 효율적이고 μ•ˆμ •μ μΈ ν•™μŠ΅μ„ μœ„ν•˜μ—¬ μ΄λ―Έμ§€μ—μ„œ 핡심적인 ν‘œμƒμ„ μš°μ„  μΆ”μΆœν•˜λ„λ‘ ν–ˆλ‹€. ν‘œμƒ μΆ”μΆœμ€ λͺ¨λΈμ˜ 인코더(encoder) 뢀뢄이 μˆ˜ν–‰ν•˜λ©°, μ΄λŠ” ν‘œμƒμ„ μ΄λ―Έμ§€λ‘œ λ³΅μ›ν•˜λŠ” 디코더(decoder)와 ν•¨κ»˜ ν›ˆλ ¨λœλ‹€. μΈμ½”λ”μ—μ„œ μΆ”μΆœλœ ν‘œμƒλ“€μ€ μƒν˜Έ μœ μ‚¬λ„μ— 따라 μ—¬λŸ¬ κ°œλ…μœΌλ‘œ λ‚˜λ‰˜κ³  각 κ°œλ…μ€ λŒ€ν‘œ ν‘œμƒμ„ κ°€μ§€λŠ” ν•˜λ‚˜μ˜ λ…Έλ“œ(node)둜 κ΅°μ§‘ν™”λœλ‹€. ꡰ집화 κ³Όμ •μ˜ λ…Έλ“œ 좔가와 쑰정은 적응 곡λͺ… 이둠(Grossberg 1987)μ—μ„œμ²˜λŸΌ ν•˜ν–₯식 예츑과 상ν–₯식 ν™œμ„±ν™”λ₯Ό 톡해 이루어진닀. 인코더와 디코더λ₯Ό ν¬ν•¨ν•œ 전체 λͺ¨λΈμ€ 데이터가 μž…λ ₯될 λ•Œλ§ˆλ‹€ ν•™μŠ΅ν•˜λ©°, ν‘œμ§€ μ „νŒŒ 기법을 톡해 μœ μ‚¬ν•œ λ…Έλ“œ 간에 정보λ₯Ό μ „λ‹¬ν•˜κ³  λΆˆν™•μ‹€ν•œ κ°œλ…μ— λŒ€ν•΄μ„œλŠ” λŠ₯동적 질의λ₯Ό 톡해 정보λ₯Ό λ³΄μΆ©ν•¨μœΌλ‘œμ¨ 데이터가 적고 μ •λ‹΅ ν‘œμ§€κ°€ λ“œλ¬Ό λ•Œλ„ 효과적으둜 ν•™μŠ΅ν•  수 μžˆλ‹€. 이 μ—°κ΅¬μ—μ„œλŠ” μ‹€μ œ λ‘œλ΄‡μ—μ„œ λͺ¨λΈμ˜ μ„±λŠ₯을 κ²€μ¦ν•˜κΈ° μœ„ν•˜μ—¬ νœ΄λ¨Έλ…Έμ΄λ“œ λ‘œλ΄‡μΈ NAO둜 연속적인 물체 이미지λ₯Ό μˆ˜μ§‘ν•˜κ³  μ‹œκ° 객체 인식 μ‹€ν—˜μ„ μˆ˜ν–‰ν–ˆλ‹€. CARLSON은 일반적인 λ”₯λŸ¬λ‹ λͺ¨λΈμΈ ν•©μ„±κ³± 신경망(CNN)보닀 ν™•μ—°νžˆ 높은 λΆ„λ₯˜ 정확도λ₯Ό λ³΄μ˜€μœΌλ©°, 데이터 μˆ˜μ™€ ν‘œμ§€κ°€ 적고 각 데이터λ₯Ό ν•œ λ²ˆμ”©λ§Œ ν•™μŠ΅ν•  수 μžˆλŠ” μ œμ•½ν•˜μ—μ„œλ„ μ•ˆμ •μ μœΌλ‘œ λ™μž‘ν•˜λŠ” 것을 검증할 수 μžˆμ—ˆλ‹€. μΆ”κ°€λ‘œ 잘 μ•Œλ €μ§„ 숫자 및 물체 인식 데이터셋인 MNIST, SVHN, Fashion-MNIST, CIFAR-10μ—μ„œ 온라인 쀀지도 ν•™μŠ΅ μ‹œλ‚˜λ¦¬μ˜€λ₯Ό μ„€μ •ν•˜κ³  λͺ¨λΈμ„ μ‹œν—˜ν–ˆμœΌλ©°, λ§ˆμ°¬κ°€μ§€λ‘œ CARLSON이 CNN보닀 높은 μ„±λŠ₯을 λ³΄μ΄λŠ” 것을 ν™•μΈν–ˆλ‹€.In this thesis, a continual and active machine learning method is proposed to make artificial intelligence (AI) robots adapt to real environments and form concepts of nearby objects. Recent advances in the field of AI have led to the development of smart home appliances or AI speakers, but most of these products may suffer performance degradation in actual use. This is because they use functions such as voice or face recognition without adjusting them to the individual operation environments. The deep learning techniques used for these functions need to be trained repeatedly with big data for a long time, and they have a risk of catastrophic forgetting when encountering increasingly diverse objects. Meanwhile, AI robots need to continuously learn skills and concepts from a relatively small number of newly acquired data. Since humans are the most well-known agents that learn this way, imitating human learning would be one of the most effective ways to achieve the desired robot learning. The proposed model, CARLSON, integrates the strengths of the previous human-like machine learning methods. CARLSON is a self-organizing neural network that can expand the knowledge by comparing the incoming object image to the learned concepts. In order to increase the efficiency and stability of learning, the model first reduces the size and noise of high-dimensional input images by extracting informative features, or representations, from them. The feature extraction is carried out by an encoder which is jointly trained with a decoder that reconstructs images from representations. CARLSON divides the representations into groups in such a way that each group represents a single kind of objects, or an individual concept. The groups are implemented as nodes with means and variances that are created or adjusted by considering both top-down prediction and bottom-up activation as in Adaptive Resonance Theory (Grossberg 1987). The whole model including the encoder and decoder is trained in an end-to-end manner, and updated upon every new input. Using a label propagation method, CARLSON makes the similar nodes share information so that it can infer the object categories even when the labels are provided rarely. It can also actively ask a human operator about uncertain concepts to further make up for insufficient information. To evaluate the model, a visual object dataset was constructed by collecting images with a humanoid robot NAO, and was used for object recognition experiments. CARLSON clearly outperformed a convolutional neural network (CNN) model and showed a stable performance even when the labels were given rarely and each data could be accessed only once during training. It also performed better than CNN in online semi-supervised recognition tasks using well-known digit and object classification datasets: MNIST, SVHN, Fashion-MNIST, and CIFAR-10.제 1μž₯ μ„œ λ‘  1 제 2μž₯ μ‚¬λžŒμ˜ ν•™μŠ΅ 방식을 λͺ¨μ‚¬ν•œ λ¨Έμ‹ λŸ¬λ‹ 5 2.1. μžκΈ°μ‘°μ§ν™” 신경망 5 2.1.1. 승자 독식과 k-평균 ꡰ집화 6 2.1.2. μžκΈ°μ‘°μ§ν™” 지도 6 2.1.3. λ‰΄λŸ΄ κ°€μŠ€ λ„€νŠΈμ›Œν¬ 7 2.2. 효율적인 ν•™μŠ΅ 기제 9 2.2.1. 데이터가 적을 λ•Œμ˜ ν•™μŠ΅ 방법 9 2.2.2. ν‘œμ§€κ°€ 적을 λ•Œμ˜ ν•™μŠ΅ 방법 11 2.3. 적응 곡λͺ… 이둠을 ν†΅ν•œ 지속 ν•™μŠ΅ 12 제 3μž₯ 지속적 λŠ₯동 ν•™μŠ΅ μžκΈ°μ‘°μ§ν™” 신경망 15 3.1. ν‘œμƒ μΆ”μΆœκ³Ό 데이터 μž¬μƒμ„± 15 3.2. κ°€μš°μ‹œμ•ˆ ꡰ집화와 λ…Έλ“œ κ°„ 정보 전달 16 3.2.1. κ°œλ… 생성과 μ‘°μ • 17 3.2.2. ꡰ집 νŠΉμ„±μ„ ν™œμš©ν•œ 인코더 ν•™μŠ΅ 18 3.2.3. λ…Έλ“œ κ°„ ν‘œμ§€ 정보 μ „νŒŒ 21 3.3. 데이터 생성을 ν†΅ν•œ λŠ₯동 ν•™μŠ΅ 22 3.4. CARLSON ν•™μŠ΅ 및 μΆ”λ‘  23 제 4μž₯ μΌλ°˜ν™”λœ CARLSON λͺ¨λΈ 25 4.1. 유기적 κ°œλ… ν˜•μ„± 26 4.1.1. λͺ…μ‹œμ  κ°œλ… μ‘°μ • 27 4.1.2. κ°œλ… 병합 및 μ„ΈλΆ„ν™” 27 4.2. 계측적 ν‘œμƒν™” 28 4.3. λŠ₯동적 데이터 질의 29 제 5μž₯ μ‹œκ° 객체 인식 μ‹€ν—˜ 31 5.1. μ‹œκ° 객체 데이터셋 31 5.1.1. λ‘œλ΄‡μ„ μ΄μš©ν•œ 데이터 μˆ˜μ§‘ 31 5.1.2. 숫자 및 물체 인식 데이터셋 33 5.2. 온라인 쀀지도 ν•™μŠ΅ μ‹€ν—˜ 33 5.2.1. λͺ¨λΈ κ΅¬ν˜„ 상세 33 5.2.2. μ‹€ν—˜ μ„€μ • 34 5.2.3. μ‹€ν—˜ κ²°κ³Ό 및 λ…Όμ˜ 35 제 6μž₯ κ²° λ‘  36 μ°Έκ³  λ¬Έν—Œ 37 Abstract 43Maste

    Multimodal Hierarchical Dirichlet Process-based Active Perception

    Full text link
    In this paper, we propose an active perception method for recognizing object categories based on the multimodal hierarchical Dirichlet process (MHDP). The MHDP enables a robot to form object categories using multimodal information, e.g., visual, auditory, and haptic information, which can be observed by performing actions on an object. However, performing many actions on a target object requires a long time. In a real-time scenario, i.e., when the time is limited, the robot has to determine the set of actions that is most effective for recognizing a target object. We propose an MHDP-based active perception method that uses the information gain (IG) maximization criterion and lazy greedy algorithm. We show that the IG maximization criterion is optimal in the sense that the criterion is equivalent to a minimization of the expected Kullback--Leibler divergence between a final recognition state and the recognition state after the next set of actions. However, a straightforward calculation of IG is practically impossible. Therefore, we derive an efficient Monte Carlo approximation method for IG by making use of a property of the MHDP. We also show that the IG has submodular and non-decreasing properties as a set function because of the structure of the graphical model of the MHDP. Therefore, the IG maximization problem is reduced to a submodular maximization problem. This means that greedy and lazy greedy algorithms are effective and have a theoretical justification for their performance. We conducted an experiment using an upper-torso humanoid robot and a second one using synthetic data. The experimental results show that the method enables the robot to select a set of actions that allow it to recognize target objects quickly and accurately. The results support our theoretical outcomes.Comment: submitte

    A Dataset of Anatomical Environments for Medical Robots: Modeling Respiratory Deformation

    Full text link
    Anatomical models of a medical robot's environment can significantly help guide design and development of a new robotic system. These models can be used for benchmarking motion planning algorithms, evaluating controllers, optimizing mechanical design choices, simulating procedures, and even as resources for data generation. Currently, the time-consuming task of generating these environments is repeatedly performed by individual research groups and rarely shared broadly. This not only leads to redundant efforts, but also makes it challenging to compare systems and algorithms accurately. In this work, we present a collection of clinically-relevant anatomical environments for medical robots operating in the lungs. Since anatomical deformation is a fundamental challenge for medical robots operating in the lungs, we describe a way to model respiratory deformation in these environments using patient-derived data. We share the environments and deformation data publicly by adding them to the Medical Robotics Anatomical Dataset (Med-RAD), our public dataset of anatomical environments for medical robots

    Human-Machine Collaborative Optimization via Apprenticeship Scheduling

    Full text link
    Coordinating agents to complete a set of tasks with intercoupled temporal and resource constraints is computationally challenging, yet human domain experts can solve these difficult scheduling problems using paradigms learned through years of apprenticeship. A process for manually codifying this domain knowledge within a computational framework is necessary to scale beyond the ``single-expert, single-trainee" apprenticeship model. However, human domain experts often have difficulty describing their decision-making processes, causing the codification of this knowledge to become laborious. We propose a new approach for capturing domain-expert heuristics through a pairwise ranking formulation. Our approach is model-free and does not require enumerating or iterating through a large state space. We empirically demonstrate that this approach accurately learns multifaceted heuristics on a synthetic data set incorporating job-shop scheduling and vehicle routing problems, as well as on two real-world data sets consisting of demonstrations of experts solving a weapon-to-target assignment problem and a hospital resource allocation problem. We also demonstrate that policies learned from human scheduling demonstration via apprenticeship learning can substantially improve the efficiency of a branch-and-bound search for an optimal schedule. We employ this human-machine collaborative optimization technique on a variant of the weapon-to-target assignment problem. We demonstrate that this technique generates solutions substantially superior to those produced by human domain experts at a rate up to 9.5 times faster than an optimization approach and can be applied to optimally solve problems twice as complex as those solved by a human demonstrator.Comment: Portions of this paper were published in the Proceedings of the International Joint Conference on Artificial Intelligence (IJCAI) in 2016 and in the Proceedings of Robotics: Science and Systems (RSS) in 2016. The paper consists of 50 pages with 11 figures and 4 table
    corecore