1,048 research outputs found

    A Survey of Brain Inspired Technologies for Engineering

    Full text link
    Cognitive engineering is a multi-disciplinary field and hence it is difficult to find a review article consolidating the leading developments in the field. The in-credible pace at which technology is advancing pushes the boundaries of what is achievable in cognitive engineering. There are also differing approaches to cognitive engineering brought about from the multi-disciplinary nature of the field and the vastness of possible applications. Thus research communities require more frequent reviews to keep up to date with the latest trends. In this paper we shall dis-cuss some of the approaches to cognitive engineering holistically to clarify the reasoning behind the different approaches and to highlight their strengths and weaknesses. We shall then show how developments from seemingly disjointed views could be integrated to achieve the same goal of creating cognitive machines. By reviewing the major contributions in the different fields and showing the potential for a combined approach, this work intends to assist the research community in devising more unified methods and techniques for developing cognitive machines

    A Neural Framework for Organization and Flexible Utilization of Episodic Memory in Cumulatively Learning Baby Humanoids

    Get PDF
    Cumulatively developing robots offer a unique opportunity to reenact the constant interplay between neural mechanisms related to learning, memory, prospection, and abstraction from the perspective of an integrated system that acts, learns, remembers, reasons, and makes mistakes. Situated within such interplay lie some of the computationally elusive and fundamental aspects of cognitive behavior: the ability to recall and flexibly exploit diverse experiences of oneโ€™s past in the context of the present to realize goals, simulate the future, and keep learning further. This article is an adventurous exploration in this direction using a simple engaging scenario of how the humanoid iCub learns to construct the tallest possible stack given an arbitrary set of objects to play with. The learning takes place cumulatively, with the robot interacting with different objects (some previously experienced, some novel) in an open-ended fashion. Since the solution itself depends on what objects are available in the โ€œnow,โ€ multiple episodes of past experiences have to be remembered and creatively integrated in the context of the present to be successful. Starting from zero, where the robot knows nothing, we explore the computational basis of organization episodic memory in a cumulatively learning humanoid and address (1) how relevant past experiences can be reconstructed based on the present context, (2) how multiple stored episodic memories compete to survive in the neural space and not be forgotten, (3) how remembered past experiences can be combined with explorative actions to learn something new, and (4) how multiple remembered experiences can be recombined to generate novel behaviors (without exploration). Through the resulting behaviors of the robot as it builds, breaks, learns, and remembers, we emphasize that mechanisms of episodic memory are fundamental design features necessary to enable the survival of autonomous robots in a real world where neither everything can be known nor can everything be experienced

    Task-adaptable, Pervasive Perception for Robots Performing Everyday Manipulation

    Get PDF
    Intelligent robotic agents that help us in our day-to-day chores have been an aspiration of robotics researchers for decades. More than fifty years since the creation of the first intelligent mobile robotic agent, robots are still struggling to perform seemingly simple tasks, such as setting or cleaning a table. One of the reasons for this is that the unstructured environments these robots are expected to work in impose demanding requirements on a robota s perception system. Depending on the manipulation task the robot is required to execute, different parts of the environment need to be examined, the objects in it found and functional parts of these identified. This is a challenging task, since the visual appearance of the objects and the variety of scenes they are found in are large. This thesis proposes to treat robotic visual perception for everyday manipulation tasks as an open question-asnswering problem. To this end RoboSherlock, a framework for creating task-adaptable, pervasive perception systems is presented. Using the framework, robot perception is addressed from a systema s perspective and contributions to the state-of-the-art are proposed that introduce several enhancements which scale robot perception toward the needs of human-level manipulation. The contributions of the thesis center around task-adaptability and pervasiveness of perception systems. A perception task-language and a language interpreter that generates task-relevant perception plans is proposed. The task-language and task-interpreter leverage the power of knowledge representation and knowledge-based reasoning in order to enhance the question-answering capabilities of the system. Pervasiveness, a seamless integration of past, present and future percepts, is achieved through three main contributions: a novel way for recording, replaying and inspecting perceptual episodic memories, a new perception component that enables pervasive operation and maintains an object belief state and a novel prospection component that enables robots to relive their past experiences and anticipate possible future scenarios. The contributions are validated through several real world robotic experiments that demonstrate how the proposed system enhances robot perception

    From focused thought to reveries: A memory system for a conscious robot

    Full text link
    ยฉ 2018 Balkenius, Tjรธstheim, Johansson and Gรคrdenfors. We introduce a memory model for robots that can account for many aspects of an inner world, ranging from object permanence, episodic memory, and planning to imagination and reveries. It is modeled after neurophysiological data and includes parts of the cerebral cortex together with models of arousal systems that are relevant for consciousness. The three central components are an identification network, a localization network, and a working memory network. Attention serves as the interface between the inner and the external world. It directs the flow of information from sensory organs to memory, as well as controlling top-down influences on perception. It also compares external sensations to internal top-down expectations. The model is tested in a number of computer simulations that illustrate how it can operate as a component in various cognitive tasks including perception, the A-not-B test, delayed matching to sample, episodic recall, and vicarious trial and error

    EEG theta and Mu oscillations during perception of human and robot actions.

    Get PDF
    The perception of others' actions supports important skills such as communication, intention understanding, and empathy. Are mechanisms of action processing in the human brain specifically tuned to process biological agents? Humanoid robots can perform recognizable actions, but can look and move differently from humans, and as such, can be used in experiments to address such questions. Here, we recorded EEG as participants viewed actions performed by three agents. In the Human condition, the agent had biological appearance and motion. The other two conditions featured a state-of-the-art robot in two different appearances: Android, which had biological appearance but mechanical motion, and Robot, which had mechanical appearance and motion. We explored whether sensorimotor mu (8-13 Hz) and frontal theta (4-8 Hz) activity exhibited selectivity for biological entities, in particular for whether the visual appearance and/or the motion of the observed agent was biological. Sensorimotor mu suppression has been linked to the motor simulation aspect of action processing (and the human mirror neuron system, MNS), and frontal theta to semantic and memory-related aspects. For all three agents, action observation induced significant attenuation in the power of mu oscillations, with no difference between agents. Thus, mu suppression, considered an index of MNS activity, does not appear to be selective for biological agents. Observation of the Robot resulted in greater frontal theta activity compared to the Android and the Human, whereas the latter two did not differ from each other. Frontal theta thus appears to be sensitive to visual appearance, suggesting agents that are not sufficiently biological in appearance may result in greater memory processing demands for the observer. Studies combining robotics and neuroscience such as this one can allow us to explore neural basis of action processing on the one hand, and inform the design of social robots on the other

    DAC-h3: A Proactive Robot Cognitive Architecture to Acquire and Express Knowledge About the World and the Self

    Get PDF
    This paper introduces a cognitive architecture for a humanoid robot to engage in a proactive, mixed-initiative exploration and manipulation of its environment, where the initiative can originate from both the human and the robot. The framework, based on a biologically-grounded theory of the brain and mind, integrates a reactive interaction engine, a number of state-of-the art perceptual and motor learning algorithms, as well as planning abilities and an autobiographical memory. The architecture as a whole drives the robot behavior to solve the symbol grounding problem, acquire language capabilities, execute goal-oriented behavior, and express a verbal narrative of its own experience in the world. We validate our approach in human-robot interaction experiments with the iCub humanoid robot, showing that the proposed cognitive architecture can be applied in real time within a realistic scenario and that it can be used with naive users

    A Survey of Embodied AI: From Simulators to Research Tasks

    Full text link
    There has been an emerging paradigm shift from the era of "internet AI" to "embodied AI", where AI algorithms and agents no longer learn from datasets of images, videos or text curated primarily from the internet. Instead, they learn through interactions with their environments from an egocentric perception similar to humans. Consequently, there has been substantial growth in the demand for embodied AI simulators to support various embodied AI research tasks. This growing interest in embodied AI is beneficial to the greater pursuit of Artificial General Intelligence (AGI), but there has not been a contemporary and comprehensive survey of this field. This paper aims to provide an encyclopedic survey for the field of embodied AI, from its simulators to its research. By evaluating nine current embodied AI simulators with our proposed seven features, this paper aims to understand the simulators in their provision for use in embodied AI research and their limitations. Lastly, this paper surveys the three main research tasks in embodied AI -- visual exploration, visual navigation and embodied question answering (QA), covering the state-of-the-art approaches, evaluation metrics and datasets. Finally, with the new insights revealed through surveying the field, the paper will provide suggestions for simulator-for-task selections and recommendations for the future directions of the field.Comment: Under Review for IEEE TETC

    ์˜๋ฏธ๋ก ์  ํ™˜๊ฒฝ ์ดํ•ด ๊ธฐ๋ฐ˜ ์ธ๊ฐ„ ๋กœ๋ด‡ ํ˜‘์—…

    Get PDF
    ํ•™์œ„๋…ผ๋ฌธ(๋ฐ•์‚ฌ)--์„œ์šธ๋Œ€ํ•™๊ต ๋Œ€ํ•™์› :๊ณต๊ณผ๋Œ€ํ•™ ์ „๊ธฐยท์ •๋ณด๊ณตํ•™๋ถ€,2020. 2. ์ด๋ฒ”ํฌ.Human-robot cooperation is unavoidable in various applications ranging from manufacturing to field robotics owing to the advantages of adaptability and high flexibility. Especially, complex task planning in large, unconstructed, and uncertain environments can employ the complementary capabilities of human and diverse robots. For a team to be effectives, knowledge regarding team goals and current situation needs to be effectively shared as they affect decision making. In this respect, semantic scene understanding in natural language is one of the most fundamental components for information sharing between humans and heterogeneous robots, as robots can perceive the surrounding environment in a form that both humans and other robots can understand. Moreover, natural-language-based scene understanding can reduce network congestion and improve the reliability of acquired data. Especially, in field robotics, transmission of raw sensor data increases network bandwidth and decreases quality of service. We can resolve this problem by transmitting information in the form of natural language that has encoded semantic representations of environments. In this dissertation, I introduce a human and heterogeneous robot cooperation scheme based on semantic scene understanding. I generate sentences and scene graphs, which is a natural language grounded graph over the detected objects and their relationships, with the graph map generated using a robot mapping algorithm. Subsequently, a framework that can utilize the results for cooperative mission planning of humans and robots is proposed. Experiments were performed to verify the effectiveness of the proposed methods. This dissertation comprises two parts: graph-based scene understanding and scene understanding based on the cooperation between human and heterogeneous robots. For the former, I introduce a novel natural language processing method using a semantic graph map. Although semantic graph maps have been widely applied to study the perceptual aspects of the environment, such maps do not find extensive application in natural language processing tasks. Several studies have been conducted on the understanding of workspace images in the field of computer vision; in these studies, the sentences were automatically generated, and therefore, multiple scenes have not yet been utilized for sentence generation. A graph-based convolutional neural network, which comprises spectral graph convolution and graph coarsening, and a recurrent neural network are employed to generate sentences attention over graphs. The proposed method outperforms the conventional methods on a publicly available dataset for single scenes and can be utilized for sequential scenes. Recently, deep learning has demonstrated impressive developments in scene understanding using natural language. However, it has not been extensively applied to high-level processes such as causal reasoning, analogical reasoning, or planning. The symbolic approach that calculates the sequence of appropriate actions by combining the available skills of agents outperforms in reasoning and planning; however, it does not entirely consider semantic knowledge acquisition for human-robot information sharing. An architecture that combines deep learning techniques and symbolic planner for human and heterogeneous robots to achieve a shared goal based on semantic scene understanding is proposed for scene understanding based on human-robot cooperation. In this study, graph-based perception is used for scene understanding. A planning domain definition language (PDDL) planner and JENA-TDB are utilized for mission planning and data acquisition storage, respectively. The effectiveness of the proposed method is verified in two situations: a mission failure, in which the dynamic environment changes, and object detection in a large and unseen environment.์ธ๊ฐ„๊ณผ ์ด์ข… ๋กœ๋ด‡ ๊ฐ„์˜ ํ˜‘์—…์€ ๋†’์€ ์œ ์—ฐ์„ฑ๊ณผ ์ ์‘๋ ฅ์„ ๋ณด์ผ ์ˆ˜ ์žˆ๋‹ค๋Š” ์ ์—์„œ ์ œ์กฐ์—…์—์„œ ํ•„๋“œ ๋กœ๋ณดํ‹ฑ์Šค๊นŒ์ง€ ๋‹ค์–‘ํ•œ ๋ถ„์•ผ์—์„œ ํ•„์—ฐ์ ์ด๋‹ค. ํŠนํžˆ, ์„œ๋กœ ๋‹ค๋ฅธ ๋Šฅ๋ ฅ์„ ์ง€๋‹Œ ๋กœ๋ด‡๋“ค๊ณผ ์ธ๊ฐ„์œผ๋กœ ๊ตฌ์„ฑ๋œ ํ•˜๋‚˜์˜ ํŒ€์€ ๋„“๊ณ  ์ •ํ˜•ํ™”๋˜์ง€ ์•Š์€ ๊ณต๊ฐ„์—์„œ ์„œ๋กœ์˜ ๋Šฅ๋ ฅ์„ ๋ณด์™„ํ•˜๋ฉฐ ๋ณต์žกํ•œ ์ž„๋ฌด ์ˆ˜ํ–‰์„ ๊ฐ€๋Šฅํ•˜๊ฒŒ ํ•œ๋‹ค๋Š” ์ ์—์„œ ํฐ ์žฅ์ ์„ ๊ฐ–๋Š”๋‹ค. ํšจ์œจ์ ์ธ ํ•œ ํŒ€์ด ๋˜๊ธฐ ์œ„ํ•ด์„œ๋Š”, ํŒ€์˜ ๊ณตํ†ต๋œ ๋ชฉํ‘œ ๋ฐ ๊ฐ ํŒ€์›์˜ ํ˜„์žฌ ์ƒํ™ฉ์— ๊ด€ํ•œ ์ •๋ณด๋ฅผ ์‹ค์‹œ๊ฐ„์œผ๋กœ ๊ณต์œ ํ•  ์ˆ˜ ์žˆ์–ด์•ผ ํ•˜๋ฉฐ ํ•จ๊ป˜ ์˜์‚ฌ ๊ฒฐ์ •์„ ํ•  ์ˆ˜ ์žˆ์–ด์•ผ ํ•œ๋‹ค. ์ด๋Ÿฌํ•œ ๊ด€์ ์—์„œ, ์ž์—ฐ์–ด๋ฅผ ํ†ตํ•œ ์˜๋ฏธ๋ก ์  ํ™˜๊ฒฝ ์ดํ•ด๋Š” ์ธ๊ฐ„๊ณผ ์„œ๋กœ ๋‹ค๋ฅธ ๋กœ๋ด‡๋“ค์ด ๋ชจ๋‘ ์ดํ•ดํ•  ์ˆ˜ ์žˆ๋Š” ํ˜•ํƒœ๋กœ ํ™˜๊ฒฝ์„ ์ธ์ง€ํ•œ๋‹ค๋Š” ์ ์—์„œ ๊ฐ€์žฅ ํ•„์ˆ˜์ ์ธ ์š”์†Œ์ด๋‹ค. ๋˜ํ•œ, ์šฐ๋ฆฌ๋Š” ์ž์—ฐ์–ด ๊ธฐ๋ฐ˜ ํ™˜๊ฒฝ ์ดํ•ด๋ฅผ ํ†ตํ•ด ๋„คํŠธ์›Œํฌ ํ˜ผ์žก์„ ํ”ผํ•จ์œผ๋กœ์จ ํš๋“ํ•œ ์ •๋ณด์˜ ์‹ ๋ขฐ์„ฑ์„ ๋†’์ผ ์ˆ˜ ์žˆ๋‹ค. ํŠนํžˆ, ๋Œ€๋Ÿ‰์˜ ์„ผ์„œ ๋ฐ์ดํ„ฐ ์ „์†ก์— ์˜ํ•ด ๋„คํŠธ์›Œํฌ ๋Œ€์—ญํญ์ด ์ฆ๊ฐ€ํ•˜๊ณ  ํ†ต์‹  QoS (Quality of Service) ์‹ ๋ขฐ๋„๊ฐ€ ๊ฐ์†Œํ•˜๋Š” ๋ฌธ์ œ๊ฐ€ ๋นˆ๋ฒˆํžˆ ๋ฐœ์ƒํ•˜๋Š” ํ•„๋“œ ๋กœ๋ณดํ‹ฑ์Šค ์˜์—ญ์—์„œ๋Š” ์˜๋ฏธ๋ก ์  ํ™˜๊ฒฝ ์ •๋ณด์ธ ์ž์—ฐ์–ด๋ฅผ ์ „์†กํ•จ์œผ๋กœ์จ ํ†ต์‹  ๋Œ€์—ญํญ์„ ๊ฐ์†Œ์‹œํ‚ค๊ณ  ํ†ต์‹  QoS ์‹ ๋ขฐ๋„๋ฅผ ์ฆ๊ฐ€์‹œํ‚ฌ ์ˆ˜ ์žˆ๋‹ค. ๋ณธ ํ•™์œ„ ๋…ผ๋ฌธ์—์„œ๋Š” ํ™˜๊ฒฝ์˜ ์˜๋ฏธ๋ก ์  ์ดํ•ด ๊ธฐ๋ฐ˜ ์ธ๊ฐ„ ๋กœ๋ด‡ ํ˜‘๋™ ๋ฐฉ๋ฒ•์— ๋Œ€ํ•ด ์†Œ๊ฐœํ•œ๋‹ค. ๋จผ์ €, ๋กœ๋ด‡์˜ ์ง€๋„ ์ž‘์„ฑ ์•Œ๊ณ ๋ฆฌ์ฆ˜์„ ํ†ตํ•ด ํš๋“ํ•œ ๊ทธ๋ž˜ํ”„ ์ง€๋„๋ฅผ ์ด์šฉํ•˜์—ฌ ์ž์—ฐ์–ด ๋ฌธ์žฅ๊ณผ ๊ฒ€์ถœํ•œ ๊ฐ์ฒด ๋ฐ ๊ฐ ๊ฐ์ฒด ๊ฐ„์˜ ๊ด€๊ณ„๋ฅผ ์ž์—ฐ์–ด ๋‹จ์–ด๋กœ ํ‘œํ˜„ํ•˜๋Š” ๊ทธ๋ž˜ํ”„๋ฅผ ์ƒ์„ฑํ•œ๋‹ค. ๊ทธ๋ฆฌ๊ณ  ์ž์—ฐ์–ด ์ฒ˜๋ฆฌ ๊ฒฐ๊ณผ๋ฅผ ์ด์šฉํ•˜์—ฌ ์ธ๊ฐ„๊ณผ ๋‹ค์–‘ํ•œ ๋กœ๋ด‡๋“ค์ด ํ•จ๊ป˜ ํ˜‘์—…ํ•˜์—ฌ ์ž„๋ฌด๋ฅผ ์ˆ˜ํ–‰ํ•  ์ˆ˜ ์žˆ๋„๋ก ํ•˜๋Š” ํ”„๋ ˆ์ž„์›Œํฌ๋ฅผ ์ œ์•ˆํ•œ๋‹ค. ๋ณธ ํ•™์œ„ ๋…ผ๋ฌธ์€ ํฌ๊ฒŒ ๊ทธ๋ž˜ํ”„๋ฅผ ์ด์šฉํ•œ ์˜๋ฏธ๋ก ์  ํ™˜๊ฒฝ ์ดํ•ด์™€ ์˜๋ฏธ๋ก ์  ํ™˜๊ฒฝ ์ดํ•ด๋ฅผ ํ†ตํ•œ ์ธ๊ฐ„๊ณผ ์ด์ข… ๋กœ๋ด‡ ๊ฐ„์˜ ํ˜‘์—… ๋ฐฉ๋ฒ•์œผ๋กœ ๊ตฌ์„ฑ๋œ๋‹ค. ๋จผ์ €, ๊ทธ๋ž˜ํ”„๋ฅผ ์ด์šฉํ•œ ์˜๋ฏธ๋ก ์  ํ™˜๊ฒฝ ์ดํ•ด ๋ถ€๋ถ„์—์„œ๋Š” ์˜๋ฏธ๋ก ์  ๊ทธ๋ž˜ํ”„ ์ง€๋„๋ฅผ ์ด์šฉํ•œ ์ƒˆ๋กœ์šด ์ž์—ฐ์–ด ์ฒ˜๋ฆฌ ๋ฐฉ๋ฒ•์— ๋Œ€ํ•ด ์†Œ๊ฐœํ•œ๋‹ค. ์˜๋ฏธ๋ก ์  ๊ทธ๋ž˜ํ”„ ์ง€๋„ ์ž‘์„ฑ ๋ฐฉ๋ฒ•์€ ๋กœ๋ด‡์˜ ํ™˜๊ฒฝ ์ธ์ง€ ์ธก๋ฉด์—์„œ ๋งŽ์ด ์—ฐ๊ตฌ๋˜์—ˆ์ง€๋งŒ ์ด๋ฅผ ์ด์šฉํ•œ ์ž์—ฐ์–ด ์ฒ˜๋ฆฌ ๋ฐฉ๋ฒ•์€ ๊ฑฐ์˜ ์—ฐ๊ตฌ๋˜์ง€ ์•Š์•˜๋‹ค. ๋ฐ˜๋ฉด ์ปดํ“จํ„ฐ ๋น„์ „ ๋ถ„์•ผ์—์„œ๋Š” ์ด๋ฏธ์ง€๋ฅผ ์ด์šฉํ•œ ํ™˜๊ฒฝ ์ดํ•ด ์—ฐ๊ตฌ๊ฐ€ ๋งŽ์ด ์ด๋ฃจ์–ด์กŒ์ง€๋งŒ, ์—ฐ์†์ ์ธ ์žฅ๋ฉด๋“ค์€ ๋‹ค๋ฃจ๋Š”๋ฐ๋Š” ํ•œ๊ณ„์ ์ด ์žˆ๋‹ค. ๋”ฐ๋ผ์„œ ์šฐ๋ฆฌ๋Š” ๊ทธ๋ž˜ํ”„ ์ŠคํŽ™ํŠธ๋Ÿผ ์ด๋ก ์— ๊ธฐ๋ฐ˜ํ•œ ๊ทธ๋ž˜ํ”„ ์ปจ๋ณผ๋ฃจ์…˜๊ณผ ๊ทธ๋ž˜ํ”„ ์ถ•์†Œ ๋ ˆ์ด์–ด๋กœ ๊ตฌ์„ฑ๋œ ๊ทธ๋ž˜ํ”„ ์ปจ๋ณผ๋ฃจ์…˜ ์‹ ๊ฒฝ๋ง ๋ฐ ์ˆœํ™˜ ์‹ ๊ฒฝ๋ง์„ ์ด์šฉํ•˜์—ฌ ๊ทธ๋ž˜ํ”„๋ฅผ ์„ค๋ช…ํ•˜๋Š” ๋ฌธ์žฅ์„ ์ƒ์„ฑํ•œ๋‹ค. ์ œ์•ˆํ•œ ๋ฐฉ๋ฒ•์€ ๊ธฐ์กด์˜ ๋ฐฉ๋ฒ•๋“ค๋ณด๋‹ค ํ•œ ์žฅ๋ฉด์— ๋Œ€ํ•ด ํ–ฅ์ƒ๋œ ์„ฑ๋Šฅ์„ ๋ณด์˜€์œผ๋ฉฐ ์—ฐ์†๋œ ์žฅ๋ฉด๋“ค์— ๋Œ€ํ•ด์„œ๋„ ์„ฑ๊ณต์ ์œผ๋กœ ์ž์—ฐ์–ด ๋ฌธ์žฅ์„ ์ƒ์„ฑํ•œ๋‹ค. ์ตœ๊ทผ ๋”ฅ๋Ÿฌ๋‹์€ ์ž์—ฐ์–ด ๊ธฐ๋ฐ˜ ํ™˜๊ฒฝ ์ธ์ง€์— ์žˆ์–ด ๊ธ‰์†๋„๋กœ ํฐ ๋ฐœ์ „์„ ์ด๋ฃจ์—ˆ๋‹ค. ํ•˜์ง€๋งŒ ์ธ๊ณผ ์ถ”๋ก , ์œ ์ถ”์  ์ถ”๋ก , ์ž„๋ฌด ๊ณ„ํš๊ณผ ๊ฐ™์€ ๋†’์€ ์ˆ˜์ค€์˜ ํ”„๋กœ์„ธ์Šค์—๋Š” ์ ์šฉ์ด ํž˜๋“ค๋‹ค. ๋ฐ˜๋ฉด ์ž„๋ฌด๋ฅผ ์ˆ˜ํ–‰ํ•˜๋Š” ๋ฐ ์žˆ์–ด ๊ฐ ์—์ด์ „ํŠธ์˜ ๋Šฅ๋ ฅ์— ๋งž๊ฒŒ ํ–‰์œ„๋“ค์˜ ์ˆœ์„œ๋ฅผ ๊ณ„์‚ฐํ•ด์ฃผ๋Š” ์ƒ์ง•์  ์ ‘๊ทผ๋ฒ•(symbolic approach)์€ ์ถ”๋ก ๊ณผ ์ž„๋ฌด ๊ณ„ํš์— ์žˆ์–ด ๋›ฐ์–ด๋‚œ ์„ฑ๋Šฅ์„ ๋ณด์ด์ง€๋งŒ ์ธ๊ฐ„๊ณผ ๋กœ๋ด‡๋“ค ์‚ฌ์ด์˜ ์˜๋ฏธ๋ก ์  ์ •๋ณด ๊ณต์œ  ๋ฐฉ๋ฒ•์— ๋Œ€ํ•ด์„œ๋Š” ๊ฑฐ์˜ ๋‹ค๋ฃจ์ง€ ์•Š๋Š”๋‹ค. ๋”ฐ๋ผ์„œ, ์ธ๊ฐ„๊ณผ ์ด์ข… ๋กœ๋ด‡ ๊ฐ„์˜ ํ˜‘์—… ๋ฐฉ๋ฒ• ๋ถ€๋ถ„์—์„œ๋Š” ๋”ฅ๋Ÿฌ๋‹ ๊ธฐ๋ฒ•๋“ค๊ณผ ์ƒ์ง•์  ํ”Œ๋ž˜๋„ˆ(symbolic planner)๋ฅผ ์—ฐ๊ฒฐํ•˜๋Š” ํ”„๋ ˆ์ž„์›Œํฌ๋ฅผ ์ œ์•ˆํ•˜์—ฌ ์˜๋ฏธ๋ก ์  ์ดํ•ด๋ฅผ ํ†ตํ•œ ์ธ๊ฐ„ ๋ฐ ์ด์ข… ๋กœ๋ด‡ ๊ฐ„์˜ ํ˜‘์—…์„ ๊ฐ€๋Šฅํ•˜๊ฒŒ ํ•œ๋‹ค. ์šฐ๋ฆฌ๋Š” ์˜๋ฏธ๋ก ์  ์ฃผ๋ณ€ ํ™˜๊ฒฝ ์ดํ•ด๋ฅผ ์œ„ํ•ด ์ด์ „ ๋ถ€๋ถ„์—์„œ ์ œ์•ˆํ•œ ๊ทธ๋ž˜ํ”„ ๊ธฐ๋ฐ˜ ์ž์—ฐ์–ด ๋ฌธ์žฅ ์ƒ์„ฑ์„ ์ˆ˜ํ–‰ํ•œ๋‹ค. PDDL ํ”Œ๋ž˜๋„ˆ์™€ JENA-TDB๋Š” ๊ฐ๊ฐ ์ž„๋ฌด ๊ณ„ํš ๋ฐ ์ •๋ณด ํš๋“ ์ €์žฅ์†Œ๋กœ ์‚ฌ์šฉํ•œ๋‹ค. ์ œ์•ˆํ•œ ๋ฐฉ๋ฒ•์˜ ํšจ์šฉ์„ฑ์€ ์‹œ๋ฎฌ๋ ˆ์ด์…˜์„ ํ†ตํ•ด ๋‘ ๊ฐ€์ง€ ์ƒํ™ฉ์— ๋Œ€ํ•ด์„œ ๊ฒ€์ฆํ•œ๋‹ค. ํ•˜๋‚˜๋Š” ๋™์  ํ™˜๊ฒฝ์—์„œ ์ž„๋ฌด ์‹คํŒจ ์ƒํ™ฉ์ด๋ฉฐ ๋‹ค๋ฅธ ํ•˜๋‚˜๋Š” ๋„“์€ ๊ณต๊ฐ„์—์„œ ๊ฐ์ฒด๋ฅผ ์ฐพ๋Š” ์ƒํ™ฉ์ด๋‹ค.1 Introduction 1 1.1 Background and Motivation 1 1.2 Literature Review 5 1.2.1 Natural Language-Based Human-Robot Cooperation 5 1.2.2 Artificial Intelligence Planning 5 1.3 The Problem Statement 10 1.4 Contributions 11 1.5 Dissertation Outline 12 2 Natural Language-Based Scene Graph Generation 14 2.1 Introduction 14 2.2 Related Work 16 2.3 Scene Graph Generation 18 2.3.1 Graph Construction 19 2.3.2 Graph Inference 19 2.4 Experiments 22 2.5 Summary 25 3 Language Description with 3D Semantic Graph 26 3.1 Introduction 26 3.2 Related Work 26 3.3 Natural Language Description 29 3.3.1 Preprocess 29 3.3.2 Graph Feature Extraction 33 3.3.3 Natural Language Description with Graph Features 34 3.4 Experiments 35 3.5 Summary 42 4 Natural Question with Semantic Graph 43 4.1 Introduction 43 4.2 Related Work 45 4.3 Natural Question Generation 47 4.3.1 Preprocess 49 4.3.2 Graph Feature Extraction 50 4.3.3 Natural Question with Graph Features 51 4.4 Experiments 52 4.5 Summary 58 5 PDDL Planning with Natural Language 59 5.1 Introduction 59 5.2 Related Work 60 5.3 PDDL Planning with Incomplete World Knowledge 61 5.3.1 Natural Language Process for PDDL Planning 63 5.3.2 PDDL Planning System 64 5.4 Experiments 65 5.5 Summary 69 6 PDDL Planning with Natural Language-Based Scene Understanding 70 6.1 Introduction 70 6.2 Related Work 74 6.3 A Framework for Heterogeneous Multi-Agent Cooperation 77 6.3.1 Natural Language-Based Cognition 78 6.3.2 Knowledge Engine 80 6.3.3 PDDL Planning Agent 81 6.4 Experiments 82 6.4.1 Experiment Setting 82 6.4.2 Scenario 84 6.4.3 Results 87 6.5 Summary 91 7 Conclusion 92Docto
    • โ€ฆ
    corecore