2,574 research outputs found
Semantic path planning for indoor navigation and household tasks
Assisting people with daily living tasks in their own homes with a robot requires a navigation through a cluttered and varying environment. Sometimes the only possible path would be blocked by an obstacle which needs to be moved away but not into other obstructing regions like the space required for opening a door. This paper presents semantic assisted path planning in which a gridded semantic map is used to improve navigation among movable obstacles (NAMO) and partially plan simple household tasks like cleaning a carpet or moving objects to another location. Semantic planning allows the execution of tasks expressed in human-like form instead of mathematical concepts like coordinates. In our numerical experiments, spatial planning was completed well within a typical human-human dialogue response time, allowing for an immediate response by the robot
Semantic path planning for indoor navigation and household tasks
Assisting people with daily living tasks in their own homes with a robot requires a navigation through a cluttered and varying environment. Sometimes the only possible path would be blocked by an obstacle which needs to be moved away but not into other obstructing regions like the space required for opening a door. This paper presents semantic assisted path planning in which a gridded semantic map is used to improve navigation among movable obstacles (NAMO) and partially plan simple household tasks like cleaning a carpet or moving objects to another location. Semantic planning allows the execution of tasks expressed in human-like form instead of mathematical concepts like coordinates. In our numerical experiments, spatial planning was completed well within a typical human-human dialogue response time, allowing for an immediate response by the robot
Embodied Question Answering
We present a new AI task -- Embodied Question Answering (EmbodiedQA) -- where
an agent is spawned at a random location in a 3D environment and asked a
question ("What color is the car?"). In order to answer, the agent must first
intelligently navigate to explore the environment, gather information through
first-person (egocentric) vision, and then answer the question ("orange").
This challenging task requires a range of AI skills -- active perception,
language understanding, goal-driven navigation, commonsense reasoning, and
grounding of language into actions. In this work, we develop the environments,
end-to-end-trained reinforcement learning agents, and evaluation protocols for
EmbodiedQA.Comment: 20 pages, 13 figures, Webpage: https://embodiedqa.org
Semantic information for robot navigation: a survey
There is a growing trend in robotics for implementing behavioural mechanisms based on human psychology, such as the processes associated with thinking. Semantic knowledge has opened new paths in robot navigation, allowing a higher level of abstraction in the representation of information. In contrast with the early years, when navigation relied on geometric navigators that interpreted the environment as a series of accessible areas or later developments that led to the use of graph theory, semantic information has moved robot navigation one step further. This work presents a survey on the concepts, methodologies and techniques that allow including semantic information in robot navigation systems. The techniques involved have to deal with a range of tasks from modelling the environment and building a semantic map, to including methods to learn new concepts and the representation of the knowledge acquired, in many cases through interaction with users. As understanding the environment is essential to achieve high-level navigation, this paper reviews techniques for acquisition of semantic information, paying attention to the two main groups: human-assisted and autonomous techniques. Some state-of-the-art semantic knowledge representations are also studied, including ontologies, cognitive maps and semantic maps. All of this leads to a recent concept, semantic navigation, which integrates the previous topics to generate high-level navigation systems able to deal with real-world complex situationsThe research leading to these results has received funding from HEROITEA: Heterogeneous 480 Intelligent Multi-Robot Team for Assistance of Elderly People (RTI2018-095599-B-C21), funded by Spanish 481 Ministerio de Economía y Competitividad. The research leading to this work was also supported project "Robots sociales para estimulacón física, cognitiva y afectiva de mayores"; funded by the Spanish State Research Agency under grant 2019/00428/001. It is also funded by WASP-AI Sweden; and by Spanish project Robotic-Based Well-Being Monitoring and Coaching for Elderly People during Daily Life Activities (RTI2018-095599-A-C22)
Sonicverse: A Multisensory Simulation Platform for Embodied Household Agents that See and Hear
Developing embodied agents in simulation has been a key research topic in
recent years. Exciting new tasks, algorithms, and benchmarks have been
developed in various simulators. However, most of them assume deaf agents in
silent environments, while we humans perceive the world with multiple senses.
We introduce Sonicverse, a multisensory simulation platform with integrated
audio-visual simulation for training household agents that can both see and
hear. Sonicverse models realistic continuous audio rendering in 3D environments
in real-time. Together with a new audio-visual VR interface that allows humans
to interact with agents with audio, Sonicverse enables a series of embodied AI
tasks that need audio-visual perception. For semantic audio-visual navigation
in particular, we also propose a new multi-task learning model that achieves
state-of-the-art performance. In addition, we demonstrate Sonicverse's realism
via sim-to-real transfer, which has not been achieved by other simulators: an
agent trained in Sonicverse can successfully perform audio-visual navigation in
real-world environments. Sonicverse is available at:
https://github.com/StanfordVL/Sonicverse.Comment: In ICRA 2023. Project page:
https://ai.stanford.edu/~rhgao/sonicverse/. Code:
https://github.com/StanfordVL/sonicverse. Gao and Li contributed equally to
this work and are in alphabetical orde
Preferential Multi-Target Search in Indoor Environments using Semantic SLAM
In recent years, the demand for service robots capable of executing tasks
beyond autonomous navigation has grown. In the future, service robots will be
expected to perform complex tasks like 'Set table for dinner'. High-level tasks
like these, require, among other capabilities, the ability to retrieve multiple
targets. This paper delves into the challenge of locating multiple targets in
an environment, termed 'Find my Objects.' We present a novel heuristic designed
to facilitate robots in conducting a preferential search for multiple targets
in indoor spaces. Our approach involves a Semantic SLAM framework that combines
semantic object recognition with geometric data to generate a multi-layered
map. We fuse the semantic maps with probabilistic priors for efficient
inferencing. Recognizing the challenges introduced by obstacles that might
obscure a navigation goal and render standard point-to-point navigation
strategies less viable, our methodology offers resilience to such factors.
Importantly, our method is adaptable to various object detectors, RGB-D SLAM
techniques, and local navigation planners. We demonstrate the 'Find my Objects'
task in real-world indoor environments, yielding quantitative results that
attest to the effectiveness of our methodology. This strategy can be applied in
scenarios where service robots need to locate, grasp, and transport objects,
taking into account user preferences. For a brief summary, please refer to our
video: https://tinyurl.com/PrefTargetSearchComment: 6 pages, 8 figure
- …