529 research outputs found
Machine Learning through Exploration for Perception-Driven Robotics
The ability of robots to perform tasks in human environments has
largely been limited to rather simple and specific tasks, such as lawn mowing
and vacuum cleaning. As such, current robots are far away from the robot butlers, assistants,
and housekeepers that are depicted in science fiction movies. Part of this gap can be
explained by the fact that human environments are hugely varied, complex and unstructured.
For example, the homes that a domestic robot might end up in are hugely varied. Since
every home has a different layout with different objects and furniture, it is impossible for
a human designer to anticipate all challenges a robot might
face, and equip the robot a priori with all the necessary perceptual and manipulation skills.
Instead, robots could be programmed in a way that allows them to adapt to any
environment that they are in. In that case, the robot designer would not
need to precisely anticipate such environments. The ability to adapt can be provided by
robot learning techniques, which can be applied to learn skills for perception and manipulation.
Many of the current
robot learning techniques,
however, rely on human supervisors to provide annotations or demonstrations, and to fine-tuning the methods parameters and heuristics. As such,
it can require a significant amount of human time investment to
make a robot perform a task in a novel environment, even if statistical learning techniques are used.
In this thesis, I focus on another way of obtaining the data a robot needs to
learn about the environment and how to successfully
perform skills in it. By exploring the environment using its own sensors and actuators, rather than
passively waiting for annotations or demonstrations, a
robot can obtain this data by itself. I investigate multiple approaches that allow a robot
to explore its environment autonomously, while trying to minimize the design effort
required to deploy such algorithms in different situations.
First, I consider an unsupervised robot with minimal prior knowledge
about its environment. It can only learn through observed
sensory feedback obtained though interactive exploration of its
environment. In a bottom-up, probabilistic approach, the robot tries to segment
the objects in its environment through clustering with minimal prior knowledge. This clustering is
based on static visual scene features and observed movement. Information theoretic principles are used to autonomously select actions that maximize
the expected information gain, and thus learning speed. Our evaluations
on a real robot system equipped with an on-board camera show that the proposed
method handles noisy inputs better than previous methods, and that
action selection according to the information gain criterion does increase the learning speed.
Often, however, the goal of a robot is not just to learn the structure of the environment, but to learn
how to perform a task encoded by a reward signal.
In addition to the weak feedback provided by reward signals, the robot has access to rich sensory data, that, even for
simple tasks, is often non-linear and high-dimensional. Sensory data can be
leveraged to learn a system model, but in high-dimensional sensory spaces this
step often requires manually designing features. I propose a robot
reinforcement learning algorithm with learned non-parametric models, value
functions, and policies that can deal with high-dimensional state representations.
As such, the proposed algorithm is well-suited to deal with high-dimensional signals
such as camera images. To avoid that the robot converges prematurely to a sub-optimal solution,
the information loss of policy updates is limited. This constraint makes sure the robot keeps exploring the effects
of its behavior on the environment. The experiments show that the proposed non-parametric
relative entropy policy search algorithm performs better than prior methods that either do not employ bounded updates,
or that try to cover the state-space with general-purpose radial basis functions. Furthermore,
the method is validated on a
real-robot setup with high-dimensional camera image inputs.
One problem with typical exploration strategies is that the behavior is perturbed independently
in each time step, for example through selecting a random action or random policy parameters.
As such, the resulting exploration behavior might be incoherent. Incoherence causes
inefficient random walk behavior, makes the system less robust, and causes wear and tear on the robot.
A typical solution is to perturb the policy parameters directly, and use the same perturbation for an entire episode. However, this
strategy
tends to increase the number of episodes needed, since only a single perturbation can be evaluated per episode. I introduce a
strategy that can make a more balanced trade-off between the advantages of these two approaches.
The experiments show that intermediate trade-offs, rather than independent or episode-based exploration,
is beneficial across different tasks and learning algorithms.
This thesis thus addresses how robots can learn autonomously by exploring the world through
unsupervised learning and reinforcement learning. Throughout the thesis, new approaches
and algorithms are introduced: a probabilistic interactive segmentation approach, the non-parametric
relative entropy policy search algorithm, and a framework for generalized exploration.
To allow the learning algorithms to be applied in different and unknown environments,
the design effort and supervision required from human designers or users is minimized.
These approaches and algorithms contribute
towards the capability of robots to autonomously learn useful skills in human environments in a practical manner
Registration of 3D Point Clouds and Meshes: A Survey From Rigid to Non-Rigid
Three-dimensional surface registration transforms multiple three-dimensional data sets into the same coordinate system so as to align overlapping components of these sets. Recent surveys have covered different aspects of either rigid or nonrigid registration, but seldom discuss them as a whole. Our study serves two purposes: 1) To give a comprehensive survey of both types of registration, focusing on three-dimensional point clouds and meshes and 2) to provide a better understanding of registration from the perspective of data fitting. Registration is closely related to data fitting in which it comprises three core interwoven components: model selection, correspondences and constraints, and optimization. Study of these components 1) provides a basis for comparison of the novelties of different techniques, 2) reveals the similarity of rigid and nonrigid registration in terms of problem representations, and 3) shows how overfitting arises in nonrigid registration and the reasons for increasing interest in intrinsic techniques. We further summarize some practical issues of registration which include initializations and evaluations, and discuss some of our own observations, insights and foreseeable research trends
Recommended from our members
Embodied learning for visual recognition
The field of visual recognition in recent years has come to rely on large expensively curated and manually labeled "bags of disembodied images". In the wake of this, my focus has been on understanding and exploiting alternate "free" sources of supervision available to visual learning agents that are situated within real environments. For example, even simply moving from orderless image collections to continuous visual observations offers opportunities to understand the dynamics and other physical properties of the visual world. Further, embodied agents may have the abilities to move around their environment and/or effect changes within it, in which case these abilities offer new means to acquire useful supervision. In this dissertation, I present my work along this and related directions.Electrical and Computer Engineerin
Advances in Graph-Cut Optimization: Multi-Surface Models, Label Costs, and Hierarchical Costs
Computer vision is full of problems that are elegantly expressed in terms of mathematical optimization, or energy minimization. This is particularly true of low-level inference problems such as cleaning up noisy signals, clustering and classifying data, or estimating 3D points from images. Energies let us state each problem as a clear, precise objective function. Minimizing the correct energy would, hypothetically, yield a good solution to the corresponding problem. Unfortunately, even for low-level problems we are confronted by energies that are computationally hard—often NP-hard—to minimize. As a consequence, a rather large portion of computer vision research is dedicated to proposing better energies and better algorithms for energies. This dissertation presents work along the same line, specifically new energies and algorithms based on graph cuts.
We present three distinct contributions. First we consider biomedical segmentation where the object of interest comprises multiple distinct regions of uncertain shape (e.g. blood vessels, airways, bone tissue). We show that this common yet difficult scenario can be modeled as an energy over multiple interacting surfaces, and can be globally optimized by a single graph cut. Second, we introduce multi-label energies with label costs and provide algorithms to minimize them. We show how label costs are useful for clustering and robust estimation problems in vision. Third, we characterize a class of energies with hierarchical costs and propose a novel hierarchical fusion algorithm with improved approximation guarantees. Hierarchical costs are natural for modeling an array of difficult problems, e.g. segmentation with hierarchical context, simultaneous estimation of motions and homographies, or detecting hierarchies of patterns
- …