179,809 research outputs found
Self-Supervised Intrinsic Image Decomposition
Intrinsic decomposition from a single image is a highly challenging task, due
to its inherent ambiguity and the scarcity of training data. In contrast to
traditional fully supervised learning approaches, in this paper we propose
learning intrinsic image decomposition by explaining the input image. Our
model, the Rendered Intrinsics Network (RIN), joins together an image
decomposition pipeline, which predicts reflectance, shape, and lighting
conditions given a single image, with a recombination function, a learned
shading model used to recompose the original input based off of intrinsic image
predictions. Our network can then use unsupervised reconstruction error as an
additional signal to improve its intermediate representations. This allows
large-scale unlabeled data to be useful during training, and also enables
transferring learned knowledge to images of unseen object categories, lighting
conditions, and shapes. Extensive experiments demonstrate that our method
performs well on both intrinsic image decomposition and knowledge transfer.Comment: NIPS 2017 camera-ready version, project page:
http://rin.csail.mit.edu
Relate to Predict: Towards Task-Independent Knowledge Representations for Reinforcement Learning
Reinforcement Learning (RL) can enable agents to learn complex tasks.
However, it is difficult to interpret the knowledge and reuse it across tasks.
Inductive biases can address such issues by explicitly providing generic yet
useful decomposition that is otherwise difficult or expensive to learn
implicitly. For example, object-centered approaches decompose a high
dimensional observation into individual objects. Expanding on this, we utilize
an inductive bias for explicit object-centered knowledge separation that
provides further decomposition into semantic representations and dynamics
knowledge. For this, we introduce a semantic module that predicts an objects'
semantic state based on its context. The resulting affordance-like object state
can then be used to enrich perceptual object representations. With a minimal
setup and an environment that enables puzzle-like tasks, we demonstrate the
feasibility and benefits of this approach. Specifically, we compare three
different methods of integrating semantic representations into a model-based RL
architecture. Our experiments show that the degree of explicitness in knowledge
separation correlates with faster learning, better accuracy, better
generalization, and better interpretability.Comment: submitted to IJCNN 202
Understanding deep features with computer-generated imagery
We introduce an approach for analyzing the variation of features generated by
convolutional neural networks (CNNs) with respect to scene factors that occur
in natural images. Such factors may include object style, 3D viewpoint, color,
and scene lighting configuration. Our approach analyzes CNN feature responses
corresponding to different scene factors by controlling for them via rendering
using a large database of 3D CAD models. The rendered images are presented to a
trained CNN and responses for different layers are studied with respect to the
input scene factors. We perform a decomposition of the responses based on
knowledge of the input scene factors and analyze the resulting components. In
particular, we quantify their relative importance in the CNN responses and
visualize them using principal component analysis. We show qualitative and
quantitative results of our study on three CNNs trained on large image
datasets: AlexNet, Places, and Oxford VGG. We observe important differences
across the networks and CNN layers for different scene factors and object
categories. Finally, we demonstrate that our analysis based on
computer-generated imagery translates to the network representation of natural
images
Mathematics for Visual Guidance of Robots
Vision can be used to position a robot relative to a known object or a known environment in 3D. If the object has enough feature points, one view is sufficient for determining the relative position between the object and the camera, otherwise, multiple views are required. We discuss the mathematics of viewpoint determination, using a combination of calibration matrix decomposition and space resection. The combined method has low noise sensitivity and does not require knowledge of camera parameters. If the object does not have enough features, multiple views are required to determine its position and orientation; an example of this is be given. The formulation of homogeneous transform equations to drive the manipulator to the goal position is also be given
Query processing of geometric objects with free form boundarie sin spatial databases
The increasing demand for the use of database systems as an integrating
factor in CAD/CAM applications has necessitated the development of database
systems with appropriate modelling and retrieval capabilities. One essential
problem is the treatment of geometric data which has led to the development of
spatial databases. Unfortunately, most proposals only deal with simple geometric
objects like multidimensional points and rectangles. On the other hand, there has
been a rapid development in the field of representing geometric objects with free
form curves or surfaces, initiated by engineering applications such as mechanical
engineering, aviation or astronautics. Therefore, we propose a concept for the realization
of spatial retrieval operations on geometric objects with free form
boundaries, such as B-spline or Bezier curves, which can easily be integrated in
a database management system. The key concept is the encapsulation of geometric
operations in a so-called query processor. First, this enables the definition of
an interface allowing the integration into the data model and the definition of the
query language of a database system for complex objects. Second, the approach
allows the use of an arbitrary representation of the geometric objects. After a
short description of the query processor, we propose some representations for free
form objects determined by B-spline or Bezier curves. The goal of efficient query
processing in a database environment is achieved using a combination of decomposition
techniques and spatial access methods. Finally, we present some experimental
results indicating that the performance of decomposition techniques is
clearly superior to traditional query processing strategies for geometric objects
with free form boundaries
The role of functional prototyping within the KADS methodology : a thesis presented in partial fulfilment of the requirements for the degree of Master of Science in Computer Science at Massey University
Knowledge-based systems have until recent times lacked a clear and complete methodology for their construction. KADS was the result of the early 1980's project (ESPRIT-I P1098) which had the aim of developing a comprehensive, commercially viable methodology for knowledge-based system construction. KADS has subsequently proved to be one of the more popular approaches, focusing on the modelling approach to knowledge based system development. One area of the KADS methodology that has not been examined to any great depth is that of model validation. Model validation is the process of ensuring that a derived model is an accurate representation of the domain from which it has been derived from. The two approaches which have been suggested for this purpose within the KADS framework are that of protocol analysis and functional prototyping. This project seeks to apply the second of these choices, that of functional prototyping, to the model of expertise created by da Silva (1994) for model validation purposes. The problem domain is that of farm management, under an joint program of research between the Computer Science, Information Systems and Agricultural Management departments of Massey University. The project took the model of expertise and created a knowledge representation model in compliance with the selected object-oriented paradigm. After this the creation of a functional prototype in a Microsoft Windows based PC environment took place, using Kappa-PC as the application development tool. The validation took place through a demonstration session to a number of domain experts. Conclusions drawn from the experience gained through the creation and use of the prototype are presented, outlining the reasons why functional prototyping was deemed to be an appropriate method for model validation
Bayesian robot Programming
We propose a new method to program robots based on Bayesian inference and learning. The capacities of this programming method are demonstrated through a succession of increasingly complex experiments. Starting from the learning of simple reactive behaviors, we present instances of behavior combinations, sensor fusion, hierarchical behavior composition, situation recognition and temporal sequencing. This series of experiments comprises the steps in the incremental development of a complex robot program. The advantages and drawbacks of this approach are discussed along with these different experiments and summed up as a conclusion. These different robotics programs may be seen as an illustration of probabilistic programming applicable whenever one must deal with problems based on uncertain or incomplete knowledge. The scope of possible applications is obviously much broader than robotics
- …