17,628 research outputs found
Learning Interpretable Spatial Operations in a Rich 3D Blocks World
In this paper, we study the problem of mapping natural language instructions
to complex spatial actions in a 3D blocks world. We first introduce a new
dataset that pairs complex 3D spatial operations to rich natural language
descriptions that require complex spatial and pragmatic interpretations such as
"mirroring", "twisting", and "balancing". This dataset, built on the simulation
environment of Bisk, Yuret, and Marcu (2016), attains language that is
significantly richer and more complex, while also doubling the size of the
original dataset in the 2D environment with 100 new world configurations and
250,000 tokens. In addition, we propose a new neural architecture that achieves
competitive results while automatically discovering an inventory of
interpretable spatial operations (Figure 5)Comment: AAAI 201
Hypernetwork functional image representation
Motivated by the human way of memorizing images we introduce their functional
representation, where an image is represented by a neural network. For this
purpose, we construct a hypernetwork which takes an image and returns weights
to the target network, which maps point from the plane (representing positions
of the pixel) into its corresponding color in the image. Since the obtained
representation is continuous, one can easily inspect the image at various
resolutions and perform on it arbitrary continuous operations. Moreover, by
inspecting interpolations we show that such representation has some properties
characteristic to generative models. To evaluate the proposed mechanism
experimentally, we apply it to image super-resolution problem. Despite using a
single model for various scaling factors, we obtained results comparable to
existing super-resolution methods
- …