67 research outputs found
Invariant Tensor Feature Coding
We propose a novel feature coding method that exploits invariance. We
consider the setting where the transformations that preserve the image contents
compose a finite group of orthogonal matrices. This is the case in many image
transformations, such as image rotations and image flipping. We prove that the
group-invariant feature vector contains sufficient discriminative information
when learning a linear classifier using convex loss minimization. From this
result, we propose a novel feature modeling for principal component analysis
and k-means clustering, which are used for most feature coding methods, and
global feature functions that explicitly consider the group action. Although
the global feature functions are complex nonlinear functions in general, we can
calculate the group action on this space easily by constructing the functions
as the tensor product representations of basic representations, resulting in
the explicit form of invariant feature functions. We demonstrate the
effectiveness of our methods on several image datasets.Comment: 14 pages, 5 figure
A General Theory of Equivariant CNNs on Homogeneous Spaces
We present a general theory of Group equivariant Convolutional Neural
Networks (G-CNNs) on homogeneous spaces such as Euclidean space and the sphere.
Feature maps in these networks represent fields on a homogeneous base space,
and layers are equivariant maps between spaces of fields. The theory enables a
systematic classification of all existing G-CNNs in terms of their symmetry
group, base space, and field type. We also consider a fundamental question:
what is the most general kind of equivariant linear map between feature spaces
(fields) of given types? Following Mackey, we show that such maps correspond
one-to-one with convolutions using equivariant kernels, and characterize the
space of such kernels
Learning to Convolve: A Generalized Weight-Tying Approach
Recent work (Cohen & Welling, 2016) has shown that generalizations of
convolutions, based on group theory, provide powerful inductive biases for
learning. In these generalizations, filters are not only translated but can
also be rotated, flipped, etc. However, coming up with exact models of how to
rotate a 3 x 3 filter on a square pixel-grid is difficult. In this paper, we
learn how to transform filters for use in the group convolution, focussing on
roto-translation. For this, we learn a filter basis and all rotated versions of
that filter basis. Filters are then encoded by a set of rotation invariant
coefficients. To rotate a filter, we switch the basis. We demonstrate we can
produce feature maps with low sensitivity to input rotations, while achieving
high performance on MNIST and CIFAR-10.Comment: Accepted to ICML 201
CB2: Collaborative Natural Language Interaction Research Platform
CB2 is a multi-agent platform to study collaborative natural language
interaction in a grounded task-oriented scenario. It includes a 3D game
environment, a backend server designed to serve trained models to human agents,
and various tools and processes to enable scalable studies. We deploy CB2 at
https://cb2.ai as a system demonstration with a learned instruction following
model
- …