29 research outputs found
Efficient end-to-end learning for quantizable representations
Embedding representation learning via neural networks is at the core
foundation of modern similarity based search. While much effort has been put in
developing algorithms for learning binary hamming code representations for
search efficiency, this still requires a linear scan of the entire dataset per
each query and trades off the search accuracy through binarization. To this
end, we consider the problem of directly learning a quantizable embedding
representation and the sparse binary hash code end-to-end which can be used to
construct an efficient hash table not only providing significant search
reduction in the number of data but also achieving the state of the art search
accuracy outperforming previous state of the art deep metric learning methods.
We also show that finding the optimal sparse binary hash code in a mini-batch
can be computed exactly in polynomial time by solving a minimum cost flow
problem. Our results on Cifar-100 and on ImageNet datasets show the state of
the art search accuracy in precision@k and NMI metrics while providing up to
98X and 478X search speedup respectively over exhaustive linear search. The
source code is available at
https://github.com/maestrojeong/Deep-Hash-Table-ICML18Comment: Accepted and to appear at ICML 2018. Camera ready versio
VMesh: Hybrid Volume-Mesh Representation for Efficient View Synthesis
With the emergence of neural radiance fields (NeRFs), view synthesis quality
has reached an unprecedented level. Compared to traditional mesh-based assets,
this volumetric representation is more powerful in expressing scene geometry
but inevitably suffers from high rendering costs and can hardly be involved in
further processes like editing, posing significant difficulties in combination
with the existing graphics pipeline. In this paper, we present a hybrid
volume-mesh representation, VMesh, which depicts an object with a textured mesh
along with an auxiliary sparse volume. VMesh retains the advantages of
mesh-based assets, such as efficient rendering, compact storage, and easy
editing, while also incorporating the ability to represent subtle geometric
structures provided by the volumetric counterpart. VMesh can be obtained from
multi-view images of an object and renders at 2K 60FPS on common consumer
devices with high fidelity, unleashing new opportunities for real-time
immersive applications.Comment: Project page: https://bennyguo.github.io/vmesh
A Transfer Principle: Universal Approximators Between Metric Spaces From Euclidean Universal Approximators
We build universal approximators of continuous maps between arbitrary Polish
metric spaces and using universal approximators
between Euclidean spaces as building blocks. Earlier results assume that the
output space is a topological vector space. We overcome this
limitation by "randomization": our approximators output discrete probability
measures over . When and are Polish
without additional structure, we prove very general qualitative guarantees;
when they have suitable combinatorial structure, we prove quantitative
guarantees for H\"older-like maps, including maps between finite graphs,
solution operators to rough differential equations between certain Carnot
groups, and continuous non-linear operators between Banach spaces arising in
inverse problems. In particular, we show that the required number of Dirac
measures is determined by the combinatorial structure of and
. For barycentric , including Banach spaces,
-trees, Hadamard manifolds, or Wasserstein spaces on Polish metric
spaces, our approximators reduce to -valued functions. When the
Euclidean approximators are neural networks, our constructions generalize
transformer networks, providing a new probabilistic viewpoint of geometric deep
learning.Comment: 14 Figures, 3 Tables, 78 Pages (Main 40, Proofs 26, Acknowledgments
and References 12
Action in Mind: Neural Models for Action and Intention Perception
To notice, recognize, and ultimately perceive the others’ actions and to discern the intention behind those observed actions is an essential skill for social communications and improves markedly the chances of survival. Encountering dangerous behavior, for instance, from a person or an animal requires an immediate and suitable reaction. In addition, as social creatures, we need to perceive, interpret, and judge correctly the other individual’s actions as a fundamental skill for our social life. In other words, our survival and success in adaptive social behavior and nonverbal communication depends heavily on our ability to thrive in complex social situations. However, it has been shown that humans spontaneously can decode animacy and social interactions even from strongly impoverished stimuli and this is a fundamental part of human experience that develops early in infancy and is shared with other primates. In addition, it is well established that perceptual and motor representations of actions are tightly coupled and both share common mechanisms. This coupling between action perception and action execution plays a critical role in action understanding as postulated in various studies and they are potentially important for our social cognition. This interaction likely is mediated by action-selective neurons in the superior temporal sulcus (STS), premotor and parietal cortex. STS and TPJ have been identified also as coarse neural substrate for the processing of social interactions stimuli. Despite this localization, the underlying exact neural circuits of this processing remain unclear. The aim of this thesis is to understand the neural mechanisms behind the action perception coupling and to investigate further how human brain perceive different classes of social interactions. To achieve this goal, first we introduce a neural model that provides a unifying account for multiple experiments on the interaction between action execution and action perception. The model reproduces correctly the interactions between action observation and execution in several experiments and provides a link towards electrophysiological detailed models of relevant circuits. This model might thus provide a starting point for the detailed quantitative investigation how motor plans interact with perceptual action representations at the level of single-cell mechanisms. Second we present a simple neural model that reproduces some of the key observations in psychophysical experiments about the perception of animacy and social interactions from stimuli. Even in its simple form the model proves that animacy and social interaction judgments partly might be derived by very elementary operations in hierarchical neural vision systems, without a need of sophisticated or accurate probabilistic inference