3,134 research outputs found
Input and Weight Space Smoothing for Semi-supervised Learning
We propose regularizing the empirical loss for semi-supervised learning by
acting on both the input (data) space, and the weight (parameter) space. We
show that the two are not equivalent, and in fact are complementary, one
affecting the minimality of the resulting representation, the other
insensitivity to nuisance variability. We propose a method to perform such
smoothing, which combines known input-space smoothing with a novel weight-space
smoothing, based on a min-max (adversarial) optimization. The resulting
Adversarial Block Coordinate Descent (ABCD) algorithm performs gradient ascent
with a small learning rate for a random subset of the weights, and standard
gradient descent on the remaining weights in the same mini-batch. It achieves
comparable performance to the state-of-the-art without resorting to heavy data
augmentation, using a relatively simple architecture
Multi-Entity Dependence Learning with Rich Context via Conditional Variational Auto-encoder
Multi-Entity Dependence Learning (MEDL) explores conditional correlations
among multiple entities. The availability of rich contextual information
requires a nimble learning scheme that tightly integrates with deep neural
networks and has the ability to capture correlation structures among
exponentially many outcomes. We propose MEDL_CVAE, which encodes a conditional
multivariate distribution as a generating process. As a result, the variational
lower bound of the joint likelihood can be optimized via a conditional
variational auto-encoder and trained end-to-end on GPUs. Our MEDL_CVAE was
motivated by two real-world applications in computational sustainability: one
studies the spatial correlation among multiple bird species using the eBird
data and the other models multi-dimensional landscape composition and human
footprint in the Amazon rainforest with satellite images. We show that
MEDL_CVAE captures rich dependency structures, scales better than previous
methods, and further improves on the joint likelihood taking advantage of very
large datasets that are beyond the capacity of previous methods.Comment: The first two authors contribute equall
- …