872,227 research outputs found
Adversarial Discriminative Heterogeneous Face Recognition
The gap between sensing patterns of different face modalities remains a
challenging problem in heterogeneous face recognition (HFR). This paper
proposes an adversarial discriminative feature learning framework to close the
sensing gap via adversarial learning on both raw-pixel space and compact
feature space. This framework integrates cross-spectral face hallucination and
discriminative feature learning into an end-to-end adversarial network. In the
pixel space, we make use of generative adversarial networks to perform
cross-spectral face hallucination. An elaborate two-path model is introduced to
alleviate the lack of paired images, which gives consideration to both global
structures and local textures. In the feature space, an adversarial loss and a
high-order variance discrepancy loss are employed to measure the global and
local discrepancy between two heterogeneous distributions respectively. These
two losses enhance domain-invariant feature learning and modality independent
noise removing. Experimental results on three NIR-VIS databases show that our
proposed approach outperforms state-of-the-art HFR methods, without requiring
of complex network or large-scale training dataset
Interpret Federated Learning with Shapley Values
Federated Learning is introduced to protect privacy by distributing training
data into multiple parties. Each party trains its own model and a meta-model is
constructed from the sub models. In this way the details of the data are not
disclosed in between each party. In this paper we investigate the model
interpretation methods for Federated Learning, specifically on the measurement
of feature importance of vertical Federated Learning where feature space of the
data is divided into two parties, namely host and guest. For host party to
interpret a single prediction of vertical Federated Learning model, the
interpretation results, namely the feature importance, are very likely to
reveal the protected data from guest party. We propose a method to balance the
model interpretability and data privacy in vertical Federated Learning by using
Shapley values to reveal detailed feature importance for host features and a
unified importance value for federated guest features. Our experiments indicate
robust and informative results for interpreting Federated Learning models
Graph Spectral Feature Learning for Mixed Data of Categorical and Numerical Type
Feature learning in the presence of a mixed type of variables, numerical and
categorical types, is an important issue for related modeling problems. For
simple neighborhood queries under mixed data space, standard practice is to
consider numerical and categorical variables separately and combining them
based on some suitable distance functions. Alternatives, such as Kernel
learning or Principal Component do not explicitly consider the inter-dependence
structure among the mixed type of variables. In this work, we propose a novel
strategy to explicitly model the probabilistic dependence structure among the
mixed type of variables by an undirected graph. Spectral decomposition of the
graph Laplacian provides the desired feature transformation. The Eigen spectrum
of the transformed feature space shows increased separability and more
prominent clusterability among the observations. The main novelty of our paper
lies in capturing interactions of the mixed feature type in an unsupervised
framework using a graphical model. We numerically validate the implications of
the feature learning strateg
Learning to Rank Using Localized Geometric Mean Metrics
Many learning-to-rank (LtR) algorithms focus on query-independent model, in
which query and document do not lie in the same feature space, and the rankers
rely on the feature ensemble about query-document pair instead of the
similarity between query instance and documents. However, existing algorithms
do not consider local structures in query-document feature space, and are
fragile to irrelevant noise features. In this paper, we propose a novel
Riemannian metric learning algorithm to capture the local structures and
develop a robust LtR algorithm. First, we design a concept called \textit{ideal
candidate document} to introduce metric learning algorithm to query-independent
model. Previous metric learning algorithms aiming to find an optimal metric
space are only suitable for query-dependent model, in which query instance and
documents belong to the same feature space and the similarity is directly
computed from the metric space. Then we extend the new and extremely fast
global Geometric Mean Metric Learning (GMML) algorithm to develop a localized
GMML, namely L-GMML. Based on the combination of local learned metrics, we
employ the popular Normalized Discounted Cumulative Gain~(NDCG) scorer and
Weighted Approximate Rank Pairwise (WARP) loss to optimize the \textit{ideal
candidate document} for each query candidate set. Finally, we can quickly
evaluate all candidates via the similarity between the \textit{ideal candidate
document} and other candidates. By leveraging the ability of metric learning
algorithms to describe the complex structural information, our approach gives
us a principled and efficient way to perform LtR tasks. The experiments on
real-world datasets demonstrate that our proposed L-GMML algorithm outperforms
the state-of-the-art metric learning to rank methods and the stylish
query-independent LtR algorithms regarding accuracy and computational
efficiency.Comment: To appear in SIGIR'1
Inverse Reinforcement Learning via Deep Gaussian Process
We propose a new approach to inverse reinforcement learning (IRL) based on
the deep Gaussian process (deep GP) model, which is capable of learning
complicated reward structures with few demonstrations. Our model stacks
multiple latent GP layers to learn abstract representations of the state
feature space, which is linked to the demonstrations through the Maximum
Entropy learning framework. Incorporating the IRL engine into the nonlinear
latent structure renders existing deep GP inference approaches intractable. To
tackle this, we develop a non-standard variational approximation framework
which extends previous inference schemes. This allows for approximate Bayesian
treatment of the feature space and guards against overfitting. Carrying out
representation and inverse reinforcement learning simultaneously within our
model outperforms state-of-the-art approaches, as we demonstrate with
experiments on standard benchmarks ("object world","highway driving") and a new
benchmark ("binary world")
Domain-Invariant Projection Learning for Zero-Shot Recognition
Zero-shot learning (ZSL) aims to recognize unseen object classes without any
training samples, which can be regarded as a form of transfer learning from
seen classes to unseen ones. This is made possible by learning a projection
between a feature space and a semantic space (e.g. attribute space). Key to ZSL
is thus to learn a projection function that is robust against the often large
domain gap between the seen and unseen classes. In this paper, we propose a
novel ZSL model termed domain-invariant projection learning (DIPL). Our model
has two novel components: (1) A domain-invariant feature self-reconstruction
task is introduced to the seen/unseen class data, resulting in a simple linear
formulation that casts ZSL into a min-min optimization problem. Solving the
problem is non-trivial, and a novel iterative algorithm is formulated as the
solver, with rigorous theoretic algorithm analysis provided. (2) To further
align the two domains via the learned projection, shared semantic structure
among seen and unseen classes is explored via forming superclasses in the
semantic space. Extensive experiments show that our model outperforms the
state-of-the-art alternatives by significant margins.Comment: Accepted to NIPS 201
A Novel Perspective to Zero-shot Learning: Towards an Alignment of Manifold Structures via Semantic Feature Expansion
Zero-shot learning aims at recognizing unseen classes (no training example)
with knowledge transferred from seen classes. This is typically achieved by
exploiting a semantic feature space shared by both seen and unseen classes,
i.e., attribute or word vector, as the bridge. One common practice in zero-shot
learning is to train a projection between the visual and semantic feature
spaces with labeled seen classes examples. When inferring, this learned
projection is applied to unseen classes and recognizes the class labels by some
metrics. However, the visual and semantic feature spaces are mutually
independent and have quite different manifold structures. Under such a
paradigm, most existing methods easily suffer from the domain shift problem and
weaken the performance of zero-shot recognition. To address this issue, we
propose a novel model called AMS-SFE. It considers the alignment of manifold
structures by semantic feature expansion. Specifically, we build upon an
autoencoder-based model to expand the semantic features from the visual inputs.
Additionally, the expansion is jointly guided by an embedded manifold extracted
from the visual feature space of the data. Our model is the first attempt to
align both feature spaces by expanding semantic features and derives two
benefits: first, we expand some auxiliary features that enhance the semantic
feature space; second and more importantly, we implicitly align the manifold
structures between the visual and semantic feature spaces; thus, the projection
can be better trained and mitigate the domain shift problem. Extensive
experiments show significant performance improvement, which verifies the
effectiveness of our model
Learning from Between-class Examples for Deep Sound Recognition
Deep learning methods have achieved high performance in sound recognition
tasks. Deciding how to feed the training data is important for further
performance improvement. We propose a novel learning method for deep sound
recognition: Between-Class learning (BC learning). Our strategy is to learn a
discriminative feature space by recognizing the between-class sounds as
between-class sounds. We generate between-class sounds by mixing two sounds
belonging to different classes with a random ratio. We then input the mixed
sound to the model and train the model to output the mixing ratio. The
advantages of BC learning are not limited only to the increase in variation of
the training data; BC learning leads to an enlargement of Fisher's criterion in
the feature space and a regularization of the positional relationship among the
feature distributions of the classes. The experimental results show that BC
learning improves the performance on various sound recognition networks,
datasets, and data augmentation schemes, in which BC learning proves to be
always beneficial. Furthermore, we construct a new deep sound recognition
network (EnvNet-v2) and train it with BC learning. As a result, we achieved a
performance surpasses the human level.Comment: 13 pages, 6 figures, published as a conference paper at ICLR 201
Feature grouping from spatially constrained multiplicative interaction
We present a feature learning model that learns to encode relationships
between images. The model is defined as a Gated Boltzmann Machine, which is
constrained such that hidden units that are nearby in space can gate each
other's connections. We show how frequency/orientation "columns" as well as
topographic filter maps follow naturally from training the model on image
pairs. The model also helps explain why square-pooling models yield feature
groups with similar grouping properties. Experimental results on synthetic
image transformations show that spatially constrained gating is an effective
way to reduce the number of parameters and thereby to regularize a
transformation-learning model.Comment: (new version:) added training formulae; added minor clarification
Deep Learning for Multi-label Classification
In multi-label classification, the main focus has been to develop ways of
learning the underlying dependencies between labels, and to take advantage of
this at classification time. Developing better feature-space representations
has been predominantly employed to reduce complexity, e.g., by eliminating
non-helpful feature attributes from the input space prior to (or during)
training. This is an important task, since many multi-label methods typically
create many different copies or views of the same input data as they transform
it, and considerable memory can be saved by taking advantage of redundancy. In
this paper, we show that a proper development of the feature space can make
labels less interdependent and easier to model and predict at inference time.
For this task we use a deep learning approach with restricted Boltzmann
machines. We present a deep network that, in an empirical evaluation,
outperforms a number of competitive methods from the literatur
- …