84,017 research outputs found
Knowledge Consistency between Neural Networks and Beyond
This paper aims to analyze knowledge consistency between pre-trained deep
neural networks. We propose a generic definition for knowledge consistency
between neural networks at different fuzziness levels. A task-agnostic method
is designed to disentangle feature components, which represent the consistent
knowledge, from raw intermediate-layer features of each neural network. As a
generic tool, our method can be broadly used for different applications. In
preliminary experiments, we have used knowledge consistency as a tool to
diagnose representations of neural networks. Knowledge consistency provides new
insights to explain the success of existing deep-learning techniques, such as
knowledge distillation and network compression. More crucially, knowledge
consistency can also be used to refine pre-trained networks and boost
performance
End-to-End Video Classification with Knowledge Graphs
Video understanding has attracted much research attention especially since
the recent availability of large-scale video benchmarks. In this paper, we
address the problem of multi-label video classification. We first observe that
there exists a significant knowledge gap between how machines and humans learn.
That is, while current machine learning approaches including deep neural
networks largely focus on the representations of the given data, humans often
look beyond the data at hand and leverage external knowledge to make better
decisions. Towards narrowing the gap, we propose to incorporate external
knowledge graphs into video classification. In particular, we unify traditional
"knowledgeless" machine learning models and knowledge graphs in a novel
end-to-end framework. The framework is flexible to work with most existing
video classification algorithms including state-of-the-art deep models.
Finally, we conduct extensive experiments on the largest public video dataset
YouTube-8M. The results are promising across the board, improving mean average
precision by up to 2.9%.Comment: 9 pages, 5 figure
Training Deep Neural Networks on Noisy Labels with Bootstrapping
Current state-of-the-art deep learning systems for visual object recognition
and detection use purely supervised training with regularization such as
dropout to avoid overfitting. The performance depends critically on the amount
of labeled examples, and in current practice the labels are assumed to be
unambiguous and accurate. However, this assumption often does not hold; e.g. in
recognition, class labels may be missing; in detection, objects in the image
may not be localized; and in general, the labeling may be subjective. In this
work we propose a generic way to handle noisy and incomplete labeling by
augmenting the prediction objective with a notion of consistency. We consider a
prediction consistent if the same prediction is made given similar percepts,
where the notion of similarity is between deep network features computed from
the input data. In experiments we demonstrate that our approach yields
substantial robustness to label noise on several datasets. On MNIST handwritten
digits, we show that our model is robust to label corruption. On the Toronto
Face Database, we show that our model handles well the case of subjective
labels in emotion recognition, achieving state-of-the- art results, and can
also benefit from unlabeled face images with no modification to our method. On
the ILSVRC2014 detection challenge data, we show that our approach extends to
very deep networks, high resolution images and structured outputs, and results
in improved scalable detection
KG-GAN: Knowledge-Guided Generative Adversarial Networks
Can generative adversarial networks (GANs) generate roses of various colors
given only roses of red petals as input? The answer is negative, since GANs'
discriminator would reject all roses of unseen petal colors. In this study, we
propose knowledge-guided GAN (KG-GAN) to fuse domain knowledge with the GAN
framework. KG-GAN trains two generators; one learns from data whereas the other
learns from knowledge with a constraint function. Experimental results
demonstrate the effectiveness of KG-GAN in generating unseen flower categories
from seen categories given textual descriptions of the unseen ones
Physics-guided Neural Networks (PGNN): An Application in Lake Temperature Modeling
This paper introduces a novel framework for combining scientific knowledge of
physics-based models with neural networks to advance scientific discovery. This
framework, termed as physics-guided neural network (PGNN), leverages the output
of physics-based model simulations along with observational features to
generate predictions using a neural network architecture. Further, this paper
presents a novel framework for using physics-based loss functions in the
learning objective of neural networks, to ensure that the model predictions not
only show lower errors on the training set but are also scientifically
consistent with the known physics on the unlabeled set. We illustrate the
effectiveness of PGNN for the problem of lake temperature modeling, where
physical relationships between the temperature, density, and depth of water are
used to design a physics-based loss function. By using scientific knowledge to
guide the construction and learning of neural networks, we are able to show
that the proposed framework ensures better generalizability as well as
scientific consistency of results.Comment: submitted to ACM SIGKDD 201
Are Nearby Neighbors Relatives?: Testing Deep Music Embeddings
Deep neural networks have frequently been used to directly learn
representations useful for a given task from raw input data. In terms of
overall performance metrics, machine learning solutions employing deep
representations frequently have been reported to greatly outperform those using
hand-crafted feature representations. At the same time, they may pick up on
aspects that are predominant in the data, yet not actually meaningful or
interpretable. In this paper, we therefore propose a systematic way to test the
trustworthiness of deep music representations, considering musical semantics.
The underlying assumption is that in case a deep representation is to be
trusted, distance consistency between known related points should be maintained
both in the input audio space and corresponding latent deep space. We generate
known related points through semantically meaningful transformations, both
considering imperceptible and graver transformations. Then, we examine within-
and between-space distance consistencies, both considering audio space and
latent embedded space, the latter either being a result of a conventional
feature extractor or a deep encoder. We illustrate how our method, as a
complement to task-specific performance, provides interpretable insight into
what a network may have captured from training data signals.Comment: this work was accepted for publication in the "Frontiers in Applied
Mathematics and Statistics (Deep Learning: Status, Applications and
Algorithms)
PortraitGAN for Flexible Portrait Manipulation
Previous methods have dealt with discrete manipulation of facial attributes
such as smile, sad, angry, surprise etc, out of canonical expressions and they
are not scalable, operating in single modality. In this paper, we propose a
novel framework that supports continuous edits and multi-modality portrait
manipulation using adversarial learning. Specifically, we adapt
cycle-consistency into the conditional setting by leveraging additional facial
landmarks information. This has two effects: first cycle mapping induces
bidirectional manipulation and identity preserving; second pairing samples from
different modalities can thus be utilized. To ensure high-quality synthesis, we
adopt texture-loss that enforces texture consistency and multi-level
adversarial supervision that facilitates gradient flow. Quantitative and
qualitative experiments show the effectiveness of our framework in performing
flexible and multi-modality portrait manipulation with photo-realistic effects
Joint Learning of Neural Networks via Iterative Reweighted Least Squares
In this paper, we introduce the problem of jointly learning feed-forward
neural networks across a set of relevant but diverse datasets. Compared to
learning a separate network from each dataset in isolation, joint learning
enables us to extract correlated information across multiple datasets to
significantly improve the quality of learned networks. We formulate this
problem as joint learning of multiple copies of the same network architecture
and enforce the network weights to be shared across these networks. Instead of
hand-encoding the shared network layers, we solve an optimization problem to
automatically determine how layers should be shared between each pair of
datasets. Experimental results show that our approach outperforms baselines
without joint learning and those using pretraining-and-fine-tuning. We show the
effectiveness of our approach on three tasks: image classification, learning
auto-encoders, and image generation
Every Node Counts: Self-Ensembling Graph Convolutional Networks for Semi-Supervised Learning
Graph convolutional network (GCN) provides a powerful means for graph-based
semi-supervised tasks. However, as a localized first-order approximation of
spectral graph convolution, the classic GCN can not take full advantage of
unlabeled data, especially when the unlabeled node is far from labeled ones. To
capitalize on the information from unlabeled nodes to boost the training for
GCN, we propose a novel framework named Self-Ensembling GCN (SEGCN), which
marries GCN with Mean Teacher - another powerful model in semi-supervised
learning. SEGCN contains a student model and a teacher model. As a student, it
not only learns to correctly classify the labeled nodes, but also tries to be
consistent with the teacher on unlabeled nodes in more challenging situations,
such as a high dropout rate and graph collapse. As a teacher, it averages the
student model weights and generates more accurate predictions to lead the
student. In such a mutual-promoting process, both labeled and unlabeled samples
can be fully utilized for backpropagating effective gradients to train GCN. In
three article classification tasks, i.e. Citeseer, Cora and Pubmed, we validate
that the proposed method matches the state of the arts in the classification
accuracy.Comment: 9 pages, 4 figure
Mean teachers are better role models: Weight-averaged consistency targets improve semi-supervised deep learning results
The recently proposed Temporal Ensembling has achieved state-of-the-art
results in several semi-supervised learning benchmarks. It maintains an
exponential moving average of label predictions on each training example, and
penalizes predictions that are inconsistent with this target. However, because
the targets change only once per epoch, Temporal Ensembling becomes unwieldy
when learning large datasets. To overcome this problem, we propose Mean
Teacher, a method that averages model weights instead of label predictions. As
an additional benefit, Mean Teacher improves test accuracy and enables training
with fewer labels than Temporal Ensembling. Without changing the network
architecture, Mean Teacher achieves an error rate of 4.35% on SVHN with 250
labels, outperforming Temporal Ensembling trained with 1000 labels. We also
show that a good network architecture is crucial to performance. Combining Mean
Teacher and Residual Networks, we improve the state of the art on CIFAR-10 with
4000 labels from 10.55% to 6.28%, and on ImageNet 2012 with 10% of the labels
from 35.24% to 9.11%.Comment: In this version: Corrected hyperparameters of the 4000-label CIFAR-10
ResNet experiment. Changed Antti's contact info, Advances in Neural
Information Processing Systems 30 (NIPS 2017) pre-proceeding
- …