276 research outputs found
Achieving Adversarial Robustness via Sparsity
Network pruning has been known to produce compact models without much
accuracy degradation. However, how the pruning process affects a network's
robustness and the working mechanism behind remain unresolved. In this work, we
theoretically prove that the sparsity of network weights is closely associated
with model robustness. Through experiments on a variety of adversarial pruning
methods, we find that weights sparsity will not hurt but improve robustness,
where both weights inheritance from the lottery ticket and adversarial training
improve model robustness in network pruning. Based on these findings, we
propose a novel adversarial training method called inverse weights inheritance,
which imposes sparse weights distribution on a large network by inheriting
weights from a small network, thereby improving the robustness of the large
network
Fine-tuning Multi-hop Question Answering with Hierarchical Graph Network
In this paper, we present a two stage model for multi-hop question answering.
The first stage is a hierarchical graph network, which is used to reason over
multi-hop question and is capable to capture different levels of granularity
using the nature structure(i.e., paragraphs, questions, sentences and entities)
of documents. The reasoning process is convert to node classify task(i.e.,
paragraph nodes and sentences nodes). The second stage is a language model
fine-tuning task. In a word, stage one use graph neural network to select and
concatenate support sentences as one paragraph, and stage two find the answer
span in language model fine-tuning paradigm.Comment: the experience result is not as good as I excep
Self Normalizing Flows
Efficient gradient computation of the Jacobian determinant term is a core
problem in many machine learning settings, and especially so in the normalizing
flow framework. Most proposed flow models therefore either restrict to a
function class with easy evaluation of the Jacobian determinant, or an
efficient estimator thereof. However, these restrictions limit the performance
of such density models, frequently requiring significant depth to reach desired
performance levels. In this work, we propose Self Normalizing Flows, a flexible
framework for training normalizing flows by replacing expensive terms in the
gradient by learned approximate inverses at each layer. This reduces the
computational complexity of each layer's exact update from
to , allowing for the training of flow architectures which
were otherwise computationally infeasible, while also providing efficient
sampling. We show experimentally that such models are remarkably stable and
optimize to similar data likelihood values as their exact gradient
counterparts, while training more quickly and surpassing the performance of
functionally constrained counterparts
ES-ENAS: Blackbox Optimization over Hybrid Spaces via Combinatorial and Continuous Evolution
We consider the problem of efficient blackbox optimization over a large
hybrid search space, consisting of a mixture of a high dimensional continuous
space and a complex combinatorial space. Such examples arise commonly in
evolutionary computation, but also more recently, neuroevolution and
architecture search for Reinforcement Learning (RL) policies. Unfortunately
however, previous mutation-based approaches suffer in high dimensional
continuous spaces both theoretically and practically. We thus instead propose
ES-ENAS, a simple joint optimization procedure by combining Evolutionary
Strategies (ES) and combinatorial optimization techniques in a highly scalable
and intuitive way, inspired by the one-shot or supernet paradigm introduced in
Efficient Neural Architecture Search (ENAS). Through this relatively simple
marriage between two different lines of research, we are able to gain the best
of both worlds, and empirically demonstrate our approach by optimizing BBOB
functions over hybrid spaces as well as combinatorial neural network
architectures via edge pruning and quantization on popular RL benchmarks. Due
to the modularity of the algorithm, we also are able incorporate a wide variety
of popular techniques ranging from use of different continuous and
combinatorial optimizers, as well as constrained optimization.Comment: 22 pages. See
https://github.com/google-research/google-research/tree/master/es_enas for
associated cod
Boosting Deep Neural Networks with Geometrical Prior Knowledge: A Survey
While Deep Neural Networks (DNNs) achieve state-of-the-art results in many
different problem settings, they are affected by some crucial weaknesses. On
the one hand, DNNs depend on exploiting a vast amount of training data, whose
labeling process is time-consuming and expensive. On the other hand, DNNs are
often treated as black box systems, which complicates their evaluation and
validation. Both problems can be mitigated by incorporating prior knowledge
into the DNN.
One promising field, inspired by the success of convolutional neural networks
(CNNs) in computer vision tasks, is to incorporate knowledge about symmetric
geometrical transformations of the problem to solve. This promises an increased
data-efficiency and filter responses that are interpretable more easily. In
this survey, we try to give a concise overview about different approaches to
incorporate geometrical prior knowledge into DNNs. Additionally, we try to
connect those methods to the field of 3D object detection for autonomous
driving, where we expect promising results applying those methods.Comment: Survey Pape
- …