24 research outputs found
SPLASH: Learnable Activation Functions for Improving Accuracy and Adversarial Robustness
We introduce SPLASH units, a class of learnable activation functions shown to
simultaneously improve the accuracy of deep neural networks while also
improving their robustness to adversarial attacks. SPLASH units have both a
simple parameterization and maintain the ability to approximate a wide range of
non-linear functions. SPLASH units are: 1) continuous; 2) grounded (f(0) = 0);
3) use symmetric hinges; and 4) the locations of the hinges are derived
directly from the data (i.e. no learning required). Compared to nine other
learned and fixed activation functions, including ReLU and its variants, SPLASH
units show superior performance across three datasets (MNIST, CIFAR-10, and
CIFAR-100) and four architectures (LeNet5, All-CNN, ResNet-20, and
Network-in-Network). Furthermore, we show that SPLASH units significantly
increase the robustness of deep neural networks to adversarial attacks. Our
experiments on both black-box and open-box adversarial attacks show that
commonly-used architectures, namely LeNet5, All-CNN, ResNet-20, and
Network-in-Network, can be up to 31% more robust to adversarial attacks by
simply using SPLASH units instead of ReLUs
SPLASH: Learnable Activation Functions for Improving Accuracy and Adversarial Robustness
We introduce SPLASH units, a class of learnable activation functions shown to simultaneously improve the accuracy of deep neural networks while also improving their robustness to adversarial attacks. SPLASH units have both a simple parameterization and maintain the ability to approximate a wide range of non-linear functions. SPLASH units are: (1) continuous; (2) grounded (f(0)=0 ); (3) use symmetric hinges; and (4) their hinges are placed at fixed locations which are derived from the data (i.e. no learning required). Compared to nine other learned and fixed activation functions, including ReLU and its variants, SPLASH units show superior performance across three datasets (MNIST, CIFAR-10, and CIFAR-100) and four architectures (LeNet5, All-CNN, ResNet-20, and Network-in-Network). Furthermore, we show that SPLASH units significantly increase the robustness of deep neural networks to adversarial attacks. Our experiments on both black-box and white-box adversarial attacks show that commonly-used architectures, namely LeNet5, All-CNN, Network-in-Network, and ResNet-20, can be up to 31% more robust to adversarial attacks by simply using SPLASH units instead of ReLUs. Finally, we show the benefits of using SPLASH activation functions in bigger architectures designed for non-trivial datasets such as ImageNet
Independent Modular Networks
Monolithic neural networks that make use of a single set of weights to learn
useful representations for downstream tasks explicitly dismiss the
compositional nature of data generation processes. This characteristic exists
in data where every instance can be regarded as the combination of an identity
concept, such as the shape of an object, combined with modifying concepts, such
as orientation, color, and size. The dismissal of compositionality is
especially detrimental in robotics, where state estimation relies heavily on
the compositional nature of physical mechanisms (e.g., rotations and
transformations) to model interactions. To accommodate this data
characteristic, modular networks have been proposed. However, a lack of
structure in each module's role, and modular network-specific issues such as
module collapse have restricted their usability. We propose a modular network
architecture that accommodates the mentioned decompositional concept by
proposing a unique structure that splits the modules into predetermined roles.
Additionally, we provide regularizations that improve the resiliency of the
modular network to the problem of module collapse while improving the
decomposition accuracy of the model.Comment: ICRA23 RAP4Robots Worksho
On Solving the Rubik's Cube with Domain-Independent Planners Using Standard Representations
Rubik's Cube (RC) is a well-known and computationally challenging puzzle that
has motivated AI researchers to explore efficient alternative representations
and problem-solving methods. The ideal situation for planning here is that a
problem be solved optimally and efficiently represented in a standard notation
using a general-purpose solver and heuristics. The fastest solver today for RC
is DeepCubeA with a custom representation, and another approach is with
Scorpion planner with State-Action-Space+ (SAS+) representation. In this paper,
we present the first RC representation in the popular PDDL language so that the
domain becomes more accessible to PDDL planners, competitions, and knowledge
engineering tools, and is more human-readable. We then bridge across existing
approaches and compare performance. We find that in one comparable experiment,
DeepCubeA (trained with 12 RC actions) solves all problems with varying
complexities, albeit only 78.5% are optimal plans. For the same problem set,
Scorpion with SAS+ representation and pattern database heuristics solves 61.50%
problems optimally, while FastDownward with PDDL representation and FF
heuristic solves 56.50% problems, out of which 79.64% of the plans generated
were optimal. Our study provides valuable insights into the trade-offs between
representational choice and plan optimality that can help researchers design
future strategies for challenging domains combining general-purpose solving
methods (planning, reinforcement learning), heuristics, and representations
(standard or custom)
A* Search Without Expansions: Learning Heuristic Functions with Deep Q-Networks
A* search is an informed search algorithm that uses a heuristic function to
guide the order in which nodes are expanded. Since the computation required to
expand a node and compute the heuristic values for all of its generated
children grows linearly with the size of the action space, A* search can become
impractical for problems with large action spaces. This computational burden
becomes even more apparent when heuristic functions are learned by general, but
computationally expensive, deep neural networks. To address this problem, we
introduce DeepCubeAQ, a deep reinforcement learning and search algorithm that
builds on the DeepCubeA algorithm and deep Q-networks. DeepCubeAQ learns a
heuristic function that, with a single forward pass through a deep neural
network, computes the sum of the transition cost and the heuristic value of all
of the children of a node without explicitly generating any of the children,
eliminating the need for node expansions. DeepCubeAQ then uses a novel variant
of A* search, called AQ* search, that uses the deep Q-network to guide search.
We use DeepCubeAQ to solve the Rubik's cube when formulated with a large action
space that includes 1872 meta-actions and show that this 157-fold increase in
the size of the action space incurs less than a 4-fold increase in computation
time when performing AQ* search and that AQ* search is orders of magnitude
faster than A* search
A CdZnTeSe Gamma Spectrometer Trained by Deep Convolutional Neural Network for Radioisotope Identification
We report the implementation of a deep convolutional neural network to train a high-resolution room-temperature CdZnTeSe based gamma ray spectrometer for accurate and precise determination of gamma ray energies for radioisotope identification. The prototype learned spectrometer consists of a NI PCI 5122 fast digitizer connected to a pre-amplifier to recognize spectral features in a sequence of data. We used simulated preamplifier pulses that resemble actual data for various gamma photon energies to train a CNN on the equivalent of 90 seconds worth of data and validated it on 10 seconds worth of simulated data
Recommended from our members
Deep Learning for Puzzles and Circadian Rhythms
The combination of deep learning with reinforcement learning and the application of deep learning to the sciences is a relatively new and flourishing field. We show how deep reinforcement learning techniques can learn to solve problems, often in the most efficient way possible, when faced with many possibilities but little information by designing an algorithm that can learn to solve seven different combinatorial puzzles, including the Rubik's cube. Furthermore, we show how deep learning can be applied to the field of circadian rhythms. Circadian rhythms are fundamental for all forms of life. Using deep learning, we can gain insight into circadian rhythms on the molecular level. Finally, we propose new deep learning algorithms that yield significant performance improvements on computer vision and high energy physics tasks
Recommended from our members
Deep Learning for Puzzles and Circadian Rhythms
The combination of deep learning with reinforcement learning and the application of deep learning to the sciences is a relatively new and flourishing field. We show how deep reinforcement learning techniques can learn to solve problems, often in the most efficient way possible, when faced with many possibilities but little information by designing an algorithm that can learn to solve seven different combinatorial puzzles, including the Rubik's cube. Furthermore, we show how deep learning can be applied to the field of circadian rhythms. Circadian rhythms are fundamental for all forms of life. Using deep learning, we can gain insight into circadian rhythms on the molecular level. Finally, we propose new deep learning algorithms that yield significant performance improvements on computer vision and high energy physics tasks