Search CORE

24 research outputs found

SPLASH: Learnable Activation Functions for Improving Accuracy and Adversarial Robustness

Author: Agostinelli Forest
Baldi Pierre
Tavakoli Mohammadamin
Publication venue
Publication date: 16/06/2020
Field of study

We introduce SPLASH units, a class of learnable activation functions shown to simultaneously improve the accuracy of deep neural networks while also improving their robustness to adversarial attacks. SPLASH units have both a simple parameterization and maintain the ability to approximate a wide range of non-linear functions. SPLASH units are: 1) continuous; 2) grounded (f(0) = 0); 3) use symmetric hinges; and 4) the locations of the hinges are derived directly from the data (i.e. no learning required). Compared to nine other learned and fixed activation functions, including ReLU and its variants, SPLASH units show superior performance across three datasets (MNIST, CIFAR-10, and CIFAR-100) and four architectures (LeNet5, All-CNN, ResNet-20, and Network-in-Network). Furthermore, we show that SPLASH units significantly increase the robustness of deep neural networks to adversarial attacks. Our experiments on both black-box and open-box adversarial attacks show that commonly-used architectures, namely LeNet5, All-CNN, ResNet-20, and Network-in-Network, can be up to 31% more robust to adversarial attacks by simply using SPLASH units instead of ReLUs

arXiv.org e-Print Archive

Scholar Commons - Institutional Repository of the University of South Carolina

SPLASH: Learnable Activation Functions for Improving Accuracy and Adversarial Robustness

Author: Agostinelli Forest
Baldi Pierre
Tavakoli Mohammadamin
Publication venue: Scholar Commons
Publication date: 01/08/2021
Field of study

We introduce SPLASH units, a class of learnable activation functions shown to simultaneously improve the accuracy of deep neural networks while also improving their robustness to adversarial attacks. SPLASH units have both a simple parameterization and maintain the ability to approximate a wide range of non-linear functions. SPLASH units are: (1) continuous; (2) grounded (f(0)=0 ); (3) use symmetric hinges; and (4) their hinges are placed at fixed locations which are derived from the data (i.e. no learning required). Compared to nine other learned and fixed activation functions, including ReLU and its variants, SPLASH units show superior performance across three datasets (MNIST, CIFAR-10, and CIFAR-100) and four architectures (LeNet5, All-CNN, ResNet-20, and Network-in-Network). Furthermore, we show that SPLASH units significantly increase the robustness of deep neural networks to adversarial attacks. Our experiments on both black-box and white-box adversarial attacks show that commonly-used architectures, namely LeNet5, All-CNN, Network-in-Network, and ResNet-20, can be up to 31% more robust to adversarial attacks by simply using SPLASH units instead of ReLUs. Finally, we show the benefits of using SPLASH activation functions in bigger architectures designed for non-trivial datasets such as ImageNet

Scholar Commons - Institutional Repository of the University of South Carolina

Independent Modular Networks

Author: Agostinelli Forest
Damirchi Hamed
Jamshidi Pooyan
Publication venue
Publication date: 02/06/2023
Field of study

Monolithic neural networks that make use of a single set of weights to learn useful representations for downstream tasks explicitly dismiss the compositional nature of data generation processes. This characteristic exists in data where every instance can be regarded as the combination of an identity concept, such as the shape of an object, combined with modifying concepts, such as orientation, color, and size. The dismissal of compositionality is especially detrimental in robotics, where state estimation relies heavily on the compositional nature of physical mechanisms (e.g., rotations and transformations) to model interactions. To accommodate this data characteristic, modular networks have been proposed. However, a lack of structure in each module's role, and modular network-specific issues such as module collapse have restricted their usability. We propose a modular network architecture that accommodates the mentioned decompositional concept by proposing a unique structure that splits the modules into predetermined roles. Additionally, we provide regularizations that improve the resiliency of the modular network to the problem of module collapse while improving the decomposition accuracy of the model.Comment: ICRA23 RAP4Robots Worksho

arXiv.org e-Print Archive

On Solving the Rubik's Cube with Domain-Independent Planners Using Standard Representations

Author: Agostinelli Forest
Muppasani Bharath
Pallagani Vishal
Srivastava Biplav
Publication venue
Publication date: 21/08/2023
Field of study

Rubik's Cube (RC) is a well-known and computationally challenging puzzle that has motivated AI researchers to explore efficient alternative representations and problem-solving methods. The ideal situation for planning here is that a problem be solved optimally and efficiently represented in a standard notation using a general-purpose solver and heuristics. The fastest solver today for RC is DeepCubeA with a custom representation, and another approach is with Scorpion planner with State-Action-Space+ (SAS+) representation. In this paper, we present the first RC representation in the popular PDDL language so that the domain becomes more accessible to PDDL planners, competitions, and knowledge engineering tools, and is more human-readable. We then bridge across existing approaches and compare performance. We find that in one comparable experiment, DeepCubeA (trained with 12 RC actions) solves all problems with varying complexities, albeit only 78.5% are optimal plans. For the same problem set, Scorpion with SAS+ representation and pattern database heuristics solves 61.50% problems optimally, while FastDownward with PDDL representation and FF heuristic solves 56.50% problems, out of which 79.64% of the plans generated were optimal. Our study provides valuable insights into the trade-offs between representational choice and plan optimality that can help researchers design future strategies for challenging domains combining general-purpose solving methods (planning, reinforcement learning), heuristics, and representations (standard or custom)

arXiv.org e-Print Archive

A* Search Without Expansions: Learning Heuristic Functions with Deep Q-Networks

Author: Agostinelli Forest
Baldi Pierre
Fox Roy
McAleer Stephen
Shmakov Alexander
Publication venue
Publication date: 08/02/2021
Field of study

A* search is an informed search algorithm that uses a heuristic function to guide the order in which nodes are expanded. Since the computation required to expand a node and compute the heuristic values for all of its generated children grows linearly with the size of the action space, A* search can become impractical for problems with large action spaces. This computational burden becomes even more apparent when heuristic functions are learned by general, but computationally expensive, deep neural networks. To address this problem, we introduce DeepCubeAQ, a deep reinforcement learning and search algorithm that builds on the DeepCubeA algorithm and deep Q-networks. DeepCubeAQ learns a heuristic function that, with a single forward pass through a deep neural network, computes the sum of the transition cost and the heuristic value of all of the children of a node without explicitly generating any of the children, eliminating the need for node expansions. DeepCubeAQ then uses a novel variant of A* search, called AQ* search, that uses the deep Q-network to guide search. We use DeepCubeAQ to solve the Rubik's cube when formulated with a large action space that includes 1872 meta-actions and show that this 157-fold increase in the size of the action space incurs less than a 4-fold increase in computation time when performing AQ* search and that AQ* search is orders of magnitude faster than A* search

arXiv.org e-Print Archive

A CdZnTeSe Gamma Spectrometer Trained by Deep Convolutional Neural Network for Radioisotope Identification

Author: Agostinelli Forest
Chaudhuri Sandeep K
James Ralph B
Kleppinger Joshua W
Mandal Krishna C.
Nag Ritwik
Panta Rojina
Roy Kaushik
Roy Utpal N
Sheth Amit
Publication venue: Scholar Commons
Publication date: 01/09/2021
Field of study

We report the implementation of a deep convolutional neural network to train a high-resolution room-temperature CdZnTeSe based gamma ray spectrometer for accurate and precise determination of gamma ray energies for radioisotope identification. The prototype learned spectrometer consists of a NI PCI 5122 fast digitizer connected to a pre-amplifier to recognize spectral features in a sequence of data. We used simulated preamplifier pulses that resemble actual data for various gamma photon energies to train a CNN on the equivalent of 90 seconds worth of data and validated it on 10 seconds worth of simulated data

Scholar Commons - Institutional Repository of the University of South Carolina

Recommended from our members

Deep Learning for Puzzles and Circadian Rhythms

Author: Agostinelli Forest
Publication venue: eScholarship, University of California
Publication date: 01/01/2019
Field of study

The combination of deep learning with reinforcement learning and the application of deep learning to the sciences is a relatively new and flourishing field. We show how deep reinforcement learning techniques can learn to solve problems, often in the most efficient way possible, when faced with many possibilities but little information by designing an algorithm that can learn to solve seven different combinatorial puzzles, including the Rubik's cube. Furthermore, we show how deep learning can be applied to the field of circadian rhythms. Circadian rhythms are fundamental for all forms of life. Using deep learning, we can gain insight into circadian rhythms on the molecular level. Finally, we propose new deep learning algorithms that yield significant performance improvements on computer vision and high energy physics tasks

eScholarship - University of California

Recommended from our members

Deep Learning for Puzzles and Circadian Rhythms

Author: Agostinelli Forest
Publication venue: 'University of California, Irvine'
Publication date: 01/01/2019
Field of study

eScholarship - University of California

ProQuest OAI Repository