Search CORE

14 research outputs found

Learning Combinations of Activation Functions

Author: Manessi Franco
Rozza Alessandro
Publication venue: 'Institute of Electrical and Electronics Engineers (IEEE)'
Publication date: 25/04/2019
Field of study

In the last decade, an active area of research has been devoted to design novel activation functions that are able to help deep neural networks to converge, obtaining better performance. The training procedure of these architectures usually involves optimization of the weights of their layers only, while non-linearities are generally pre-specified and their (possible) parameters are usually considered as hyper-parameters to be tuned manually. In this paper, we introduce two approaches to automatically learn different combinations of base activation functions (such as the identity function, ReLU, and tanh) during the training phase. We present a thorough comparison of our novel approaches with well-known architectures (such as LeNet-5, AlexNet, and ResNet-56) on three standard datasets (Fashion-MNIST, CIFAR-10, and ILSVRC-2012), showing substantial improvements in the overall performance, such as an increase in the top-1 accuracy for AlexNet on ILSVRC-2012 of 3.01 percentage points.Comment: 6 pages, 3 figures. Published as a conference paper at ICPR 2018. Code: https://bitbucket.org/francux/learning_combinations_of_activation_function

arXiv.org e-Print Archive

Crossref

Quantum Inspired Genetic Programming Model to Predict Toxicity Degree for Chemical Compounds

Author: Darwish Saad Mohamed
Publication venue: 'STMIK Dharma Wacana'
Publication date: 07/06/2018
Field of study

Cheminformatics plays a vital role to maintain a large amount of chemical data. A reliable prediction of toxic effects of chemicals in living systems is highly desirable in domains such as cosmetics, drug design, food safety, and manufacturing chemical compounds. Toxicity prediction topic requires several new approaches for knowledge discovery from data to paradigm composite associations between the modules of the chemical compound; such techniques need more computational cost as the number of chemical compounds increases. State-of-the-art prediction methods such as neural network and multi-layer regression that requires either tuning parameters or complex transformations of predictor or outcome variables are not achieving high accuracy results. This paper proposes a Quantum Inspired Genetic Programming “QIGP” model to improve the prediction accuracy. Genetic Programming is utilized to give a linear equation for calculating toxicity degree more accurately. Quantum computing is employed to improve the selection of the best-of-run individuals and handles parsimony pressure to reduce the complexity of the solutions. The results of the internal validation analysis indicated that the QIGP model has the better goodness of fit statistics and significantly outperforms the Neural Network model

International Journal of artificial intelligence research (IJAIR)

Euclidean Space Data Projection Classifier with Cartesian Genetic Programming (CGP)

Author: Ali Chekima
Lenin Gopal
Tan Terence
Wong WK
Publication venue: Journal of Telecommunication, Electronic and Computer Engineering (JTEC)
Publication date: 01/03/2018
Field of study

Most evolutionary based classifiers are built based on generated rules sets that categorize the data into respective classes. This research work is a preliminary work which proposes an evolutionary-based classifier using a simplified Cartesian Genetic Programming (CGP) evolutionary algorithm. Instead on using evolutionary generated rule sets, the CGP generates i) a reference coordinate ii) projection functions to project data into a new 3 Dimensional Euclidean space. Subsequently, a distance boundary function of the new projected data to the reference coordinates is applied to classify the data into their respective classes. The evolutionary algorithm is based on a simplified CGP Algorithm using a 1+4 evolutionary strategy. The data projection functions were evolved using CGP for 1000 generations before stopping to extract the best functions. The Classifier was tested using three PROBEN 1 benchmarking datasets which are the PIMA Indians diabetes dataset, Heart Disease dataset and Wisconsin Breast Cancer (WBC) Dataset based on 10 fold cross validation dataset partitioning. Testing results showed that data projection function generated competitive results classification rates: Cancer dataset (97.71%), PIMA Indians dataset (77.92%) and heart disease (85.86%)

Universiti Teknikal Malaysia Melaka: UTeM Open Journal System

SPLASH: Learnable Activation Functions for Improving Accuracy and Adversarial Robustness

Author: Agostinelli Forest
Baldi Pierre
Tavakoli Mohammadamin
Publication venue
Publication date: 16/06/2020
Field of study

We introduce SPLASH units, a class of learnable activation functions shown to simultaneously improve the accuracy of deep neural networks while also improving their robustness to adversarial attacks. SPLASH units have both a simple parameterization and maintain the ability to approximate a wide range of non-linear functions. SPLASH units are: 1) continuous; 2) grounded (f(0) = 0); 3) use symmetric hinges; and 4) the locations of the hinges are derived directly from the data (i.e. no learning required). Compared to nine other learned and fixed activation functions, including ReLU and its variants, SPLASH units show superior performance across three datasets (MNIST, CIFAR-10, and CIFAR-100) and four architectures (LeNet5, All-CNN, ResNet-20, and Network-in-Network). Furthermore, we show that SPLASH units significantly increase the robustness of deep neural networks to adversarial attacks. Our experiments on both black-box and open-box adversarial attacks show that commonly-used architectures, namely LeNet5, All-CNN, ResNet-20, and Network-in-Network, can be up to 31% more robust to adversarial attacks by simply using SPLASH units instead of ReLUs

arXiv.org e-Print Archive

Scholar Commons - Institutional Repository of the University of South Carolina

SPLASH: Learnable Activation Functions for Improving Accuracy and Adversarial Robustness

Author: Agostinelli Forest
Baldi Pierre
Tavakoli Mohammadamin
Publication venue: Scholar Commons
Publication date: 01/08/2021
Field of study

We introduce SPLASH units, a class of learnable activation functions shown to simultaneously improve the accuracy of deep neural networks while also improving their robustness to adversarial attacks. SPLASH units have both a simple parameterization and maintain the ability to approximate a wide range of non-linear functions. SPLASH units are: (1) continuous; (2) grounded (f(0)=0 ); (3) use symmetric hinges; and (4) their hinges are placed at fixed locations which are derived from the data (i.e. no learning required). Compared to nine other learned and fixed activation functions, including ReLU and its variants, SPLASH units show superior performance across three datasets (MNIST, CIFAR-10, and CIFAR-100) and four architectures (LeNet5, All-CNN, ResNet-20, and Network-in-Network). Furthermore, we show that SPLASH units significantly increase the robustness of deep neural networks to adversarial attacks. Our experiments on both black-box and white-box adversarial attacks show that commonly-used architectures, namely LeNet5, All-CNN, Network-in-Network, and ResNet-20, can be up to 31% more robust to adversarial attacks by simply using SPLASH units instead of ReLUs. Finally, we show the benefits of using SPLASH activation functions in bigger architectures designed for non-trivial datasets such as ImageNet

Scholar Commons - Institutional Repository of the University of South Carolina

Recurrent Cartesian Genetic Programming of Artificial Neural Networks

Author: Andrew James Turner
Julian Francis Miller
Publication venue: Springer Nature
Publication date: 01/01/2016
Field of study

Springer - Publisher Connector