14 research outputs found

    Learning Combinations of Activation Functions

    Full text link
    In the last decade, an active area of research has been devoted to design novel activation functions that are able to help deep neural networks to converge, obtaining better performance. The training procedure of these architectures usually involves optimization of the weights of their layers only, while non-linearities are generally pre-specified and their (possible) parameters are usually considered as hyper-parameters to be tuned manually. In this paper, we introduce two approaches to automatically learn different combinations of base activation functions (such as the identity function, ReLU, and tanh) during the training phase. We present a thorough comparison of our novel approaches with well-known architectures (such as LeNet-5, AlexNet, and ResNet-56) on three standard datasets (Fashion-MNIST, CIFAR-10, and ILSVRC-2012), showing substantial improvements in the overall performance, such as an increase in the top-1 accuracy for AlexNet on ILSVRC-2012 of 3.01 percentage points.Comment: 6 pages, 3 figures. Published as a conference paper at ICPR 2018. Code: https://bitbucket.org/francux/learning_combinations_of_activation_function

    Quantum Inspired Genetic Programming Model to Predict Toxicity Degree for Chemical Compounds

    Get PDF
    Cheminformatics plays a vital role to maintain a large amount of chemical data. A reliable prediction of toxic effects of chemicals in living systems is highly desirable in domains such as cosmetics, drug design, food safety, and manufacturing chemical compounds. Toxicity prediction topic requires several new approaches for knowledge discovery from data to paradigm composite associations between the modules of the chemical compound; such techniques need more computational cost as the number of chemical compounds increases. State-of-the-art prediction methods such as neural network and multi-layer regression that requires either tuning parameters or complex transformations of predictor or outcome variables are not achieving high accuracy results.  This paper proposes a Quantum Inspired Genetic Programming “QIGP” model to improve the prediction accuracy. Genetic Programming is utilized to give a linear equation for calculating toxicity degree more accurately. Quantum computing is employed to improve the selection of the best-of-run individuals and handles parsimony pressure to reduce the complexity of the solutions. The results of the internal validation analysis indicated that the QIGP model has the better goodness of fit statistics and significantly outperforms the Neural Network model

    Euclidean Space Data Projection Classifier with Cartesian Genetic Programming (CGP)

    Get PDF
    Most evolutionary based classifiers are built based on generated rules sets that categorize the data into respective classes. This research work is a preliminary work which proposes an evolutionary-based classifier using a simplified Cartesian Genetic Programming (CGP) evolutionary algorithm. Instead on using evolutionary generated rule sets, the CGP generates i) a reference coordinate ii) projection functions to project data into a new 3 Dimensional Euclidean space. Subsequently, a distance boundary function of the new projected data to the reference coordinates is applied to classify the data into their respective classes. The evolutionary algorithm is based on a simplified CGP Algorithm using a 1+4 evolutionary strategy. The data projection functions were evolved using CGP for 1000 generations before stopping to extract the best functions. The Classifier was tested using three PROBEN 1 benchmarking datasets which are the PIMA Indians diabetes dataset, Heart Disease dataset and Wisconsin Breast Cancer (WBC) Dataset based on 10 fold cross validation dataset partitioning. Testing results showed that data projection function generated competitive results classification rates: Cancer dataset (97.71%), PIMA Indians dataset (77.92%) and heart disease (85.86%)

    SPLASH: Learnable Activation Functions for Improving Accuracy and Adversarial Robustness

    Get PDF
    We introduce SPLASH units, a class of learnable activation functions shown to simultaneously improve the accuracy of deep neural networks while also improving their robustness to adversarial attacks. SPLASH units have both a simple parameterization and maintain the ability to approximate a wide range of non-linear functions. SPLASH units are: 1) continuous; 2) grounded (f(0) = 0); 3) use symmetric hinges; and 4) the locations of the hinges are derived directly from the data (i.e. no learning required). Compared to nine other learned and fixed activation functions, including ReLU and its variants, SPLASH units show superior performance across three datasets (MNIST, CIFAR-10, and CIFAR-100) and four architectures (LeNet5, All-CNN, ResNet-20, and Network-in-Network). Furthermore, we show that SPLASH units significantly increase the robustness of deep neural networks to adversarial attacks. Our experiments on both black-box and open-box adversarial attacks show that commonly-used architectures, namely LeNet5, All-CNN, ResNet-20, and Network-in-Network, can be up to 31% more robust to adversarial attacks by simply using SPLASH units instead of ReLUs

    SPLASH: Learnable Activation Functions for Improving Accuracy and Adversarial Robustness

    Get PDF
    We introduce SPLASH units, a class of learnable activation functions shown to simultaneously improve the accuracy of deep neural networks while also improving their robustness to adversarial attacks. SPLASH units have both a simple parameterization and maintain the ability to approximate a wide range of non-linear functions. SPLASH units are: (1) continuous; (2) grounded (f(0)=0 ); (3) use symmetric hinges; and (4) their hinges are placed at fixed locations which are derived from the data (i.e. no learning required). Compared to nine other learned and fixed activation functions, including ReLU and its variants, SPLASH units show superior performance across three datasets (MNIST, CIFAR-10, and CIFAR-100) and four architectures (LeNet5, All-CNN, ResNet-20, and Network-in-Network). Furthermore, we show that SPLASH units significantly increase the robustness of deep neural networks to adversarial attacks. Our experiments on both black-box and white-box adversarial attacks show that commonly-used architectures, namely LeNet5, All-CNN, Network-in-Network, and ResNet-20, can be up to 31% more robust to adversarial attacks by simply using SPLASH units instead of ReLUs. Finally, we show the benefits of using SPLASH activation functions in bigger architectures designed for non-trivial datasets such as ImageNet
    corecore