76 research outputs found

    Precise Multi-Neuron Abstractions for Neural Network Certification

    Full text link
    Formal verification of neural networks is critical for their safe adoption in real-world applications. However, designing a verifier which can handle realistic networks in a precise manner remains an open and difficult challenge. In this paper, we take a major step in addressing this challenge and present a new framework, called PRIMA, that computes precise convex approximations of arbitrary non-linear activations. PRIMA is based on novel approximation algorithms that compute the convex hull of polytopes, leveraging concepts from computational geometry. The algorithms have polynomial complexity, yield fewer constraints, and minimize precision loss. We evaluate the effectiveness of PRIMA on challenging neural networks with ReLU, Sigmoid, and Tanh activations. Our results show that PRIMA is significantly more precise than the state-of-the-art, verifying robustness for up to 16%, 30%, and 34% more images than prior work on ReLU-, Sigmoid-, and Tanh-based networks, respectively

    Artificial neural networks and their applications to intelligent fault diagnosis of power transmission lines

    Get PDF
    Over the past thirty years, the idea of computing based on models inspired by human brains and biological neural networks emerged. Artificial neural networks play an important role in the field of machine learning and hold the key to the success of performing many intelligent tasks by machines. They are used in various applications such as pattern recognition, data classification, stock market prediction, aerospace, weather forecasting, control systems, intelligent automation, robotics, and healthcare. Their architectures generally consist of an input layer, multiple hidden layers, and one output layer. They can be implemented on software or hardware. Nowadays, various structures with various names exist for artificial neural networks, each of which has its own particular applications. Those used types in this study include feedforward neural networks, convolutional neural networks, and general regression neural networks. Increasing the number of layers in artificial neural networks as needed for large datasets, implies increased computational expenses. Therefore, besides these basic structures in deep learning, some advanced techniques are proposed to overcome the drawbacks of original structures in deep learning such as transfer learning, federated learning, and reinforcement learning. Furthermore, implementing artificial neural networks in hardware gives scientists and engineers the chance to perform high-dimensional and big data-related tasks because it removes the constraints of memory access time defined as the von Neuman bottleneck. Accordingly, analog and digital circuits are used for artificial neural network implementations without using general-purpose CPUs. In this study, the problem of fault detection, identification, and location estimation of transmission lines is studied and various deep learning approaches are implemented and designed as solutions. This research work focuses on the transmission lines’ datasets, their faults, and the importance of identification, detection, and location estimation of them. It also includes a comprehensive review of the previous studies to perform these three tasks. The application of various artificial neural networks such as feedforward neural networks, convolutional neural networks, and general regression neural networks for identification, detection, and location estimation of transmission line datasets are also discussed in this study. Some advanced methods based on artificial neural networks are taken into account in this thesis such as the transfer learning technique. These methodologies are designed and applied on transmission line datasets to enable the scientist and engineers with using fewer data points for the training purpose and wasting less time on the training step. This work also proposes a transfer learning-based technique for distinguishing faulty and non-faulty insulators in transmission line images. Besides, an effective design for an activation function of the artificial neural networks is proposed in this thesis. Using hyperbolic tangent as an activation function in artificial neural networks has several benefits including inclusiveness and high accuracy

    Finite element interpolated neural networks for solving forward and inverse problems

    Full text link
    We propose a general framework for solving forward and inverse problems constrained by partial differential equations, where we interpolate neural networks onto finite element spaces to represent the (partial) unknowns. The framework overcomes the challenges related to the imposition of boundary conditions, the choice of collocation points in physics-informed neural networks, and the integration of variational physics-informed neural networks. A numerical experiment set confirms the framework's capability of handling various forward and inverse problems. In particular, the trained neural network generalises well for smooth problems, beating finite element solutions by some orders of magnitude. We finally propose an effective one-loop solver with an initial data fitting step (to obtain a cheap initialisation) to solve inverse problems

    Analogue neuromorphic systems.

    Get PDF
    This thesis addresses a new area of science and technology, that of neuromorphic systems, namely the problems and prospects of analogue neuromorphic systems. The subject is subdivided into three chapters. Chapter 1 is an introduction. It formulates the oncoming problem of the creation of highly computationally costly systems of nonlinear information processing (such as artificial neural networks and artificial intelligence systems). It shows that an analogue technology could make a vital contribution to the creation such systems. The basic principles of creation of analogue neuromorphic systems are formulated. The importance will be emphasised of the principle of orthogonality for future highly efficient complex information processing systems. Chapter 2 reviews the basics of neural and neuromorphic systems and informs on the present situation in this field of research, including both experimental and theoretical knowledge gained up-to-date. The chapter provides the necessary background for correct interpretation of the results reported in Chapter 3 and for a realistic decision on the direction for future work. Chapter 3 describes my own experimental and computational results within the framework of the subject, obtained at De Montfort University. These include: the building of (i) Analogue Polynomial Approximator/lnterpolatoriExtrapolator, (ii) Synthesiser of orthogonal functions, (iii) analogue real-time video filter (performing the homomorphic filtration), (iv) Adaptive polynomial compensator of geometrical distortions of CRT- monitors, (v) analogue parallel-learning neural network (backpropagation algorithm). Thus, this thesis makes a dual contribution to the chosen field: it summarises the present knowledge on the possibility of utilising analogue technology in up-to-date and future computational systems, and it reports new results within the framework of the subject. The main conclusion is that due to its promising power characteristics, small sizes and high tolerance to degradation, the analogue neuromorphic systems will playa more and more important role in future computational systems (in particular in systems of artificial intelligence)

    Efficient FPGA-Based Inference Architectures for Deep Learning Networks

    Get PDF
    L’apprentissage profond est devenu la technique de pointe pour de nombreuses applications de classification et de rĂ©gression. Les modĂšles d’apprentissage profond, tels que les rĂ©seaux de neurones profonds (Deep Neural Network - DNN) et les rĂ©seaux de neurones convolutionnels (Convolutional Neural Network - CNN), dĂ©ploient des dizaines de couches cachĂ©es avec des centaines de neurones pour obtenir une reprĂ©sentation significative des donnĂ©es d’entrĂ©e. La puissance des DNN et des CNN provient du fait qu’ils sont formĂ©s par apprentissage de caractĂ©ristiques extraites plutĂŽt que par des algorithmes spĂ©cifiques Ă  une tĂąche. Cependant, cela se fait aux dĂ©pens d’un coĂ»t de calcul Ă©levĂ© pour les processus d’apprentissage et d’infĂ©rence. Cela nĂ©cessite des accĂ©lĂ©rateurs avec de hautes performances et Ă©conomes en Ă©nergie, en particulier pour les infĂ©rences lorsque le traitement en temps rĂ©el est important. Les FPGA offrent une plateforme attrayante pour accĂ©lĂ©rer l’infĂ©rence des DNN et des CNN en raison de leurs performances, dĂ» Ă  leur configurabilitĂ© et de leur efficacitĂ© Ă©nergĂ©tique. Dans cette thĂšse, nous abordons trois problĂšmes principaux. PremiĂšrement, nous examinons le problĂšme de la mise en oeuvre prĂ©cise et efficace des DNN traditionnels entiĂšrement connectĂ©s sur les FPGA. Bien que les rĂ©seaux de neurones binaires (Binary Neural Network - BNN) utilisent une reprĂ©sentation de donnĂ©es compacte sur un bit par rapport aux donnĂ©es Ă  virgule fixe et Ă  virgule flottante pour les DNN et les CNN traditionnels, ils peuvent encore nĂ©cessiter trop de ressources de calcul et de mĂ©moire. Par consĂ©quent, nous Ă©tudions le problĂšme de l’implĂ©mentation des BNN sur FPGA en tant que deuxiĂšme problĂšme. Enfin, nous nous concentrons sur l’introduction des FPGA en tant qu’accĂ©lĂ©rateurs matĂ©riels pour un plus grand nombre de dĂ©veloppeurs de logiciels, en particulier ceux qui ne maĂźtrisent pas les connaissances en programmation sur FPGA. Pour rĂ©soudre le premier problĂšme, et dans la mesure oĂč l’implĂ©mentation efficace de fonctions d’activation non linĂ©aires est essentielle Ă  la mise en oeuvre de modĂšles d’apprentissage profond sur les FPGA, nous introduisons une implĂ©mentation de fonction d’activation non linĂ©aire basĂ©e sur le filtre Ă  interpolation de la transformĂ©e cosinus discrĂšte (Discrete Cosine Transform Interpolation Filter - DCTIF). L’architecture d’interpolation proposĂ©e combine des opĂ©rations arithmĂ©tiques sur des Ă©chantillons stockĂ©s de la fonction de tangente hyperbolique et sur les donnĂ©es d’entrĂ©e. Cette solution offre une prĂ©cision 3× supĂ©rieure Ă  celle des travaux prĂ©cĂ©dents, tout en utilisant une quantitĂ© similaire des ressources de calculs et une petite quantitĂ© de mĂ©moire. DiffĂ©rentes combinaisons de paramĂštres du filtre DCTIF peuvent ĂȘtre choisies pour compenser la prĂ©cision et la complexitĂ© globale du circuit de la fonction tangente hyperbolique.----------ABSTRACT: Deep learning has evolved to become the state-of-the-art technique for numerous classification and regression applications. Deep learning models, such as Deep Neural Networks (DNNs) and Convolutional Neural Networks (CNNs), deploy dozens of hidden layers with hundreds of neurons to learn a meaningful representation of the input data. The power of DNNs and CNNs comes from the fact that they are trained through feature learning rather than task-specific algorithms. However, this comes at the expense of high computational cost for both training and inference processes. This necessitates high-performance and energyefficient accelerators, especially for inference where real-time processing matters. FPGAs offer an appealing platform for accelerating the inference of DNNs and CNNs due to their performance, configurability and energy-efficiency. In this thesis, we address three main problems. Firstly, we consider the problem of realizing a precise but efficient implementation of traditional fully connected DNNs in FPGAs. Although Binary Neural Networks (BNNs) use compact data representation (1-bit) compared to fixedpoint data and floating-point representation in traditional DNNs and CNNs, they may still need too many computational and memory resources. Therefore, we study the problem of implementing BNNs in FPGAs as the second problem. Finally, we focus on introducing FPGAs as accelerators to a wider range of software developers, especially those who do not posses FPGA programming knowledge. To address the first problem, and since efficient implementation of non-linear activation functions is essential to the implementation of deep learning models on FPGAs, we introduce a non-linear activation function implementation based on the Discrete Cosine Transform Interpolation Filter (DCTIF). The proposed interpolation architecture combines arithmetic operations on the stored samples of the hyperbolic tangent function and on input data. It achieves almost 3× better precision than previous works while using a similar amount of computational resources and a small amount of memory. Various combinations of DCTIF parameters can be chosen to trade off the accuracy and the overall circuit complexity of the tanh function. In an attempt to address the first and third problems, we introduce a Single hidden layer Neural Network (SNN) multiplication-free overlay architecture with fully connected DNN-level performance. This FPGA inference overlay can be used for applications that are normally solved with fully connected DNNs. The overlay avoids the time needed to synthesize, place, route and regenerate a new bitstream when the application changes. The SNN overlay in puts and activations are quantized to power-of-two values, which allows utilizing shift units instead of multipliers. Since the overlay is a SNN, we fill the FPGA chip with the maximum possible number of neurons that can work in parallel in the hidden layer. We evaluate the proposed architecture on typical benchmark datasets and demonstrate higher throughput with respect to the state-of-the-art while achieving the same accuracy. In addition, the SNN overlay makes the power and versatility of FPGAs available to a wider DNN user community and to improve DNN design efficiency

    Energy efficient enabling technologies for semantic video processing on mobile devices

    Get PDF
    Semantic object-based processing will play an increasingly important role in future multimedia systems due to the ubiquity of digital multimedia capture/playback technologies and increasing storage capacity. Although the object based paradigm has many undeniable benefits, numerous technical challenges remain before the applications becomes pervasive, particularly on computational constrained mobile devices. A fundamental issue is the ill-posed problem of semantic object segmentation. Furthermore, on battery powered mobile computing devices, the additional algorithmic complexity of semantic object based processing compared to conventional video processing is highly undesirable both from a real-time operation and battery life perspective. This thesis attempts to tackle these issues by firstly constraining the solution space and focusing on the human face as a primary semantic concept of use to users of mobile devices. A novel face detection algorithm is proposed, which from the outset was designed to be amenable to be offloaded from the host microprocessor to dedicated hardware, thereby providing real-time performance and reducing power consumption. The algorithm uses an Artificial Neural Network (ANN), whose topology and weights are evolved via a genetic algorithm (GA). The computational burden of the ANN evaluation is offloaded to a dedicated hardware accelerator, which is capable of processing any evolved network topology. Efficient arithmetic circuitry, which leverages modified Booth recoding, column compressors and carry save adders, is adopted throughout the design. To tackle the increased computational costs associated with object tracking or object based shape encoding, a novel energy efficient binary motion estimation architecture is proposed. Energy is reduced in the proposed motion estimation architecture by minimising the redundant operations inherent in the binary data. Both architectures are shown to compare favourable with the relevant prior art

    Applications in Electronics Pervading Industry, Environment and Society

    Get PDF
    This book features the manuscripts accepted for the Special Issue “Applications in Electronics Pervading Industry, Environment and Society—Sensing Systems and Pervasive Intelligence” of the MDPI journal Sensors. Most of the papers come from a selection of the best papers of the 2019 edition of the “Applications in Electronics Pervading Industry, Environment and Society” (APPLEPIES) Conference, which was held in November 2019. All these papers have been significantly enhanced with novel experimental results. The papers give an overview of the trends in research and development activities concerning the pervasive application of electronics in industry, the environment, and society. The focus of these papers is on cyber physical systems (CPS), with research proposals for new sensor acquisition and ADC (analog to digital converter) methods, high-speed communication systems, cybersecurity, big data management, and data processing including emerging machine learning techniques. Physical implementation aspects are discussed as well as the trade-off found between functional performance and hardware/system costs

    Feature Subset Selection in Intrusion Detection Using Soft Computing Techniques

    Get PDF
    Intrusions on computer network systems are major security issues these days. Therefore, it is of utmost importance to prevent such intrusions. The prevention of such intrusions is entirely dependent on their detection that is a main part of any security tool such as Intrusion Detection System (IDS), Intrusion Prevention System (IPS), Adaptive Security Alliance (ASA), checkpoints and firewalls. Therefore, accurate detection of network attack is imperative. A variety of intrusion detection approaches are available but the main problem is their performance, which can be enhanced by increasing the detection rates and reducing false positives. Such weaknesses of the existing techniques have motivated the research presented in this thesis. One of the weaknesses of the existing intrusion detection approaches is the usage of a raw dataset for classification but the classifier may get confused due to redundancy and hence may not classify correctly. To overcome this issue, Principal Component Analysis (PCA) has been employed to transform raw features into principal features space and select the features based on their sensitivity. The sensitivity is determined by the values of eigenvalues. The recent approaches use PCA to project features space to principal feature space and select features corresponding to the highest eigenvalues, but the features corresponding to the highest eigenvalues may not have the optimal sensitivity for the classifier due to ignoring many sensitive features. Instead of using traditional approach of selecting features with the highest eigenvalues such as PCA, this research applied a Genetic Algorithm (GA) to search the principal feature space that offers a subset of features with optimal sensitivity and the highest discriminatory power. Based on the selected features, the classification is performed. The Support Vector Machine (SVM) and Multilayer Perceptron (MLP) are used for classification purpose due to their proven ability in classification. This research work uses the Knowledge Discovery and Data mining (KDD) cup dataset, which is considered benchmark for evaluating security detection mechanisms. The performance of this approach was analyzed and compared with existing approaches. The results show that proposed method provides an optimal intrusion detection mechanism that outperforms the existing approaches and has the capability to minimize the number of features and maximize the detection rates

    Towards an integrated understanding of neural networks

    Get PDF
    Thesis: Ph. D., Massachusetts Institute of Technology, Department of Mathematics, 2018.Cataloged from PDF version of thesis.Includes bibliographical references (pages 123-136).Neural networks underpin both biological intelligence and modern Al systems, yet there is relatively little theory for how the observed behavior of these networks arises. Even the connectivity of neurons within the brain remains largely unknown, and popular deep learning algorithms lack theoretical justification or reliability guarantees. This thesis aims towards a more rigorous understanding of neural networks. We characterize and, where possible, prove essential properties of neural algorithms: expressivity, learning, and robustness. We show how observed emergent behavior can arise from network dynamics, and we develop algorithms for learning more about the network structure of the brain.by David Rolnick.Ph. D

    Review : Deep learning in electron microscopy

    Get PDF
    Deep learning is transforming most areas of science and technology, including electron microscopy. This review paper offers a practical perspective aimed at developers with limited familiarity. For context, we review popular applications of deep learning in electron microscopy. Following, we discuss hardware and software needed to get started with deep learning and interface with electron microscopes. We then review neural network components, popular architectures, and their optimization. Finally, we discuss future directions of deep learning in electron microscopy
    • 

    corecore