51 research outputs found

    Towards Lightweight AI: Leveraging Stochasticity, Quantization, and Tensorization for Forecasting

    Get PDF
    The deep neural network is an intriguing prognostic model capable of learning meaningful patterns that generalize to new data. The deep learning paradigm has been widely adopted across many domains, including for natural language processing, genomics, and automatic music transcription. However, deep neural networks rely on a plethora of underlying computational units and data, collectively demanding a wealth of compute and memory resources for practical tasks. This model complexity prohibits the use of larger deep neural networks for resource-critical applications, such as edge computing. In order to reduce model complexity, several research groups are actively studying compression methods, hardware accelerators, and alternative computing paradigms. These orthogonal research explorations often leave a gap in understanding the interplay of the optimization mechanisms and their overall feasibility for a given task. In this thesis, we address this gap by developing a holistic solution to assess the model complexity reduction theoretically and quantitatively at both high-level and low-level abstractions for training and inference. At the algorithmic level, a novel deep, yet lightweight, recurrent architecture is proposed that extends the conventional echo state network. The architecture employs random dynamics, brain-inspired plasticity mechanisms, tensor decomposition, and hierarchy as the key features to enrich learning. Furthermore, the hyperparameter landscape is optimized via a particle swarm optimization algorithm. To deploy these networks efficiently onto low-end edge devices, both ultra-low and mixed-precision numerical formats are studied within our feedforward deep neural network hardware accelerator. More importantly, the tapered-precision posit format with a novel exact-dot-product algorithm is employed in the low-level digital architectures to study its efficacy in resource utilization. The dynamics of the architecture are characterized through neuronal partitioning and Lyapunov stability, and we show that superlative networks emerge beyond the edge of chaos with an agglomeration of weak learners. We also demonstrate that tensorization improves model performance by preserving correlations present in multi-way structures. Low-precision posits are found to consistently outperform other formats on various image classification tasks and, in conjunction with compression, we achieve magnitudes of speedup and memory savings for both training and inference for the forecasting of chaotic time series and polyphonic music tasks. This culmination of methods greatly improves the feasibility of deploying rich predictive models on edge devices

    Applied Metaheuristic Computing

    Get PDF
    For decades, Applied Metaheuristic Computing (AMC) has been a prevailing optimization technique for tackling perplexing engineering and business problems, such as scheduling, routing, ordering, bin packing, assignment, facility layout planning, among others. This is partly because the classic exact methods are constrained with prior assumptions, and partly due to the heuristics being problem-dependent and lacking generalization. AMC, on the contrary, guides the course of low-level heuristics to search beyond the local optimality, which impairs the capability of traditional computation methods. This topic series has collected quality papers proposing cutting-edge methodology and innovative applications which drive the advances of AMC

    Computational intelligence approaches to robotics, automation, and control [Volume guest editors]

    Get PDF
    No abstract available

    Analysis of physiological signals using machine learning methods

    Get PDF
    Technological advances in data collection enable scientists to suggest novel approaches, such as Machine Learning algorithms, to process and make sense of this information. However, during this process of collection, data loss and damage can occur for reasons such as faulty device sensors or miscommunication. In the context of time-series data such as multi-channel bio-signals, there is a possibility of losing a whole channel. In such cases, existing research suggests imputing the missing parts when the majority of data is available. One way of understanding and classifying complex signals is by using deep neural networks. The hyper-parameters of such models have been optimised using the process of back propagation. Over time, improvements have been suggested to enhance this algorithm. However, an essential drawback of the back propagation can be the sensitivity to noisy data. This thesis proposes two novel approaches to address the missing data challenge and back propagation drawbacks: First, suggesting a gradient-free model in order to discover the optimal hyper-parameters of a deep neural network. The complexity of deep networks and high-dimensional optimisation parameters presents challenges to find a suitable network structure and hyper-parameter configuration. This thesis proposes the use of a minimalist swarm optimiser, Dispersive Flies Optimisation(DFO), to enable the selected model to achieve better results in comparison with the traditional back propagation algorithm in certain conditions such as limited number of training samples. The DFO algorithm offers a robust search process for finding and determining the hyper-parameter configurations. Second, imputing whole missing bio-signals within a multi-channel sample. This approach comprises two experiments, namely the two-signal and five-signal imputation models. The first experiment attempts to implement and evaluate the performance of a model mapping bio-signals from A toB and vice versa. Conceptually, this is an extension to transfer learning using CycleGenerative Adversarial Networks (CycleGANs). The second experiment attempts to suggest a mechanism imputing missing signals in instances where multiple data channels are available for each sample. The capability to map to a target signal through multiple source domains achieves a more accurate estimate for the target domain. The results of the experiments performed indicate that in certain circumstances, such as having a limited number of samples, finding the optimal hyper-parameters of a neural network using gradient-free algorithms outperforms traditional gradient-based algorithms, leading to more accurate classification results. In addition, Generative Adversarial Networks could be used to impute the missing data channels in multi-channel bio-signals, and the generated data used for further analysis and classification tasks

    Efficient Design, Training, and Deployment of Artificial Neural Networks

    Get PDF
    Over the last decade, artificial neural networks, especially deep neural networks, have emerged as the main modeling tool in Machine Learning, allowing us to tackle an increasing number of real-world problems in various fields, most notably, in computer vision, natural language processing, biomedical and financial analysis. The success of deep neural networks can be attributed to many factors, namely the increasing amount of data available, the developments of dedicated hardware, the advancements in optimization techniques, and especially the invention of novel neural network architectures. Nowadays, state-of-the-arts neural networks that achieve the best performance in any field are usually formed by several layers, comprising millions, or even billions of parameters. Despite spectacular performances, optimizing a single state-of- the-arts neural network often requires a tremendous amount of computation, which can take several days using high-end hardware. More importantly, it took several years of experimentation for the community to gradually discover effective neural network architectures, moving from AlexNet, VGGNet, to ResNet, and then DenseNet. In addition to the expensive and time-consuming experimentation process, deep neural networks, which require powerful processors to operate during the deployment phase, cannot be easily deployed to mobile or embedded devices. For these reasons, improving the design, training, and deployment of deep neural networks has become an important area of research in the Machine Learning field. This thesis makes several contributions in the aforementioned research area, which can be grouped into two main categories. The first category consists of research works that focus on designing efficient neural network architectures not only in terms of accuracy but also computational complexity. In the first contribution under this category, the computational efficiency is first addressed at the filter level through the incorporation of a handcrafted design for convolutional neural networks, which are the basis of most deep neural networks. More specifically, the multilinear convolution filter is proposed to replace the linear convolution filter, which is a fundamental element in a convolutional neural network. The new filter design not only better captures multidimensional structures inherent in CNNs but also requires far fewer parameters to be estimated. While using efficient algebraic transforms and approximation techniques to tackle the design problem can significantly reduce the memory and computational footprint of neural network models, this approach requires a lot of trial and error. In addition, the simple neuron model used in most neural networks nowadays, which only performs a linear transformation followed by a nonlinear activation, cannot effectively mimic the diverse activities of biological neurons. For this reason, the second and third contributions transition from a handcrafted, manual design approach to an algorithmic approach in which the type of transformations performed by each neuron as well as the topology of neural networks are optimized in a systematic and completely data-dependent manner. As a result, the algorithms proposed in the second and third contributions are capable of designing highly accurate and compact neural networks while requiring minimal human efforts or intervention in the design process. Despite significant progress has been made to reduce the runtime complexity of neural network models on embedded devices, the majority of them have been demonstrated on powerful embedded devices, which are costly in applications that require large-scale deployment such as surveillance systems. In these scenarios, complete on-device processing solutions can be infeasible. On the contrary, hybrid solutions, where some preprocessing steps are conducted on the client side while the heavy computation takes place on the server side, are more practical. The second category of contributions made in this thesis focuses on efficient learning methodologies for hybrid solutions that take into ac- count both the signal acquisition and inference steps. More concretely, the first contribution under this category is the formulation of the Multilinear Compressive Learning framework in which multidimensional signals are compressively acquired, and inference is made based on the compressed signals, bypassing the signal reconstruction step. In the second contribution, the relationships be- tween the input signal resolution, the compression rate, and the learning performance of Multilinear Compressive Learning systems are empirically analyzed systematically, leading to the discovery of a surrogate performance indicator that can be used to approximately rank the learning performances of different sensor configurations without conducting the entire optimization process. Nowadays, many communication protocols provide support for adaptive data transmission to maximize the data throughput and minimize energy consumption depending on the network’s strength. The last contribution of this thesis proposes an extension of the Multilinear Compressive Learning framework with an adaptive compression capability, which enables us to take advantage of the adaptive rate transmission feature in existing communication protocols to maximize the informational content throughput of the whole system. Finally, all methodological contributions of this thesis are accompanied by extensive empirical analyses demonstrating their performance and computational advantages over existing methods in different computer vision applications such as object recognition, face verification, human activity classification, and visual information retrieval
    • …
    corecore