64 research outputs found

    Extended evaluation of the Volterra-Neural Network for model compression

    Get PDF
    When large models are used for a classification task, model compression is necessary because there are transmission, space, time or computing constraints that have to be fulfilled. Multilayer Perceptron (MLP) models are traditionally used as a classifier, but depending on the problem, they may need a large number of parameters (neuron functions, weights and bias) to obtain an acceptable performance. This work extends the evaluation of a technique to compress an array of MLPs, through the outputs of a Volterra-Neural Network (Volterra-NN), maintaining its classification performance. The obtained results show that these outputs can be used to build an array of (Volterra-NN) that needs significantly less parameters than the original array of MLPs, furthermore having the same high accuracy in most of the cases. The Volterra-NN compression capabilities have been tested by solving several kind of classification problems. Experimental results are presented on three well-known databases: Letter Recognition, Pen-Based Recognition of Handwritten Digits, and Face recognition databases.Sociedad Argentina de Informática e Investigación Operativ

    Machine learning for optical fiber communication systems: An introduction and overview

    Get PDF
    Optical networks generate a vast amount of diagnostic, control and performance monitoring data. When information is extracted from this data, reconfigurable network elements and reconfigurable transceivers allow the network to adapt both to changes in the physical infrastructure but also changing traffic conditions. Machine learning is emerging as a disruptive technology for extracting useful information from this raw data to enable enhanced planning, monitoring and dynamic control. We provide a survey of the recent literature and highlight numerous promising avenues for machine learning applied to optical networks, including explainable machine learning, digital twins and approaches in which we embed our knowledge into the machine learning such as physics-informed machine learning for the physical layer and graph-based machine learning for the networking layer

    Optics for AI and AI for Optics

    Get PDF
    Artificial intelligence is deeply involved in our daily lives via reinforcing the digital transformation of modern economies and infrastructure. It relies on powerful computing clusters, which face bottlenecks of power consumption for both data transmission and intensive computing. Meanwhile, optics (especially optical communications, which underpin today’s telecommunications) is penetrating short-reach connections down to the chip level, thus meeting with AI technology and creating numerous opportunities. This book is about the marriage of optics and AI and how each part can benefit from the other. Optics facilitates on-chip neural networks based on fast optical computing and energy-efficient interconnects and communications. On the other hand, AI enables efficient tools to address the challenges of today’s optical communication networks, which behave in an increasingly complex manner. The book collects contributions from pioneering researchers from both academy and industry to discuss the challenges and solutions in each of the respective fields

    A Hybrid Approach of Traffic Flow Prediction Using Wavelet Transform and Fuzzy Logic

    Get PDF
    The rapid development of urban areas and the increasing size of vehicle fleets are causing severe traffic congestions. According to traffic index data (Tom Tom Traffic Index 2016), most of the larger cities in Canada placed between 30th and 100th most traffic congested cities in the world. A recent research study by CAA (Canadian Automotive Association) concludes traffic congestions cost drivers 11.5 million hours and 22 million liters of fuel each year that causes billions of dollars in lost revenues. Although for four decades’ active research has been going on to improve transportation management, statistical data shows the demand for new methods to predict traffic flow with improved accuracy. This research presents a hybrid approach that applies a wavelet transform on a time-frequency (traffic count/hour) signal to determine sharp variation points of traffic flow. Datasets in between sharp variation points reveal segments of data with similar trends. These sets of data, construct fuzzy membership sets by categorizing the processed data together with other recorded information such as time, season, and weather. When real-time data is compared with the historical data using fuzzy IF-THEN rules, a matched dataset represents a reliable source of information for traffic prediction. In addition to the proposed new method, this research work also includes experiment results to demonstrate the improvement of accuracy for long-term traffic flow prediction

    Tensor Networks for Dimensionality Reduction and Large-Scale Optimizations. Part 2 Applications and Future Perspectives

    Full text link
    Part 2 of this monograph builds on the introduction to tensor networks and their operations presented in Part 1. It focuses on tensor network models for super-compressed higher-order representation of data/parameters and related cost functions, while providing an outline of their applications in machine learning and data analytics. A particular emphasis is on the tensor train (TT) and Hierarchical Tucker (HT) decompositions, and their physically meaningful interpretations which reflect the scalability of the tensor network approach. Through a graphical approach, we also elucidate how, by virtue of the underlying low-rank tensor approximations and sophisticated contractions of core tensors, tensor networks have the ability to perform distributed computations on otherwise prohibitively large volumes of data/parameters, thereby alleviating or even eliminating the curse of dimensionality. The usefulness of this concept is illustrated over a number of applied areas, including generalized regression and classification (support tensor machines, canonical correlation analysis, higher order partial least squares), generalized eigenvalue decomposition, Riemannian optimization, and in the optimization of deep neural networks. Part 1 and Part 2 of this work can be used either as stand-alone separate texts, or indeed as a conjoint comprehensive review of the exciting field of low-rank tensor networks and tensor decompositions.Comment: 232 page

    Tensor Networks for Dimensionality Reduction and Large-Scale Optimizations. Part 2 Applications and Future Perspectives

    Full text link
    Part 2 of this monograph builds on the introduction to tensor networks and their operations presented in Part 1. It focuses on tensor network models for super-compressed higher-order representation of data/parameters and related cost functions, while providing an outline of their applications in machine learning and data analytics. A particular emphasis is on the tensor train (TT) and Hierarchical Tucker (HT) decompositions, and their physically meaningful interpretations which reflect the scalability of the tensor network approach. Through a graphical approach, we also elucidate how, by virtue of the underlying low-rank tensor approximations and sophisticated contractions of core tensors, tensor networks have the ability to perform distributed computations on otherwise prohibitively large volumes of data/parameters, thereby alleviating or even eliminating the curse of dimensionality. The usefulness of this concept is illustrated over a number of applied areas, including generalized regression and classification (support tensor machines, canonical correlation analysis, higher order partial least squares), generalized eigenvalue decomposition, Riemannian optimization, and in the optimization of deep neural networks. Part 1 and Part 2 of this work can be used either as stand-alone separate texts, or indeed as a conjoint comprehensive review of the exciting field of low-rank tensor networks and tensor decompositions.Comment: 232 page

    Inferential stability in systems biology

    Get PDF
    The modern biological sciences are fraught with statistical difficulties. Biomolecular stochasticity, experimental noise, and the “large p, small n” problem all contribute to the challenge of data analysis. Nevertheless, we routinely seek to draw robust, meaningful conclusions from observations. In this thesis, we explore methods for assessing the effects of data variability upon downstream inference, in an attempt to quantify and promote the stability of the inferences we make. We start with a review of existing methods for addressing this problem, focusing upon the bootstrap and similar methods. The key requirement for all such approaches is a statistical model that approximates the data generating process. We move on to consider biomarker discovery problems. We present a novel algorithm for proposing putative biomarkers on the strength of both their predictive ability and the stability with which they are selected. In a simulation study, we find our approach to perform favourably in comparison to strategies that select on the basis of predictive performance alone. We then consider the real problem of identifying protein peak biomarkers for HAM/TSP, an inflammatory condition of the central nervous system caused by HTLV-1 infection. We apply our algorithm to a set of SELDI mass spectral data, and identify a number of putative biomarkers. Additional experimental work, together with known results from the literature, provides corroborating evidence for the validity of these putative biomarkers. Having focused on static observations, we then make the natural progression to time course data sets. We propose a (Bayesian) bootstrap approach for such data, and then apply our method in the context of gene network inference and the estimation of parameters in ordinary differential equation models. We find that the inferred gene networks are relatively unstable, and demonstrate the importance of finding distributions of ODE parameter estimates, rather than single point estimates

    Model Parameter Calibration in Power Systems

    Get PDF
    In power systems, accurate device modeling is crucial for grid reliability, availability, and resiliency. Many critical tasks such as planning or even realtime operation decisions rely on accurate modeling. This research presents an approach for model parameter calibration in power system models using deep learning. Existing calibration methods are based on mathematical approaches that suffer from being ill-posed and thus may have multiple solutions. We are trying to solve this problem by applying a deep learning architecture that is trained to estimate model parameters from simulated Phasor Measurement Unit (PMU) data. The data recorded after system disturbances proved to have valuable information to verify power system devices. A quantitative evaluation of the system results is provided. Results showed high accuracy in estimating model parameters of 0.017 MSE on the testing dataset. We also provide that the proposed system has scalability under the same topology. We consider these promising results to be the basis for further exploration and development of additional tools for parameter calibration

    Analogue neuromorphic systems.

    Get PDF
    This thesis addresses a new area of science and technology, that of neuromorphic systems, namely the problems and prospects of analogue neuromorphic systems. The subject is subdivided into three chapters. Chapter 1 is an introduction. It formulates the oncoming problem of the creation of highly computationally costly systems of nonlinear information processing (such as artificial neural networks and artificial intelligence systems). It shows that an analogue technology could make a vital contribution to the creation such systems. The basic principles of creation of analogue neuromorphic systems are formulated. The importance will be emphasised of the principle of orthogonality for future highly efficient complex information processing systems. Chapter 2 reviews the basics of neural and neuromorphic systems and informs on the present situation in this field of research, including both experimental and theoretical knowledge gained up-to-date. The chapter provides the necessary background for correct interpretation of the results reported in Chapter 3 and for a realistic decision on the direction for future work. Chapter 3 describes my own experimental and computational results within the framework of the subject, obtained at De Montfort University. These include: the building of (i) Analogue Polynomial Approximator/lnterpolatoriExtrapolator, (ii) Synthesiser of orthogonal functions, (iii) analogue real-time video filter (performing the homomorphic filtration), (iv) Adaptive polynomial compensator of geometrical distortions of CRT- monitors, (v) analogue parallel-learning neural network (backpropagation algorithm). Thus, this thesis makes a dual contribution to the chosen field: it summarises the present knowledge on the possibility of utilising analogue technology in up-to-date and future computational systems, and it reports new results within the framework of the subject. The main conclusion is that due to its promising power characteristics, small sizes and high tolerance to degradation, the analogue neuromorphic systems will playa more and more important role in future computational systems (in particular in systems of artificial intelligence)
    corecore