64 research outputs found
Extended evaluation of the Volterra-Neural Network for model compression
When large models are used for a classification task, model compression is necessary because there are transmission, space, time or computing constraints that have to be fulfilled. Multilayer Perceptron (MLP) models are traditionally used as a classifier, but depending on the problem, they may need a large number of parameters (neuron functions, weights and bias) to obtain an acceptable performance. This work extends the evaluation of a technique to compress an array of MLPs, through the outputs of a Volterra-Neural Network (Volterra-NN), maintaining its classification performance. The obtained results show that these outputs can be used to build an array of (Volterra-NN) that needs significantly less parameters than the original array of MLPs, furthermore having the same high accuracy in most of the cases. The Volterra-NN compression capabilities have been tested by solving several kind of classification problems. Experimental results are presented on three well-known databases: Letter Recognition, Pen-Based Recognition of Handwritten Digits, and Face recognition databases.Sociedad Argentina de Informática e Investigación Operativ
Machine learning for optical fiber communication systems: An introduction and overview
Optical networks generate a vast amount of diagnostic, control and performance monitoring data. When information is
extracted from this data, reconfigurable network elements and reconfigurable transceivers allow the network to adapt
both to changes in the physical infrastructure but also changing traffic conditions. Machine learning is emerging as a
disruptive technology for extracting useful information from this raw data to enable enhanced planning, monitoring and
dynamic control. We provide a survey of the recent literature and highlight numerous promising avenues for machine
learning applied to optical networks, including explainable machine learning, digital twins and approaches in which we
embed our knowledge into the machine learning such as physics-informed machine learning for the physical layer and
graph-based machine learning for the networking layer
Optics for AI and AI for Optics
Artificial intelligence is deeply involved in our daily lives via reinforcing the digital transformation of modern economies and infrastructure. It relies on powerful computing clusters, which face bottlenecks of power consumption for both data transmission and intensive computing. Meanwhile, optics (especially optical communications, which underpin today’s telecommunications) is penetrating short-reach connections down to the chip level, thus meeting with AI technology and creating numerous opportunities. This book is about the marriage of optics and AI and how each part can benefit from the other. Optics facilitates on-chip neural networks based on fast optical computing and energy-efficient interconnects and communications. On the other hand, AI enables efficient tools to address the challenges of today’s optical communication networks, which behave in an increasingly complex manner. The book collects contributions from pioneering researchers from both academy and industry to discuss the challenges and solutions in each of the respective fields
A Hybrid Approach of Traffic Flow Prediction Using Wavelet Transform and Fuzzy Logic
The rapid development of urban areas and the increasing size of vehicle fleets are causing severe traffic congestions. According to traffic index data (Tom Tom Traffic Index 2016), most of the larger cities in Canada placed between 30th and 100th most traffic congested cities in the world. A recent research study by CAA (Canadian Automotive Association) concludes traffic congestions cost drivers 11.5 million hours and 22 million liters of fuel each year that causes billions of dollars in lost revenues. Although for four decades’ active research has been going on to improve transportation management, statistical data shows the demand for new methods to predict traffic flow with improved accuracy. This research presents a hybrid approach that applies a wavelet transform on a time-frequency (traffic count/hour) signal to determine sharp variation points of traffic flow. Datasets in between sharp variation points reveal segments of data with similar trends. These sets of data, construct fuzzy membership sets by categorizing the processed data together with other recorded information such as time, season, and weather. When real-time data is compared with the historical data using fuzzy IF-THEN rules, a matched dataset represents a reliable source of information for traffic prediction. In addition to the proposed new method, this research work also includes experiment results to demonstrate the improvement of accuracy for long-term traffic flow prediction
Tensor Networks for Dimensionality Reduction and Large-Scale Optimizations. Part 2 Applications and Future Perspectives
Part 2 of this monograph builds on the introduction to tensor networks and
their operations presented in Part 1. It focuses on tensor network models for
super-compressed higher-order representation of data/parameters and related
cost functions, while providing an outline of their applications in machine
learning and data analytics. A particular emphasis is on the tensor train (TT)
and Hierarchical Tucker (HT) decompositions, and their physically meaningful
interpretations which reflect the scalability of the tensor network approach.
Through a graphical approach, we also elucidate how, by virtue of the
underlying low-rank tensor approximations and sophisticated contractions of
core tensors, tensor networks have the ability to perform distributed
computations on otherwise prohibitively large volumes of data/parameters,
thereby alleviating or even eliminating the curse of dimensionality. The
usefulness of this concept is illustrated over a number of applied areas,
including generalized regression and classification (support tensor machines,
canonical correlation analysis, higher order partial least squares),
generalized eigenvalue decomposition, Riemannian optimization, and in the
optimization of deep neural networks. Part 1 and Part 2 of this work can be
used either as stand-alone separate texts, or indeed as a conjoint
comprehensive review of the exciting field of low-rank tensor networks and
tensor decompositions.Comment: 232 page
Tensor Networks for Dimensionality Reduction and Large-Scale Optimizations. Part 2 Applications and Future Perspectives
Part 2 of this monograph builds on the introduction to tensor networks and
their operations presented in Part 1. It focuses on tensor network models for
super-compressed higher-order representation of data/parameters and related
cost functions, while providing an outline of their applications in machine
learning and data analytics. A particular emphasis is on the tensor train (TT)
and Hierarchical Tucker (HT) decompositions, and their physically meaningful
interpretations which reflect the scalability of the tensor network approach.
Through a graphical approach, we also elucidate how, by virtue of the
underlying low-rank tensor approximations and sophisticated contractions of
core tensors, tensor networks have the ability to perform distributed
computations on otherwise prohibitively large volumes of data/parameters,
thereby alleviating or even eliminating the curse of dimensionality. The
usefulness of this concept is illustrated over a number of applied areas,
including generalized regression and classification (support tensor machines,
canonical correlation analysis, higher order partial least squares),
generalized eigenvalue decomposition, Riemannian optimization, and in the
optimization of deep neural networks. Part 1 and Part 2 of this work can be
used either as stand-alone separate texts, or indeed as a conjoint
comprehensive review of the exciting field of low-rank tensor networks and
tensor decompositions.Comment: 232 page
Inferential stability in systems biology
The modern biological sciences are fraught with statistical difficulties. Biomolecular
stochasticity, experimental noise, and the “large p, small n” problem all contribute to
the challenge of data analysis. Nevertheless, we routinely seek to draw robust, meaningful
conclusions from observations. In this thesis, we explore methods for assessing
the effects of data variability upon downstream inference, in an attempt to quantify and
promote the stability of the inferences we make.
We start with a review of existing methods for addressing this problem, focusing upon the
bootstrap and similar methods. The key requirement for all such approaches is a statistical
model that approximates the data generating process.
We move on to consider biomarker discovery problems. We present a novel algorithm for
proposing putative biomarkers on the strength of both their predictive ability and the stability
with which they are selected. In a simulation study, we find our approach to perform
favourably in comparison to strategies that select on the basis of predictive performance
alone.
We then consider the real problem of identifying protein peak biomarkers for HAM/TSP,
an inflammatory condition of the central nervous system caused by HTLV-1 infection.
We apply our algorithm to a set of SELDI mass spectral data, and identify a number of
putative biomarkers. Additional experimental work, together with known results from the
literature, provides corroborating evidence for the validity of these putative biomarkers.
Having focused on static observations, we then make the natural progression to time
course data sets. We propose a (Bayesian) bootstrap approach for such data, and then
apply our method in the context of gene network inference and the estimation of parameters
in ordinary differential equation models. We find that the inferred gene networks
are relatively unstable, and demonstrate the importance of finding distributions of ODE
parameter estimates, rather than single point estimates
Model Parameter Calibration in Power Systems
In power systems, accurate device modeling is crucial for grid reliability, availability, and resiliency. Many critical tasks such as planning or even realtime operation decisions rely on accurate modeling. This research presents an approach for model parameter calibration in power system models using deep learning. Existing calibration methods are based on mathematical approaches that suffer from being ill-posed and thus may have multiple solutions. We are trying to solve this problem by applying a deep learning architecture that is trained to estimate model parameters from simulated Phasor Measurement Unit (PMU) data. The data recorded after system disturbances proved to have valuable information to verify power system devices. A quantitative evaluation of the system results is provided. Results showed high accuracy in estimating model parameters of 0.017 MSE on the testing dataset. We also provide that the proposed system has scalability under the same topology. We consider these promising results to be the basis for further exploration and development of additional tools for parameter calibration
Analogue neuromorphic systems.
This thesis addresses a new area of science and technology, that of neuromorphic
systems, namely the problems and prospects of analogue neuromorphic systems. The
subject is subdivided into three chapters.
Chapter 1 is an introduction. It formulates the oncoming problem of the creation
of highly computationally costly systems of nonlinear information processing (such as
artificial neural networks and artificial intelligence systems). It shows that an analogue
technology could make a vital contribution to the creation such systems. The basic principles
of creation of analogue neuromorphic systems are formulated. The importance
will be emphasised of the principle of orthogonality for future highly efficient complex
information processing systems.
Chapter 2 reviews the basics of neural and neuromorphic systems and informs on
the present situation in this field of research, including both experimental and theoretical
knowledge gained up-to-date. The chapter provides the necessary background for
correct interpretation of the results reported in Chapter 3 and for a realistic decision on
the direction for future work.
Chapter 3 describes my own experimental and computational results within the
framework of the subject, obtained at De Montfort University. These include: the
building of (i) Analogue Polynomial Approximator/lnterpolatoriExtrapolator, (ii) Synthesiser
of orthogonal functions, (iii) analogue real-time video filter (performing the
homomorphic filtration), (iv) Adaptive polynomial compensator of geometrical distortions
of CRT- monitors, (v) analogue parallel-learning neural network (backpropagation
algorithm).
Thus, this thesis makes a dual contribution to the chosen field: it summarises the
present knowledge on the possibility of utilising analogue technology in up-to-date and
future computational systems, and it reports new results within the framework of the
subject. The main conclusion is that due to its promising power characteristics, small
sizes and high tolerance to degradation, the analogue neuromorphic systems will playa
more and more important role in future computational systems (in particular in systems
of artificial intelligence)
- …