344 research outputs found
Analyzing sparse dictionaries for online learning with kernels
Many signal processing and machine learning methods share essentially the
same linear-in-the-parameter model, with as many parameters as available
samples as in kernel-based machines. Sparse approximation is essential in many
disciplines, with new challenges emerging in online learning with kernels. To
this end, several sparsity measures have been proposed in the literature to
quantify sparse dictionaries and constructing relevant ones, the most prolific
ones being the distance, the approximation, the coherence and the Babel
measures. In this paper, we analyze sparse dictionaries based on these
measures. By conducting an eigenvalue analysis, we show that these sparsity
measures share many properties, including the linear independence condition and
inducing a well-posed optimization problem. Furthermore, we prove that there
exists a quasi-isometry between the parameter (i.e., dual) space and the
dictionary's induced feature space.Comment: 10 page
A reduced labeled samples (RLS) framework for classification of imbalanced concept-drifting streaming data.
Stream processing frameworks are designed to process the streaming data that arrives in time. An example of such data is stream of emails that a user receives every day. Most of the real world data streams are also imbalanced as is in the stream of emails, which contains few spam emails compared to a lot of legitimate emails. The classification of the imbalanced data stream is challenging due to the several reasons: First of all, data streams are huge and they can not be stored in the memory for one time processing. Second, if the data is imbalanced, the accuracy of the majority class mostly dominates the results. Third, data streams are changing over time, and that causes degradation in the model performance. Hence the model should get updated when such changes are detected. Finally, the true labels of the all samples are not available immediately after classification, and only a fraction of the data is possible to get labeled in real world applications. That is because the labeling is expensive and time consuming. In this thesis, a framework for modeling the streaming data when the classes of the data samples are imbalanced is proposed. This framework is called Reduced Labeled Samples (RLS). RLS is a chunk based learning framework that builds a model using partially labeled data stream, when the characteristics of the data change. In RLS, a fraction of the samples are labeled and are used in modeling, and the performance is not significantly different from that of the 100% labeling. RLS maintains an ensemble of classifiers to boost the performance. RLS uses the information from labeled data in a supervised fashion, and also is extended to use the information from unlabeled data in a semi supervised fashion. RLS addresses both binary and multi class partially labeled data stream and the results show the basis of RLS is effective even in the context of multi class classification problems. Overall, the RLS is shown to be an effective framework for processing imbalanced and partially labeled data streams
Stream-based active learning for sliding windows under the influence of verification latency
Stream-based active learning (AL) strategies minimize the labeling effort by querying labels that improve the classifier’s performance the most. So far, these strategies neglect the fact that an oracle or expert requires time to provide a queried label. We show that existing AL methods deteriorate or even fail under the influence of such verification latency. The problem with these methods is that they estimate a label’s utility on the currently available labeled data. However, when this label would arrive, some of the current data may have gotten outdated and new labels have arrived. In this article, we propose to simulate the available data at the time when the label would arrive. Therefore, our method Forgetting and Simulating (FS) forgets outdated information and simulates the delayed labels to get more realistic utility estimates. We assume to know the label’s arrival date a priori and the classifier’s training data to be bounded by a sliding window. Our extensive experiments show that FS improves stream-based AL strategies in settings with both, constant and variable verification latency
Semi-Supervised Learning for Diagnosing Faults in Electromechanical Systems
Safe and reliable operation of the systems relies on the use of online condition monitoring and diagnostic systems that aim to take immediate actions upon the occurrence of a fault. Machine learning techniques are widely used for designing data-driven diagnostic models. The training procedure of a data-driven model usually requires a large amount of labeled data, which may not be always practical. This problem can be untangled by resorting to semi-supervised learning approaches, which enables the decision making procedure using only a few numbers of labeled samples coupled with a large number of unlabeled samples. Thus, it is crucial to conduct a critical study on the use of semi-supervised learning for the purpose of fault diagnosis. Another issue of concern is fault diagnosis in non-stationary environments, where data streams evolve over time, and as a result, model-based and most of the data-driven models are impractical. In this work, this has been addressed by means of an adaptive data-driven diagnostic model
Diversidad explÃcita en modelos de ensembles de Extreme Learning Machine
Extreme Learning Machine(ELM) ha mostrado ser un rápido algoritmo de aprendizaje automático, adecuado para problemas de regresión y clasificación. Con el fin de generalizarlos resultados del ELM estándar, varios métodos de ensemble han sido desarrollados. Estos métodos de ensemble son meta-algoritmos que generalizan los resultados de los ELMs, generando varios predictores base cuyas predicciones se combinan en una predicción de conjunto final. La mayorÃa de estos métodos confÃan en el muestreo de datos para generar predictores diferentes y conseguir asà la generalización de los resultados. Estos métodos tienen como hipótesis que los datos de entrenamiento son suficientemente heterogéneos para que los predictores generados sean diversos entre sÃ. En esta tesis, se proponen métodos de ensemble que promueven la diversidad explÃcitamente, evitando la hipótesis de que los datos deben de muestrearse de manera diversa. Esta promoción de la diversidad se realiza a través de las funciones objetivo de los ELMs, usando ideas del entorno de trabajo de Negative Correlation Learning(NCL).La formulación de la diversidad a través de la función objetivo de ELM permite desarrollar una solución analÃtica para los parámetros de los ELMs base. Esto reduce significativamente el coste computacional, comparado con la versión clásica de NCL para redes neuronales artificiales. De manera adicional, los métodos ensemble propuestos han sido validados mediante estudios experimentales con conjuntos de datos de benchmark, comparando con métodos ensemble existentes en la literatura ELM.Versión embargada de la Tesis por publicación por artÃculo
A Survey on Semi-Supervised Learning for Delayed Partially Labelled Data Streams
Unlabelled data appear in many domains and are particularly relevant to
streaming applications, where even though data is abundant, labelled data is
rare. To address the learning problems associated with such data, one can
ignore the unlabelled data and focus only on the labelled data (supervised
learning); use the labelled data and attempt to leverage the unlabelled data
(semi-supervised learning); or assume some labels will be available on request
(active learning). The first approach is the simplest, yet the amount of
labelled data available will limit the predictive performance. The second
relies on finding and exploiting the underlying characteristics of the data
distribution. The third depends on an external agent to provide the required
labels in a timely fashion. This survey pays special attention to methods that
leverage unlabelled data in a semi-supervised setting. We also discuss the
delayed labelling issue, which impacts both fully supervised and
semi-supervised methods. We propose a unified problem setting, discuss the
learning guarantees and existing methods, explain the differences between
related problem settings. Finally, we review the current benchmarking practices
and propose adaptations to enhance them
Sequential nonlinear learning
Cataloged from PDF version of article.We study sequential nonlinear learning in an individual sequence manner, where
we provide results that are guaranteed to hold without any statistical assumptions.
We address the convergence and undertraining issues of conventional nonlinear
regression methods and introduce algorithms that elegantly mitigate these
issues using nested tree structures. To this end, in the second chapter, we introduce
algorithms that adapt not only their regression functions but also the complete
tree structure while achieving the performance of the best linear mixture of
a doubly exponential number of partitions, with a computational complexity only
polynomial in the number of nodes of the tree. In the third chapter, we propose an
incremental decision tree structure and using this model, we introduce an online
regression algorithm that partitions the regressor space in a data driven manner.
We prove that the proposed algorithm sequentially and asymptotically achieves
the performance of the optimal twice differentiable regression function for any
data sequence with an unknown and arbitrary length. The computational complexity
of the introduced algorithm is only logarithmic in the data length under
certain regularity conditions. In the fourth chapter, we construct an online finite
state (FS) predictor over hierarchical structures, whose computational complexity
is only linear in the hierarchy level. We prove that the introduced algorithm
asymptotically achieves the performance of the best linear combination of all FS
predictors defined over the hierarchical model in a deterministic manner and and
in a mean square error sense in the steady-state for certain nonstationary models.
In the fifth chapter, we introduce a distributed subgradient based extreme learning
machine algorithm to train single hidden layer feedforward neural networks
(SLFNs). We show that using the proposed algorithm, each of the individual
SLFNs asymptotically achieves the performance of the optimal centralized batch
SLFN in a strong deterministic sense.Vanlı, Nuri DenizcanM.S
Adaptive Sampling For Efficient Online Modelling
This thesis examines methods enabling autonomous systems to make active sampling and planning decisions in real time. Gaussian Process (GP) regression is chosen as a framework for its non-parametric approach allowing flexibility in unknown environments. The first part of the thesis focuses on depth constrained full coverage bathymetric surveys in unknown environments. Algorithms are developed to find and follow a depth contour, modelled with a GP, and produce a depth constrained boundary. An extension to the Boustrophedon Cellular Decomposition, Discrete Monotone Polygonal Partitioning is developed allowing efficient planning for coverage within this boundary. Efficient computational methods such as incremental Cholesky updates are implemented to allow online Hyper Parameter optimisation and fitting of the GP's. This is demonstrated in simulation and the field on a platform built for the purpose. The second part of this thesis focuses on modelling the surface salinity profiles of estuarine tidal fronts. The standard GP model assumes evenly distributed noise, which does not always hold. This can be handled with Heteroscedastic noise. An efficient new method, Parametric Heteroscedastic Gaussian Process regression, is proposed. This is applied to active sample selection on stationary fronts and adaptive planning on moving fronts where a number of information theoretic methods are compared. The use of a mean function is shown to increase the accuracy of predictions whilst reducing optimisation time. These algorithms are validated in simulation. Algorithmic development is focused on efficient methods allowing deployment on platforms with constrained computational resources. Whilst the application of this thesis is Autonomous Surface Vessels, it is hoped the issues discussed and solutions provided have relevance to other applications in robotics and wider fields such as spatial statistics and machine learning in general
- …