107 research outputs found
Broad Learning System Based on Maximum Correntropy Criterion
As an effective and efficient discriminative learning method, Broad Learning
System (BLS) has received increasing attention due to its outstanding
performance in various regression and classification problems. However, the
standard BLS is derived under the minimum mean square error (MMSE) criterion,
which is, of course, not always a good choice due to its sensitivity to
outliers. To enhance the robustness of BLS, we propose in this work to adopt
the maximum correntropy criterion (MCC) to train the output weights, obtaining
a correntropy based broad learning system (C-BLS). Thanks to the inherent
superiorities of MCC, the proposed C-BLS is expected to achieve excellent
robustness to outliers while maintaining the original performance of the
standard BLS in Gaussian or noise-free environment. In addition, three
alternative incremental learning algorithms, derived from a weighted
regularized least-squares solution rather than pseudoinverse formula, for C-BLS
are developed.With the incremental learning algorithms, the system can be
updated quickly without the entire retraining process from the beginning, when
some new samples arrive or the network deems to be expanded. Experiments on
various regression and classification datasets are reported to demonstrate the
desirable performance of the new methods
Information Theoretical Estimators Toolbox
We present ITE (information theoretical estimators) a free and open source,
multi-platform, Matlab/Octave toolbox that is capable of estimating many
different variants of entropy, mutual information, divergence, association
measures, cross quantities, and kernels on distributions. Thanks to its highly
modular design, ITE supports additionally (i) the combinations of the
estimation techniques, (ii) the easy construction and embedding of novel
information theoretical estimators, and (iii) their immediate application in
information theoretical optimization problems. ITE also includes a prototype
application in a central problem class of signal processing, independent
subspace analysis and its extensions.Comment: 5 pages; ITE toolbox: https://bitbucket.org/szzoli/ite
Application of entropy concepts to power system state estimation
Tese de mestrado integrado. Engenharia Electrotécnica e de Computadores (Major Energia). Faculdade de Engenharia. Universidade do Porto. 200
Density Preserving Sampling: Robust and Efficient Alternative to Cross-validation for Error Estimation
Estimation of the generalization ability of a classi-
fication or regression model is an important issue, as it indicates
the expected performance on previously unseen data and is
also used for model selection. Currently used generalization
error estimation procedures, such as cross-validation (CV) or
bootstrap, are stochastic and, thus, require multiple repetitions
in order to produce reliable results, which can be computationally
expensive, if not prohibitive. The correntropy-inspired density-
preserving sampling (DPS) procedure proposed in this paper
eliminates the need for repeating the error estimation procedure
by dividing the available data into subsets that are guaranteed to
be representative of the input dataset. This allows the production
of low-variance error estimates with an accuracy comparable to
10 times repeated CV at a fraction of the computations required
by CV. This method can also be used for model ranking and
selection. This paper derives the DPS procedure and investigates
its usability and performance using a set of public benchmark
datasets and standard classifier
Physically inspired methods and development of data-driven predictive systems.
Traditionally building of predictive models is perceived as a combination of both science and art. Although the designer of a predictive system effectively follows a prescribed procedure, his domain knowledge as well as expertise and intuition in the field of machine learning are
often irreplaceable. However, in many practical situations it is possible to build well–performing predictive systems by following a rigorous methodology and offsetting not only the lack of domain knowledge but also partial lack of expertise and intuition, by computational power. The
generalised predictive model development cycle discussed in this thesis is an example of such methodology, which despite being computationally expensive, has been successfully applied to real–world problems. The proposed predictive system design cycle is a purely data–driven approach. The quality of data used to build the system is thus of crucial importance. In practice however, the data is rarely perfect. Common problems include missing values, high dimensionality or very limited amount of labelled exemplars. In order to address these issues, this work investigated and exploited inspirations coming from physics. The novel use of well–established physical models in the form of potential fields, has resulted in derivation of a comprehensive Electrostatic Field Classification
Framework for supervised and semi–supervised learning from incomplete data. Although the computational power constantly becomes cheaper and more accessible, it is not
infinite. Therefore efficient techniques able to exploit finite amount of predictive information content of the data and limit the computational requirements of the resource–hungry predictive system design procedure are very desirable. In designing such techniques this work once again
investigated and exploited inspirations coming from physics. By using an analogy with a set of interacting particles and the resulting Information Theoretic Learning framework, the Density Preserving Sampling technique has been derived. This technique acts as a computationally
efficient alternative for cross–validation, which fits well within the proposed methodology. All methods derived in this thesis have been thoroughly tested on a number of benchmark datasets. The proposed generalised predictive model design cycle has been successfully applied to two real–world environmental problems, in which a comparative study of Density Preserving Sampling and cross–validation has also been performed confirming great potential of the proposed methods
- …