4,534 research outputs found
Bankruptcy Prediction: A Comparison of Some Statistical and Machine Learning Techniques
We are interested in forecasting bankruptcies in a probabilistic way. Specifically, we compare the classification performance of several statistical and machine-learning techniques, namely discriminant analysis (Altman's Z-score), logistic regression, least-squares support vector machines and different instances of Gaussian processes (GP's) -that is GP's classifiers, Bayesian Fisher discriminant and Warped GP's. Our contribution to the field of computational finance is to introduce GP's as a potentially competitive probabilistic framework for bankruptcy prediction. Data from the repository of information of the US Federal Deposit Insurance Corporation is used to test the predictions.Bankruptcy prediction, Artificial intelligence, Supervised learning, Gaussian processes, Z-score.
NIPS - Not Even Wrong? A Systematic Review of Empirically Complete Demonstrations of Algorithmic Effectiveness in the Machine Learning and Artificial Intelligence Literature
Objective: To determine the completeness of argumentative steps necessary to
conclude effectiveness of an algorithm in a sample of current ML/AI supervised
learning literature.
Data Sources: Papers published in the Neural Information Processing Systems
(NeurIPS, n\'ee NIPS) journal where the official record showed a 2017 year of
publication.
Eligibility Criteria: Studies reporting a (semi-)supervised model, or
pre-processing fused with (semi-)supervised models for tabular data.
Study Appraisal: Three reviewers applied the assessment criteria to determine
argumentative completeness. The criteria were split into three groups,
including: experiments (e.g real and/or synthetic data), baselines (e.g
uninformed and/or state-of-art) and quantitative comparison (e.g. performance
quantifiers with confidence intervals and formal comparison of the algorithm
against baselines).
Results: Of the 121 eligible manuscripts (from the sample of 679 abstracts),
99\% used real-world data and 29\% used synthetic data. 91\% of manuscripts did
not report an uninformed baseline and 55\% reported a state-of-art baseline.
32\% reported confidence intervals for performance but none provided references
or exposition for how these were calculated. 3\% reported formal comparisons.
Limitations: The use of one journal as the primary information source may not
be representative of all ML/AI literature. However, the NeurIPS conference is
recognised to be amongst the top tier concerning ML/AI studies, so it is
reasonable to consider its corpus to be representative of high-quality
research.
Conclusion: Using the 2017 sample of the NeurIPS supervised learning corpus
as an indicator for the quality and trustworthiness of current ML/AI research,
it appears that complete argumentative chains in demonstrations of algorithmic
effectiveness are rare
Spatio-temporal learning with the online finite and infinite echo-state Gaussian processes
Successful biological systems adapt to change. In this paper, we are principally concerned with adaptive systems that operate in environments where data arrives sequentially and is multivariate in nature, for example, sensory streams in robotic systems. We contribute two reservoir inspired methods: 1) the online echostate Gaussian process (OESGP) and 2) its infinite variant, the online infinite echostate Gaussian process (OIESGP) Both algorithms are iterative fixed-budget methods that learn from noisy time series. In particular, the OESGP combines the echo-state network with Bayesian online learning for Gaussian processes. Extending this to infinite reservoirs yields the OIESGP, which uses a novel recursive kernel with automatic relevance determination that enables spatial and temporal feature weighting. When fused with stochastic natural gradient descent, the kernel hyperparameters are iteratively adapted to better model the target system. Furthermore, insights into the underlying system can be gleamed from inspection of the resulting hyperparameters. Experiments on noisy benchmark problems (one-step prediction and system identification) demonstrate that our methods yield high accuracies relative to state-of-the-art methods, and standard kernels with sliding windows, particularly on problems with irrelevant dimensions. In addition, we describe two case studies in robotic learning-by-demonstration involving the Nao humanoid robot and the Assistive Robot Transport for Youngsters (ARTY) smart wheelchair
Is SGD a Bayesian sampler? Well, almost
Overparameterised deep neural networks (DNNs) are highly expressive and so
can, in principle, generate almost any function that fits a training dataset
with zero error. The vast majority of these functions will perform poorly on
unseen data, and yet in practice DNNs often generalise remarkably well. This
success suggests that a trained DNN must have a strong inductive bias towards
functions with low generalisation error. Here we empirically investigate this
inductive bias by calculating, for a range of architectures and datasets, the
probability that an overparameterised DNN, trained with
stochastic gradient descent (SGD) or one of its variants, converges on a
function consistent with a training set . We also use Gaussian processes
to estimate the Bayesian posterior probability that the DNN
expresses upon random sampling of its parameters, conditioned on .
Our main findings are that correlates remarkably well with
and that is strongly biased towards low-error and
low complexity functions. These results imply that strong inductive bias in the
parameter-function map (which determines ), rather than a special
property of SGD, is the primary explanation for why DNNs generalise so well in
the overparameterised regime.
While our results suggest that the Bayesian posterior is the
first order determinant of , there remain second order
differences that are sensitive to hyperparameter tuning. A function probability
picture, based on and/or , can shed new light
on the way that variations in architecture or hyperparameter settings such as
batch size, learning rate, and optimiser choice, affect DNN performance
Quantum machine learning: a classical perspective
Recently, increased computational power and data availability, as well as
algorithmic advances, have led machine learning techniques to impressive
results in regression, classification, data-generation and reinforcement
learning tasks. Despite these successes, the proximity to the physical limits
of chip fabrication alongside the increasing size of datasets are motivating a
growing number of researchers to explore the possibility of harnessing the
power of quantum computation to speed-up classical machine learning algorithms.
Here we review the literature in quantum machine learning and discuss
perspectives for a mixed readership of classical machine learning and quantum
computation experts. Particular emphasis will be placed on clarifying the
limitations of quantum algorithms, how they compare with their best classical
counterparts and why quantum resources are expected to provide advantages for
learning problems. Learning in the presence of noise and certain
computationally hard problems in machine learning are identified as promising
directions for the field. Practical questions, like how to upload classical
data into quantum form, will also be addressed.Comment: v3 33 pages; typos corrected and references adde
Double-descent curves in neural networks: a new perspective using Gaussian processes
Double-descent curves in neural networks describe the phenomenon that the
generalisation error initially descends with increasing parameters, then grows
after reaching an optimal number of parameters which is less than the number of
data points, but then descends again in the overparameterised regime. Here we
use a neural network Gaussian process (NNGP) which maps exactly to a fully
connected network (FCN) in the infinite width limit, combined with techniques
from random matrix theory, to calculate this generalisation behaviour, with a
particular focus on the overparameterised regime. An advantage of our NNGP
approach is that the analytical calculations are easier to interpret. We argue
that neural network generalization performance improves in the
overparameterised regime precisely because that is where they converge to their
equivalent Gaussian process
Online Ensemble Learning of Sensorimotor Contingencies
Forward models play a key role in cognitive agents by providing predictions of the sensory consequences of motor commands, also known as sensorimotor contingencies (SMCs). In continuously evolving environments, the ability to anticipate is fundamental in distinguishing cognitive from reactive agents, and it is particularly relevant for autonomous robots, that must be able to adapt their models in an online manner. Online learning skills, high accuracy of the forward models and multiple-step-ahead predictions are needed to enhance the robots’ anticipation capabilities. We propose an online heterogeneous ensemble learning method for building accurate forward models of SMCs relating motor commands to effects in robots’ sensorimotor system, in particular considering proprioception and vision. Our method achieves up to 98% higher accuracy both in short and long term predictions, compared to single predictors and other online and offline homogeneous ensembles. This method is validated on two different humanoid robots, namely the iCub and the Baxter
- …