1,710 research outputs found
Big Learning with Bayesian Methods
Explosive growth in data and availability of cheap computing resources have
sparked increasing interest in Big learning, an emerging subfield that studies
scalable machine learning algorithms, systems, and applications with Big Data.
Bayesian methods represent one important class of statistic methods for machine
learning, with substantial recent developments on adaptive, flexible and
scalable Bayesian learning. This article provides a survey of the recent
advances in Big learning with Bayesian methods, termed Big Bayesian Learning,
including nonparametric Bayesian methods for adaptively inferring model
complexity, regularized Bayesian inference for improving the flexibility via
posterior regularization, and scalable algorithms and systems based on
stochastic subsampling and distributed computing for dealing with large-scale
applications.Comment: 21 pages, 6 figure
Indirect inference through prediction
By recasting indirect inference estimation as a prediction rather than a
minimization and by using regularized regressions, we can bypass the three
major problems of estimation: selecting the summary statistics, defining the
distance function and minimizing it numerically. By substituting regression
with classification we can extend this approach to model selection as well. We
present three examples: a statistical fit, the parametrization of a simple real
business cycle model and heuristics selection in a fishery agent-based model.
The outcome is a method that automatically chooses summary statistics, weighs
them and use them to parametrize models without running any direct
minimization.Comment: Rmarkdown code to replicate the paper is available at
https://www.dropbox.com/s/zk0fi8dp5i18jav/indirectinference.Rmd?dl=
The unified maximum a posteriori (MAP) framework for neuronal system identification
The functional relationship between an input and a sensory neuron's response
can be described by the neuron's stimulus-response mapping function. A general
approach for characterizing the stimulus-response mapping function is called
system identification. Many different names have been used for the
stimulus-response mapping function: kernel or transfer function, transducer,
spatiotemporal receptive field. Many algorithms have been developed to estimate
a neuron's mapping function from an ensemble of stimulus-response pairs. These
include the spike-triggered average, normalized reverse correlation, linearized
reverse correlation, ridge regression, local spectral reverse correlation,
spike-triggered covariance, artificial neural networks, maximally informative
dimensions, kernel regression, boosting, and models based on leaky
integrate-and-fire neurons. Because many of these system identification
algorithms were developed in other disciplines, they seem very different
superficially and bear little relationship with each other. Each algorithm
makes different assumptions about the neuron and how the data is generated.
Without a unified framework it is difficult to select the most suitable
algorithm for estimating the neuron's mapping function. In this review, we
present a unified framework for describing these algorithms called maximum a
posteriori estimation (MAP). In the MAP framework, the implicit assumptions
built into any system identification algorithm are made explicit in three MAP
constituents: model class, noise distributions, and priors. Understanding the
interplay between these three MAP constituents will simplify the task of
selecting the most appropriate algorithms for a given data set. The MAP
framework can also facilitate the development of novel system identification
algorithms by incorporating biophysically plausible assumptions and mechanisms
into the MAP constituents.Comment: affiliations change
Multiclass Common Spatial Pattern for EEG based Brain Computer Interface with Adaptive Learning Classifier
In Brain Computer Interface (BCI), data generated from Electroencephalogram
(EEG) is non-stationary with low signal to noise ratio and contaminated with
artifacts. Common Spatial Pattern (CSP) algorithm has been proved to be
effective in BCI for extracting features in motor imagery tasks, but it is
prone to overfitting. Many algorithms have been devised to regularize CSP for
two class problem, however they have not been effective when applied to
multiclass CSP. Outliers present in data affect extracted CSP features and
reduces performance of the system. In addition to this non-stationarity present
in the features extracted from the CSP present a challenge in classification.
We propose a method to identify and remove artifact present in the data during
pre-processing stage, this helps in calculating eigenvectors which in turn
generates better CSP features. To handle the non-stationarity, Self-Regulated
Interval Type-2 Neuro-Fuzzy Inference System (SRIT2NFIS) was proposed in the
literature for two class EEG classification problem. This paper extends the
SRIT2NFIS to multiclass using Joint Approximate Diagonalization (JAD). The
results on standard data set from BCI competition IV shows significant increase
in the accuracies from the current state of the art methods for multiclass
classification
On the resolution of misspecified convex optimization and monotone variational inequality problems
We consider a misspecified optimization problem that requires minimizing a
function f(x;q*) over a closed and convex set X where q* is an unknown vector
of parameters that may be learnt by a parallel learning process. In this
context, We examine the development of coupled schemes that generate iterates
{x_k,q_k} as k goes to infinity, then {x_k} converges x*, a minimizer of
f(x;q*) over X and {q_k} converges to q*. In the first part of the paper, we
consider the solution of problems where f is either smooth or nonsmooth under
various convexity assumptions on function f. In addition, rate statements are
also provided to quantify the degradation in rate resulted from learning
process. In the second part of the paper, we consider the solution of
misspecified monotone variational inequality problems to contend with more
general equilibrium problems as well as the possibility of misspecification in
the constraints. We first present a constant steplength misspecified
extragradient scheme and prove its asymptotic convergence. This scheme is
reliant on problem parameters (such as Lipschitz constants)and leads us to
present a misspecified variant of iterative Tikhonov regularization. Numerics
support the asymptotic and rate statements.Comment: 35 pages, 5 figure
Bridging belief function theory to modern machine learning
Machine learning is a quickly evolving field which now looks really different
from what it was 15 years ago, when classification and clustering were major
issues. This document proposes several trends to explore the new questions of
modern machine learning, with the strong afterthought that the belief function
framework has a major role to play
Correlated Parameters to Accurately Measure Uncertainty in Deep Neural Networks
In this article a novel approach for training deep neural networks using
Bayesian techniques is presented. The Bayesian methodology allows for an easy
evaluation of model uncertainty and additionally is robust to overfitting.
These are commonly the two main problems classical, i.e. non-Bayesian,
architectures have to struggle with. The proposed approach applies variational
inference in order to approximate the intractable posterior distribution. In
particular, the variational distribution is defined as product of multiple
multivariate normal distributions with tridiagonal covariance matrices. Each
single normal distribution belongs either to the weights, or to the biases
corresponding to one network layer. The layer-wise a posteriori variances are
defined based on the corresponding expectation values and further the
correlations are assumed to be identical. Therefore, only a few additional
parameters need to be optimized compared to non-Bayesian settings. The novel
approach is successfully evaluated on basis of the popular benchmark datasets
MNIST and CIFAR-10
Big Data Analytics for Dynamic Energy Management in Smart Grids
The smart electricity grid enables a two-way flow of power and data between
suppliers and consumers in order to facilitate the power flow optimization in
terms of economic efficiency, reliability and sustainability. This
infrastructure permits the consumers and the micro-energy producers to take a
more active role in the electricity market and the dynamic energy management
(DEM). The most important challenge in a smart grid (SG) is how to take
advantage of the users' participation in order to reduce the cost of power.
However, effective DEM depends critically on load and renewable production
forecasting. This calls for intelligent methods and solutions for the real-time
exploitation of the large volumes of data generated by a vast amount of smart
meters. Hence, robust data analytics, high performance computing, efficient
data network management, and cloud computing techniques are critical towards
the optimized operation of SGs. This research aims to highlight the big data
issues and challenges faced by the DEM employed in SG networks. It also
provides a brief description of the most commonly used data processing methods
in the literature, and proposes a promising direction for future research in
the field.Comment: Published in ELSEVIER Big Data Researc
A Survey on Data Collection for Machine Learning: a Big Data -- AI Integration Perspective
Data collection is a major bottleneck in machine learning and an active
research topic in multiple communities. There are largely two reasons data
collection has recently become a critical issue. First, as machine learning is
becoming more widely-used, we are seeing new applications that do not
necessarily have enough labeled data. Second, unlike traditional machine
learning, deep learning techniques automatically generate features, which saves
feature engineering costs, but in return may require larger amounts of labeled
data. Interestingly, recent research in data collection comes not only from the
machine learning, natural language, and computer vision communities, but also
from the data management community due to the importance of handling large
amounts of data. In this survey, we perform a comprehensive study of data
collection from a data management point of view. Data collection largely
consists of data acquisition, data labeling, and improvement of existing data
or models. We provide a research landscape of these operations, provide
guidelines on which technique to use when, and identify interesting research
challenges. The integration of machine learning and data management for data
collection is part of a larger trend of Big data and Artificial Intelligence
(AI) integration and opens many opportunities for new research.Comment: 20 page
Connections Between Adaptive Control and Optimization in Machine Learning
This paper demonstrates many immediate connections between adaptive control
and optimization methods commonly employed in machine learning. Starting from
common output error formulations, similarities in update law modifications are
examined. Concepts in stability, performance, and learning, common to both
fields are then discussed. Building on the similarities in update laws and
common concepts, new intersections and opportunities for improved algorithm
analysis are provided. In particular, a specific problem related to higher
order learning is solved through insights obtained from these intersections.Comment: 18 page
- …