1,710 research outputs found

    Big Learning with Bayesian Methods

    Full text link
    Explosive growth in data and availability of cheap computing resources have sparked increasing interest in Big learning, an emerging subfield that studies scalable machine learning algorithms, systems, and applications with Big Data. Bayesian methods represent one important class of statistic methods for machine learning, with substantial recent developments on adaptive, flexible and scalable Bayesian learning. This article provides a survey of the recent advances in Big learning with Bayesian methods, termed Big Bayesian Learning, including nonparametric Bayesian methods for adaptively inferring model complexity, regularized Bayesian inference for improving the flexibility via posterior regularization, and scalable algorithms and systems based on stochastic subsampling and distributed computing for dealing with large-scale applications.Comment: 21 pages, 6 figure

    Indirect inference through prediction

    Full text link
    By recasting indirect inference estimation as a prediction rather than a minimization and by using regularized regressions, we can bypass the three major problems of estimation: selecting the summary statistics, defining the distance function and minimizing it numerically. By substituting regression with classification we can extend this approach to model selection as well. We present three examples: a statistical fit, the parametrization of a simple real business cycle model and heuristics selection in a fishery agent-based model. The outcome is a method that automatically chooses summary statistics, weighs them and use them to parametrize models without running any direct minimization.Comment: Rmarkdown code to replicate the paper is available at https://www.dropbox.com/s/zk0fi8dp5i18jav/indirectinference.Rmd?dl=

    The unified maximum a posteriori (MAP) framework for neuronal system identification

    Full text link
    The functional relationship between an input and a sensory neuron's response can be described by the neuron's stimulus-response mapping function. A general approach for characterizing the stimulus-response mapping function is called system identification. Many different names have been used for the stimulus-response mapping function: kernel or transfer function, transducer, spatiotemporal receptive field. Many algorithms have been developed to estimate a neuron's mapping function from an ensemble of stimulus-response pairs. These include the spike-triggered average, normalized reverse correlation, linearized reverse correlation, ridge regression, local spectral reverse correlation, spike-triggered covariance, artificial neural networks, maximally informative dimensions, kernel regression, boosting, and models based on leaky integrate-and-fire neurons. Because many of these system identification algorithms were developed in other disciplines, they seem very different superficially and bear little relationship with each other. Each algorithm makes different assumptions about the neuron and how the data is generated. Without a unified framework it is difficult to select the most suitable algorithm for estimating the neuron's mapping function. In this review, we present a unified framework for describing these algorithms called maximum a posteriori estimation (MAP). In the MAP framework, the implicit assumptions built into any system identification algorithm are made explicit in three MAP constituents: model class, noise distributions, and priors. Understanding the interplay between these three MAP constituents will simplify the task of selecting the most appropriate algorithms for a given data set. The MAP framework can also facilitate the development of novel system identification algorithms by incorporating biophysically plausible assumptions and mechanisms into the MAP constituents.Comment: affiliations change

    Multiclass Common Spatial Pattern for EEG based Brain Computer Interface with Adaptive Learning Classifier

    Full text link
    In Brain Computer Interface (BCI), data generated from Electroencephalogram (EEG) is non-stationary with low signal to noise ratio and contaminated with artifacts. Common Spatial Pattern (CSP) algorithm has been proved to be effective in BCI for extracting features in motor imagery tasks, but it is prone to overfitting. Many algorithms have been devised to regularize CSP for two class problem, however they have not been effective when applied to multiclass CSP. Outliers present in data affect extracted CSP features and reduces performance of the system. In addition to this non-stationarity present in the features extracted from the CSP present a challenge in classification. We propose a method to identify and remove artifact present in the data during pre-processing stage, this helps in calculating eigenvectors which in turn generates better CSP features. To handle the non-stationarity, Self-Regulated Interval Type-2 Neuro-Fuzzy Inference System (SRIT2NFIS) was proposed in the literature for two class EEG classification problem. This paper extends the SRIT2NFIS to multiclass using Joint Approximate Diagonalization (JAD). The results on standard data set from BCI competition IV shows significant increase in the accuracies from the current state of the art methods for multiclass classification

    On the resolution of misspecified convex optimization and monotone variational inequality problems

    Full text link
    We consider a misspecified optimization problem that requires minimizing a function f(x;q*) over a closed and convex set X where q* is an unknown vector of parameters that may be learnt by a parallel learning process. In this context, We examine the development of coupled schemes that generate iterates {x_k,q_k} as k goes to infinity, then {x_k} converges x*, a minimizer of f(x;q*) over X and {q_k} converges to q*. In the first part of the paper, we consider the solution of problems where f is either smooth or nonsmooth under various convexity assumptions on function f. In addition, rate statements are also provided to quantify the degradation in rate resulted from learning process. In the second part of the paper, we consider the solution of misspecified monotone variational inequality problems to contend with more general equilibrium problems as well as the possibility of misspecification in the constraints. We first present a constant steplength misspecified extragradient scheme and prove its asymptotic convergence. This scheme is reliant on problem parameters (such as Lipschitz constants)and leads us to present a misspecified variant of iterative Tikhonov regularization. Numerics support the asymptotic and rate statements.Comment: 35 pages, 5 figure

    Bridging belief function theory to modern machine learning

    Full text link
    Machine learning is a quickly evolving field which now looks really different from what it was 15 years ago, when classification and clustering were major issues. This document proposes several trends to explore the new questions of modern machine learning, with the strong afterthought that the belief function framework has a major role to play

    Correlated Parameters to Accurately Measure Uncertainty in Deep Neural Networks

    Full text link
    In this article a novel approach for training deep neural networks using Bayesian techniques is presented. The Bayesian methodology allows for an easy evaluation of model uncertainty and additionally is robust to overfitting. These are commonly the two main problems classical, i.e. non-Bayesian, architectures have to struggle with. The proposed approach applies variational inference in order to approximate the intractable posterior distribution. In particular, the variational distribution is defined as product of multiple multivariate normal distributions with tridiagonal covariance matrices. Each single normal distribution belongs either to the weights, or to the biases corresponding to one network layer. The layer-wise a posteriori variances are defined based on the corresponding expectation values and further the correlations are assumed to be identical. Therefore, only a few additional parameters need to be optimized compared to non-Bayesian settings. The novel approach is successfully evaluated on basis of the popular benchmark datasets MNIST and CIFAR-10

    Big Data Analytics for Dynamic Energy Management in Smart Grids

    Full text link
    The smart electricity grid enables a two-way flow of power and data between suppliers and consumers in order to facilitate the power flow optimization in terms of economic efficiency, reliability and sustainability. This infrastructure permits the consumers and the micro-energy producers to take a more active role in the electricity market and the dynamic energy management (DEM). The most important challenge in a smart grid (SG) is how to take advantage of the users' participation in order to reduce the cost of power. However, effective DEM depends critically on load and renewable production forecasting. This calls for intelligent methods and solutions for the real-time exploitation of the large volumes of data generated by a vast amount of smart meters. Hence, robust data analytics, high performance computing, efficient data network management, and cloud computing techniques are critical towards the optimized operation of SGs. This research aims to highlight the big data issues and challenges faced by the DEM employed in SG networks. It also provides a brief description of the most commonly used data processing methods in the literature, and proposes a promising direction for future research in the field.Comment: Published in ELSEVIER Big Data Researc

    A Survey on Data Collection for Machine Learning: a Big Data -- AI Integration Perspective

    Full text link
    Data collection is a major bottleneck in machine learning and an active research topic in multiple communities. There are largely two reasons data collection has recently become a critical issue. First, as machine learning is becoming more widely-used, we are seeing new applications that do not necessarily have enough labeled data. Second, unlike traditional machine learning, deep learning techniques automatically generate features, which saves feature engineering costs, but in return may require larger amounts of labeled data. Interestingly, recent research in data collection comes not only from the machine learning, natural language, and computer vision communities, but also from the data management community due to the importance of handling large amounts of data. In this survey, we perform a comprehensive study of data collection from a data management point of view. Data collection largely consists of data acquisition, data labeling, and improvement of existing data or models. We provide a research landscape of these operations, provide guidelines on which technique to use when, and identify interesting research challenges. The integration of machine learning and data management for data collection is part of a larger trend of Big data and Artificial Intelligence (AI) integration and opens many opportunities for new research.Comment: 20 page

    Connections Between Adaptive Control and Optimization in Machine Learning

    Full text link
    This paper demonstrates many immediate connections between adaptive control and optimization methods commonly employed in machine learning. Starting from common output error formulations, similarities in update law modifications are examined. Concepts in stability, performance, and learning, common to both fields are then discussed. Building on the similarities in update laws and common concepts, new intersections and opportunities for improved algorithm analysis are provided. In particular, a specific problem related to higher order learning is solved through insights obtained from these intersections.Comment: 18 page
    • …
    corecore