115,875 research outputs found
Evaluation of machine-learning methods for ligand-based virtual screening
Machine-learning methods can be used for virtual screening by analysing the structural characteristics of molecules of known (in)activity, and we here discuss the use of kernel discrimination and naive Bayesian classifier (NBC) methods for this purpose. We report a kernel method that allows the processing of molecules represented by binary, integer and real-valued descriptors, and show that it is little different in screening performance from a previously described kernel that had been developed specifically for the analysis of binary fingerprint representations of molecular structure. We then evaluate the performance of an NBC when the training-set contains only a very few active molecules. In such cases, a simpler approach based on group fusion would appear to provide superior screening performance, especially when structurally heterogeneous datasets are to be processed
Bayesian kernel-based system identification with quantized output data
In this paper we introduce a novel method for linear system identification
with quantized output data. We model the impulse response as a zero-mean
Gaussian process whose covariance (kernel) is given by the recently proposed
stable spline kernel, which encodes information on regularity and exponential
stability. This serves as a starting point to cast our system identification
problem into a Bayesian framework. We employ Markov Chain Monte Carlo (MCMC)
methods to provide an estimate of the system. In particular, we show how to
design a Gibbs sampler which quickly converges to the target distribution.
Numerical simulations show a substantial improvement in the accuracy of the
estimates over state-of-the-art kernel-based methods when employed in
identification of systems with quantized data.Comment: Submitted to IFAC SysId 201
Parametric and nonparametric inference in equilibrium job search models
Equilibrium job search models allow for labor markets with homogeneous workers and firms to yield nondegenerate wage densities. However, the resulting wage densities do not accord well with empirical regularities. Accordingly, many extensions to the basic equilibrium search model have been considered (e.g., heterogeneity in productivity, heterogeneity in the value of leisure, etc.). It is increasingly common to use nonparametric forms for these extensions and, hence, researchers can obtain a perfect fit (in a kernel smoothed sense) between theoretical and empirical wage densities. This makes it difficult to carry out model comparison of different model extensions. In this paper, we first develop Bayesian parametric and nonparametric methods which are comparable to the existing non-Bayesian literature. We then show how Bayesian methods can be used to compare various nonparametric equilibrium search models in a statistically rigorous sense
The Harmonic Analysis of Kernel Functions
Kernel-based methods have been recently introduced for linear system
identification as an alternative to parametric prediction error methods.
Adopting the Bayesian perspective, the impulse response is modeled as a
non-stationary Gaussian process with zero mean and with a certain kernel (i.e.
covariance) function. Choosing the kernel is one of the most challenging and
important issues. In the present paper we introduce the harmonic analysis of
this non-stationary process, and argue that this is an important tool which
helps in designing such kernel. Furthermore, this analysis suggests also an
effective way to approximate the kernel, which allows to reduce the
computational burden of the identification procedure
EigenGP: Gaussian Process Models with Adaptive Eigenfunctions
Gaussian processes (GPs) provide a nonparametric representation of functions.
However, classical GP inference suffers from high computational cost for big
data. In this paper, we propose a new Bayesian approach, EigenGP, that learns
both basis dictionary elements--eigenfunctions of a GP prior--and prior
precisions in a sparse finite model. It is well known that, among all
orthogonal basis functions, eigenfunctions can provide the most compact
representation. Unlike other sparse Bayesian finite models where the basis
function has a fixed form, our eigenfunctions live in a reproducing kernel
Hilbert space as a finite linear combination of kernel functions. We learn the
dictionary elements--eigenfunctions--and the prior precisions over these
elements as well as all the other hyperparameters from data by maximizing the
model marginal likelihood. We explore computational linear algebra to simplify
the gradient computation significantly. Our experimental results demonstrate
improved predictive performance of EigenGP over alternative sparse GP methods
as well as relevance vector machine.Comment: Accepted by IJCAI 201
- …
